« Paper: Paxos Made Moderately Complex | Main | Sponsored Post: Infragistics, Velocity, Reality Check Network, Gigaspaces, AiCache, ElasticHosts, Logic Monitor, Attribution Modeling, New Relic, AppDynamics, CloudSigma, ManageEnine, Site24x7 »

Cell Architectures

A consequence of Service Oriented Architectures is the burning need to provide services at scale. The architecture that has evolved to satisfy these requirements is a little known technique called the Cell Architecture.

A Cell Architecture is based on the idea that massive scale requires parallelization and parallelization requires components be isolated from each other. These islands of isolation are called cells. A cell is a self-contained installation that can satisfy all the operations for a shard. A shard is a subset of a much larger dataset, typically a range of users, for example. 

Cell Architectures have several advantages:

  • Cells provide a unit of parallelization that can be adjusted to any size as the user base grows.
  • Cell are added in an incremental fashion as more capacity is required.
  • Cells isolate failures. One cell failure does not impact other cells.
  • Cells provide isolation as the storage and application horsepower to process requests is independent of other cells.
  • Cells enable nice capabilities like the ability to test upgrades, implement rolling upgrades, and test different versions of software.
  • Cells can fail, be upgraded, and distributed across datacenters independent of other cells.

A number of startups make use of Cell Architectures:

  • Tumblr: Users are mapped into cells and many cells exist per data center. Each cell has an HBase cluster, service cluster, and Redis caching cluster. Users are homed to a cell and all cells consume all posts via firehose updates. Background tasks consume from the firehose to populate tables and process requests. Each cell stores a single copy of all posts.
  • Flickr: Uses a federated approach where all a user’s data is stored on a shard which is a cluster of different services.
  • Facebook: The Messages service has as the basic building block of their system a cluster of machines and services called a cell. A cell consists of ZooKeeper controllers, an application server cluster, and a metadata store.
  • Salesforce: Salesforce is architected in terms of pods. Pods are self-contained sets of functionality consisting of 50 nodes, Oracle RAC servers, and Java application servers. Each pod supports many thousands of customers. If a pod fails only the users on that pod are impacted.

The key to the cell is you are creating a scalable and robust MTBF friendly service. A service than can be used as a bedrock component in a system of other services coordinated by a programmable orchestration layer. It works just as well in a data center as in a cloud. If you are looking for a higher level organization pattern, the Cell Architecture is a solid choice.

Related Articles

Reader Comments (8)

Can you elaborate what you mean by "...coordinated by a programmable orchestration layer." It almost sounds like you are describing an ESB, hopefully that is not the case.

May 10, 2012 | Unregistered Commenteranonymous

Do you think cells make sense in typical enterprises not focused on one service? I guess the question is at what scale does the idea of a cell break down? We support multiple LOBs with different app/solution design philosophies - does the cell concept work at the micro level within a solution, or for diverse compute environments like ours does EA need to build a macro vision and change behaviours across the org? Perhaps if the orchestration layer is robust enough, and our service catalogue matured enough, we can take advantage of cell based architecture even on smaller scales?

May 10, 2012 | Unregistered CommenterChris

Anonymous, not an ESB, but it's really the application glue code you write around the collection of services that are used to implement some function. When that glue code becomes substantial enough it itself becomes a candidate service. Programmable is the sense of it often being more like a scripting, though the logic can be deeply complex. Orchestration in the sense of make all the services work together in ways that they weren't intended, hooking up inputs to outputs, a pipe with transformation points.

May 12, 2012 | Registered CommenterTodd Hoff

Chris, from the outside I would think of a cell as a service and the service can be implemented very simply or to the degree shown in the examples. Behind the service interface the evolution of the cell will be transparent, so do what makes sense at the time. Sharing between different groups I think would be tough, both functionally and organizationally. Functionally you can go a long way with parameterization and configuration, but eventually there will be schedule problems and budgeting problems and bug fix priority problems and feature priority problems etc. So it might make sense to keep the services closest to the application domain organizationally specific and seek more commonality lower in the stack, say around databases, file storage, and other more generic services. As to what can be a cell, I agree, it can be anything you need to rely upon in your application.

May 12, 2012 | Registered CommenterTodd Hoff

I'm pleasantly surprised that 'cell architecture' resembles 'stateful shard' approach outlined in my comment at http://highscalability.com/blog/2011/12/14/virtualization-and-cloud-computing-is-changing-the-network-t.html :

"[A stateful shard] has enough infromation for request processing with minimal (ideally, zero) external communications. A 'shard' is a set of tightly coupled processing and storage units with fast interconnect. Ideally these units must share the same physical machine [or a set of physical machines]."

Industry moves in the correct direction :)

May 12, 2012 | Unregistered Commentervalyala

I completely agree with Cell Architectures and have experience building them. The benefit is availability the drawback is cost. I find it challenging to build a true "cell" architecture without having a complete redundancy including the datacenter and network layers. This is where leadership is not usually willing to pay for the cost of the highly improbable 5x9's. However, you may see 4x9's quite often following a cell architecture with exception to core network & datacenter redundant build outs, BUT it would mostly be dependent on people not doing something dumb, and let's face it, people cause most outages in my experience. To caveat that you need a truly automated failover capability to a completely isolated and separate network & facility and due to the cost this is where I would push the cloud. Run what you can out of it if you're an enterprise. Just my 2 cents..


May 14, 2012 | Unregistered CommenterTuxninja

I doubt that I can use this cell architecture for my application. Is there any patent on this architecture?

June 4, 2012 | Unregistered CommenterRon Park

I've authored a Cell-based Reference Architecture https://git.io/fpwtf sometime back and came across this post.

February 14, 2019 | Unregistered CommenterAsanka Abeysinghe

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>