Google Finds: Centralized Control, Distributed Data Architectures Work Better than Fully Decentralized Architectures

For years a war has been fought in the software architecture trenches between the ideal of decentralized services and the power and practicality of centralized services. Centralized architectures, at least at the management and control plane level, are winning. And Google not only agrees, they are enthusiastic adopters of this model, even in places you don't think it should work.

Here's an excerpt from Google Lifts Veil On “Andromeda” Virtual Networking, an excellent article by Timothy Morgan, that includes a money quote from Amin Vahdat, distinguished engineer and technical lead for networking at Google:

"What we have seen is that a logically centralized, hierarchical control plane with a peer-to-peer data plane beats full decentralization,” explained Vahdat in his keynote. “All of these flew in the face of conventional wisdom,” he continued, referring to all of those projects above, and added that everyone was shocked back in 2002 that Google would, for instance, build a large-scale storage system like GFS with centralized control. “We are actually pretty confident in the design pattern at this point. We can build a fundamentally more efficient system by prudently leveraging centralization rather than trying to manage things in a peer-to-peer, decentralized manner.

The context of the article is Google's impressive home brew SDN (software defined network) system that uses a centralized control architecture instead of the Internet's decentralized Autonomous System model, which thinks of the Internet as individual islands that connect using routing protocols.

SDN completely changes that model as explained by Greg Ferro:

The major difference between SDN and traditional networking lies in the model of controller-based networking. In a software-defined network, a centralized controller has a complete end-to-end view of the entire network, and knowledge of all network paths and device capabilities resides in a single application. As a result, the controller can calculate paths based on both source and destination addresses; use different network paths for different traffic types; and react quickly to changing networking conditions.
In addition to delivering these features, the controller serves as a single point of configuration. This full programmability of the entire network from a single location, which finally enables network automation, is the most valuable aspect of SDN.

So a centralized controller knows all and sees all and hardwires routes by directly programming routers. In the olden says slow BGP convergence times after a fault was detected would kill performance. With your own SDN on your hardware failure response times can be immediate, as the centralized controller will program routers with a possibly precalculated alternative route. This is a key feature for today's cloud based systems that demand highly available, low latency connections, even across the WAN.

Does this mean the controller is a single process? Not at all. It's logically centralized, but may be split up among numerous machines as is typical in any service architecture. This is how it can scale. With today's big iron, big memory, and fast networks the motivation for adopting a completely decentralized architecture for capacity reasons is not compelling except for all but the largest problems.

At Internet scale, the Autonomous System model of being logically and physically decentralized is still a win, it can scale wonderfully, but at the price of high coordination costs and slow reactions times. Which was fine in the past, but doesn't work for today's networking needs.

Google isn't running an Internet. They are running a special purpose network for their own particular portfolio of needs. Why should they use an over generalized technology meant for a completely different purpose?

We can see centralization winning in the services that people choose to use.

Email and NNTP, both fully decentralized services, while not dead by any means, have given way to centralized services like Twitter, Facebook, G+, WhatsApp, and push notifications. While decentralization plays an important part in the back-end of most every software service, the services themselves are logically centralized.

Centralization makes a lot of things easier. Search, for example. If you want great search you need all the data in one place. That's why Google crawls the web and stashes it in their very large back pocket. Identity is a dish best served centralized. As are things like follow lists, joins, profiles, A/B testing, frequent pushes, iterative design, fraud detection, DDoS mitigation, deep learning, and virtually any kind of high value add feature you want to create.

Also, having a remote entity not under your control as a key component to your product is inviting a high latency and a variable user experience due to failures. Not something you want in your service. End-to-end control is key for creating an experience.

So when you argue for a fully decentralized architecture it's hard to argue based on features or scalability, you have to look elsewhere.

Decentralization is also a political choice.

Attempts to make a decentralized or federated Twitter service, for example, while technically feasible, have not busted out into general adoption. The simple reason is centralization works and as a user what you want is something that works. That's primary. Secondary qualities like security, owning your own data, resilience, free speech, etc. while of great importance to some, barely register as issues to the many.

But for the few, these secondary qualities are exactly what they prize the most. Doc Searls in articles like Escaping the Black Holes of Centralization makes the case that decentralization is important for human rights and personal sovereignty reasons. A fully distributed and encrypted P2P chat system is a lot harder to compromise than a centralized service run by a large faceless corporation.

When you are thinking about the architecture of your own system...

If it is for personal sovereignty purposes, or it operates at Internet or inter-planetary scale, or it must otherwise operate autonomously then federation is your friend.

If your system is smallish then a completely centralized architecture is still quite attractive.

For the vast middle ground Google has shown centralized management and control combined with distributed data is probably now the canonical architecture. Don't get caught up trying to make distributed everything work. You probably don't need it and it's really really hard.

But then again Oceania has always been at war with Eastasia.