The essence of my work is coming into daily contact with innovative technologies. A recent example was at the request of a partner company who wanted to answer- which one of these tools will best solve my virtualized datacenter headache? After initial analysis all the products could be classified as tools that troubleshoot VM sprawl, but there was no universally accepted term for them. The most descriptive term that I found was Virtual Resource Manager (VRM) from DynamicOps. As I delved deeper into their workings, the distinction between VRMs and Private Clouds became blurred. What are the differences?
Drupal 7 is having a scalability makeover. Karoly Negyesi, Drupal Core Developer and Public Development Team Lead, explains the process in this video: Drupal 7 APIs, scalability mindset. Karoly states the general theme of the changes as: You give up some control and you get back scalability. An interesting comment on the politics of scalability?
Makeover may not be quite the right word though. A makeover implies a cosmetic change, looking better by changing the surface. Drupal's changes will go deeper than that, right to Drupal's core. It's a genuine and authentic change that will hopefully allow one of the Internet's most venerable Content Management Systems (CMSs) to compete with a constant stream of younger and sexier models.
Drupal is based on an older LAMP stack approach where PHP modules are scooped up and merged together each time a request is made to Drupal. Drupal's most intriguing idea is how it is built, expands, and changes by weaving together a single system out of individual components called modules. Built-in modules include comments, RSS, contact forms, forums, and Clean URLs. Add in modules include things like CSE to add Google's Custom Search Engine, modules to add in AdSense, CAPTCHA, and Sitemaps. Drupal establishes AOP extension points that allow modules to work remarkably well together, creating a site that feels like one single site even though it has been constructed from dozens of modules hunted and gathered from all over the digital world.
The problem is the PHP code can directly access the database and directly render to the UI, there is little required layering. Part of Drupal's amazing configurability and extensibility has been how easy it is for everything to work together by changing the database. But when there's no layering it's almost impossible to optimize the system. If you have 20 different modules they each can make 20 separate calls to the database when what we really want is one call. And because of the direct SQL access when the number of writes increases there's no systematic way to distribute the writes across multiple servers. So we see as Drupal sites grow in the number of modules and the number of users both performance and scalability tank.
The younger models architect their systems differently. Sites like Google, Amazon, Facebook are written terms of an API and a framework, a service based approach. Using a service based approach the web tier can be programmed in terms of services that themselves are scalable so the entire system is scalable. When the API is skipped there are no leverage points that can be made to scale. It becomes a big ball of mud.
More layering and more APIs is exactly the direction Drupal is taking. Exactly how is Drupal changing?
Online Social Networks (OSN) face serious scalability challenges due to their rapid growth and popularity. To address this issue we present a novel approach to scale up OSN called One Hop Replication (OHR). Our system combines partitioning and replication in a middleware to transparently scale up a centralized OSN design, and therefore, avoid the OSN application to undergo the costly transition to a fully distributed system to meet its scalability needs. OHR exploits some of the structural characteristics of Social Networks: 1) most of the information is one-hop away, and 2) the topology of the network of connections among people displays a strong community structure. We evaluate our system and its potential benefits and overheads using data from real OSNs: Twitter and Orkut. We show that OHR has the potential to provide out-of-the-box transparent scalability while maintaining the replication overhead costs in check.
Real-time social graphs (connectivity between people, places, and things). That's why scaling Facebook is hard says Jeff Rothschild, Vice President of Technology at Facebook. Social networking sites like Facebook, Digg, and Twitter are simply harder than traditional websites to scale. Why is that? Why would social networking sites be any more difficult to scale than traditional web sites? Let's find out.
Traditional websites are easier to scale than social networking sites for two reasons:
Jeff Rothschild, Vice President of Technology at Facebook gave a great presentation at UC San Diego on our favorite subject: "High Performance at Massive Scale – Lessons learned at Facebook". The abstract for the talk is:
Facebook has grown into one of the largest sites on the Internet today serving over 200 billion pages per month. The nature of social data makes engineering a site for this level of scale a particularly challenging proposition. In this presentation, I will discuss the aspects of social data that present challenges for scalability and will describe the the core architectural components and design principles that Facebook has used to address these challenges. In addition, I will discuss emerging technologies that offer new opportunities for building cost-effective high performance web architectures.
There's a lot of interesting about this talk that we'll get into later, but I thought you might want a head start on learning how Facebook handles 30K+ machines, 300 million active users, 20 billion photos, and 25TB per day of logging data.
I'm not sure how many people who follow this have even tried collectl but I wanted to let you all know that I just released a set of utilities called strangely enough collectl-utils, which you can get at http://collectl-utils.sourceforge.net. One web-based utility called colplot gives you the ability to very easily plot data from multiple systems in a way that makes correlating them over time very easy.
Update: Short presentation NYC by Bryan Fink demonstrating the riak web-shaped data storage engine
- Scalable, decentralized key-value store
- Standard get, put, and delete operations.
- Distributed, fault-tolerant storage solution.
- Configurable levels of consistency, availability, and partition tolerance
- open source and NoSQL
- Pluggable backends
- Eventing system
- Inter-cluster replication
- Links between records that can be traversed.
- Map/Reduce. Functions are executed on the data node. One interesting difference is that a list keys are required to specify which values are operated on as apposed to running calculations on all values.
- Hacker News Thread. More juicy details on how Riak compares to Cassandra, mongodb, couchdb, etc.
This MySQL article lists 5 problems to avoid when scaling out:
- Don't Think Synchronously. Introduce asynchronous communication, parallelization, and strategies to deal with approximate or slightly outdated data.
- Don't Think Vertically. Scaling by bigger machines won't work. Plan on horizontal scaling and asynchronous architectures form the start which make it easy to add capacity on demand.
- Don't Mix Transactions with Business Intelligence. Transactions and analytics are inherently different. Separate out different types of data onto different databases.
- Avoid Mixing Hot and Cold Data. Static and fast changing data are inherently different. Separate out different types of data onto different databases.
- Don't Forget the Power of Memory. Make data accessible in RAM by smartly partitioning data across servers.
More information at Scale-Out & Replication Best Practices for High-Growth Businesses.
There are many reasons to roll your own data storage solution on top of existing technologies. We've seen stories on HighScalability about custom databases for very large sets of individual data(like Twitter) and large amounts of binary data (like Facebook pictures). However, I recently ran into a unique type of problem. I was tasked with recording and storing bandwidth information for more than 20,000 servers and their associated networking equipment. This data needed to be accessed in real-time, with less than a 5 minute delay between the data being recorded and the datashowing up on customer bandwidth graphs on our customer portal.
After numerous false starts with off the shelf components and existing database clustering technology, we decided we must roll our own system. The real key to our problem (literally) was the ratio of the size of the key to the size of the actual data. Because the tracked metric was so small (a 64-bit counter) compared to the unique identifier (32-bit network component ID, 32-bit timestamp, 16-bit data type identifier) existing database technologies would choke on the key sizes.
Eventually it was decided that the best solution was to write our own wrapper for standard MySQL databases. No fancy features, no clustering, no merge tables or partitioning, no extra indexes, just hundreds of thousands of flat tables on as many physical machines as was necessary. I chronicled the whole decision making process in the full article, located here, on our developers' blog.