Update:Presentation: Behind the Scenes at MySpace.com. Dan Farino, Chief Systems Architect at MySpace shares details of some of MySpace's cool internal operations tools.
MySpace.com is one of the fastest growing site on the Internet with 65 million subscribers and 260,000 new users registering each day. Often criticized for poor performance, MySpace has had to tackle scalability issues few other sites have faced. How did they do it?
- 500,000 Users: A Simple Architecture Stumbles
- 1 Million Users:Vertical Partitioning Solves Scalability Woes
- 3 Million Users: Scale-Out Wins Over Scale-Up
- 9 Million Users: Site Migrates to ASP.NET, Adds Virtual Storage
- 26 Million Users: MySpace Embraces 64-Bit Technology
- They used a database architecture built around the concept of vertical partitioning, with separate databases for parts of the website that served different functions such as the log-in screen, user profiles and blogs.
- The vertical partitioning scheme helped divide up the workload for database reads and writes alike, and when users demanded a new feature, MySpace would put a new database online to support it.
- MySpace switched from using storage devices directly attached to its database servers to a storage area network (SAN), in which a pool of disk storage devices are tied together by a high-speed, specialized network, and the databases connect to the SAN. The change to a SAN boosted performance, uptime and reliability.
- the vertical partitioning solution didn't last because they replicated some horizontal information like user accounts across all vertical slices. With so many replications one would fail and slow down the system.
- individual applications like blogs on sub-sections of the Web site would grow too large for a single database server
- Reorganized all the core data to be logically organized into one database
- split its user base into chunks of 1 million accounts and put all the data keyed to those accounts in a separate instance of SQL Server
- Moved to ASP.NET which used less resources than their previous architecture. 150 servers running the new code were able to do the same work that had previously required 246.
- Saw storage bottlenecks again. Implementing a SAN had solved some early performance problems, but now the Web site's demands were starting to periodically overwhelm the SAN's I/O capacity—the speed with which it could read and write data to and from disk storage.
- Hit limits with the 1 million-accounts-per-database division approach as these limits were exceeded.
- Moved to a virtualized storage architecture where the entire SAN is treated as one big pool of storage capacity, without requiring that specific disks be dedicated to serving specific applications. MySpace now standardized on equipment from a relatively new SAN vendor, 3PARdata
- Moved to 64-bit SQL server to work around their memory bottleneck issues. Their standard database server configuration uses 64 GB of RAM.