MySpace Architecture

Todd Hoff's picture

MySpace.com is one of the fastest growing site on the Internet with 65 million subscribers and 260,000 new users registering each day. Often criticized for poor performance, MySpace has had to tackle scalability issues few other sites have faced. How did they do it?

Site: http://myspace.com

Information Sources

  • Inside MySpace.com

    Platform

  • ASP.NET 2.0
  • Windows
  • IIS
  • SQL Server

    What's Inside?

  • MySpace processes 1.5 Billion page views per day and handles 2.3 million concurrent users during the day
  • Membership Milestones:
    - 500,000 Users: A Simple Architecture Stumbles
    - 1 Million Users:Vertical Partitioning Solves Scalability Woes
    - 3 Million Users: Scale-Out Wins Over Scale-Up
    - 9 Million Users: Site Migrates to ASP.NET, Adds Virtual Storage
    - 26 Million Users: MySpace Embraces 64-Bit Technology
  • 500,000 accounts was too much load for two web servers and a single database.
  • At 1-2 Million Accounts
    - They used a database architecture built around the concept of vertical partitioning, with separate databases for parts of the website that served different functions such as the log-in screen, user profiles and blogs.
    - The vertical partitioning scheme helped divide up the workload for database reads and writes alike, and when users demanded a new feature, MySpace would put a new database online to support it.
    - MySpace switched from using storage devices directly attached to its database servers to a storage area network (SAN), in which a pool of disk storage devices are tied together by a high-speed, specialized network, and the databases connect to the SAN. The change to a SAN boosted performance, uptime and reliability.
  • At 3 Million Accounts
    - the vertical partitioning solution didn't last because they replicated some horizontal information like user accounts across all vertical slices. With so many replications one would fail and slow down the system.
    - individual applications like blogs on sub-sections of the Web site would grow too large for a single database server
    - Reorganized all the core data to be logically organized into one database
    - split its user base into chunks of 1 million accounts and put all the data keyed to those accounts in a separate instance of SQL Server
  • 9 Million–17 Million Accounts
    - Moved to ASP.NET which used less resources than their previous architecture. 150 servers running the new code were able to do the same work that had previously required 246.
    - Saw storage bottlenecks again. Implementing a SAN had solved some early performance problems, but now the Web site's demands were starting to periodically overwhelm the SAN's I/O capacity—the speed with which it could read and write data to and from disk storage.
    - Hit limits with the 1 million-accounts-per-database division approach as these limits were exceeded.
    - Moved to a virtualized storage architecture where the entire SAN is treated as one big pool of storage capacity, without requiring that specific disks be dedicated to serving specific applications. MySpace now standardized on equipment from a relatively new SAN vendor, 3PARdata
  • Added a caching tier—a layer of servers placed between the Web servers and the database servers whose sole job was to capture copies of frequently accessed data objects in memory and serve them to the Web application without the need for a database lookup.
  • 26 Million Accounts
    - Moved to 64-bit SQL server to work around their memory bottleneck issues. Their standard database server configuration uses 64 GB of RAM.

    Lessons Learned

  • You can build big websites using Microsoft tech.
  • A cache should have been used from the beginning.
  • The cache is a better place to store transitory data that doesn't need to be recorded in a database, such as temporary files created to track a particular user's session on the Web site.
  • Built in OS features to detect denial of service attacks can cause inexplicable failures.
  • Distribute your data to geographically diverse data centers to handle power failures.
  • Consider using virtualized storage/clustered file systems from the start. It allows you to massively parallelize IO access while being able to add disk as needed without any reorganization needed.

  • Comments

    Resourceful web site!

    I have never seen such a web site which is so much focussed on web site architecture. I am enlightened by the bountiful information it contains!

    Windows vs linux

    64bit hardware is the key here. All the bad name windows got for poor performance because of the use of 32 bit hardware. Most linux/unix hardware migrated to 64 bit hardware much earlier on. Additional RAM supported by 64 bit machines make the whole difference in performance of a machine.

    "You can build big websites

    "You can build big websites using Microsoft tech."

    It seems that you still can't, speaking from personal Myspace experience. It doesn't work as it should with errors popping all over the place. Looks like "scalability" through dropping every second or third request, thus reducing load on the system.

    Cold Fusion

    I think they rather combine ASP.NET with Cold Fusion and not only .NET

    MySpace uses BlueDragon.NET

    MySpace uses BlueDragon.NET provided by NewAtlanta to run their CFML code on .NET. Read more about it here; http://blog.newatlanta.com/index.cfm?mode=entry&entry=764C1F4A-89D7-A61A...

    Re: MySpace Architecture

    Thanks for publishing this series, it's a huge resource for anyone that wants to scale. I've referenced it in my blog:

    http://smoothspan.wordpress.com/2007/09/17/who-doesnt-love-java-youd-be-...

    Cheers,

    BW

    cbmeeks's picture

    Re: MySpace Architecture

    150 servers for 26 million accounts?

    That's 173,333 accounts per server.

    Is that really a good benchmark? I don't know either way. How many servers does Facebook use?

    http://codershangout.com
    A place for coders to hangout!

    Re: MySpace Architecture

    I think they rather combine ASP.NET with Cold Fusion and not only .NET

    Re: MySpace Architecture

    I think they rather combine ASP.NET with Cold Fusion and not only .NET

    Re: MySpace Architecture

    Thanks for putting up the information.

    Their standard database server configuration uses 64 GB of RAM. - Wow 64GB of RAM, thats huge!

    Re: MySpace Architecture

    Really interesting and useful article, thanks a lot for it!

    Re: MySpace Architecture

    My space is useful site as like facebook

    Re: MySpace Architecture

    That's a facinating article, and also inspriing to see that SQL Server and ASP.Net can (albeit with clever organisation and a lot of hardware) support such a vast user base.

    Thanks for sharing this information.

    GHJDGJH

    THANK YOU

    VERY VERY GOOOOOOOOD

    VERY NIC

    ykl

    The subject of a very wonderful and distinct
    I thank you for continuing excellence
    Thank you

    Comment viewing options

    Select your preferred way to display the comments and click "Save settings" to activate your changes.

    Post new comment

    The content of this field is kept private and will not be shown publicly.
    • Web page addresses and e-mail addresses turn into links automatically.
    • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd><div ?=?><p ?=?> <img ?=?> <embed ?=?> <h1 ?=?><h2 ?=?><h3 ?=?>
    • Lines and paragraphs break automatically.
    • Glossary terms will be automatically marked with links to their descriptions
    • You may link to webpages through the weblinks registry

    More information about formatting options

    To combat spam, please enter the code in the image.