How Gravatar scales on hardware

Automattic recently purchase Gravatar and have switched the server onto their hosting platform. host over 1.7 million blogs with well over 60'000 new posts submitted each day generating 10 - 12 million page views per day. Barry on has a great post on the changes they've introduced to help Gravatar scale.

Click to read more ...


Paper: Wikipedia's Site Internals, Configuration, Code Examples and Management Issues

Wikipedia and Wikimedia have some of the best, most complete real-world documentation on how to build highly scalable systems. This paper by Domas Mituzas covers a lot of details about how Wikipedia works, including: an overview of the different packages used (Linux, PowerDNS, LVS, Squid, lighttpd, Apache, PHP5, Lucene, Mono, Memcached), how they use their CDN, how caching works, how they profile their code, how they store their media, how they structure their database access, how they handle search, how they handle load balancing and administration. All with real code examples and examples of configuration files. This is a really useful resource.

Related Articles

  • Wikimedia Architecture
  • Domas Mituzas' Blog

    Click to read more ...

  • Thursday

    Should JSPs be avoided for high scalability?

    I just heard about some web sites where Velocity templates are used to render HTML instead of using JSPs and all the processing in performed in servlets. Can JSPs cause issue with scalability? Thanks, Unmesh

    Click to read more ...


    Who can answer or analyze the image store and visit solution about

    Who can answer or analyze the image store and visit solution about

    Click to read more ...


    Scaling Operations Saves Money and Scales Faster

    Jesse Robbins at O'Reily Radar has a nice post on how spending a little up front time on figuring out how to scale your operations process saves money on ops people and allows you to save time adding and upgrading servers. Adding, monitoring, and upgrading servers can get so incredibly screwed up that a herd of squirrels has to work overtime just to put out a release. Or it can be one button simple from your automated build system out to your servers. This is one area where "do the simplest thing that could possibly work" is a dumb idea and Jesse does a good job capturing the advantages of doing it right.

    Click to read more ...


    Hire Facebook, Ning, and Salesforce to Scale for You

    One of the premier scaling strategies is always: get someone else to do the work for you. But unlike Huckleberry Finn in Tom Sawyer, you won't have to trick anyone into whitewashing a fence for you. Times have changed. Companies like Ning, Facebook, and Salesforce are more than happy to help. Their price: lock-in. Previously you had few options when building a "real" website. You needed to do everything yourself. Infrastructure and application were all yours. Then companies stepped in by commoditizing parts of the infrastructure, but the application was still yours. The next step is full on Borg take no prisoners assimilation where the infrastructure and application are built as one collective. What you have to decide as someone faced with building a scalable website is if these new options are worth the price. Feeding this explosion of choice is one of the new strategy games on the intertubes: the Internet Platform Game. Ning's Marc Andreessen defines a platform as: a system that can be programmed and therefore customized by outside developers -- users -- and in that way, adapted to countless needs and niches that the platform's original developers could not have possibly contemplated, much less had time to accommodate. The idea is you'll win great rewards in exchange for coding to someone else's internet platform. From Ning you'll win a featureful and customizable social networking platform that they are completely responsible for scaling. The cost ranges from free to very reasonable. From Facebook you'll win prime space on the profile page of over 40 million virally infected customers. It's free, but you must make your application scalable enough to handle all those millions. By coding to the Salesforce platform you'll win the same infrastructure that executes 100 million Salesforce transactions a day. The cost of their service is unknown at this time.

    The Three Levels of Internet Platforms

    Mr. Andreessen then went a step further and defined a three level platform categorization scheme:
  • Level 1: Access API. A platform provided in the form of a REST/SOAP web services API. Examples: eBay, Paypal, Flickr, Digg. Your application lives outside the service and their API is your only access point to the system. Scalability is completely up to you. You are basically building a mashup from distributed parts in your own data center.
  • Level 2: Plug-In API. A platform provided in the form of a system for embedding your application inside another application. Examples: Facebook, Eclipse, Firefox. You still use an API, but the user sees an integrated application because your application is using their screen real estate, log in, user accounts and so on. For internet plug-ins scalability is still up to you. The millions of Facebook users running your application must run completely on your servers.
  • Level 3: Deep hosting. A platform provided in the form an API, Plug-in, and fully hosted runtime environment. Examples: Ning, Salesforce, and Second Life. Your application is completely integrated with a host application framework and runs completely on the host servers. They are responsible adding machines, maintenance, and management. You are free to just write your application. Amazon is on his original list, but I don't put it there. If Amazon exposed their Dynamo service I would, but since with EC2 you are stuck worrying about database storage they really don't belong here. Like the typical depiction of human ascent from amoeba to weapon wielding, art appreciating primate, the levels are meant to indicate progress. While in reality evolution isn't about progress at all. It's all about survival through adaptation to local ecological niches. And that's how I look at the levels. At each level you gain something and you lose something. You need to select your niche by looking at your talents and needs.

    Why Use an Access API?

    Using open APIs to access services is what has made the internet great. APIs provide the most flexibility at the greatest cost. You get access to a huge number of wonderful services for virtually nothing. The linkage between website is a relatively simple API and a data definition. You can do anything you want, but you have to build the infrastructure to do it. Yet that's a lot better than building your own map service, your own SMS service, or your own photo sharing service. Yet there's still so much work to do. Grid services make the job easier, but the level of expertise it takes to create a scalable site is still very high.

    Why Use a Plug-In?

    Since Facebook is the only internet company in this category the answer is clear why you want to be a Facebook plug-in: to get access to a lot of users, connected by an exploitable social graph, for the purpose of exponentionally propagating your application along the graph. Most would be ecstatic to get to hundreds of thousands of regular users on their own standalone site. With Facebook that's very possible. The reward is great, but the costs are great too. Your application must be something that can be deconstructed onto Facebook. I don't see gmail making it as a Facebook app. You must subject yourself to a lot of restrictions to use the Facebook infrastructure. You must trust yourself to a poorly documented system in which it is hard to get anything done. And to top it off:Facebook does not host your application. This really blew me away when I first heard about it. When someone says they are offering a platform my immediate assumption is they are hosting your application. That's what a platform is, isn't it? But your application must run on your own hardware. Imagine going from 0 to millions of users in the space of a few days. How would you handle that? Well that's exactly the problem ILike (a popular music sharing site) had when they released their Facebook app. Mr. Andreessen gives a wonderful if somewhat self-serving account of ILike's troubles with viral growth. After launching they posted this on their blog: In our first 20 hours of opening doors we had 50,000 users sign up, and it is only accelerating. (10,000 users joined in the first 12 hrs. 10,000 more users in the next 3 hrs. 30,000 more users in the next 5 hrs!!) We started the system not knowing what to expect, with only 2 servers, but ready with backup. Facebook's rabid userbase chewed up our 2 servers almost instantly. We doubled our capacity to catch up. And then we doubled it again. And again. And again. Oh crap - we ran out of servers!! Although has a very healthy level of Web traffic, and even though about half of all the servers in our datacenter were sitting unused, idle, as backup capacity, we are now completely maxed out. We just emailed everybody we know across over a dozen Bay Area startups, corporations, and venture firms in a desperate plea to find spare servers so we can triple our capacity for the continued onslaught. Tomorrow we are picking up over 100 servers from different companies to have them installed just to handle the weekend's traffic. (For those who responded to our late night pleas, thank you!) ILike says they now have over 3 million Facebook users and are growing at an astonishing rate of 300,000 users per day. That number of users and growth rate will make almost anyone salivate. Yet how many can afford the hundreds and hundreds of servers it would take to handle all those users, especially if you have an unclear monetization strategy? Which brings us to Deep Hosting and Mr. Andreessen's end game for the internet's evolution.

    Why Use Deep Hosting?

    The trouble with handling application growth under Facebook's large user base has an obvious solution: host your application on their infrastructure. This is exactly what Mr. Andreessen has done with Ning. Out of the Ning box you get an exceptionally functional social networking package. So functional in fact it makes almost anyone think "do I really need to reinvent all this stuff when they've already done it? Can't I just tweak a few things and make it my own?" And that's exactly what Ning wants to hear. They've made it so you can completely rebrand their software, add your own features using normal programming tools, yet still host your application on their platform, on their servers, in their datacenter. So you don't have to worry about scaling. Its Ning's job to scale the database, back it up, manage the infrastructure, add servers, and do all the other nasty bits that keep so many people away from deploying successful websites. So the temptation is clear. Go with Ning and you immediately get a cool system that will scale and that you can still program if you feel the need. But with all that power comes a price, as usual. You are locked inside a gilded cage. If your application slows down there's not much you can do about it. I found their documentation better than Facebook's, but not very useful for someone looking to get going quickly and that makes me very nervous when adopting a platform. Yet when they add features, as they frequently do, your app gets them for free. You see some of the same effects here that all Google apps get when the Google stack is improved. And not having to worry about scalability is very attractive, especially at such a reasonable cost.

    Problems with Deep Hosting

    Mr. Andreessen thinks that "in the long run, all credible large-scale Internet companies will provide Level 3 platforms." There are three problems with this argument.
  • One: Ning has the same problem as Salesforce, only their part of the application infrastructure is scalable. What if I want to a add new service that is specific to my application? Let's say I want to send mass emailings for an invitation feature, for example? How do I make my infrastructure for this run inside their platform? I don't. Which means I have to be able build a scalable infrastructure anyway. Which means I might as well do the whole thing. But Ning might say their functionality is so compelling that it's worth the trade off. You can always make those external services. Which brings us back to if I have to do one part I might as well do it all. And it also brings us to the second problem with the L3 platform model.
  • Two: How compelling will each L3 domain be? You have to be very very attractive to even get someone to consider assimilating into a platform. Ning has done an excellent job at this. But how many other companies in how many other domains will do as a good a job? Precious few I would think.
  • Three: Mr. Andreessen maintains it is "really easy to learn how to program -- in fact, it's never been easier." So centering the L3 platform definition around programmability is not seen as a concern. But programming is not easy. It's very hard. Especially with such poorly documented systems. The more code you have to write the further you are away from your goal and the further you are away from adoption. This is why we see systems like Drupal with well defined plug-in architectures being very popular. Most people can't and won't ever program, so building things from pre-existing parts (like how our bodies evolved) allows people to get a lot of core functionality with the chance for specialization and expandability.

    What does this mean for you?

    I've found it difficult to reconcile all the different pros and cons of each approach. There is a definite value in all these alternatives. If you have a vision for an application then building it yourself is the only way you'll achieve that vision. So do it yourself. But what good is a vision without users? So go Facebook. But I could get something going very quickly in Ning and the expand overtime with much less hassle, even if it's not exactly what I want. So go Ning. What to do? The point of this post isn't to come to a conclusion. The point has been to cover some new and different approaches to scalability so you can spend a few sleepless nights pondering your options too :-)

    Related Articles

  • The three kinds of platforms you meet on the Internet by Marc Andreessen
  • Analyzing the Facebook Platform, three weeks in by Marc Andreessen
  • Q&A with iLike’s Ali Partovi, on Facebook By Eric Eldon
  • I want to understand Ning's architecture and how it works
  • Response to Three Platforms You Meet by Joshua
  • Ning's Developer Documentation
  • Facebook's Application Architecture
  • Saleforce's On-Demand Computing Platform
  • Building a Business on Virtual Infrastructure, Using Google and

    Click to read more ...

  • Sunday

    Paper: Standardizing Storage Clusters (with pNFS)

    pNFS (parallel NFS) is the next generation of NFS and its main claim to fame is that it's clustered, which "enables clients to directly access file data spread over multiple storage servers in parallel. As a result, each client can leverage the full aggregate bandwidth of a clustered storage service at the granularity of an individual file." About pNFS StorageMojo says: pNFS is going to commoditize parallel data access. In 5 years we won’t know how we got along without it. Something to watch.

    Click to read more ...


    Should you build your next website using 3tera's grid OS?

    Update 2: 3tera has added Dynamic Appliances, which are "packaged data center operations like backup, migration or SLAs that users can add to their applications to provide functionality." Update: in an effort to help cross the chasm of how start building a website using their grid OS, 3tera is offering their Assured Success Plan. The idea is to provide training, consulting, and support so you can get started with some confidence you'll end up succeeding. If you are starting or extending a website you have a problem: what technologies should you use? Now there are more answers to that question than ever. One new and refreshingly innovative answer is 3tera's grid OS. In this podcast interview with Bert Armijo from 3tera, we'll learn how 3tera wants to change how you build websites. How? By transforming the physical into the virtual and then allowing the virtual to be manipulated as if it were real. Could I possibly be more abstract? Not really. But when I think of what they are doing that's the mental model I see whirling around in my mind. Don't worry, I promise we'll drill down to how it can help you in the real world. Let's see how. I think of 3tera's product as like staying at a nice hotel. At home you are in charge. If something needs doing you must do it. If something breaks you must fix it. But at a nice hotel everything just happens for you. Your room is cleaned, beds are made, outrageously expensive candy bars are replaced in the mini-bar, food arrives when you order it and plates disappear when you are done, and the courtesy mint is placed just so on your pillow. You are free to simply enjoy your stay. All the other details of living just happen. That's the same sort of experience 3tera is trying to provide for your website. You can concentrate on your application and 3tera, through their GUI on the front-end and their AppLogic grid operating system on the back-end, worries about all the housekeeping. I think Bert summed up their goal wonderfully when said their aim is to:

    Get peoples hands off physical boxes and to give them a way to define complex infrastructures in a reusable way that they can then instantiate, trade, sell, or replicate, backup up and manage as individual units. This is what AppLogic that does incredibly way.
    What they are doing is taking hard physical resources like CPU and storage and decoupling them from their physical sources so you can just order and use them on demand without worrying how its done under the covers. This is trend that has been happening for a while, but their grid OS takes that process to the next level. Your physical co-lo cage is now a private virtual data center. Physical boxes, once lovingly spec'ed, bought, and installed are now allocated on demand from a phalanx of preconfigured and separately maintained servers. Physical storage, once lovingly pieced together from disks, controllers, and networks is now allocated from a vast unending sea of virtual storage. Physical load balancers are now programs you can create. What this means for you is you can take a website architecture you've draw up on your white board and simply and quickly create it in a data center. Its all configurable from a GUI. You can bring on 10 new web servers with a simple drag and drop operation. It's basically your white board diagram come to life, only you get to skip all the nasty implementation bits. In the virtual world the nasty non application related implementation bits are someone else's problem. 3tera's value proposition pretty easy to understand:
  • Simplify the data center. You no longer need to locate, outfit, staff, maintain, and support a co-lo space.
  • Simplify operations. A few people can manage a lot machines.
  • Simplify disaster recovery. Failover is complicated and often doesn't work as planned. With AppLogic your redundant data center is always the same because the virtual data center is copied a unit. You can pick it up and move it anywhere you want.
  • Simplify the cost model for growth. If you grow how are you going to fund your hardware? Growing on a grid is more agile, incremental, and requires less upfront investment.
  • Simplify your architecture. The grid OS provides a powerful implementation model of how you should structure, grow, and maintain your system. You don't need to code it from scratch or think it up yourself. In short: customers don't care about your servers. Hardware and the data center do not add value. You core competency is in your application and running your business, not playing with servers. Well, that's it for the overview. Please listen to this podcast for all the nitty-gritty details. Download audio file (1:16 minutes, mp3).

    Podcast Notes

    I know what you are probably saying. You are saying: "But Todd, the podcast is over an hour long, couldn't you have please made it longer? I have nothing else to do today and I need to waste more time!" What can I say, Bert was very knowledgeable and helpful, and this is a new model for building scalable websites so I was trying to figure out how I could physically make a website using their product. That takes a lot of questions. I am happy with the result though. I think I have a good picture of how their system works and I think its well worth investigating if you are in the market for creating or expanding a website. Here are some notes taken from the podcast.
  • They started 3 years ago. At that time nobody could understand what they were trying to build. They have just now been able to build the higher level features, like Smart Appliances, that they wanted to build originally. They've been concentrating on making all the plumbing work.
  • The AppLogic grid operating system allows you to take hard infrastructure servers, load balancers, firewalls, VPNs, all these boxes you need to make a website and it allows you deploy these in a virtual data center
  • A virtual data center (VDC) is like a cage you would buy from a co-location service except you operate and manage it through a browser. You can be anywhere in the world and you can use hosting services anywhere in the world.
  • An entry level package ranges from $500 to a few thousand a month. The starting point is 4 - 32 CPUs, some amount of storage and some amount bandwidth. You add resources as you need to. Overage charges are passed through to you from the data center provider. They don't mark it up.
  • They don't own any servers. They contract with hosting providers data centers, like Softlayer and Layeredtech, for a uniform set of resources.
  • They offer templates for a scalable virtualized LAMP infrastructure as a starting point for building your own applications.
  • Their GUI shows you the architecture. You don't have to think of physical boxes.
  • There's a controller for the VDC through which you can provision your system.
  • You can still login to any physical or logical service. You have root access. You can install anything and manage the system, but you don't have to worry about where it physically resides.
  • To create an application: - You use the controller to provision a LAMP cluster. - Then you log into Apache server and configure it how you wish. - Then restart and it begins to serve. - Say you want 10 front-end web servers. - The load balancer is a virtual load balancer you program. - You use virtual NAS. - Upload code to the NAS. - Then have all apache servers run off the NAS. So you don't have to log into all and upload code.
  • Shared storage is part of the virtual data center by definition. You can create as many volumes as you wish. All are mirrored for high availability. If a virtual server goes down AppLogic will simply restart on another available resource in the data center.
  • Partners build the grid backbone to which nodes and other resources are attached. AppLogic runs that grid backbone. When you sign up you provision the virtual data center the nodes on the backbone are assigned to your VDC. A controller allows you to provision your VDC. Anything you can do in co-lo cage you can do, but there's nothing physical. AppLogic carries out your commands on the grid.
  • They provide standards for the hosting service. A variety of machine classifications available. Have customers with 50TB of storage. The largest number of CPUs in a single VDC is over 450.
  • To see if the VDC meets your requirements you run a test on the VDC. Once you have resources in your VDC they are not shared with anyone else so you can be confident the performance will be as tested. It's not a VPS. Their customers run production systems. They are all running a business of some sort.
  • Pricing is designed to be attractive for startups, but not artificially low to over-subscribe.
  • Currently there's no data center API. It's scriptable from the CLI. Smart Appliances can package up a data center operations into a drag and drop package. You can drag them into any application. Their first Smart Appliance is "follow me" which can move your application to a data center that is close to you. If you are in Asia you can move your data center to Asia. So your data center can follow you around. No coding is needed on your part. Just drag it into your VDC.
  • With AppLogic instead of managing a bunch of different things you manage your application. You do it once. AppLogic maintains the infrastructure for you.
  • In an upgrade of 10 Apache servers you don't upgrade standing infrastructure. You take a copy of your application and upgrade the copy.
  • Let's say you have an Apache server you want to patch. You create one prototype, which they call a class volume. Then when application restarts all the new changes will be picked up everywhere.
  • The power of what it means to be virtual can be seen in their rollback model. You don't upgrade in-place. You upgrade a copy. Because everything is virtual its easy to make copies of your entire data center. So you can copy your data center, keep the original running, and switch to the upgraded version. If the upgrade version doesn't work you can rollback to the original version of your VDC. This would be almost impossible using traditional methods. An application is the full state of the application with all its data. So you are operating a full complete copy of the application with all of its data. You can rollback to a complete running instance of the application. You just restart the old version.
  • For upgrades that require transformations, like database upgrades, you can write a script to run a database transformation.
  • They don't over automate. They don't only want to have their way of doing things.
  • The model an application has having two parts: the appliance and the content. For a web server this means: - Web Server - Content that it's serving.
  • You first create a prototype of what you want your system to look like. This becomes a class from which you later can create instances. There are templates, like the Linux appliance, to build from. Through their on-line system you configure your system, install packages, etc. When it works the way you want you can drag into your catalog as a template for building new instances. You can create hundreds of copies if you choose.
  • Content would be served off a mount location from inside the VDC.
  • You can upgrade the catalog element and restart the appliance and it will automatically upgrade for you. Not transactional. It's only an individual basis.
  • You can pin machines. You can get the environment to make machine specific configurations. You can put appliances into standby so you can quickly add additional resources on demand.
  • Their load balancer is Pound. No spam detection, but it is session aware. You can use others if you want.
  • They specialize in the code that runs the grid. They aren't specialists in load balancers and routers, etc.
  • In the VDC you can share infrastructure. You can email each other a clustered database, for example. You can save and package up an integration effort as an assembly. Save it. Sell it. Share it.
  • You can create an active-active redundancy scheme and pay for only resources you need because you can bring on resources like the front-end when you need them.
  • Many companies periodically make a local copy of their VDC and move it to their disaster center. - Remember, with a VDC it's easy to pick up your whole data center and move it somewhere else. The catalog doesn't have to be copied each time. Just the data for applications can be copied over. Not so bad with a fast backbone. - Disaster recovery can be triggered by a 3rd party or scripts. - This model is sufficient for companies that can accept some down time. - If no data loss can be tolerated you need replicate in an acive-active architecture.
  • Some companies maintain fungible data centers. They constantly copy their data center over to backup locations. If an app goes down they can fire up a replacement.
  • With AppLogic you can create a stub that can start an application on demand if it's not already running. This allows you to share resources. You can shut it down at night and save those resources for other applications.
  • Here's how they would handle TechCrunch: - Let's say you have an 8 grid data center. Let's say your normal load takes 20-30% of that. - First thing you'll do is use more resources from within the grid. - Then reconfigure appliances with more resources and restart them. - Then call your provider to add more resources. - Softlayer, for example, has a 500-1000 server inventory. So you can add servers to your grid within an hour a two. Currently this process requires human intervention.
  • Finding good OPs people is difficult. So with the VDC you can automate most of it and you don't need a big OPs team.
  • In you VDC your data center configuration is in the meta data, so its not kept as tacit knowledge. One or two people can run a thousand servers because you aren't really running servers, you are running applications.
  • Monitoring - AppLogic in control of all resources. You can build dashboards right off the bat. - You van plug your monitored variables into their monitoring system. - The data are available over the web. - Widgets are available for the display of live stats.
  • Different Way of Thinking about Your System - Typically you put the database on fastest server. Instead, they recommend allocating high end machines to everything so your database can run anywhere. A different way of thinking about your system. - Same with SAN. You don't need a SAN with the storage in the VDC. You are locking yourself into certain ways of thinking that don't apply in the VDC. Concept of using a SAN is just another lock-in.

    Some Observations and Conclusions

  • I think the grid/virtualization approach, in one form or another, is the wave of the future. It simply makes it easier for companies to scale applications. And as applications themselves are structured to run natively on a grid, it will become even easier.
  • Reaching the full potential of the virtual data center depends on having a more granular billing strategy and more fine grained control over resource management. For example, if I have 6 CPU grid and I want to upgrade. I don't want to pay for a 12 CPU data center just so I can upgrade a copy. I don't need 12 normally, I just transiently need 12. So during my upgrade I want my script to trigger allocating a copy of my VDC, do the upgrade, switch to it, and then decommission the old VDC. So I want for a time to have 6 extra servers for the time it takes to upgrade. Then the old VDC should go away. And I should only be billed for the resources I am using while I am using them. This would also give a more satisfactory solution to the TechCrunch scenario.
  • You need to architect your system to take advantage of the grid. To me this means a shared nothing architecture that can be grown horizontally by adding more machines on demand. Applications should read their configuration off shared storage so the configuration doesn't need to be configured on each machine and you can bring up new machines based on a template. If you need to scale a new machine should come up and automatically start handling load. Queuing architectures, for example, have this attribute.
  • They need a data center API so you can treat the data center like an object. This would allow you to orchestrate various data centers around the world as a single coopering unit.
  • Operations within a grid would benefit from standardization. I know this enters the application realm, but operations like upgrade and failover are common and hard. So it would be useful of common processes could be developed and easily deployed.
  • They need turnkey options for those new to the game. As it stands the path from signing up to their service and deploying a web service is little scary. They are very honest in saying they do only one part of the overall picture. But many people need a painting, not a brush and paint. It would be helpful to have out of the box plans for solving the most common problems people face. I would like to thank Bert again for taking this time for the interview! May the grid be with you, always.

    Related Sites and Articles

  • On-Demand Infinitely Scalable Database Seed the Amazon EC2 Cloud

    Click to read more ...

  • Saturday

    Strategy: Send XHR Request on Lost Focus Instead of For Every Character

    Robert Stewart shared this useful Ajax related scalability strategy: We avoided XMLHttpRequests for individual keystrokes, choosing to go back to the server only when a field lost focus. Google can afford all the servers to handle the load for that, but we didn't want to. Do you have a scalability strategy to share? Then share it!.

    Click to read more ...


    another approach to replication

    File replication based on erasure codes can reduce total replicas size 2 times and more.

    Click to read more ...