6 Lessons from Dropbox - One Million Files Saved Every 15 minutes

Dropbox saves one million files every 15 minutes,  more tweets than even Twitterers tweet. That mind blowing statistic was revealed by Rian Hunter, a Dropbox Engineer, in his presentation How Dropbox Did It and How Python Helped at PyCon 2011.

The first part of the presentation is some Dropbox lore, origin stories and other foundational myths. We learn that Dropbox is a startup company located in San Francisco that has probably one of the most popular file synchronization and sharing tools in the world, shipping Python on the desktop and supporting millions of users and growing every day

About half way through the talk turns technical. Not a lot of info on how Dropbox handles this massive scale was dropped, but there were a number of good lessons to ponder:

Click to read more ...


Google and Netflix Strategy: Use Partial Responses to Reduce Request Sizes

This strategy targets reducing the amount of protocol data in packets by sending only the attributes that are needed. Google calls this Partial Response and Partial Update.

Netflix posted about adopting this strategy in their recent Netflix API redesign. We've seen previously how Netflix improved performance by creating less chatty protocols.

As a consequence packet sizes rise as more data is being stuffed into each packet in order to reduce the number of round trips. But we don't like large packets either (memory usage and packet processing overhead), so we have to think of creative ways to shrink them back down.

The change Netflx is making is to conceptualize their API as a database. What does this mean?

Click to read more ...


Productivity vs. Control tradeoffs in PaaS

Gartner published recently an interesting paper: Productivity vs. Control: Cloud Application Platforms Must Split to Win. (The paper requires registration.)

The paper does a pretty good job covering the evolution that is taking place in the PaaS market toward a more open platform and compares between the two main categories: aPaaS (essentially a PaaS running as a service) and CEAP (Cloud Enabled Application Platform) which is the  *P* out of PaaS that gives you the platform to build your own PaaS in private or public cloud.

While I was reading through the paper I felt that something continued to bother me with this definition, even though I tend to agree with the overall observation. If I follow the logic of this paper than I have to give away productivity to gain control, hmm…  that’s a hard choice.

The issue seem to be with the way we define productivity. Read the full detailes here


Medialets Architecture - Defeating the Daunting Mobile Device Data Deluge

Mobile developers have a huge scaling problem ahead: doing something useful with massive continuous streams of telemetry data from millions and millions of devices. This is a really good problem to have. It means smartphone sales are finally fulfilling their destiny: slaughtering PCs in the sales arena. And it also means mobile devices aren't just containers for simple standalone apps anymore, they are becoming the dominant interface to giant backend systems.

Click to read more ...


Stuff The Internet Says On Scalability For March 4, 2011

Submitted for your reading pleasure on this beautifully blue and sunny Friday...

  • @Werner: Each day #AWS adds enough computing muscle to power one whole circa 2000, when it was a $2.8 billion business
  • Building servers to rule in hell. Datacenters spend a lot of energy on cooling down processors. Why can't they operate at higher temperatures? This is the proposition addressed by James Hamilton in Exploring the Limits of Datacenter Temprature and Datacenter Knowledge in What’s Next? Hotter Servers with ‘Gas Pedals’.
  • Quotable Quotes for 200 Watson:
    • @jreichhold: One thing working at Twitter teaches me daily is that all scale is relative. What seemed impossible last year is now the daily case.
  • Scalability Porn:
    • Storage, you ain't seen nothing yet, wait until every smart phone is equipped with a new gigapixel camera. These new one billion plus pixel images will take upwards of 30GB to store.

    Click to read more ...


Stack Overflow Architecture Update - Now at 95 Million Page Views a Month

A lot has happened since my first article on the Stack Overflow Architecture. Contrary to the theme of that last article, which lavished attention on Stack Overflow's dedication to a scale-up strategy, Stack Overflow has both grown up and out in the last few years.

Stack Overflow has grown up by more then doubling in size to over 16 million users and multiplying its number of page views nearly 6 times to 95 million page views a month.  

Stack Overflow has grown out by expanding into the Stack Exchange Network, which includes Stack Overflow, Server Fault, and Super User for a grand total of 43 different sites. That's a lot of fruitful multiplying going on.

Click to read more ...


Sponsored Post: ScaleOut, aiCache, WAPT, Karmasphere, Kabam, Opera Solutions, Newrelic, Cloudkick, Membase, Joyent, CloudSigma, ManageEngine, Site24x7

Who's Hiring?

Fun and Informative Events

Cool Products and Services

  • ScaleOut StateServer - Scale Out Your Server Farm Applications!
  • aiCache creates a better user experience by increasing the speed scale and stability of your web-site. 
  • WAPT is a load, stress and performance testing tool for websites and web-based applications.
  • Karmasphere is bringing Apache Hadoop power to developers and analysts. Download your Free Community Edition today!
  • Newrelic - What are you doing to ensure the performance of your apps?
  • Cloudkick - monitor & manage your servers better with a FREE Cloudkick developer account.
  • Learn how two game developers prepared for rapid user growth in this recorded Joyent webinar:
  • CloudSigma. Instantly scalable European cloud servers.
  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.
  • : Monitor End User Experience from a global monitoring network.

Click to read more ...


A Practical Guide to Varnish - Why Varnish Matters

This is a guest post by Jeff Su from Factual.

What is Varnish?

Varnish is an open source, high performance http accelerator that sits in front of a web stack and caches pages.  This caching layer is very configurable and can be used for both static and dynamic content.

One great thing about Varnish is that it can improve the performance of your website without requiring any code changes.  If you haven’t heard of Varnish (or have heard of it, but haven’t used it), please read on.  Adding Varnish to your stack can be completely noninvasive, but if you tweak your stack to play along with some of varnish’s more advanced features, you’ll be able to increase performance by orders of magnitude.

Some of the high profile companies using Varnish include: TwitterFacebookHeroku and LinkedIn.

Click to read more ...


Strategy: Eliminate Unnecessary SQL

MySQL Expert Ronald Bradford explains how one key way to improve the scalability of a MySQL server, and undoubtedly nearly every other server, is to eliminate unnecessary SQL, saying the most efficient way to improve an SQL statement is to eliminate it:

The MySQL kernel can only physically process a certain number of SQL statements for a given time period (e.g. per second). Regardless of the type of machine you have, there is a physical limit. If you eliminate SQL statements that are unwarranted and unnecessary, you automatically enable more important SQL statements to run. There are numerous other downstream affects, however this is the simple math. To run more SQL, reduce the number of SQL you need to run.

Ronald shows how to use mk-query-digest to look at query execution times and determine which ones can be profitably whacked. 

Click to read more ...


This stuff isn't taught, you learn it bit by bit as you solve each problem.

"For the things we have to learn before we can do them, we learn by doing them." -- Aristotle

A really nice Internet moment happened in the HackerNews thread Disqus: Scaling the World’s Largest Django Application, when David Kitchen crafted an awesome response to a question about how you learn to build scalable systems. It's so good I thought I would reproduce it here.

Question: asked by grovulent:

Not like this is a problem I have to worry about. But where on earth does one learn this stuff? The talk is useful - as an overview of what they use - but I know nothing of how to implement a single step.

Answer: answered by David Kitchen of buro9:

Click to read more ...