How many times have we all run across a situation where the performance tests on a piece of software pass with flying colors on the test systems only to see the software exhibit poor performance characteristics when the software is deployed in production? Read More Here...
"But it is not complicated. [There's] just a lot of it."
-- Richard Feynman on how the immense variety of the world arises from simple rules.
- Have We Reached the End of Scaling?
- Applications Become Black Boxes Using Markets to Scale and Control Costs
- Let's Welcome our Neo-Feudal Overlords
- The Economic Argument for the Ambient Cloud
- What Will Kill the Cloud?
- The Amazing Collective Compute Power of the Ambient Cloud
- Using the Ambient Cloud as an Application Runtime
- Applications as Virtual States
We have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.
Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but consider we have no planet wide applications yet. None.
Tomorrow the numbers foreshadow a new Cambrian explosion of connectivity that will look as different as the image of a bare lifeless earth looks to us today. We will have 10 billion people, we will have trillions of things, and we will have a great multitude of social networks densely interconnecting all these people to people, things to things, and people to things.
How can we possibly build planet scalable systems to handle this massive growth if building much smaller applications currently stresses architectural best practices past breaking? We can't. We aren't anywhere close to building applications at this scale, except for perhaps Google and a few others, and there's no way you and I can reproduce what they are doing. Companies are scrambling to raise hundreds of millions of dollars in order to build even more datacenters. As the world becomes more and more global and more and more connected, handling the load may require building applications 4 or 5 orders of magnitude larger than any current system. The cost for an infrastructure capable of supporting planet-scale applications could be in the 10 trillion dollar range (very roughly estimated at $100 million a data center times 10K).
If you aren't Google, or a very few other companies, how can you possibly compete? For a glimmer of a possible direction that may not require a kingdom's worth of resources, please take a look at this short video:
This post draws some of the common patterns behind the various NOSQL alternatives, and how they address the database scalability challenge.
Read the full story here
One of the core assumption behind many of today’s databases is that disks are reliable. In other words, your data is “safe” if it is stored on a disk, and indeed most database solutions rely heavily on that assumption. Is it a valid assumption?
Read the full story here
I try to keep this blog targeted and on topic. So even though I may be thankful for the song of the tinniest sparrow at sunrise, I'll save you from all that. It's hard to tie scalability and the giving of thanks together, especially as it sometimes occurs to me that this blog may be a self-indulgent waste of time. But I think I found a sentiment in A New THEORY of AWESOMENESS and MIRACLES by James Bridle that manages to marry the topic of this blog and giving thanks meaningfully together:
I distrust commercial definitions of innovation, and particularly of awesomeness. It’s an overused term. When I think of awesomeness, I want something awe-inspiring, vast and mind-expanding.
So I started thinking about things that I think are awesome, or miraculous, and for me, it kept coming back to scale and complexity.
We’re not actually very good about thinking about scale and complexity in real terms, so we have to use metaphors and examples. Douglas Adams writes somewhere about how big the Hitchhiker’s Guide to the Galaxy actually is—imagine a sheet of paper, then a filing cabinet full of sheets of paper, then a room full of filing cabinets, then a skyscraper full of rooms, then a city full of skyscrapers, a country, a planet, a solar system and so on. I couldn’t find the exact quote, so his thoughts on space will have to do:
Just wonderful. I especially love the quote So I started thinking about things that I think are awesome, or miraculous, and for me, it kept coming back to scale and complexity. This perfectly sums up why the topic of scalability is so endlessly diverting. It can take you anywhere you want to go and everything eventually ends up back again.
Thanks for reading and...
- Eventual Consistency by Example by Sergio Bossa. Attempts to clear up some misconceptions about eventual consitency as discussed in Amazon's Dynamo paper.
- Boston Big Data Summit keynote outline by Curt Monash. Interesting topics: Big Data and the cloud actually have relatively little to do with each other and The NoSQL movement is a lot like the Ron Paul campaign.
- I think RDBMS has set the industry back by 10 years by Henry G. Baker, Ph.D, from 1992. I can categorically state that relational databases set the commercial data processing industry back at least ten yearsand wasted many of the billions of dollars that were spent on data processing. Henry thought OO databases would change things. They didn't. The question is why?
- Intel cloud service tests the scalability of your code. Intel has a cloud based tool that can test how your application will perform on will on a number of multicore processor configurations -- 1, 2, 4, 8, or 16 hardware threads.
- Mapreduce 1, a lecture by Brian Harvey.
- Gear6 has released a software version of their cache product. Interesting departure from the appliance model. Appliances are good because they allow you complete control and something to hang some margin off of. Yet if you want to sell into the cloud you have to build software components, not a hardware solution. Seems like a good idea for those who want a tricked out memcached solution out of the box.
- Hadoop at Twitter (part 1): Splittable LZO Compression. How Twitter is using Hadoop to analyze a tweasure trove of tweets.
- A funny/insightful/sad/truish Dilbert cartoon on how clouds fit into Dilbert's world.
Contributed by Wolfgang Gentzsch:
Now that we have a new computing paradigm, Cloud Computing, how can Clouds help our data? Replace our internal data vaults as we hoped Grids would? Are Grids dead now that we have Clouds? Despite all the promising developments in the Grid and Cloud computing space, and the avalanche of publications and talks on this subject, many people still seem to be confused about internal data and compute resources, versus Grids versus Clouds, and they are hesitant to take the next step. I think there are a number of issues driving this uncertainty.
read more at: BigDataMatters.com
You don't even have to make a bid, Randy Shoup, an eBay Distinguished Architect, gives this presentation on how eBay scales, for free. Randy has done a fabulous job in this presentation and in other talks listed at the end of this post getting at the heart of the principles behind scalability. It's more about ideas of how things work and fit together than a focusing on a particular technology stack.
In case you weren't sure, eBay is big, with lots of: users, data, features, and change...
- Over 89 million active users worldwide
- 190 million items for sale in 50,000 categories
- Over 8 billion URL requests per day
- Hundreds of new features per quarter
- Roughly 10% of items are listed or ended every day
- In 39 countries and 10 languages
- 70 billion read / write operations / day
- Processes 50TB of new, incremental data per day
- Analyzes 50PB of data per day