Strategy: Cache Application Start State to Reduce Spin-up Times

Using this strategy, Valyala, a commenter on Are Long VM Instance Spin-Up Times In The Cloud Costing You Money?, was able to reduce their GAE application start-up times from 15 seconds down to to 1.5 seconds:

Click to read more ...


Paper: NoSQL Databases - NoSQL Introduction and Overview

Christof Strauch, from Stuttgart Media University, has written an incredible 120+ page paper titled NoSQL Databases as an introduction and overview to NoSQL databases . The paper was written between 2010-06 and 2011-02, so it may be a bit out of date, but if you are looking to take in the NoSQL world in one big gulp, this is your chance. I asked Christof to give us a  short taste of what he was trying to accomplish in his paper:

Click to read more ...


Sponsored Post: Gazillion, Edmunds, OPOWER, ClearStone, deviantART, ScaleOut, aiCache, WAPT, Karmasphere, Kabam, Newrelic, Cloudkick, Membase, Joyent, CloudSigma, ManageEngine, Site24x7

Who's Hiring?

  • Gazillion Entertainment is looking for a Web Developer Generalist to work on massively multiplayer online games. Please apply here
  • helps people find the car that meets their every need.  We’re currently hiring talented Java Developers in the Los Angeles area.
  • OPOWER motivates millions to become more energy efficient, and we're hiring!
  • deviantART is looking for Infrastructure and Database Operations Engineers! 
  • Kabam is looking for a Quantitative Analyst and a Senior Data Engineer to join the Business Intelligence group at our social gaming startup.

Fun and Informative Events

Cool Products and Services

  • APM (Application Performance Management) for NOSQL, Java and More - Try ClearStone 5.0. Download ClearStone 5.0 today!
  • ScaleOut StateServer - Scale Out Your Server Farm Applications!
  • aiCache creates a better user experience by increasing the speed scale and stability of your web-site. 
  • WAPT is a load, stress and performance testing tool for websites and web-based applications.
  • Karmasphere is bringing Apache Hadoop power to developers and analysts. Download your Free Community Edition today!
  • Newrelic - What are you doing to ensure the performance of your apps?
  • Cloudkick - monitor & manage your servers better with a FREE Cloudkick developer account.
  • Learn how two game developers prepared for rapid user growth in this recorded Joyent webinar:
  • CloudSigma. Instantly scalable European cloud servers.
  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.
  • : Monitor End User Experience from a global monitoring network.
To read more on each sponsor please click below...

Click to read more ...


Caching and Processing 2TB Mozilla Crash Reports in memory with Hazelcast

Mozilla processes TB's of Firefox crash reports daily using HBase, Hadoop, Python and Thrift protocol. The project is called Socorro, a system for collecting, processing, and displaying crash reports from clients. Today the Socorro application stores about 2.6 million crash reports per day. During peak traffic, it receives about 2.5K crashes per minute. 

In this article we are going to demonstrate a proof of concept showing how Mozilla could integrate Hazelcast into Socorro and achieve caching and processing 2TB of crash reports with 50 node Hazelcast cluster. The video for the demo is available here.


To read the rest of the article please click below...

Click to read more ...


Stuff The Internet Says On Scalability For April 8, 2011

Submitted for your reading pleasure on this tomato killing frosty morn...

For the rest of the Stuff the Internet Says please read on...

Click to read more ...


Paper: A Co-Relational Model of Data for Large Shared Data Banks

Let's play a quick game of truth or sacrilage: are SQL and NoSQL are really just two sides of the same coin? That's what Erik Meijer and Gavin Bierman would have us believe in their "we can all get along and make a lot of money" article in the Communications of the ACM, A Co-Relational Model of Data for Large Shared Data Banks. You don't believe it? It's math, so it must be true :-) Some key points:

In this article we present a mathematical data model for the most common noSQL databases—namely, key/value relationships—and demonstrate that this data model is the mathematical dual of SQL's relational data model of foreign-/primary-key relationships

...we believe that our categorical data-model formalization and monadic query language will allow the same economic growth to occur for coSQL key-value stores.

...In contrast to common belief, the question of big versus small data is orthogonal to the question of SQL versus coSQL. While the coSQL model naturally supports extreme sharding, the fact that it does not require strong typing and normalization makes it attractive for "small" data as well. On the other hand, it is possible to scale SQL databases by careful partitioning.
What this all means is that coSQL and SQL are not in conflict, like good and evil. Instead they are two opposites that coexist in harmony and can transmute into each other like yin and yang. Because of the common query language based on monads, both can be implemented using the same principles.

I'm certainly in no position to judge this work, or what it means at some deep level. After reading a 1000 treatments on monads I still have no idea what they are. But, like the Standard Model in physics, it would be satisfying if some unifying principles underlay all this stuff. Would we all get along? That's a completely different question...


Netflix: Run Consistency Checkers All the time to Fixup Transactions

You might have consistency problems if you have: multiple datastores in multiple datacenters, without distributed transactions, and with the ability to alternately execute out of each datacenter;  syncing protocols that can fail or sync stale data; distributed clients that cache data and then write old back to the central store; a NoSQL database that doesn't have transactions between updates of multiple related key-value records; application level integrity checks; client driven optimistic locking.

Sounds a lot like many evolving, loosely coupled, autonomous, distributed systems these days. How do you solve these consistency problems? Siddharth "Sid" Anand of Netflix talks about how they solved theirs in his excellent presentation, NoSQL @ Netflix : Part 1, given to a packed crowd at a Cloud Computing Meetup

You might be inclined to say how silly it is to have these problems in the first place, but just hold on. See if you might share some of their problems, before getting all judgy:

Click to read more ...


Scaling Social Ecommerce Architecture Case study

A recent study showed that over 92 percent of executives from leading retailers are focusing their marketing efforts on Facebook and subsequent applications. Furthermore, over 71 percent of users have confirmed they are more likely to make a purchase after “liking” a brand they find online. (source)

Sears Architect Tomer Gabel provides an insightful overview on how they built a Social Ecommerce solution for that can handle complex relationship quires in real time. Tomer goes through:

  • the architectural considerations behind their solution
  • why they chose memory over disk
  • how they partitioned the data to gain scalability
  • why they chose to execute code with the data using GigaSpaces Map/Reduce execution framework
  • how they integrated with Facebook
  • why they chose GigaSpaces over Coherence and Terracotta for in-memory caching and scale

In this post I tried to summarize the main takeaway from the interview.

You can also watch the full interview (highly recomended).

Read the full story here


Stuff The Internet Says On Scalability For April 1, 2011

Submitted for your reading pleasure, no foolin'...

  • Quotable Quotes:
    • @zateriosystems: thinking about scalability?, are you OK to double your capacity in one week?, a startup should be ready...ready to jump.
    • @sklacy: Maybe what I should have said is "Design for scalability, deploy without it."
    • @MikeHale: Scalability is customer 2000 having the same experience as customer 1 #sqlsat67
    • @LusciousPear: The meaning of #NoSQL is shut up
    • @deobrat: The biggest bottleneck to scalability are ignorant developers. Most don't even try saving extra CPU cycles or memory bytes :(
    • @w_westendorp: .@ijansch: Cloud computing is like outsourcing your scalability problems
    • @edyavno: Billy Newport essentially just affirmed the theme I've been propagating: "Distributed Caching is the enterprise NoSQL" #strangeloop #nosql
    • @monkchips: HP CEO Leo Apotheker says "relational databases are becoming less and less relevant to the future stack"
    For more Stuff the Internet says please keep on reading...

    Click to read more ...


8 Lessons We Can Learn from the MySpace Incident - Balance, Vision, Fearlessness

A surprising amount of heat and light was generated by the whole Micrsoft vs MySpace discussion. Why people feel so passionate about this I'm not quite sure, but fortunately for us, in the best sense of the web, it generated an amazing number of insightful comments and observations. If we stand back and take a look at the whole incident, what can we take a way that might help us in the future?

Click to read more ...