advertise
Saturday
Apr162011

The NewSQL Market Breakdown

Matt Aslett from the 451 group created a term called “NewSQL”. On the definition of NewSQL, Aslett writes:

“NewSQL” is our shorthand for the various new scalable/high performance SQL database vendors. We have previously referred to these products as ‘ScalableSQL’ to differentiate them from the incumbent relational database products. Since this implies horizontal scalability, which is not necessarily a feature of all the products, we adopted the term ‘NewSQL’ in the new report.

And to clarify, like NoSQL, NewSQL is not to be taken too literally: the new thing about the NewSQL vendors is the vendor, not the SQL.

As with NoSQL, under the NewSQL umbrella you can see various providers, with various solutions.

I think these can be divided into several sub-types:

Click to read more ...

Friday
Apr152011

Stuff The Internet Says On Scalability For April 15, 2011

Submitted for your reading pleasure...

Luxury is an ancient notion.  There was once a Chinese mandarin who had himself wakened three times every morning simply for the pleasure of being told it was not yet time to get up.  ~Argosy

  • We have a Qutoable Quote machine for you today:
    • @kevinweil: Twitter monthly signups have increased more than 50% since December, and we're now doing well over 150 million Tweets per day.
    • @ChrisShain: Prediction: Black art of query optimization will become black art of #nosql data modeling, for same reasons. Minimize IOs, query time.
    • @ui_matters: Infrastructure as a Service = no hardware headaches. Platform as a Svc = no scalability headaches. SaaS = common dev platform #amchamtech
    • @plcstpierre: Thinking about high scalability stuff... I never thought database stuff can be interesting...
    To read more of what the Internet is saying on scalability please read below...

    Click to read more ...

Thursday
Apr142011

Strategy: Cache Application Start State to Reduce Spin-up Times

Using this strategy, Valyala, a commenter on Are Long VM Instance Spin-Up Times In The Cloud Costing You Money?, was able to reduce their GAE application start-up times from 15 seconds down to to 1.5 seconds:

Click to read more ...

Wednesday
Apr132011

Paper: NoSQL Databases - NoSQL Introduction and Overview

Christof Strauch, from Stuttgart Media University, has written an incredible 120+ page paper titled NoSQL Databases as an introduction and overview to NoSQL databases . The paper was written between 2010-06 and 2011-02, so it may be a bit out of date, but if you are looking to take in the NoSQL world in one big gulp, this is your chance. I asked Christof to give us a  short taste of what he was trying to accomplish in his paper:

Click to read more ...

Tuesday
Apr122011

Sponsored Post: Gazillion, Edmunds, OPOWER, ClearStone, deviantART, ScaleOut, aiCache, WAPT, Karmasphere, Kabam, Newrelic, Cloudkick, Membase, Joyent, CloudSigma, ManageEngine, Site24x7

Who's Hiring?

  • Gazillion Entertainment is looking for a Web Developer Generalist to work on massively multiplayer online games. Please apply here
  • Edmunds.com helps people find the car that meets their every need.  We’re currently hiring talented Java Developers in the Los Angeles area.
  • OPOWER motivates millions to become more energy efficient, and we're hiring!
  • deviantART is looking for Infrastructure and Database Operations Engineers! 
  • Kabam is looking for a Quantitative Analyst and a Senior Data Engineer to join the Business Intelligence group at our social gaming startup.

Fun and Informative Events

Cool Products and Services

  • APM (Application Performance Management) for NOSQL, Java and More - Try ClearStone 5.0. Download ClearStone 5.0 today!  http://www.evidentsoftware.com/download/
  • ScaleOut StateServer - Scale Out Your Server Farm Applications!
  • aiCache creates a better user experience by increasing the speed scale and stability of your web-site. 
  • WAPT is a load, stress and performance testing tool for websites and web-based applications.
  • Karmasphere is bringing Apache Hadoop power to developers and analysts. Download your Free Community Edition today!
  • Newrelic - What are you doing to ensure the performance of your apps?
  • Cloudkick - monitor & manage your servers better with a FREE Cloudkick developer account.
  • Learn how two game developers prepared for rapid user growth in this recorded Joyent webinar: http://bit.ly/hzBoib.
  • CloudSigma. Instantly scalable European cloud servers.
  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.
  • www.site24x7.com : Monitor End User Experience from a global monitoring network.
To read more on each sponsor please click below...

Click to read more ...

Tuesday
Apr122011

Caching and Processing 2TB Mozilla Crash Reports in memory with Hazelcast

Mozilla processes TB's of Firefox crash reports daily using HBase, Hadoop, Python and Thrift protocol. The project is called Socorro, a system for collecting, processing, and displaying crash reports from clients. Today the Socorro application stores about 2.6 million crash reports per day. During peak traffic, it receives about 2.5K crashes per minute. 

In this article we are going to demonstrate a proof of concept showing how Mozilla could integrate Hazelcast into Socorro and achieve caching and processing 2TB of crash reports with 50 node Hazelcast cluster. The video for the demo is available here.

 

To read the rest of the article please click below...

Click to read more ...

Friday
Apr082011

Stuff The Internet Says On Scalability For April 8, 2011

Submitted for your reading pleasure on this tomato killing frosty morn...

For the rest of the Stuff the Internet Says please read on...

Click to read more ...

Thursday
Apr072011

Paper: A Co-Relational Model of Data for Large Shared Data Banks

Let's play a quick game of truth or sacrilage: are SQL and NoSQL are really just two sides of the same coin? That's what Erik Meijer and Gavin Bierman would have us believe in their "we can all get along and make a lot of money" article in the Communications of the ACM, A Co-Relational Model of Data for Large Shared Data Banks. You don't believe it? It's math, so it must be true :-) Some key points:

In this article we present a mathematical data model for the most common noSQL databases—namely, key/value relationships—and demonstrate that this data model is the mathematical dual of SQL's relational data model of foreign-/primary-key relationships

...we believe that our categorical data-model formalization and monadic query language will allow the same economic growth to occur for coSQL key-value stores.

...In contrast to common belief, the question of big versus small data is orthogonal to the question of SQL versus coSQL. While the coSQL model naturally supports extreme sharding, the fact that it does not require strong typing and normalization makes it attractive for "small" data as well. On the other hand, it is possible to scale SQL databases by careful partitioning.
What this all means is that coSQL and SQL are not in conflict, like good and evil. Instead they are two opposites that coexist in harmony and can transmute into each other like yin and yang. Because of the common query language based on monads, both can be implemented using the same principles.

I'm certainly in no position to judge this work, or what it means at some deep level. After reading a 1000 treatments on monads I still have no idea what they are. But, like the Standard Model in physics, it would be satisfying if some unifying principles underlay all this stuff. Would we all get along? That's a completely different question...

Wednesday
Apr062011

Netflix: Run Consistency Checkers All the time to Fixup Transactions

You might have consistency problems if you have: multiple datastores in multiple datacenters, without distributed transactions, and with the ability to alternately execute out of each datacenter;  syncing protocols that can fail or sync stale data; distributed clients that cache data and then write old back to the central store; a NoSQL database that doesn't have transactions between updates of multiple related key-value records; application level integrity checks; client driven optimistic locking.

Sounds a lot like many evolving, loosely coupled, autonomous, distributed systems these days. How do you solve these consistency problems? Siddharth "Sid" Anand of Netflix talks about how they solved theirs in his excellent presentation, NoSQL @ Netflix : Part 1, given to a packed crowd at a Cloud Computing Meetup

You might be inclined to say how silly it is to have these problems in the first place, but just hold on. See if you might share some of their problems, before getting all judgy:

Click to read more ...

Monday
Apr042011

Scaling Social Ecommerce Architecture Case study

A recent study showed that over 92 percent of executives from leading retailers are focusing their marketing efforts on Facebook and subsequent applications. Furthermore, over 71 percent of users have confirmed they are more likely to make a purchase after “liking” a brand they find online. (source)

Sears Architect Tomer Gabel provides an insightful overview on how they built a Social Ecommerce solution for Sears.com that can handle complex relationship quires in real time. Tomer goes through:

  • the architectural considerations behind their solution
  • why they chose memory over disk
  • how they partitioned the data to gain scalability
  • why they chose to execute code with the data using GigaSpaces Map/Reduce execution framework
  • how they integrated with Facebook
  • why they chose GigaSpaces over Coherence and Terracotta for in-memory caching and scale

In this post I tried to summarize the main takeaway from the interview.

You can also watch the full interview (highly recomended).

Read the full story here