advertise
Thursday
Oct282010

Notes from A NOSQL Evening in Palo Alto 

I along with 180 other people and veritable who's who of NoSQL vendors, attended the A NoSQL Evening in Palo Alto NoSQL Meetup on Tuesday. The format was a panel of 10 vendors--10gen, Basho, CouchOne, Cloudant, Cloudera, GoGrid, InfiniteGraph, Membase, Riptano, Scality--sitting in two rows of chairs in front of what seemed like a pretty diverse audience. Tim Anglade (founder, A NOSQL Summer) moderated. Tim kept things moving by asking a few leading questions and the panel chimed in with answers. Quite a few questions came from the audience, which was refreshing. 

Overall a genial evening with some good discussion. I was pleased that the panel members didn't just automatically slip into marketing speak. Most of the discussions were on point rather than just another excuse to hit the talking points. There were some complaints about the talk not being technical enough, but I don't think that was really the purpose of this kind of talk. The panel format is excellent at giving a wide range of views on general topics, and that's exactly how the evening went.

Some key takeaways:

  • Good energy. A lot of people are trying to good things and are excited to be in a space where technology still matters more than politics. Real problems are being solved for customers and that's motivating.
  • NoSQL took away the relational model and gave nothing back. Using NoSQL for complex data puts way too much pressure on the programmer.
  • NoSQL will not converge. There's no consensus on what the next thing will be, so we are unlikely to see any standardization in the NoSQL world any time soon. There is a convergence on some features, but it seems the products will evolve to serve specific markets. This is not a bad thing. NoSQL doesn't need to converge on one stack. Products can remain differentiated by being able solve specific problems.
  • NoSQL has a parallel to the "back to the land movement". As the relational world and the framework world got ever more complex and expensive, a counter movement developed that sought out simplicity and transparency. 

Click to read more ...

Thursday
Oct282010

Sponsored Post: Amazon, Membase, Playfish, Electronic Arts, Tagged, Undertone, Joyent, Appirio, Tuenti, CloudSigma, ManageEngine, Site24x7

Who's Hiring?

Fun and Informative Events

  • AWS Start-Up Challenge 2010. Startups can win $100,000 in cash and credits, they won't take any of your stock, and it's easy to enter. You must enter before October 31.
  • Membase Meetups Coming to Major US Cities. The first of these technical meetups is on October 28 at Zynga’s San Francisco offices.

Cool Products and Services

Click to read more ...

Tuesday
Oct262010

Marrying memcached and NoSQL

Memcached is one of the most common In-Memory cache implementation.  It was originally developed by Danga Interactive for LiveJournal, but is now used by many other sites as a side cache to speed up read mostly operations. It gained popularity in the non-Java world, too, especially since it’s a language-neutral side cache for which few alternatives existed.  

As a side-cache, Memcache clients relies on the database as the system of record, The database is still used for write,update and complex query operations.  Since the  memcached specification includes no query operations, memcached is not a database alternative, unlike most of the NoSQL offerings. It also exclude memcache from being a real solution for write scalability. As a result of that many of the heavy sites started to move away from Memcache and replace it with other NoSQL alternatives as noted in a recent highscalability post MySQL And Memcached: End Of An Era?

The transition away from memcached to NoSQL could represent a large investment as many sites are already heavily invested in memcached usage. In this post, I'll illustrate an alternative approach in which we’ll extend the use of memcache for write scaling, add other goodies such as high availability and elasticity by plugging GigaSpaces as the backend datastore, and avoid the need for a re-write. The pure Java implementation could also be seen as a benefit as it can increase the adoption of memcached within the Java community and leverage the portability of java to other platforms. more...

Tuesday
Oct262010

Scaling DISQUS to 75 Million Comments and 17,000 RPS

This presentation and video by Jason Yan and David Cramer discusses how they scaled DISQUS, a comments as a service service for easily adding comments to your site and connecting communities. The presentation is very good, so here are just a few highlights: 

  • Traffic: 17,000 requests/second peak; 450,000 websites; 15 million profiles; 75 million comments; 250 million visitors; 40 million monthly users / developer.
  • Forces: unpredictable traffic patterns because of celebrity gossip and events like disasters; discussion never expire which means they can't fit in memory; must always be up.
  • Machines: 100 servers; 30% web servers (Appache + mod_wsgi); 10% databases (PostgreSQL); 25% cache servers (memcached); 20% load balancing / high availability (HAProxy + heartbeat); 15% Utility servers (Python scripts).
  • Architecture: Requests are load balanced across an Apache cluster. Apache talks to memcached, HAProxy/pgbouncer to handle connection pooling to the database, and a central queue service. 
  • Strategies: make sure indexes fit in memory; log slow queries; use connection pooling; the data model consists of user, forum, thread, post; partitions horizontally (Disqus, Your blog, etc) and vertically (forums, posts, users, sentry) at application level; joins performed in Python; Hudson is used for continuous integration; Redmine is used for bug tracking; extensive test suite; feature switches are used to turn off features; isolate slow functions from transactions; use autocommit for read slaves; a queue is used for low priority tasks; Django QuerySet caching is turned off to save memory.
Tuesday
Oct262010

Sponsored Post: Membase, Playfish, Electronic Arts, Tagged, Undertone, Joyent, Appirio, Tuenti, CloudSigma, ManageEngine, Site24x7

Who's Hiring?

Fun and Informative Events

  • Membase Meetups Coming to Major US Cities. The first of these technical meetups is on October 28 at Zynga’s San Francisco offices.

Cool Products and Services

For more information on each sponsor please read the rest of the post...

Click to read more ...

Sunday
Oct242010

Hot Scalability Links For Oct 24, 2010

On a cold and rainy Fall day, a day stolen from winter rather than our usual gorgeous Indian Summers, a day not even the SF Giants winning the pennant can help warm, here are some hot links to read by a digital flame: 

Friday
Oct222010

Paper: Netflix’s Transition to High-Availability Storage Systems 

In an audacious move for such an established property, Netflix is moving their website out of the comfort of their own datacenter and into the wilds of the Amazon cloud. This paper by Netflix's Siddharth “Sid” Anand, Netflix’s Transition to High-Availability Storage Systems, gives a detailed look at this transition and does a deep dive on SimpleDB best practices, focussing especially on techniques useful to those who are making the move from a RDBMS.

Sid is going to give a talk at QCon based on this paper and he would appreciate your feedback. So if you have any comments or thoughts please comment here or email Sid at r39132@hotmail.com or Twitter at @r39132 Here's the introduction from the paper:

Click to read more ...

Thursday
Oct212010

What is Network-based Application Virtualization and Why Do You Need It?

With all the attention being paid these days to VDI (virtual desktop infrastructure) and application virtualization and server virtualization and <insert type> virtualization it’s easy to forget about network-based application virtualization. But it’s the one virtualization technique you shouldn’t forget because it is a foundational technology upon which myriad other solutions will be enabled.

WHAT IS NETWORK-BASED APPLICATION VIRTUALIZATION?

This term may not be familiar to you but that’s because since its inception oh, more than a decade ago, it’s always just been called “server virtualization”. After the turn of the century (I love saying that, by the way) it was always referred to as service virtualization in SOA and XML circles. With the rise of the likes of VMware and Citrix and Microsoft server virtualization solutions, it’s become impossible to just use the term “server virtualization” and “service virtualization” is just as ambiguous so it seems appropriate to give it a few more modifiers to make it clear that we’re talking about the network-based virtualization (aggregation) of applications.

Click to read more ...

Thursday
Oct212010

Machine VM + Cloud API - Rewriting the Cloud from Scratch

Write a little "Hello World" program these days and it runs inside a bewildering Russian Doll of nested environments, each layer adding its own special performance and complexity tax. First, a language executes in its own environment of data structure libraries, memory management, and so on. That, more often than not, will run inside a language VM like the JVM, CLR, or V8. The language VM will in-turn run inside a process that runs inside an OS. An application will run in one or more threads inside a process. And the whole thing will run inside a machine sharing VM layer like Xen. And across all of that are frameworks for monitoring, elasticity, storage, and so on. That's a lot of overhead for a such a little program.

What if we could remove all these taxes and run directly on the new bare metal, which some consider to be a combination of Machine VM + Cloud API? That's exactly what a system called Mirage, described in the paper Turning down the LAMP: Software Specialisation for the Cloud, sets out to do by treating the cloud virtual hardware as a compiler target, and converting high-level language source code directly into kernels that run on it.

Click to read more ...

Tuesday
Oct192010

Sponsored Post: Playfish, Electronic Arts, Tagged, Undertone, Box.net, Wiredrive, Joyent, DeviantART, CloudSigma, ManageEngine, Site24x7

Who's Hiring?

Fun and Informative Events

  • Membase Meetups Coming to Major US Cities. The first of these technical meetups is on October 28 at Zynga’s San Francisco offices.

Cool Products and Services

Click to read more ...