advertise
Wednesday
Dec292010

Pinboard.in Architecture - Pay to Play to Keep a System Small  

How do you keep a system small enough, while still being successful, that a simple scale-up strategy becomes the preferred architecture? StackOverflow, for example, could stick with a tool chain they were comfortable with because they had a natural brake on how fast they could grow: there are only so many programmers in the world. If this doesn't work for you, here's another natural braking strategy to consider: charge for your service

This interesting point, one I hadn't properly considered before, was brought up by Maciej Ceglowski, co-founder of Pinboard.in, in an interview with Leo Laporte and Amber MacArthur on their their net@night show.

Pinboard is a lean, mean, pay for bookmarking machine, a timely replacement for the nearly departed Delicious. And as a self professed anti-social bookmarking site, it emphasizes speed over socializing. Maciej considers Pinboard a personal archive, where you can keep a history of what you are reading: forever. When the demise of Delicious was announced, if Pinboard had been a free site they'd have been down immediately, but being a paid site helped flatten out their growth curve.

Bookmarking sites used to about sharing links with your friends, but Twitter has largely taken over that role. Twitter, however, is infamous for presenting only a small slice of your tweet history. What you really want is a big server sucking down your bookmarks from wherever you might bookmark them, and that's just what Pinboard does.

A few points struck me as particularly cool about Pinboard:

Click to read more ...

Tuesday
Dec282010

Netflix: Continually Test by Failing Servers with Chaos Monkey

In 5 Lessons We’ve Learned Using AWS, Netflix's John Ciancutti says the best way to avoid failure is to fail constantly. In the cloud it's expected instances can fail at any time, so you always have to be prepared. In the real world we prepare by running drills. Remember all those exciting fire drills? It's not just fire drills of course. The military, football teams, fire fighters, beach rescue, virtually any entity that must react quickly and efficiently to disaster hones their responsiveness by running drills.

Netflix aggressively moves this strategy into the cloud by randomly failing servers using a tool they built called Chaos Monkey. The idea is:

If we aren’t constantly testing our ability to succeed despite failure, then it isn’t likely to work when it matters most – in the event of an unexpected outage.

They respond to failures by degrading service, but they always respond:

Click to read more ...

Thursday
Dec232010

Paper: CRDTs: Consistency without concurrency control

For a great Christmas read forget The Night Before Christmas, a heart warming poem written by Clement Moore for his children, that created the modern idea of Santa Clause we all know and anticipate each Christmas eve. Instead, curl up with a some potent eggnog, nog being any drink made with rum, and read CRDTs: Consistency without concurrency control by Mihai Letia, Nuno Preguiça, and Marc Shapiro, which talks about CRDTs (Commutative Replicated Data Type), a data type whose operations commute when they are concurrent.

From the introduction, which also serves as a nice concise overview of distributed consistency issues:

Click to read more ...

Tuesday
Dec212010

SQL + NoSQL = Yes !

SQLNoSQL

This is a guest post by Frédéric Faure (architect at Ysance), you can follow him on twitter.

Data storage has always been one of the most difficult problems to address, especially as the quantity of stored data is constantly increasing. This is not simply due to the growing numbers of people regularly using the Internet, particularly with all the social networks, games and gizmos now available. Companies are also amassing more and more meticulous information relevant to their business, in order to optimize productivity and ROI (Return On Investment). I find the positioning of SQL and NoSQL (Not Only SQL) as opposites rather a shame: it’s true that the marketing wave of NoSQL has enabled the renewed promotion of a system that’s been around for quite a while, but which was only rarely considered in most cases, as after all, everything could be fitted into the « good old SQL model ». The reverse trend of wanting to make everything fit the NoSQL model is not very profitable either.

So, what’s new … and what isn’t?

Click to read more ...

Tuesday
Dec212010

Sponsored Post: Electronic Arts, Joyent, Membase, CloudSigma, ManageEngine, Site24x7 

Who's Hiring?

Fun and Informative Events

  • A new round of Membase meetups have been planned for January 2011 for San Diego, Denver, Seattle, Vancouver and Chicago.

Cool Products and Services

Click to read more ...

Monday
Dec202010

Netflix: Use Less Chatty Protocols in the Cloud - Plus 26 Fixes

Updated on Friday, February 11, 2011 at 11:26AM by Registered CommenterTodd Hoff

In 5 Lessons We’ve Learned Using AWS, Netflix's John Ciancutti says one of the big lessons they've learned is to create less chatty protocols:

In the Netflix data centers, we have a high capacity, super fast, highly reliable network. This has afforded us the luxury of designing around chatty APIs to remote systems. AWS networking has more variable latency. We’ve had to be much more structured about “over the wire” interactions, even as we’ve transitioned to a more highly distributed architecture.

There's not a lot of advice out there on how to create protocols. Combine that with a rush to the cloud and you have a perfect storm for chatty applications crushing application performance. Netflix is far from the first to be surprised by the less than stellar networks inside AWS. 

A chatty protocol is one where a client makes a series of requests to a server and the client must wait on each reply before sending the next request. On a LAN this can work great. LAN's are typically fast, wide, and drop few packets.

Move that same application to a different network, one where round trip times can easily be an order of magnitude or larger because either the network is slow, lossy or poorly designed, and if a protocol takes many requests to complete a transaction, then it will make a dramatic difference in performance.

My WAN acceleration friends says Microsoft's Common Internet File System (CIFS) is infamous for being chatty. Transferring a 30MB file could tally something like 300msecs of latency on a LAN. On a WAN that could stretch to 7 minutes. Very unexpected results. What is key here is how the quality characteristics of the pipe interacts with the protocol design.

OK, chatty protocols are bad. What can you do about it?

Click to read more ...

Friday
Dec172010

Stuff the Internet Says on Scalability For December 17th, 2010

  • If you missed it here's a link to my webinar and here's the slidedeck for the talk with a buch of additional slides that I didn't have a chance to talk about. The funky picture of Lincoln is classic.
  • Can MySQL really handle 1,000,000 req/sec? Sure, when you turn it into a NoSQLish database, skip all the SQL processing, and access the backend store directly. Percona is making this possible with their HandlerSocket plugin based on the work of Yoshinori Matsunobu.
  • Quotable Quotes:
    • @labsji: If SQL is an abstraction of Big machines....NoSQL is an abstration of distributed computing.
    • : man this eventual consistency #nosql thingy makes #facebook even more annoying. "you have a new comment, no you dont"
  • Nice racks. Time has pictures of a Facebook datacenter. 

Click to read more ...

Thursday
Dec162010

7 Design Patterns for Almost-infinite Scalability

Good article from manageability.com summarizing design patterns from Pat Helland's amazing paper Life beyond Distributed Transactions: an Apostate's Opinion.

  1. Entities are uniquely identified - each entity which represents disjoint data (i.e. no overlap of data between entities) should have a unique key.
  2. Multiple disjoint scopes of transactional serializability - in other words there are these 'entities' and that you cannot perform atomic transactions across these entities.
  3. At-Least-Once messaging - that is an application must tolerate message retries and out-of-order arrival of messages.
  4. Messages are adressed to entities - that is one can't abstract away from the business logic the existence of the unique keys for addresing entities. Addressing however is independent of location.
  5. Entities manage conversational state per party - that is, to ensure idemptency an entity needs to remember that a message has been previously processed. Furthermore, in a world without atomic transactions, outcomes need to be 'negotiated' using some kind of workflow capability.
  6. Alternate indexes cannot reside within a single scope of serializability - that is, one can't assume the indices or references to entities can be update atomically. There is the potential that these indices may become out of sync.
  7. Messaging between Entities are Tentative - that is, entities need to accept some level of uncertainty and that messages that are sent are requests form commitment and may possibly be cancelled.

The article then compares how these principles compare so of the design principles used to develop S3: 

Click to read more ...

Monday
Dec132010

Still Time to Attend My Webinar Tomorrow: What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications

It's time to do something a little different and for me that doesn't mean cutting off my hair and joining a monastery, nor does it mean buying a cherry red convertible (yet), it means doing a webinar!

  • On December 14th, 2:00 PM - 3:00 PM EST, I'll be hosting What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications.
  • The webinar is sponsored by VoltDB, but it will be completely vendor independent, as that's the only honor preserving and technically accurate way of doing these things.
  • The webinar will run about 60 minutes, with 40 minutes of speechifying and 20 minutes for questions.
  • The hashtag for the event on Twitter will be SQLNoSQL. I'll be monitoring that hashtag if you have any suggestions for the webinar or if you would like to ask questions during the webinar. 

Click to read more ...

Wednesday
Dec082010

How To Get Experience Working With Large Datasets

The Giant Twins

I think I have been lucky that several of the projects I been worked on have exposed me to having to manage large volumes of data. The largest dataset was probably at MailChannels, though Livedoor.com also had some sizeable data for their books store and department store. Most of the pain with Livedoor’s data was from it being in Japanese. Other than that, it was pretty static. This was similar to the data I worked with at the BBC. You would be surprised at how much data can be involved with a single episode of a TV show. With any in-house generated data the update size and frequency is much less dramatic, even if the data is being regularly pumped in from 3rd parties.

Click to read more ...