advertise
« Stuff The Internet Says On Scalability For September 6, 2013 | Main | Wide Fast SATA: the Recipe for Hot Performance »
Thursday
Sep052013

Paper: MillWheel: Fault-Tolerant Stream Processing at Internet Scale

Ever wonder what powers Google's world spirit sensing Zeitgeist service? No, it's not a homunculus of Georg Wilhelm Friedrich Hegel sitting in each browser. It's actually a stream processing (think streaming MapReduce on steroids) system called MillWheel, described in this very well written paper: MillWheel: Fault-Tolerant Stream Processing at Internet Scale. MillWheel isn't just used for Zeitgeist at Google, it's also used for streaming joins for a variety of Ads customers, generalized anomaly-detection service, and network switch and cluster health monitoring.

Abstract:

MillWheel is a framework for building low-latency data-processing applications that is widely used at Google. Users specify a directed computation graph and application code for individual nodes, and the system manages persistent state and the continuous flow of records, all within the envelope of the framework’s fault-tolerance guarantees.

 

This paper describes MillWheel’s programming model as well as its implementation. The case study of a continuous anomaly detector in use at Google serves to motivate how many of MillWheel’s features are used. MillWheel’s programming model provides a notion of logical time, making it simple to write time-based aggregations. MillWheel was designed from the outset with fault tolerance and scalability in mind. In practice, we find that MillWheel’s unique combination of scalability, fault tolerance, and a versatile programming model lends itself to a wide variety of problems at Google.

Related Articles

 

Reader Comments (1)

Any details on a release?

September 10, 2013 | Unregistered CommenterNils

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>