Map Reduce

Database People Hating on MapReduce

Update: Typical Programmer tackles the technical issues in Relational Database Experts Jump The MapReduce Shark. The culture clash is still what fascinates me.

David DeWitt writes in the Database Column that MapReduce is a major step backwards:

A giant step backward in the programming paradigm for large-scale data intensive applications

A sub-optimal implementation, in that it uses brute force instead of indexing

Not novel at all -- it represents a specific implementation of well known techniques developed nearly 25 years ago

Missing most of the features that are routinely included in current DBMS

Incompatible with all of the tools DBMS users have come to depend on

Listening to databasers and map reducers talk is like eavesdropping on your average family holiday mashup. Every holiday people who have virtually nothing in common are thrown together because they incidentally share a little DNA or are married to the shared DNA. In desperation everyone gravitates to some shared enemy they can all confidently bash. But after that moment is relieved and awkward silence once again looms, nothing is left but more drinking and tackling sensitive topics you just know will end badly.

Database folks love their schemas, relational purity and their swiss army knife indexes. You soon learn that really map reduce is just another form of an index and indexes really can scale to any heights with just a little tweaking. Map reducers lover their pure functional models, their self-healing clustery filled ecosystems, and the shear joy of the semi-organized chaos of letting a 10,000 CPUs simultaneously bloom.

I for one stand firmly by the relish tray. Transactions have a place and so does structured data. That's why Google contributes heavily to MySQL. Yet, I too like my map reduce engine, distributed file system combo platter. With map reduce I can implement any complex behavior over any data set. With enough machines that work can be performed in a predictable amount of time. You aren't limit to set logic, SQL types, and tweaked indexes. That's pretty good stuff too.

Much like a staunchly conservative nail crunching father and his too soft pansy liberal son, these two camps will never understand each other. Every sign of beauty in one person's eyes is just another confirmation to the other side of impending senility. Why even try? Just hug in a manly way and agree to meet again next year.

Kafka 101

This is a guest article by Stanislav Kozlovski, an Apache Kafka Committer. If you would like to connect with Stanislav, you can do so on Twitter and LinkedIn. Originally developed in LinkedIn during 2011, Apache Kafka is one of the most popular open-source Apache projects out there. So far it

Capturing A Billion Emo(j)i-ons

This blog post was written by Dedeepya Bonthu. This is a repost from her Medium article, approved by the author. In stadiums, sports fans love to express themselves by cheering for their favorite teams, holding up placards and team logos. Emoji’s allow fans at home to rapidly express themselves,

Brief History of Scaling Uber

This blog post was written by Josh Clemm, Senior Director of Engineering at Uber Eats. This is a repost from his LinkedIn article, approved by the author. On a cold evening in Paris in 2008, Travis Kalanick and Garrett Camp couldn't get a cab. That's when

Behind AWS S3’s Massive Scale

This is a guest article by Stanislav Kozlovski, an Apache Kafka Committer. If you would like to connect with Stanislav, you can do so on Twitter and LinkedIn. AWS S3 is a service every engineer is familiar with. It’s the service that popularized the notion of cold-storage to the

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale