Product

Product: Happy = Hadoop + Python

Todd Hoff

Sep 28, 2008 — 1 min read

Has a Java only Hadoop been getting you down? Now you can be Happy. Happy is a framework for writing map-reduce programs for Hadoop using Jython. It files off the sharp edges on Hadoop and makes writing map-reduce programs a breeze. There's really no history yet on Happy, but I'm delighted at the idea of being able to map-reduce in other languages. The more ways the better.

From the website:

Happy is a framework that allows Hadoop jobs to be written and run in Python 2.2 using Jython. It is an
easy way to write map-reduce programs for Hadoop, and includes some new useful features as well.
The current release supports Hadoop 0.17.2.

Map-reduce jobs in Happy are defined by sub-classing happy.HappyJob and implementing a
map(records, task) and reduce(key, values, task) function. Then you create an instance of the
class, set the job parameters (such as inputs and outputs) and call run().

When you call run(), Happy serializes your job instance and copies it and all accompanying
libraries out to the Hadoop cluster. Then for each task in the Hadoop job, your job instance is
de-serialized and map or reduce is called.

The task results are written out using a collector, but aggregate statistics and other roll-up
information can be stored in the happy.results dictionary, which is returned from the run() call.

Jython modules and Java jar files that are being called by your code can be specified using
the environment variable HAPPY_PATH. These are added to the Python path at startup, and
are also automatically included when jobs are sent to Hadoop. The path is stored in happy.path
and can be edited at runtime.

Capturing A Billion Emo(j)i-ons

This blog post was written by Dedeepya Bonthu. This is a repost from her Medium article, approved by the author. In stadiums, sports fans love to express themselves by cheering for their favorite teams, holding up placards and team logos. Emoji’s allow fans at home to rapidly express themselves,

Brief History of Scaling Uber

This blog post was written by Josh Clemm, Senior Director of Engineering at Uber Eats. This is a repost from his LinkedIn article, approved by the author. On a cold evening in Paris in 2008, Travis Kalanick and Garrett Camp couldn't get a cab. That's when

Behind AWS S3’s Massive Scale

This is a guest article by Stanislav Kozlovski, an Apache Kafka Committer. If you would like to connect with Stanislav, you can do so on Twitter and LinkedIn. AWS S3 is a service every engineer is familiar with. It’s the service that popularized the notion of cold-storage to the

The Swedbank Outage shows that Change Controls don't work

This week I’ve been reading through the recent judgment from the Swedish FSA on the Swedbank outage. If you’re unfamiliar with this story, Swedbank had a major outage in April 2022 that was caused by an unapproved change to their IT systems. It temporarily left nearly a million

Read more

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale

The Swedbank Outage shows that Change Controls don't work