advertise
« Reducing Your Website's Bandwidth Usage - how to | Main | Getting ready for the cloud »
Tuesday
Jan132009

Product: Gearman - Open Source Message Queuing System

Update: New Gearman Server & Library in C, MySQL UDFs.

Gearman is an open source message queuing system that makes it easy to do distributed job processing using multiple languages. With Gearman you: farm out work to other machines, dispatching function calls to machines that are better suited to do work, to do work in parallel, to load balance lots of function calls, to call functions between languages, spread CPU usage around your network.

Gearman is used by companies like LiveJournal, Yahoo!, and Digg. Digg, for example, runs 300,000 jobs a day through Gearman without any issues. Most large sites use something similar. Why would anyone ever even need a message queuing system?


Message queuing is a handy way to move work off your web servers (like image manipulation), to generate thousands of documents in the background, to run the multiple requests in parallel needed to build a web page, or to perform tasks that can comfortably be run in the background and not part of the main request loop for servicing a web request.

There's a gearmand server and clients written in Perl, Ruby, Python or C. Use at least two gearmand server daemons for higher availability. The tasks each client can perform are registered with gearman distributes requests for those functions to the client that can implement them.

Gearman uses a very robust, if somewhat higher latency, signal-and-pull architecture.

  • According to dormando the flow goes like:
    * worker connects to all gearmand servers.
    * worker registers what functions it supports.
    * worker asks for jobs.
    * if no jobs, sends command 'pre_sleep' to all gearmand's and sleeps.

  • Client does:
    * Connect to gearmand.
    * submit's a job for a particular func.

  • Gearmand does:
    * Acks the job, finds all *sleeping workers* related to the function.
    * Sends them all a 'noop' command to wake them up.

  • Worker does:
    * Urk, I'm awake now.
    * Worker asks for jobs.
    * If jobs, do work.
    * If no jobs, sends command 'pre_sleep' to all gearmand's, etc.

    Gearman uses an efficient binary protocol and no XML. There's an a line-based text protocol for admin so you can use telnet and hook into Nagios plugins.

    The system makes no guarantees. If there's a failure the client is told about the failure and the client is responsible for retries. And the queue isn’t persistent. If gearman is restarted the queue is gone.

    Related Articles



  • Gearman Wiki
  • German Google Groups
  • Queue everything and delight everyone by Leslie Michael Orchard.
  • USENIX 2007. Starts at slide 83.
  • PEAR and Gearman by Daniel O'Connor.
  • Amazon Architecture
  • Reader Comments (8)

    Message Queue users might want to consider the Open Message Queue as an alternative which is an enterprise quality, production ready, scalable messaging server. It provides a complete Java Message Service (JMS) implementation for message oriented system integration.

    November 29, 1990 | Unregistered Commentergeekr

    The latest version of the Gearman server is written in C, not Perl.

    November 29, 1990 | Unregistered CommenterAnonymous

    I don't see Gearman's java API both for client and worker though they have mentioned java as one of those mix-n-match languages. Is it a separate project?

    November 29, 1990 | Unregistered CommenterSuman Ganta

    Does anyone know if Yahoo used Gearman to implement their Capacity Scheduler used in Hadoop?

    November 29, 1990 | Unregistered CommenterAnonymous

    I wrote a small piece on handling poison messages/jobs with gearman C-based server that I believe will be helpful to others:

    Gearman and Poison Messages/Jobs
    [http://endertech.blogspot.com/2009/10/gearman-and-poison-messages-or-jobs.html]

    October 17, 2009 | Unregistered CommenterRob O.

    Hi all, I need a pesistent queue that write jobs on a flat file (not use database) so if job server crash and died, when restarted, It will run all jobs which didn't run...

    December 4, 2009 | Unregistered Commenterabc

    Gearman DOES support persistent queues: http://gearman.org/index.php?id=manual:job_server#persistent_queues

    January 27, 2010 | Unregistered CommenterYugene

    Gearman supports retries too http://www.hermanradtke.com/blog/retrying-failed-gearman-jobs/

    January 10, 2012 | Unregistered CommenterCherian

    PostPost a New Comment

    Enter your information below to add a new comment.
    Author Email (optional):
    Author URL (optional):
    Post:
     
    Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>