« Paper: Network Stack Specialization for Performance | Main | Stuff The Internet Says On Scalability For February 7th, 2014 »

13 Simple Tricks for Scaling Python and Django with Apache from HackerEarth

HackerEarth is a coding skill practice and testing service that in a series of well written articles describes the trials and tribulations of building their site and how they overcame them: Scaling Python/Django application with Apache and mod_wsgi, Programming challenges, uptime, and mistakes in 2013, Post-mortem: The big outage on January 25, 2014, The Robust Realtime Server, 100,000 strong - CodeFactory server, Scaling database with Django and HAProxy, Continuous Deployment System, HackerEarth Technology Stack.

What characterizes these articles and makes them especially helpful is a drive for improvement and an openness towards reporting what didn't work and how they figured out what would work.

As they say, mistakes happen when you are building a complex product with a team of just 3-4 engineers, but investing in infrastructure allowed them to take more breaks, roam the streets of Bangalore while their servers are happily serving thousands of requests every minute, while reaching a 50,000 user base with ease.

Here's a gloss on how they did it:

Current Architecture at HackerEarth: Frontend server(s); API server(s); Code-checker server(s); Search server(s) - Apache Solr & Elastic Search; Realtime server - written using Tornado; Status server; Toolchain server (Mainly used for continuous deployment); Integration Test server; Log server; Memcached server; Few more servers for data crunching processing analytics database and background jobs; RabbitMQ, Celery, etc. which glues many servers; monitoring servers; databases are sharded and are load balanced behing HAProxy. 

  1. Remove unnecessary Apache modules. Saves memory and improves performance. By including only what you need you can cut in half the number of modules loaded.
  2. Use Apache MPM (Multi-Processing Module) worker. Generally a better choice for high-traffic servers because it has a smaller memory footprint than the prefork MPM.
  3. KeepAlive Off. Static files are served from CloudFront and experimentation showed this was more efficient, processes/threads are free to handle new requests instantaneously rather than waiting for a request to arrive on the older connection.
  4. Daemon Mode of mod_wsgi. The number of threads and processes is constant, which makes resource consumption predictable and protects against traffic spikes. 
  5. Tweaking mpm-worker configuration. They show the configuration they use after much experimentation, which favors their application type, which is more CPU intensive than memory intensive. 
  6. Check configuration. Enable modules mod_status.so and mod_info.so to see how Apache is being run. This information helped them significantly reduced the number of servers we had to run and made the application more stable and resilient to traffic bursts.
  7. Nothing scales automatically. 100% uptime is a constant struggle. Roll up your sleeves and work towards that goal.
  8. Don’t take pride in running 100 servers. Write better code and tune your system. There's no pride in throwing servers at a large number of requests. This means making sure, for example, that a request doesn't query the database 20 times.
  9. Asynchronous code-checker server queueing system. Rewriting the code-checker server queueing system to make it asynchronous significantly reduced the process overhead on their frontend servers.
  10. Use Tornado for serious parallel work. “socket.io” module is not able to scale past 150 simultaneous connections. Nowjs also leaked file descriptors.
  11. Shard database and database routers. Sharding the database reduced overhead on single database and further reduced query latencies.
  12. Cache it. Over a million key-value pairs in memcached, sessions are maintained in redis, any other persistent data goes into MySQL or S3, but most is cached for some suitable lifetime.
  13. Deploy continuously. Updating code changes in production manually would have driven them crazy and would have been a total waste of time.

Reader Comments (6)

I actually like the article, but please don't start using link-bait titles. I mostly ignore stuff with titles like that, and I almost skipped this one until I realized it was on High Scalability.

February 10, 2014 | Unregistered CommenterG Gordon Worley III

Indeed. this is not some stupid buzzfeed.. Content of these articles does not warrant degrading to "# simple tricks to X" titles.

February 11, 2014 | Registered Commentermxx

Actually I try to had have a wide mix of content. Some long, some short. Some original, some from others. Some curated with the details elsewhere if you are interested, some self-contained. Some for beginners, some for experts. Some very specific, some very general.

So any single post may not be your thing, but that doesn't mean it won't be someone's thing.

The bullet format is the quickest way to get specific ideas across. Those ideas may be old hat to many or new and interesting. If interesting you can see the referenced source materials for more details, which is exactly the idea. Otherwise you can just move on, which is exactly the idea.

February 11, 2014 | Registered CommenterTodd Hoff

I think the first two comments just meant they didn't like the title because it looked like a cheep trick. But as they also note, your articles are consistently excellent which is why went ahead and read it anyways.

February 11, 2014 | Unregistered Commenterjpj

Indeed, there is no problem with the content. It's just that title reeks of linkbait (unjustly).

February 12, 2014 | Unregistered CommenterKor

jpj, exactly. :)

February 12, 2014 | Registered Commentermxx

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>