advertise
« Paper: ZooKeeper: Wait-free coordination for Internet-scale systems | Main | The Great Microservices vs Monolithic Apps Twitter Melee »
Wednesday
Jul302014

Preventing the Dogpile Effect - Problem and Solution

This is a guest repost Przemek Sobstel, who believes that dogpile effect issue is not covered enough, especially in the PHP world. Orignal article: Preventing dogpile effect.

The Dogpile effect occurs when cache expires and websites are hit by numerous requests the same time. From my own experiences working on big-traffic websites this is what I consider best the best solution. It was used sucessfully in the wild and it worked. Many people mention storing two redundant values FRESH + STALE, but for big traffic websites it was killing our network. We thought it worth sharing our solution and starting a discussion for sharing experiences.

Preventing Dogpiles

Implementing caching in web apps seems to be simple. You check if value is cached. If it is, you fetch cached value from cache and serve it. If it’s not, you generate new value and store in cache for future requests. Simple like that.

However, what if value expires and then you get hundreds of requests? It cannot be served from cache anymore, so your databases are hit with numerous processes trying to re-generate the value. And the more requests databases receive, the slower and less responsive they get. Load spikes. Until eventually they likely go down.

See picture below (green - in cache, red - no cache).

Preventing dogpile effect boils down to having just one process (first one to come) regenerating new value while other subsequent processes serving stale value from cache until it’s refereshed by the first process.

Worried about serving stale data? Well, if your databases are overloaded and suffering, serving stale data is smallest inconvenience you can have. And if takes long to regenerate new value, having multiple processes doing this (instead of one) won’t help really. It will just add more load.

Dogpile effect - prevention/implementation

Dogpile effect can be prevented using semaphore lock. If value expired, first process acquires a lock and starts generating new value. All the subsequent requests check if lock is acquired and serve stale content. After new value is generated, lock is released.

Important to note is that in fact values should be given an extended life time, so they’re not physically removed when they expire and they can be still served if there’s a need.

Here’s how it works in detail.

Get cache value from cache store.

$value = $this->store->get($key);

$value is a value object.

Check whether cached value expired or not. If not expired, serve it.

if ($value && !$value->isStale()) {
	return $value->getResult();
}

Otherwise, acquire lock so there’s just one process regenerating new value.

$lock_acquired = $this->acquireLock($key, $grace_ttl);

If lock cannot be acquired, it means there’s already other process regenerating it, so let’s just serve current (stale) value.

if (!$lock_acquired) {
	return $value->getResult();
}

Otherwise (lock has been acquired), regenerate new value.

$result = ...

Save regenerated value in cache store. Add grace period, so stale result might be served if needed by other processes.

$expiration_timestamp = time() + $ttl;
$value = new Value($result, $expiration_timestamp);

$real_ttl = $ttl + $grace_ttl;
$this->store->set($key, $value, $real_ttl);

Release lock.

$this->releaseLock($key);

Full implementation:https://github.com/sobstel/metaphore/blob/master/src/Cache.php.

Metaphore

Metaphore is open-sourced library to prevent dogpile effect in PHP apps. It’s actually rewrite of LSDCache, which has been successfully used in many high-traffic production web apps. I just believe that LSDCache has grown too big into multi-purpose cache library while metaphore strives to be simple to do just one thing and to do it well.

Usage is really simple.

In composer.json file:

"require": {
	"sobstel/metaphore": "dev-master"
}

In your PHP file:

use Metaphore\Cache;

// initialize $memcached object (new Memcached())

$cache = new Cache($memcached);
$cache->cache($key, function(){
    // generate content
}, $ttl);

More reading

Thanks

Thanks to Mariusz Gil for his talk about Memcached back in 2010 at PHPCon - which made me aware of dogpile effect issue - and for allowing me to use pics from slides.

Reader Comments (5)

How do you handle locks not being released, ie still help by crashed processes?

July 30, 2014 | Unregistered CommenterGH

Recently did similar (added lease functionality) but this approach has caveats:
- add operation for some unknown reason is pretty slow in memcached.
- instead of simple get and db access we get three roundtrips to memcached: get miss, lease, db, set, release lease. and this is for all keys. So solution is to use this for real hot or hard to get keys only
- does not work well with multi get - when one query to db populates different keys

July 30, 2014 | Unregistered CommenterDenis

That's very tricky part to handle. Generally you should have lock TTL (time-to-live) low enough, so it does not affect your app. Even if it's just 1s, still you allow at most 1 request per second to re-generate data.

At library level, as a partial remedy: releaseLock() may be called in __destruct() if lock has been acquired and not released earlier.

July 31, 2014 | Unregistered Commentersobstel

Re: "How do you handle locks not being released, ie still help by crashed processes?"
--

You could consider using MySQL with GET_LOCK(), or implementing something similar in a server programming language. The key semantic is that when the connection goes away, the lock is automatically released. This may also combined with a longer lock TTL as well.

July 31, 2014 | Unregistered CommenterRyan Bastic

My solution in Yii1 component: https://gist.github.com/m8rge/4554478

using:

$cache = Yii::app()->cache;
if (false === $result = $cache->get($cacheKey)) {
    $result = $cache->setWithLocking($cacheKey, function($cacheKey) {
        $result = file_get_contents('http://slow.service.ru');
        $cache->set($cacheKey, $result);

        return $result;
    }[, $waitTimeout[, $lockTimeout]]);
}

August 4, 2014 | Unregistered Commenterm8rge

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>