Using Varnish for Paywalls: Moving Logic to the Edge
This is a guest post from Per Buer, founder and CEO of Varnish Software, provider of Varnish Cache, an open source web application accelerator freely available at varnish-cache.org. Varnish powers a lot of really big websites worldwide.
We at Varnish Software are all about speed. Varnish Cache is built for speed. It executes its policy code more or less a thousand times faster than your typical Java or PHP based application servers, mostly due to the fact that the configuration is compiled into system call free machine code.
System calls require expensive context switches, stall the CPU and wreck havoc in the CPU cache so avoiding them makes the code fly. There are strong limitations on what kind of logic you can move into Varnish Cache, but the logic that you do move there will run very fast.
An example is using Varnish for access control to serve access controlled content from the caching edge layer.
The Varnish Paywall
Who gets to access your content? In a traditional environment the caching layer only serves up pieces of content without giving any thought to who gets access to it. Since the rules governing access control can be rather complex these rules have traditionally been implemented in the application server, which is slow.
We’ve seen companies struggle with performance as they suddenly have to revert to serving content from their application layer again. With a bit of effort and some open source magic you can have your lunch and eat it too: serve access controlled content from the caching edge layer.
How would it work?
Varnish Cache would need two pieces of information. One would be a header coming from the origin server indicating that this piece of information is under access control, maybe X-Access-Control. If the header is present Varnish would then check whether the user is logged in or not, using a cookie. This cookie would be set by an authentication service, and if you are worried about users cheating you could secure it by signing the cookie cryptographically. It’s possible, but not recommended to implement the actual authentication in Varnish itself using modules to access data in a database or another data source. As each user usually only logs in once this is is not a performance critical path and so it makes more sense to do it on your regular application servers.
Authenticated access is not the only option, you might also want to limit access to say 5 articles per user per week. If so, you would store the read article count in a signed cookie or a NoSQL database like Redis or MemcacheDB. To do this you extend Varnish through the use of a Varnish Module as VCL itself lacks flow control structures such as for loops.
What tools are needed? How much effort?
Getting a proof of concept up and running should be pretty fast. You could just check for the presence of a certain value in a cookie and you’ve proven the concept. The more advanced controls will require more effort, maybe a week or two of work, depending on the complexity.
Caution
Be wary when moving logic away from your application servers. You must always maintain clear guidelines on what goes where or you’ll quickly end up with a messy infrastructure. In some organization the edge cache is considered infrastructure and not part of the web application and handled by different teams.
Resources
- Digest VMOD, https://www.varnish-cache.org/vmod/digest
- Redis VMOD, https://www.varnish-cache.org/vmod/redis
- Writing VMODs, http://blog.zenika.com/index.php?post%2F2012%2F08%2F21%2FCreating-a-Varnish-module
- The Varnish Book, https://www.varnish-software.com/book
- The Official Varnish Documentation, https://www.varnish-cache.org/docs/