Try Squid as a Reverse Proxy

Todd Hoff's picture

This scalability strategy is brought to you by Erik Osterman:

My recommendations for anyone dealing with explosive growth on a limited budget with lots of cachable content (e.g. content capable of returning valid expiration headers) is employ a reverse proxy as mentioned in this article.

In the last week, we had a site get AP'd, triggering 100K unique visitors to a single IIS server in under 5 hours. It took out the IIS server. Placing a single squid infront of the server handled the entire onslaught with a max server load of 0.10 on a modest Intel IV 3Ghz.

It's trivial to implement for anyone interested...

Comments

squid as reverse proxy

I've been using squid for different things for about 10 years now. Though it works for most cases, I've been asked to be cautious before putting it in high traffic sites. The person who mentioned this had mentioned that they had to restart squid every few hours under very heavy load.

Todd Hoff's picture

re: squid as reverse proxy

Yep, YouTube also mentioned some problems with squid.

From http://highscalability.com/youtube-architecture:

Used squid (reverse proxy) in front of Apache. This worked for a while, but as load increased performance eventually decreased. Went from 300 requests/second to 20.

So it's not all roses and candy.

re: Squid as a Reverse Proxy

I think that it's a kind of truism to say that as "load increased, performance eventually decreased." That's to be expected with most servers; once capacity is reached, performance often degrades exponentially as the server catches up processing all backlogged requests and still accepting new requests. I see this as an indicator it's time to scale out. I haven't done the math to calculate the cost effectiveness, but my gut instinct is that Squid is still one of the most cost effective ways to attack the problem as opposed to throwing more web servers into the pool. With its own built-in peer-to-peer caching network, Squid makes it far easier and more efficient to scale than web servers. This means that as you scale out the Squids, they can just request content from cache peers leaving the web servers free to handle new requests. Squids can handle enormous amounts of traffic well, but will get overwhelmed at a certain point; that's inevitable.

I am frankly surprised that YouTube had trouble serving it's thumbnail traffic from Squid. Google has been using Squid for it's Thumbnails on their image search, Orkut and likely on many other properties. To be fair, cache hydration is an issue when dealing with millions of small objects. I also wish Squid would use it's own object store that didn't involve storing each individual object as a file on disk.

As for Squid's stability under high-load, I am curious if it still suffers from problems like that. Squid is an actively maintained project, frequently releasing updates. It's come a long ways from v1. If someone is genuinely experiencing lockups under high-load, I really hope they get in touch with the maintainers of Squid so as to get to the root of the problem.

Squid and scaling

Every piece of the your architecture won't scale into infinity - Squid is no exception.

However, for sites that have very little content when compared to the number of HTTP requests and the content can be made cacheable the performance can be blazingly fast.

Squid can be configured to function with a memory cache btw. I think the disk cache is option but might be wrong.

Squid vs. Varnish?

Has anyone tested Squid against Varnish? I've read some articles, and watched a couple of presentations by the author, and it looks pretty promising. The author claims that Varnish will provide significantly better performance and scalability when compared with squid.

Varnish

Worth reading this link to see why Varnish is better than Squid. It's all about virtual memory usage, and not fighting the operating system.
http://varnish.projects.linpro.no/wiki/ArchitectNotes

Re: Try Squid as a Reverse Proxy

I actually found that squid performs better when compiled with Tcmalloc (google's perftool).
Squid used to degrade over time (several months of uptime) with average load of 3000 requests/minute on my webserver. After compiling it with perftool, I'm no longer seeing the degradation.
In fact perftool is so good that I decided to compile it with MySQL and my server runs happily now.

Re: Try Squid as a Reverse Proxy

@Dan Kubb
> Has anyone tested Squid against Varnish? I've read some articles, and

From my testing, Varnish crushes Squid, it's not even close. I've worked with a few different load levels in a corp, mid-sized web environment - Varnish will come up 5-20x faster than Squid depending on load on identical hardware. I've worked on our Squid config for over a month and can't improve upon it, while I've hardly touched Varnish's. While I'm sure there's still a place for Squid, I would say it is more for a filter/forward proxy, and not a reverse proxy anymore.

fak3r

Re: Try Squid as a Reverse Proxy

fak3r:

With your squid vs varnish comparisons, are you working with constantly full caches ? Meaning, is the working set larger than the cache size ? Varnish only recently had LRU eviction put in, so it can handle a consistently churning cache. Or, are you testing with what can be served just from memory ?

Re: Try Squid as a Reverse Proxy

From my testing, Varnish crushes Squid, it's not even close. I've worked with a few different load levels in a corp, mid-sized web environment - Varnish will come up 5-20x faster than Squid depending on load on identical hardware. I've worked on our Squid config for over a month and can't improve upon it, while I've hardly touched Varnish's. While I'm sure there's still a place for Squid, I would say it is more for a filter/forward proxy, and not a reverse proxy anymore.

Re: Try Squid as a Reverse Proxy

Oyun: were your tests done with working sets that were larger than could fit in memory, or on disk? i.e., was varnish constantly evicting objects at the same time ? or were you just serving everything out of memory each time? I've yet to hear of a varnish install that was running with constantly full caches, which is why I'm sticking with squid.

Re: Try Squid as a Reverse Proxy

congrulations

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd><div ?=?><p ?=?> <img ?=?><h1 ?=?><h2 ?=?><h3 ?=?>
  • Lines and paragraphs break automatically.
  • Glossary terms will be automatically marked with links to their descriptions
  • You may link to webpages through the weblinks registry

More information about formatting options

To combat spam, please enter the code in the image.