Caching

Scalability Perspectives #2: Van Jacobson – Content-Centric Networking

Scalability Perspectives is a series of posts that highlights the ideas that will shape the next decade of IT architecture. Each post is dedicated to a thought leader of the information age and his vision of the future. Be warned though – the journey into the minds and perspectives of these people requires an open mind.

Van Jacobson

Van Jacobson is a Research Fellow at PARC. Prior to that he was Chief Scientist and co-founder of Packet Design. Prior to that he was Chief Scientist at Cisco. Prior to that he was head of the Network Research group at Lawrence Berkeley National Laboratory. He's been studying networking since 1969. He still hopes that someday something will start to make sense.

Scaling the Internet – Does the Net needs an upgrade?

As the Internet is being overrun with video traffic, many wonder if it can survive. With challenges being thrown down over the imbalances that have been created and their impact on the viability of monopolistic business models, the Internet is under constant scrutiny. Will it survive? Or will it succumb to the burden of the billion plus community that is constantly demanding more and more?

Does the Net Need an Upgrade? To answer this question a distinguished panel of Van Jacobson, Rick Hutley, Norman Lewis, David S. Isenberg has discussed the issue on the Supernova conference. In this compelling debate available on IT Conversations, the panel addresses the question and provides some differing perspectives. One of the perspectives is Content-based networking described by Van Jacobson.

Todd Hoff's picture

A High Performance Memory Database for Web Application Caches

Abstract—This paper presents the architecture and
characteristics of a memory database intended to be used as a
cache engine for web applications. Primary goals of this database
are speed and efficiency while running on SMP systems with
several CPU cores (four and more). A secondary goal is the
support for simple metadata structures associated with cached
data that can aid in efficient use of the cache. Due to these goals,
some data structures and algorithms normally associated with
this field of computing needed to be adapted to the new
environment.

Oracle opens Coherence Incubator

During the Coherence Special Interest Group meeting in London, Brian Oliver from Oracle yesterday announced the start of the Coherence Incubator project. Coherence Incubator is a new online repository of projects that provides reference implementation examples for commonly used design patterns and integration solutions based on Oracle Coherence.

Todd Hoff's picture

Product: ScaleOut StateServer is Memcached on Steroids

ScaleOut StateServer is an in-memory distributed cache across a server farm or compute grid. Unlike middleware vendors, StateServer is aims at being a very good data cache, it doesn't try to handle job scheduling as well.

StateServer is what you might get when you take Memcached and merge in all the value added distributed caching features you've ever dreamed of. True, Memcached is free and ScaleOut StateServer is very far from free, but for those looking a for a satisfying out-of-the-box experience, StateServer may be just the caching solution you are looking for. Yes, "solution" is one of those "oh my God I'm going to pay through the nose" indicator words, but it really applies here. Memcached is a framework whereas StateServer has already prepackaged most features you would need to add through your own programming efforts.

Why use a distributed cache? Because it combines the holly quadrinity of computing: better performance, linear scalability, high availability, and fast application development. Performance is better because data is accessed from memory instead of through a database to a disk. Scalability is linear because as more servers are added data is transparently load balanced across the servers so there is an automated in-memory sharding. Availability is higher because multiple copies of data are kept in memory and the entire system reroutes on failure. Application development is faster because there's only one layer of software to deal with, the cache, and its API is simple. All the complexity is hidden from the programmer which means all a developer has to do is get and put data.

StateServer follows the RAM is the new disk credo. Memory is assumed to be the system of record, not the database. If you want data to be stored in a database and have the two kept in sync, then you'll have to add that layer yourself. All the standard memcached techniques should work as well for StateServer. Consider however that a database layer may not be needed. Reliability is handled by StateServer because it keeps multiple data copies, reroutes on failure, and has an option for geographical distribution for another layer of added safety. Storing to disk wouldn't make you any safer.

Via email I asked them a few questions. The key question was how they stacked up against Memcached? As that is surely one of the more popular challenges they would get in any sales cycle, I was very curious about their answer. And they did a great job differentiation themselves. What did they say?

Todd Hoff's picture

Strategy: Serve Pre-generated Static Files Instead Of Dynamic Pages

Pre-generating static files is an oldy but a goody, and as Thomas Brox Røst says, it's probably an underused strategy today. At one time this was the dominate technique for structuring a web site. Then the age of dynamic web sites arrived and we spent all our time worrying how to make the database faster and add more caching to recover the speed we had lost in the transition from static to dynamic.

Static files have the advantage of being very fast to serve. Read from disk and display. Simple and fast. Especially when caching proxies are used. The issue is how do you bulk generate the initial files, how do you serve the files, and how do you keep the changed files up to date? This is the process Thomas covers in his excellent article Serving static files with Django and AWS - going fast on a budget", where he explains how he converted 600K thousand previously dynamic pages to static pages for his site Eventseer.net, a service for tracking academic events.

Eventseer.net was experiencing performance problems as search engines crawled their 600K dynamic pages. As a solution you could imagine scaling up, adding more servers, adding sharding, etc etc, all somewhat complicated approaches. Their solution was to convert the dynamic pages to static pages in order to keep search engines from killing the site. As an added bonus non logged-in users experienced a much faster site and were more likely to sign up for the service.

The article does a good job explaining what they did, so I won't regurgitate it all here, but I will cover the highlights and comment on some additional potential features and alternate implementations...

Product: Terracotta - Open Source Network-Attached Memory

Update: Evaluating Terracotta by Piotr Woloszyn. Nice writeup that covers resilience, failover, DB persistence, Distributed caching implementation, OS/Platform restrictions, Ease of implementation, Hardware requirements, Performance, Support package, Code stability, partitioning, Transactional, Replication and consistency.

Terracotta is Network Attached Memory (NAM) for Java VMs. It provides up to a terabyte of virtual heap for Java applications that spans hundreds of connected JVMs.

NAM is best suited for storing what they call scratch data. Scratch data is defined as object oriented data that is critical to the execution of a series of Java operations inside the JVM, but may not be critical once a business transaction is complete.

The Terracotta Architecture has three components:

  1. Client Nodes - Each client node corresponds to a client node in the cluster which runs on a standard JVM
  2. Server Cluster - java process that provides the clustering intelligence. The current Terracotta implementation operates in an Active/Passive mode
  3. Storage used as
    • Virtual Heap storage - as objects are paged out of the client nodes, into the server, if the server heap fills up, objects are paged onto disk
    • Lock Arbiter - To ensure that there is no possibility of the classic "split-brain" problem, Terracotta relies on the disk infrastructure to provide a lock.
    • Shared Storage - to transmit the object state from the active to passive, objects are persisted to disk, which then shares the state to the passive server(s).

JVM-level clustering can turn single-node, multi-threaded apps into distributed, multi-node apps, often with no code changes. This is possible by plugging in to the Java Memory Model in order to maintain key Java semantics of pass-by-reference, thread coordination and garbage collection across the cluster. Terracotta enables this using only declarative configuration with minimal impact to existing code and provides fine-grained field-level replication which means your objects no longer need to implement Java serialization.

Ari Zilka, the founder and CTO of Terracotta had a
video session
organized by Skills Matter. He will show you how it works and how you can start clustering your POJO-based Web applications (based on Spring, Struts, Wicket, RIFE, EHCache, Quartz, Lucene, DWR, Tomcat, JBoss, Jetty or Geronimo etc.).

Todd Hoff's picture

Strategy: Let Google and Yahoo Host Your Ajax Library - For Free

Don't have a CDN? Why not let Google and Yahoo be your CDN? At least for Ajax libraries. No charge. Google runs a content distribution network and loading architecture for the most popular open source JavaScript libraries, which include: jQuery, prototype, script.aculo.us, MooTools, and dojo. The idea is web pages directly include your library of choice from Google's global, fast, and highly available network. Some have found much better performance and others experienced slower performance. My guess is the performance may be slower if your data center is close to you, but far away users will be much happier. Some negatives: not all libraries are included, you'll load more than you need because all functionality is included. Yahoo has had a similar service for YUI for a while. Remember to have a backup plan for serving your libraries, just in case.

Todd Hoff's picture

Ehcache - A Java Distributed Cache

Ehcache is a pure Java cache with the following features: fast, simple, small foot print, minimal dependencies, provides memory and disk stores for scalability into gigabytes, scalable to hundreds of caches
is a pluggable cache for Hibernate, tuned for high concurrent load on large multi-cpu servers, provides LRU, LFU and FIFO cache eviction policies, and is production tested. Ehcache is used by LinkedIn to cache member profiles. The user guide says it's possible to get at 2.5 times system speedup for persistent Object Relational Caching, a 1000 times system speedup for Web Page Caching, and a 1.6 times system speedup Web Page Fragment Caching.
From the website:

Economies of Non-Scale

Scalability forces us to think differently. What worked on a small scale doesn't always work on a large scale -- and costs are no different. If 90% of our application is free of contention, and only 10% is spent on a shared resources, we will need to grow our compute resources by a factor of 100 to scale by a factor of 10! Another important thing to note is that 10x, in this case, is the limit of our ability to scale, even if more resources are added.

1. The cost of non-linearly scalable applications grows exponentially with the demand for more scale.
2. Non-linearly scalable applications have an absolute limit of scalability. According to Amdhal's Law, with 10% contention, the maximum scaling limit is 10. With 40% contention, our maximum scaling limit is 2.5 - no matter how many hardware resources we will throw at the problem.

This post discuss in further details how to measure the true cost of non linearly scalable systems and suggest a model for reducing that cost significantly.

Scaling Out MySQL

This post covers two main options for scaling-out MySql and compare between them. The first is based on data-base clustering and the second is based on In Memory clustering a.k.a Data Grid. A special emphasis is given to a pattern which shows how to scale our existing data base without changing it through a combination of Data Grid and data base as a background service. This pattern is referred to as Persistency as a Service (PaaS). It also address many of the fequently asked question related to how performance, reliability and scalability is achieved with this pattern.

Database War Stories #3: Flickr

[Tim O'Reilly] Continuing my series of queries about how "Web 2.0" companies used databases, I asked Cal Henderson of Flickr to tell me "how the folksonomy model intersects with the traditional database. How do you manage a tag cloud?"

Speed up (Oracle) database code with result caching

One of the most interesting new features of Oracle 11 is the new function result caching mechanism. Until now, making sure that a PL/SQL function gets executed only as many times as necessary was a black art. The new caching system makes that quite easy -- here is how it works.

Golden rule of web caching

Effective content caching is one of the key features of scalable web sites. Although there are several out-of-the-box options for caching with modern web technologies, a custom built cache still provides the best performance.

Todd Hoff's picture

IBMer Says LAMP Can't Scale

A very entertaining and somewhat educational article on IBM Poopheads say LAMP Users Need to "grow up". The physical three tier architecture turns out to be the root of all evil and shared nothing architectures brings simplicity and light.

In the comments Simon Willison makes an insightful comment on why fine grained caching works for personalized pages and proxy's don't:
Great post, but I have to disagree with you on the finely grained caching part. If you look at big LAMP deployments such as Flickr, LiveJournal and Facebook the common technology component that enables them to scale is memcached - a tool for finely grained caching. That's not to say that they aren't doing shared-nothing, it's just that memcached is critical for helping the database layer scale. LiveJournal serves around 50% of its page views "permission controlled" (friends only) so an HTTP proxy on the front end isn't the right solution - but memcached reduces their database hits by 90%.

Todd Hoff's picture

Product: Tugela Cache

Tugela Cache is a cache system like memecached, but instead of storing data just in RAM, it stores data in the file system using a b-tree. You trade latency in order to have a very large cache. It's useful for sites that have caching requirements that exceed their available memory. It uses the same wire protocol as memcached so it can be dropped in without a hassle. From the website:

As large MediaWiki deployments may gain performance using Memcached, at some level cost of RAM to store all objects becomes too high. In order to balance resource usage and make more use of our Apache server disks, Tugela, the distributed cached on-disk hash database, has arrived.

Tugela Cache is derived from Memcached. Much of the code remains the same, but notably, these changes:
* Internal slab allocator replaced by BerkeleyDB B-Tree database.
* Expiry policy management moved to external program tugela-expire
* Much statistics code made obsolete.

An interesting point brought up in the comments is using memcached with a larger cache size than physical RAM and then let the OS swap versus using a b-tree to access data on disk. Nginx seems to use the "let the OS swap" approach to good effect. It would be interesting to see which approach works better.

For an idea of how an in-process cache and a disk based cache hierarchy can work together take a look at Kevin Burton's IDEA: Hierarchy of caches for high performance AND high capacity memcached.

There's also an interesting variation called Memcachedb which is said to be "a better and simplified Tugela." It's more of a persistence mechanism than a cache. It enables transactions, replication, and there's no expiration.

Todd Hoff's picture

Try Squid as a Reverse Proxy

This scalability strategy is brought to you by Erik Osterman:

My recommendations for anyone dealing with explosive growth on a limited budget with lots of cachable content (e.g. content capable of returning valid expiration headers) is employ a reverse proxy as mentioned in this article.

In the last week, we had a site get AP'd, triggering 100K unique visitors to a single IIS server in under 5 hours. It took out the IIS server. Placing a single squid infront of the server handled the entire onslaught with a max server load of 0.10 on a modest Intel IV 3Ghz.

It's trivial to implement for anyone interested...

Todd Hoff's picture

Product: Memcached

memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.

Danga Interactive developed memcached to enhance the speed of LiveJournal.com, a site which was already doing 20 million+ dynamic page views per day for 1 million users with a bunch of webservers and a bunch of database servers. memcached dropped the database load to almost nothing, yielding faster page load times for users, better resource utilization, and faster access to the databases on a memcache miss.

Todd Hoff's picture

Product: eAccelerator a PHP Accelerator

eAccelerator is a free open-source PHP accelerator, optimizer, and dynamic content cache. It increases the performance of PHP scripts by caching them in their compiled state, so that the overhead of compiling is almost completely eliminated. It also optimizes scripts to speed up their execution. eAccelerator typically reduces server load and increases the speed of your PHP code by 1-10 times.

Syndicate content