ScaleOut StateServer is an in-memory distributed cache across a server farm or compute grid. Unlike middleware vendors, StateServer is aims at being a very good data cache, it doesn't try to handle job scheduling as well.
StateServer is what you might get when you take Memcached and merge in all the value added distributed caching features you've ever dreamed of. True, Memcached is free and ScaleOut StateServer is very far from free, but for those looking a for a satisfying out-of-the-box experience, StateServer may be just the caching solution you are looking for. Yes, "solution" is one of those "oh my God I'm going to pay through the nose" indicator words, but it really applies here. Memcached is a framework whereas StateServer has already prepackaged most features you would need to add through your own programming efforts.
Why use a distributed cache? Because it combines the holly quadrinity of computing: better performance, linear scalability, high availability, and fast application development. Performance is better because data is accessed from memory instead of through a database to a disk. Scalability is linear because as more servers are added data is transparently load balanced across the servers so there is an automated in-memory sharding. Availability is higher because multiple copies of data are kept in memory and the entire system reroutes on failure. Application development is faster because there's only one layer of software to deal with, the cache, and its API is simple. All the complexity is hidden from the programmer which means all a developer has to do is get and put data.
StateServer follows the RAM is the new disk credo. Memory is assumed to be the system of record, not the database. If you want data to be stored in a database and have the two kept in sync, then you'll have to add that layer yourself. All the standard memcached techniques should work as well for StateServer. Consider however that a database layer may not be needed. Reliability is handled by StateServer because it keeps multiple data copies, reroutes on failure, and has an option for geographical distribution for another layer of added safety. Storing to disk wouldn't make you any safer.
Via email I asked them a few questions. The key question was how they stacked up against Memcached? As that is surely one of the more popular challenges they would get in any sales cycle, I was very curious about their answer. And they did a great job differentiation themselves. What did they say?
First, for an in-depth discussion of their technology take a look ScaleOut Software Technology, but here a few of the highlights:
- Development Edition: No Charge
- Professional Edition: $1,895 for 2 servers
- Data Center Edition: $71,995 for 64 servers
- GeoServer Option First two data centers $14,995, Each add'l data center $7,495.
- Support: 25% of software license fee
Some potential negatives about ScaleOut StateServer:
In the next section the headings are my questions and the responses are from ScaleOut Software.
Why use ScaleOut StateServer instead of Memcached?
I've [Dan McMillan, VP Sales] included some data points below based on our current understanding of the Memcached product. We don't use and haven't tested Memcached internally, so this comparison is based in part upon our own investigations and in part what we are hearing from our own customers during their evaluation and comparisons. We are aware that Memcached is successfully being used on many large, high volume sites. We believe strong demand for ScaleOut is being driven by companies that need a ready-to-deploy solution that provides advanced features and just works. We also hear that Memcached is often seen as a low cost solution in the beginning, but development and ongoing management costs sometimes far exceed our licensing fees.
What sets ScaleOut apart from Memcached (and other competing solutions) is that ScaleOut was architected from the ground up to be a fully integrated and automated caching solution. ScaleOut offers both scalability and high availability, where our competitors typically provide only one or the other. ScaleOut is considered a full-featured, plug-n-play caching solution at a very reasonable price point, whereas we view Memcached as a framework in which to build your own caching solution. Much of the cost in choosing Memcached will be in development and ongoing management. ScaleOut works right out of the box.
I asked ScaleOut Software founder and chief architect, Bill Bain for his thoughts on this. He is a long-time distributed caching and parallel computing expert and is the architect of ScaleOut StateServer. He had several interesting points to share about creating a distributed cache by using an open source (i.e. build it yourself) solution versus ScaleOut StateServer.
First, he estimates that it would take considerable time and effort for engineers to create a distributed cache that has ScaleOut StateServer's fundamental capabilities. The primary reason is that the open source method only gives you a starting point, but it does not include most capabilities that are needed in a distributed cache. In fact, there is no built-in scalability or availability, the two principal benefits of a distributed cache. Here is some of the functionality that you would have to build:
In addition to the above, we hope that the fact ScaleOut Software provides a commercial solution that is reasonably priced, supported and constantly improved would be viewed as an important plus for our customers. In many cases, in-house and open source solutions are not supported or improved once the original developer is gone or is assigned to other priorities.
Do you find yourself in competition with the likes of Terracotta, GridGain, GridSpaces, and Coherence type products?
Our ScaleOut technology has previously been targeted to the ASP.Net space. Now that we are entering the Java/Linux space, we will be competing with companies like the ones you mentioned above, which are mainly Java/Linux focused as well.
We initially got our start with distributed caching for ecommerce applications, but grid computing seems to be a strong growth area for us as well. We are now working with some large Wall Street firms on grid computing projects that involve some (very large) grid operations.
I would like to reiterate that we are very focused on data caching only. We don't try to do job scheduling or other grid computing tasks, but we do improve performance and availability for those tasks via our distributed data cache.
What architectures your customers are using with your GeoServer product?
A. GeoServer is a newer, add-on product that is designed to replicate the contents of two or more geographically separated ScaleOut object stores (caches). Typically a customer might use GeoServer to replicate object data between a primary data center site and a DR site. GeoServer facilitates continuous (async.) replication between sites, so if site A goes offline, the other site B is immediately available to handle the workload.
Our ScaleOut technology offers 3 primary benefits: Scalability, performance & high availability. From a single web farm perspective, ScaleOut provides high availability by making either 1 or 2 (this is configurable) replica copies of each master object and storing the replica on an alternate host server in the farm. ScaleOut provides uniform access to the object from any server, and protects the object in the case of a server failure. With GeoServer, these benefits are extended across multiple sites.
It is true that distributed caches typically hold temporary, fast-changing data, but that data can still be very critical to ecommerce, or grid computing applications. Loss of this data during a server failure, worker process recycle or even a grid computation process is unacceptable. We improve performance by keeping the data in-memory, while still maintaining high availability.