Product

Product: Infinispan - Open Source Data Grid

Todd Hoff

07 Sep 2009 — 2 min read

Infinispan is a highly scalable, open source licensed data grid platform in the style of GigaSpaces and Oracle Coherence.

From their website:

The purpose of Infinispan is to expose a data structure that is highly concurrent, designed ground-up to make the most of modern multi-processor/multi-core architectures while at the same time providing distributed cache capabilities. At its core Infinispan exposes a JSR-107 (JCACHE) compatible Cache interface (which in turn extends java.util.Map). It is also optionally is backed by a peer-to-peer network architecture to distribute state efficiently around a data grid.

Offering high availability via making replicas of state across a network as well as optionally persisting state to configurable cache stores, Infinispan offers enterprise features such as efficient eviction algorithms to control memory usage as well as JTA compatibility.

In addition to the peer-to-peer architecture of Infinispan, on the roadmap is the ability to run farms of Infinispan instances as servers and connecting to them using a plethora of clients - both written in Java as well as other popular platforms.

A few observations:

Open source is an important consideration, depending on your business model. As you scale out your costs don't go up. The downside is you'll likely put in more programming effort to implement capabilities the commercial products have already solved.

It's from the makers of Jboss Cache so it's likely to have a solid implmentation, even so early in it's development cycle. The API looks very well thought out.

Java only. Plan is to add more bindings in the future.

Distributed hash table only. Commercial products have very advanced features like distributed query processing which can make all the difference during implementation. We'll see how the product expands from its caching roots into a full fledged data manipulation platform.

MVCC and a STM-like approach provide lock- and synchronization-free data structures. This means dust off all those non-blocking algorithms you've never used before. It will be very interesting to see how this approach performs under real-life loads programmed by real-life programmers not used to such techniques.

Data is made safe using a configurable degree of redundancy. State is distributed across a cluster. And it's peer-to-peer, there's no central server.

API based (put and get operations). XML, bytecode manipulation and JVM hooks aren't used.

Future plans call for adding a compute-grid for map-reduce style operations.

Distributed transactions across multiple objects are supported. It also offers eviction strategies to ensure individual nodes do not run out of memory and passivation/overflow to disk. Warm-starts using preloads are also supported.

It's exciting to have an open source grid alternative. It will be interesting to see how Infinispan develops in quality and its feature set. Making a mission critical system of this type is no simple task.

I don't necessarily see Infinispan as just a competitor for obvious players like GigaSpaces and Coherence, it may play even more strongly in the NoSQL space. For people looking for a reliable, highly performant, scalable, transaction aware hash storage system, Ininispan may look even more attractive than a lot of the disk based systems.

Video Interview with Manik Surtani, Founder & Project Lead at JBoss Cache, Infinispan Data Grid

Infinispan Interview by Mark Little on InfoQ.

Are Cloud Based Memory Architectures the Next Big Thing?

Infinispan - data grids meets open source on TheServerSide.com

Technical FAQs

Anti-RDBMS: A list of distributed key-value stores

Infinispan Wiki

Distribution instead of Buddy Replication