Java World Interview on Scalability and Other Java Scalability Secrets
OK, this interview is with me on Java scalability issues. I sound like a bigger idiot than I would like, but I suppose it could have been worse. The Java World folks were very nice and did a good job, so there’s no blame on them :-)
The interview went an interesting direction, but there’s more I’d like add and I will do so here. Two major rules regarding Java and scalability have popped out at me:
The fecundity of the Java ecosystem can most readily be seen with the efforts to tame our multi-core future. There’s a multi-core crisis going in case you haven’t heard. It’s all you’ll hear mentioned in the halls of the Pentagon. The CPU wizards have maxed out on clock speed and the only way we can scale is by adding more cores. And we don't know how to do that. At 100 cores common ways of doing things break down. Locks don't scale. Cache contention for shared memory slows us down. Bandwidth on the bus is limited. TLB for managing more memory is in short supply. And we need more high speed network cards to handle faster CPUs.
And that’s just at the hardware level. It’s worse for programmers. Locking is just a nightmare. I didn't believe that at first. Early in my career I worked on several multi-core systems. I thought everything was cool. Be careful and it will all work out. But work with a group and it all goes to hell. People add functions, locks, take too much time. Problems like deadlock, priority inversion, and high latency all kill a system.
What can we do?
As we’ll see, Java and more importantly the JVM have become a platform for many interesting technologies and scalability patterns. Let’s take a look at few.
How Java affects both Performance and Scalability
Peter Williams in this very informative blog post discusses how Java affects both performance and scalability.The main points are:
Java’s culture encourages practices that scale only to medium size system because it encourages techniques that do not scale well :
- favors multi-threading
- shared state
- vertical scaling
- large monolithic components
- multiple tiers, lots of layers
Alain Penders makes some good points in the comments:
Ron takes an apposing view and says Java is less scalable:
Greg Frank weighs in with some excellent points:
The Top 10 Ways to Botch Enterprise Java Application Scalability and Reliability
This is a wonderful presentation by Oracle’s Cameron Purdy. Here’s a PDF.Cameron was CEO of Tangosol before Oracle bought them out. Tangosol made Coherence, a distributed cache. Cameron is a long time prolific contributor to the Java community. His presentation is a must see. He’s both entertaining and technically excellent.
The main points he makes in the presentation are:
1. Avoid proprietary features/Believe product claims.
2. Assume the network works
3. Use big JVM machine heaps
4. Use a one-size-fits-all architecture
5. Assume disaster-recovery can be added when it becomes necessary
6. Abuse Abstractions/Avoid abstractions
7. Introduce a single point of bottleneck/Introduce a single point of failure
8. Abuse the database/Avoid the database
9. Assume you are smarter than the infrastructure/Follow the rules blindly
10. Optimize performance assuming that it will translate to scalability/Ignore the potential impact of performance on scalability (and vice-versa).
Remember, iff these seem backwards remember these are botching strategies. You'll want to see the presentation to see each point fleshed out in more detail.
Azul
Azul is a Java Compute Appliance and is the ultimate scale-up play for Java. It kind of does what Google App Engine does at the framework level but does it to the JVM at the hardware level. Current standard practice is to deploy Java application across a cluster of commodity servers. Azul does the opposite. It goes big. The most recent release can contain up to 864 processor cores and 768 GB of memory. That’s big.
Azul transparently runs unmodified Java applications on their specialized hardware platform which allows even the most mild mannered of Java apps to scale. Their hardware-assisted garbage collector dramatically reduces application pauses and gives access to hundred of gigabytes of RAM.
Some very impressive performance improvements are possible. In one case study Breakthrough Scalability of an Application Constrained to an x86 Server, an application was given access to 384 cores and 128 GB of memory on an Azul compute appliance. The result was a 45x improvements in scalability.
Scalability was increased along a number of dimensions (quoted from their article):
If Azul is so cool why aren’t all applications being run on Azul? Buying a completely proprietary hardware platform is too big a risk without a gigantic throbbing pain point. I would like to see Azul open their own cloud so we could get in with low cost and risk.
X10
X10 is not just an inexpensive home automation control system. X10 is also:A type-safe, modern, parallel, distributed object-oriented language intended to be very easily accessible to Java(TM) programmers. It is targeted to future low-end and high-end systems with nodes that are built out of multi-core SMP chips with non-uniform memory hierarchies, and interconnected in scalable cluster configurations.
A member of the Partitioned Global Address Space (PGAS) family of languages, X10 highlights the explicit reification of locality in the form of places; lightweight activities embodied in async, future, foreach, and ateach constructs; constructs for termination detection (finish) and phased computation (clocks); the use of lock-free synchronization (atomic blocks); and the manipulation of global arrays and data structures.
An Eclipse-based Integrated Development Environment (IDE) has been developed at IBM for X10 to help further increase programmer productivity by providing state-of-the-art functionality for viewing, editing, navigating, executing, and manipulating X10 programs.
X10 is built on top of Java. X10 adds:
* Multi-dimensional arrays,
* aggregate operations
* activities (async, future), atomic
* blocks, clocks
* places
* distributed arrays
X10 does not have:
X10 restricts:
The result is hopefully a language that can be scaled across a cluster of mulit-core processors yet still has the familiar Java syntax and is developed using familiar Java development tools like Eclipse.
Clojure, Jruby, Jpython
More of the "it’s the platform stupid." We have many different and interesting languages being built on the JVM platform.Jruby
Jruby is an 100% pure-Java implementation of the Ruby programming language.Jpython
Jpython is an 100% pure-Java implementation of the Python programming language.Clojure
I first came upon Clojure researching software transactional memory (STM) as solution to the problem of how to create easy to write massively parallel programs. STM is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. It functions as an alternative to lock-based synchronization. It’s supposed to make writing parallel programs easier. The idea is you can do away with all those nasty locks that cause so many problems. Some have found STM’s performance very disappointing for larger scale applications. And it may ultimately fail simply because everything that inside a transaction boundary is not memory. Programs routinely call out to other services and peripherals. How can STM work in real world environments?STM may not turn out to be the savior of the multi-core world, but Clojure explores some very new Java territory:
Clojure is a dynamic programming language that targets the Java Virtual Machine. It is designed to be a general-purpose language, combining the approachability and interactive development of a scripting language with an efficient and robust infrastructure for multithreaded programming. Clojure is a compiled language - it compiles directly to JVM bytecode, yet remains completely dynamic. Every feature supported by Clojure is supported at runtime. Clojure provides easy access to the Java frameworks, with optional type hints and type inference, to ensure that calls to Java can avoid reflection.Clojure doesn’t fit my aging mental model. The message-passing actor model of Erlang is more my style. Interestingly the difference between Erlang and Clojure is quite purposeful. Clojure wants to be efficient while operating in the same process rather than taking a message passing hit for every operation. Clojure requires specifying an agent as the receiver of a message where I prefer a more publish-subscribe approach where message senders and consumers are independent. Clojure's use of Java threads makes latency difficult to control. And I'm not sure a S-expression based language can ever become popular.
Clojure is a dialect of Lisp, and shares with Lisp the code-as-data philosophy and a powerful macro system. Clojure is predominantly a functional programming language, and features a rich set of immutable, persistent data structures. When mutable state is needed, Clojure offers a software transactional memory system and reactive Agent system that ensure clean, correct, multithreaded designs.
But these are relatively minor issues compared to the task of making Java safe for parallelism. Java does OOP well enough, but sucks at concurrency. Clojure is a nice middle-ground that may be able to make concurrency-oriented programming by real humans in Java a reality.
GigaSpaces, GridGain, Terracotta
GigaSpaces, GirdGain, and Terracotta take Java objects and fairly transparently turn them into highly distributed in-memory grids with amazing out-of-the-box functionality. If RAM is the new disk these products make it very easy to hop on the next generation architecture.Again, their ability to do so much behind the scenes is based on the power of the JVM. Try doing any of this with C++. Can’t happen. Only in the Java world will you see so much cutting edge innovation.
Open Services Gateway Initiative (OSGi)
One of the major problems with using Java is it’s a pain to deploy a new release across many machines in a data center. Deploying patches and upgrades requires bringing containers down. There’s also isn’t a good way to architect WAR files. Creating one WAR file with multiple services doesn’t work for developers. Creating N WAR files with one service doesn’t scale for containers. And how do you run multiple versions of the same service in the same container?OSGI is a solution that should make dynamic and high availability deployment of Java web services a reality. OSGi is a dynamic module system for Java. Class loading done right. OSGi defines an architecture for developing and deploying modular applications and libraries by creating a microkernel-style architecture. There’s a core set of modules that make up a basic platform and new functionality is dynamically layered in with a plugin. Using OSGi these plugins are isolated, secured and controlled from the rest of code. The unit of deployment is an OSGi bundle, which is simply a JAR file with an OSGi manifest.
This approach allows loosely-coupled application modules to be developed by a team of developers. Everything is kept in-sync using version numbers and module dependency ranges. If you’ve ever worked with Linux this should sound familiar. It’s basically how packages are installed on Linux.
Many Companies are Successfully Using Java
Many companies are using Java on their websites, they just don't use the full stack. Java is the ultimate service implementation language. There’s a trend like Amazon to develop in terms of separate services that are composed together to produce pages. Put up a cluster of applications, load balance between them and you are set. This is a big move for internal architecture. Web services now have external APIs. Those same APIs can be used internally to build your site. Java is great for larger web sites who need to start thinking in terms of services.Each Service has its own domain-specific database (i.e., vertical partitioning).
We’ve covered a lot of ground. We’ve seen how the excesses of old J2EE scalability failures can be routed around with ease using a number of different scalability patterns. We’ve seen some really innovative and amazing products available to Java developers. And we’ve seen a lot of successful websites use Java. Ah, after all that hard work it’s time for another cup of java. Peet’s coffee is my favorite.