Java

Gigaspaces curbs latency outliers with Java Real Time

Today, most banks have migrated their internal software development from C/C++ to the Java language because of well-known advantages in development productivity (Java Platform), robustness & reliability (Garbage Collector) and platform independence (Java Bytecode). They may even have gotten better throughput performance through the use of standard architectures and application servers (Java Enterprise Edition). Among the few banking applications that have not been able to benefit yet from the Java revolution, you find the latency-critical applications connected to the trading floor. Why? Because of the unpredictable pauses introduced by the garbage collector which result in significant jitter (variance of execution time).

In this post Frederic Pariente Engineering Manager at Sun Microsystems posted a summary of a case study on how the use of Sun Real Time JVM and GigaSpaces was used in the context of of a customer proof-of-concept this summer to ensure guaranteed latency per message under 10 msec, with no code modification to the matching engine.

[ANN] New Open Source Cache System

The SHOP.COM Cache System is now available at http://code.google.com/p/sccache/

The SHOP.COM Cache System is an object cache system that...
* is an in-process cache and external, shared Cache
* is horizontally scalable
* stores cached objects to disk
* supports associative keys
* is non-transactional
* can have any size key and any size data
* does auto-GC based on TTL
* is container and platform neutral

It was built in-house at SHOP.COM (by me) and has powered our website for years. We are open-sourcing it in the hope that it will be useful to others and to get some help in its maintenance.

This is our first open source attempt and we'd appreciate any help and comments.

Scaling MySQL on a 256-way T5440 server using Solaris ZFS and Java 1.7

How to scale MySQL on a 32 core system with 256 threads? Diagonal scalability in a box.
An impressive benchmark that achieved more than 79,000 SQL queries per second on a single 4 RU server! Is this real? If so what is the role of good old horizontal scalability?

The goals of the benchmark:

  1. Reach a high throughput of SQL queries on a 256-way Sun SPARC Enterprise T5440
  2. Do it 21st century style i.e. with MySQL and ZFS , not 20th century style i.e with OraSybInf... and VxFS
  3. Do it with minimal tuning i.e as close as possible as out-of-the-box
Todd Hoff's picture

Java World Interview on Scalability and Other Java Scalability Secrets

OK, this interview is with me on Java scalability issues. I sound like a bigger idiot than I would like, but I suppose it could have been worse. The Java World folks were very nice and did a good job, so there’s no blame on them :-)

The interview went an interesting direction, but there’s more I’d like add and I will do so here. Two major rules regarding Java and scalability have popped out at me:

  • Java – It’s the platform stupid. Java the language isn’t the big win. What is the big win is the ecosystem building up around the JVM, libraries, and toolsets.
  • Java – It’s the community stupid. A lot of creativity is being expended on leveraging the Java platform to meet scalability challenges. The amazing community that has built up around Java is pushing Java to the next level in almost every direction imaginable.

    The fecundity of the Java ecosystem can most readily be seen with the efforts to tame our multi-core future. There’s a multi-core crisis going in case you haven’t heard. It’s all you’ll hear mentioned in the halls of the Pentagon. The CPU wizards have maxed out on clock speed and the only way we can scale is by adding more cores. And we don't know how to do that. At 100 cores common ways of doing things break down. Locks don't scale. Cache contention for shared memory slows us down. Bandwidth on the bus is limited. TLB for managing more memory is in short supply. And we need more high speed network cards to handle faster CPUs.

    And that’s just at the hardware level. It’s worse for programmers. Locking is just a nightmare. I didn't believe that at first. Early in my career I worked on several multi-core systems. I thought everything was cool. Be careful and it will all work out. But work with a group and it all goes to hell. People add functions, locks, take too much time. Problems like deadlock, priority inversion, and high latency all kill a system.

    What can we do?

  • Sun's High-Performance and Reliable Web Proxy Solution

    As individuals and businesses depend on the Web more than ever to conduct business, rapid and reliable content retrieval is critical. Reducing wait time improves productivity and increases user satisfaction. Web proxy technology has emerged as an effective solution to improve performance, help ensure content availability and enhance network security by caching and filtering Web content. The combination of Sun SPARC Enterprise servers with CoolThreads technology and the Sun Java System Web Proxy Server software provides a compelling foundation for a robust Web proxy solution. Sun SPARC Enterprise T1000 and T2000 servers include the UltraSPARC T1 processor with CoolThreads technology, offering six or eight cores with four threads per core. The Sun Java System Web Proxy Server software is highly threaded and takes advantage of the large number of threads supported by Sun UltraSPARC T1 processors with CoolThreads technology. Together, these products provide a highly scalable solution that accommodates a large number of requests, addresses peak loads, and provides future headroom for growth. This document explores the use of a Sun SPARC Enterprise T1000 server and the Sun Java System Web Proxy Server software as a replacement for an existing Web proxy implementation that used the SQUID Web proxy server software deployed on x86 servers.

    Todd Hoff's picture

    Google Architecture

    Update 2: Sorting 1 PB with MapReduce. PB is not peanut-butter-and-jelly misspelled. It's 1 petabyte or 1000 terabytes or 1,000,000 gigabytes. It took six hours and two minutes to sort 1PB (10 trillion 100-byte records) on 4,000 computers and the results were replicated thrice on 48,000 disks.
    Update: Greg Linden points to a new Google article MapReduce: simplified data processing on large clusters. Some interesting stats: 100k MapReduce jobs are executed each day; more than 20 petabytes of data are processed per day; more than 10k MapReduce programs have been implemented; machines are dual processor with gigabit ethernet and 4-8 GB of memory.

    Google is the King of scalability. Everyone knows Google for their large, sophisticated, and fast searching, but they don't just shine in search. Their platform approach to building scalable applications allows them to roll out internet scale applications at an alarmingly high competition crushing rate. Their goal is always to build a higher performing higher scaling infrastructure to support their products. How do they do that?

    Olio Web2.0 Toolkit - Evaluate Web Technologies and Tools

    How do you evaluate and decide which web technologies (and there are myriads out there) to use for your new web application, which one potentially gives you the best performance, which one will likely give you the shortest time-to-market? The Apache incubator project Olio might help.

    Olio is a is an open source web 2.0 toolkit to help evaluate the suitability, functionality and performance of web technologies. Olio defines an example web2.0 application (an events site somewhat like yahoo.com/upcoming) and provides three initial implementations : PHP, Java EE and RubyOnRails (ROR). The toolkit also defines ways to drive load against the application in order to measure performance.

    Apache Olio could be used to

    • Understand how to use various web 2.0 technologies such as AJAX, memcached, mogileFS etc. Use the code in the application to understand the subtle complexities involved and how to get around issues with these technologies.
    • Evaluate the differences in the three implementations: php, ruby and java to understand which might best work for your situation.
    • Within each implementation, evaluate different infrastructure technologies by changing the servers used (e.g: apache vs lighttpd, mysql vs postgre, ruby vs Jruby etc.)
    • Drive load against the application to evaluate the performance and scalability of the chosen platform.
    • Experiment with different algorithms (e.g. memcache locking, a different DB access API) by replacing portions of code in the application.

    Olio started it's life as the web2.0kit developed by Sun Microsystems in colloboration with U.C. Berkeley RAD Lab and was presented on Velocity2008.

    Server load balancing architectures, Part 1: Transport-level load balancing

    Server farms achieve high scalability and high availability through server load balancing, a technique that makes the server farm appear to clients as a single server. In this two-part article, Gregor Roth explores server load balancing architectures, with a focus on open source solutions. Part 1 covers server load balancing basics and discusses the pros and cons of transport-level server load balancing.

    The barrier to entry for many Internet companies is low. Anyone with a good idea can develop a small application, purchase a domain name, and set up a few PC-based servers to handle incoming traffic. The initial investment is small, so the start-up risk is minimal. But a successful low-cost infrastructure can become a serious problem quickly. A single server that handles all the incoming requests may not have the capacity to handle high traffic volumes once the business becomes popular. In such a situations companies often start to scale up: they upgrade the existing infrastructure by buying a larger box with more processors or add more memory to run the applications.

    Read the rest of the article on JavaWorld.

    Need help with your Hadoop deployment? This company may help!

    A group of top Silicon Valley engineers (ex-Yahoo, Facebook, Google) have come together to launch a new startup called Cloudera.
    Not yet launched, it intends to help other companies adopt a promising software platform called Hadoop.

    Hadoop is an open-source software project (written in Java) designed to let developers write and run applications that process huge amounts of data. While it could potentially improve a wide range of other software, the ecosystem supporting its implementation is still developing. Which is where Cloudera hopes to make a place for itself.

    More on Hadoop: It uses the Google-introduced MapReduce systems framework that divides applications into small blocks of work, creating multiple replicas of data blocks that it places on various computer nodes.

    It is already in use at large companies like Yahoo.
    Read more about Cloudera here.

    EE-Appserver Clustering OR Terracota OR Coherence OR something else?

    Hi,

    I am very glad that this site exists, as I have learned more about clustering on this site than for quite some time reading stuff elsewhere. Oftentimes, one can find lots of material about clustering, but the practical real-life information is missing. Not so wih this site.

    I am currently planning the development of an application which has a lot of enterprise features and requirements. On the other side (if the tiny chance of success might strike us), this application would not be an in-house application of a financial institution, or something like that, but some kind of communit/web 2.0 web site. Thus it is an enterprise application with (hopefully, but surely unlikely) the user numbers of a social networking site. Each user initiated transaction involves huge resssources business logic wise (including insane amounts of encryption oprations).

    Of course, I do not intend to induldge into premature scaling, but to invest every minute I have into the implementation of business logic features. Nevertheless, I do not want to make some extremely bad choices which would force a complete reimplementation straight after the first tiny success - i.e. I want to start with the right technology and architecture, but wait with the implementation of the scalability and high availyability features.

    Because of the enterprise aspects of this software, my first thought was to use Java SE 6 and Java EE 5 technologies only in order to get all the JEE features and to be vendor independent at the same time.

    For implementation and testing purposes I thought of Glassfish v2UR2, Postgresql 8.3 and Solaris 10. As all of the major JEE-Appserver vendors advertise the clustering capabilities, I thought that this could not be a bad move. Hopefully, Glassfish would provide HA and scalability, if not there would always be Geronimo, JBoss, Weblogic, or Websphere.

    Now it seems that there are vast differences between different products:
    - JEE-Application servers are scaling only to some degree(?). It seems that JEE is almost exclusively used for enterprise applications like SAP ERP or applications at financial institutions? Therefore, there is no need for extreme scalability.
    - Terracotta seems to be very nice, as one do not have to learn the insanely huge JEE-technology stack, but can just write a mostly Java-SE-only threaded application(?). But Terracotta does not seem to scale very well either (bottleneck with write-operations caused by the master-worker architecture?) and we would be dependend on the future of the Terracotta Corporation. JEE on the other side is vendor neutral.
    - Oracle Coherence. This product seems to be the best distributed caching product and the holy grail of scalability(?). But it is oracle-expensive. Absolutely nothing for a tiny start-up with no financing. JEE is vendor neutral and thus possibly much cheaper.

    Do you think that it is possible that one could produce a JEE-Architecture which could provide massive scalability (many hundreds of AppServer) using only the Glassfish clustering features?

    Or am I on a completely wrong track? Do we have to plan for Oracle Coherence usage? Are there other possibilities?

    Thanks a lot for any opinions or hints!

    regards,
    mike

    Product: Terracotta - Open Source Network-Attached Memory

    Update: Evaluating Terracotta by Piotr Woloszyn. Nice writeup that covers resilience, failover, DB persistence, Distributed caching implementation, OS/Platform restrictions, Ease of implementation, Hardware requirements, Performance, Support package, Code stability, partitioning, Transactional, Replication and consistency.

    Terracotta is Network Attached Memory (NAM) for Java VMs. It provides up to a terabyte of virtual heap for Java applications that spans hundreds of connected JVMs.

    NAM is best suited for storing what they call scratch data. Scratch data is defined as object oriented data that is critical to the execution of a series of Java operations inside the JVM, but may not be critical once a business transaction is complete.

    The Terracotta Architecture has three components:

    1. Client Nodes - Each client node corresponds to a client node in the cluster which runs on a standard JVM
    2. Server Cluster - java process that provides the clustering intelligence. The current Terracotta implementation operates in an Active/Passive mode
    3. Storage used as
      • Virtual Heap storage - as objects are paged out of the client nodes, into the server, if the server heap fills up, objects are paged onto disk
      • Lock Arbiter - To ensure that there is no possibility of the classic "split-brain" problem, Terracotta relies on the disk infrastructure to provide a lock.
      • Shared Storage - to transmit the object state from the active to passive, objects are persisted to disk, which then shares the state to the passive server(s).

    JVM-level clustering can turn single-node, multi-threaded apps into distributed, multi-node apps, often with no code changes. This is possible by plugging in to the Java Memory Model in order to maintain key Java semantics of pass-by-reference, thread coordination and garbage collection across the cluster. Terracotta enables this using only declarative configuration with minimal impact to existing code and provides fine-grained field-level replication which means your objects no longer need to implement Java serialization.

    Ari Zilka, the founder and CTO of Terracotta had a
    video session
    organized by Skills Matter. He will show you how it works and how you can start clustering your POJO-based Web applications (based on Spring, Struts, Wicket, RIFE, EHCache, Quartz, Lucene, DWR, Tomcat, JBoss, Jetty or Geronimo etc.).

    Todd Hoff's picture

    Ehcache - A Java Distributed Cache

    Ehcache is a pure Java cache with the following features: fast, simple, small foot print, minimal dependencies, provides memory and disk stores for scalability into gigabytes, scalable to hundreds of caches
    is a pluggable cache for Hibernate, tuned for high concurrent load on large multi-cpu servers, provides LRU, LFU and FIFO cache eviction policies, and is production tested. Ehcache is used by LinkedIn to cache member profiles. The user guide says it's possible to get at 2.5 times system speedup for persistent Object Relational Caching, a 1000 times system speedup for Web Page Caching, and a 1.6 times system speedup Web Page Fragment Caching.
    From the website:

    Todd Hoff's picture

    eBay Architecture

    Update 2: EBay's Randy Shoup spills the secrets of how to service hundreds of millions of users and over two billion page views a day in Scalability Best Practices: Lessons from eBay on InfoQ. The practices: Partition by Function, Split Horizontally, Avoid Distributed Transactions, Decouple Functions Asynchronously, Move Processing To Asynchronous Flows, Virtualize At All Levels, Cache Appropriately.
    Update: eBay Serves 5 Billion API Calls Each Month. Aren't we seeing more and more traffic driven by mashups composed on top of open APIs? APIs are no longer a bolt on, they are your application. Architecturally that argues for implementing your own application around the same APIs developers and users employ.

    Who hasn't wondered how eBay does their business? As one of the largest most loaded websites in the world, it can't be easy. And the subtitle of the presentation hints at how creating such a monster system requires true engineering: Striking a balance between site stability, feature velocity, performance, and cost.

    You may not be able to emulate how eBay scales their system, but the issues and possible solutions are worth learning from.

    Todd Hoff's picture

    Yandex Architecture

    Update: Anatomy of a crash in a new part of Yandex written in Django. Writing to a magic session variable caused an unexpected write into an InnoDB database on every request. Writes took 6-7 seconds because of index rebuilding. Lots of useful details on the sizing of their system, what went wrong, and how they fixed it.

    Yandex is a Russian search engine with 3.5 billion pages in their search index. We only know a few fun facts about how they do things, nothing at a detailed architecture level. Hopefully we'll learn more later, but I thought it would still be interesting. From Allen Stern's interview with Yandex's CTO Ilya Segalovich, we learn:

    Todd Hoff's picture

    Mailinator Architecture

    Update: A fun exploration of applied searching in How to search for the word "pen1s" in 185 emails every second. When indexOf doesn't cut it you just trie harder.

    Has a drunken friend ever inspired you to create a first of its kind internet service that is loved by millions, deemed subversive by thousands, all while handling over 1.2 billion emails a year on one rickity old server? That's how Paul Tyma came to build Mailinator.

    Mailinator is a free no-setup web service for thwarting evil spammers by creating throw-away registration email addresses. If you don't give web sites you real email address they can't spam you. They spam Mailinator instead :-)

    I love design with a point-of-view and Mailinator has a big giant harry one: performance first, second, and last. Why? Because Mailinator is free and that allows Paul to showcase his different perspective on design. While competitors buy big Iron to handle load, Paul uses a big idea instead: pick the right problem and create a design to fit the problem. No more. No less. The result is a perfect system architecture sonnet, beauty within the constraints of form.

    How does Mailinator carry out its work as a spam busting super hero?

    Future of EJB3 !! ??

    What is the future of EJB3 in the industry , given the current trends ?
    There are a lot of arguments regarding EJB3 being heavy weighted .....
    Also, what could be the alternatives of EJB3 ?
    How about the scalability, persistence, performance and other factors ?

    Database-Clustering: a8cjdbc - update: version 1.3

    The new version of a8cjdbc finished some limitations. Now Clobs and Blobs are supported, and some fixes using binary data. The version was also fully tested with Postgres and mySQL.

    Since Version 1.3 there is also a free trail version for download available. Check it out and test yourself...

    Take a look at: http://www.activ8.at/homepage/en/a8cjdbc.php

    I've downloaded the latest version and setup a environment with one virtual database and two database backends.
    I tried to make a "non real life szenario": The first backend was a Postgres node, the second was a mySQL node.
    Everything works fine - failover - recoverylog, etc... with to different backend database types.

    So check out the trial version and test yourself the clustered driver and give me some results about your experience with a8cjdbc.
    As I only tested mySQL and Postgres (and the non real life szenario with two different backend types) - maybe someone else have experiences with out databases?

    greetings
    Wolfgang

    a8cjdbc - update verision 1.3

    The new version of a8cjdbc finished some limitations. Now Clobs and Blobs are supported, and some fixes using binary data. The version was also fully tested with Postgres and mySQL.

    Since Version 1.3 there is also a free trail version for download available. Check it out and test yourself...

    Take a look at: http://www.activ8.at/homepage/en/a8cjdbc.php

    I've downloaded the latest version and setup a environment with one virtual database and two database backends.
    I tried to make a "non real life szenario": The first backend was a Postgres node, the second was a mySQL node.
    Everything works fine - failover - recoverylog, etc... with to different backend database types.

    So check out the trial version and test yourself the clustered driver and give me some results about your experience with a8cjdbc.
    As I only tested mySQL and Postgres (and the non real life szenario with two different backend types) - maybe someone else have experiences with out databases?

    greetings
    Wolfgang