« Antirez: You Need to Think in Terms of Organizing Your Data for Fetching | Main | How UltraDNS Handles Hundreds of Thousands of Zones and Tens of Millions of Records »

Batoo JPA - The new JPA Implementation that runs over 15 times faster...

This post is by Hasan Ceylan, an Open Source software enthusiast from Istanbul.

I loved the JPA 1.0 back in early 2000s. I started using it together with EJB 3.0 even before the stable releases. I loved it so much that I contributed bits and parts for JBoss 3.x implementations.

Those were the days our company was considerably still small in size. Creating new features and applications were more priority than the performance, because there were a lot of ideas that we have and we needed to develop and market those as fast as we can. Now, we no longer needed to write tedious and error prone xml descriptions for the data model and deployment descriptors. Nor we needed to use the curse called “XDoclet”.

On the other side, our company grew steadily, our web site has become the top portal in the country for live events and ticketing. We now had the performance problems! Although the company grew considerably, due to the economics in the industry, we did not make a lot of money. The challenge we had was our company was a ticketing company. Every e-commerce business has high and low seasons. But for ticketing there is low seasons and high hours. While you sell avarage x tickets an hour, when a blockbuster event goes on sale suddenly demand becomes 1000s of xs for an hour. Welcome to hell!

We worked day and night to tweak and enhance the application to use whatever available to keep it up on a big day. To be frank there was always a bigger event that was capable of bringing the site down no matter how hard we tried.

The dream was over, I came to realize that developing applications on top of frameworks is a bit “be careful!” along with “fun”.

I Kept Learning

I loved programming, I loved Java, I loved opensource. I developed almost every possible type applications on every possible platform I could. For the rest I went in and discovered stuff. I learned a lot from masters thanks to open source. In contrast to most, I read articles and codes written by great programmers like Linus Torvalds, Gavin King, Ed Merks and so many others.

With the experiences I gathered, I quit the ticketing company I loved and became a Software Consultant. This opened a new era in front of me that there were a lot of industries and a lot of different platforms and industries.

In each project I became the performance police of the application.

I am now the performance freak!

I Took The Red Pill!

One day I said to myself, could JPA be faster? If yes, how fast can it be. I spent about two weeks to create an entitymanager that persisted and loaded entities. Then I ran it and compared the results to ones off of Hibernate. The results were not really promising I was only about %50 faster than Hibernate in persisting and finding the entities. I spent another week to tweak the loops, cached metamodel chunks, changed access to classes from interfaces to abstract classes, modified the lists to arrays and so many other things. Suddenly I had a prototype that were 50+ times faster than Hibernate!

Development of Batoo JPA

I was astonished by how drastically performance went up by just paying attention to performance centric coding. By then I was using Visual VM to measure the times spent in the JPA layer. I got down and wrote a self profiling tool that measured the CPU resources spent at the JPA Layer and started implementing every aspect of the JPA 2.0 Specification. After each iteration I re-run the benchmark and when the performance dropped considerably I went back to changes and inspected the new code line by line - the profiling tool I created reported performance hit of every line of the JPA Stack.

It took about 6 months to implement the specification as a whole, on top of it, I introduced a Maven Plugin to create bytecode instrumentation at build time and a complementary Eclipse Plugin to allow use of instrumentation in Eclipse IDE.

After a carriage of 6 months Batoo JPA was born in August 2012. it measured over 15 times faster than Hibernate.


As stated earlier, a benchmark was introduced to measure every micro development iteration of Batoo JPA. This benchmark was not created to put forward the areas Batoo JPA was fast so that other would believe in Batoo JPA, but was created to put together a most common domain model and persistence operations that existed in almost every JPA application - so that I knew how fast Batoo JPA was.

Performance Metrics

The scenario is:

  • A Person object
    • With phonenumbers - PhoneNumber object
    • With addresses - Address object
      • That point to country - Country Object

Common life-cycle tasks has been introduced:

  • Persist 100K person objects with two phone numbers and two addresses in lots of 10 per session
  • Locate and load 250K person objects with lots of 10 per session
  • Remove 5K person objects with lots of 5 per session
  • Update 100K person objects with lots of 100
  • Query person objects 25K times using Object Oriented Criteria Querying API.
  • Query person objects 25K times using JPQL - Java Persistence Query Language, an SQL-like query scripting language.

For the sake of simplicity, the benchmark was run on top of in-memory embedded Derby with the profiler slicing the times spent at the

  • Unit Test Layer
  • JPA Layer
  • Derby Layer

The times spent at the Unit Test Layer is omitted from the Results due to irrelevancy.


The times given in the below tables are in milliseconds spent in the JPA layer while running the benchmark scenario. The same tests are run for Batoo and Hibernate JPA in different runs to isolate boot, memory, cache, garbage collection etc. effects.

The tables below show

  • the total time spent at Derby Layer as DB Operation Total
  • the type of the test as Test
  • the times for each test at Derby Layer as DB Operation
  • the times for each test at JPA Layer as Core Operation
  • the total time spent at JPA Layer as Core Operation Total
  • the total time spent at both JPA and Derby Layers as Operation Total

Below are the ratios of CPU resources spent by Hibernate and Batoo JPA. It is assumed that an an application generates average 1 save, 5 locate, 2 remove and 3 update and 5 + 5 total of ten queries in ratios. Now although these numbers are extremely dependent on the application nature, some sort of assumption is needed to measure the overall speed comparison.

Given the scenario above, Batoo JPA measures over 15 times faster than Hibernate - the leading JPA implementation.

As you may have noticed Batoo JPA not only performs insanely fast at the JPA Layer it also employs a number of optimizations to relieve the pressure on the database. This is why Batoo JPA measures half the time at DB Layer in comparison to the one off of Hibernate.

Interpretation of Results

We do appreciate that JPA is not the single part of an application. But we do believe that the current JPA implementation consume quite a bit of your server budget. While a typical application cluster spends CPU resources for persistence layer about %20 to %40, Batoo JPA will well be able to bring your cluster down to half of its size allowing you save a lot on licensing administration and hardware, as well as room to scale up even for non-cluster friendly applications - in my experience I saw applications running on 96 core Solaris systems simply because they are not scalable.


We have managed to create a JPA Product that allows you to enjoy the great features of JPA Technology but also do not require you to compromise on performance!

On top of that Batoo JPA is developed using the Apache Coding Standards and has valuable documentation within the code. The project codebase is released with LGPL license and there is absolutely no closed source part and we envision that it would be that way forever.

As stated earlier, it also has a complementary Maven and Eclipse plugin to provide instrumentation for build and development phases.

Batoo JPA deviates from the specification almost zero, making it easy for existing JPA applications be migrated to Batoo JPA, while requiring no additional learning phase to start using it.

Last but not the least, Batoo JPA not only saves you when you run your application, but also during the time you deploy your application. Batoo JPA employs parallel deployer managers to handle deployment in parallel. Considering a developer deploys the application during his / her development phase well 10x times a day if not 100, with a moderately large domain model this may take quite a bit of developers time when summed up. Although we haven’t made a concrete benchmark on deployment speed, we know that Batoo JPA  deploys about 3 4 times faster than Hibernate.



References (1)

References allow you to track sources for this article, as well as articles that were written in response to this article.

Reader Comments (37)

I totally disagree, in-memory database is a strong choice you've made to build your communication. It doesn't reflect at all how a real production system will behave.
In order to validate your numbers, you must put users in front of real life configurations.

Here are my final conclusions when comparing the implementations in a real life QA environment. Very long benchmark settings.
Both implementations offer quite the same response times.
In term of CPU usage I'm getting these results
- 1.5% of the global System CPU capacity with hibernate
- 1.25% of my global System CPU capacity with Batoo

All the details are available in my blog.

By global system, I only include the test itself. So except the JPA internals, there is really nothing more.

At the same time, the database CPU is at 80%+ full.

Explain me, in that conditions, why users would care about these extreme tunings you are making on every single line of code.
Is this is interesting ? Yes
Should we respect that ? Yes
Is this useful ? No, sorry this is not solving real performance issues at all.
Is it important when comparing to hibernate? No, to solve real performance issues, you need features which you don't have much for now, hibernate has tons of features that can help.

I'll change my mind if you publish a test case with:
- a remote database config
- mapping, config, and APIs well used
If you manage to hit a CPU issue because of Hibernate BEFORE hitting the database crash, I'll trust you.

Best regards,

October 23, 2012 | Unregistered CommenterAnthony Patricio


About the in-memory database, I think we have to agree to disagree. On the other hand you can easily change the datasource to MySQL (or any local / remote database). Why do not make that changes and post the stats here. I can certainly make that for you, but the idea here is that you run the tests and share the results. I do not know if you have subscribed to notifications here, but just before you sent your comments, I have released another full application that again justifies the performance difference between the two. For the reference it is in Batoo JPA repository called HelloRest...

On your conclusions:

You are again stuck with response times. I have never mentioned the response times in my article. Nor the focus here is on response times. Yes there are some applications that make a lot of database iterations that can be slow in response times but this usually is mainly due to bad architecture, incorrect domain model mappings etc. Again based on this you make your assertion on a single user system. I would love to see your QA test with concurrent requests then we can go back on response times. As the number of request increase, Hibernate will be starving more an more on its CPU intensive operations and drop the performance. This you can see again in HelloJSF and HelloRest test applications.

Back to your single user test QA... While Batoo JPA comes with full blown profiler to measure the CPU times spent on down to line number, a test by yourself based on vmstat... I really do not want to comment on that. Let alone measuring with numbers within the range of 0% 1.5%... I think most of the people are aware of that any Java Application Server has services that run in the background and some of them use static CPU resources - meaning they do not increase as the load in the system increases ( a system that has more then one user on it), i.e. deployers, connection pool managers, lgo rotaters, etc.

Those services no matter what will consume x CPU usage, then what you should really be comparing is (1,5 -x) / (1.25 -x).
Again I like to work on numbers that makes sense when measured.

Yes developers, & architects can make bad designs. But guess what when their system is tuned or good enough, there still is room for improvement by simply switching to Batoo JPA. And that argument (in addition to current Hibernate Lead) alone gives me the impression that Hibernate is all about features not about the performance (after all that's how Batoo JPA started). Imagine the situation if XML parsers, JDBC drivers, UI frameworks written with that approach, no one would ever come close to Java as it is a performance hog. Again, all the tweaks and

"Massive insert / update / delete / query operations". I guess that operations what all systems do and if the load is high, then they become massive, right?

" I'll change my mind if you publish a test case with...(clipped)" , As said before it is already in the repo - HelloRest. As for the "mapping, config, and APIs well used" comment, this is very specific, why do not you offer a domain model and test that we again benchmark both frameworks. This isi from the date the article published, I kept asking Hibernate team to modify the tests the way they like, but we are yet to see a response to that. All we see from Hibernate side that some numbers spit out of mysterious QA environment that has no public availability, and everyone must just believe what Hibernate team says. So for the transparency, kindly share you test bench and everyone can make the tests themselves, otherwise, I am sorry but the numbers mean nothing. For the HelloRest and HelloJSF apps, I will be awaiting your results, either here or on your blog.

From your blog "you’ll notice that the author of the benchmark left the auto commit mode. For someone who claims to be implementing the fastest JPA implementation ever, he should review the basics." - if that should be on, why is it off by default? Batoo JPA doesn't need simple tweaks like that. It is smart and fast in all environments, including Android which AFAIK Hibernate cannot run. Oh wait it is a mobile, low performance CPU right? With 50 settings all doing some magic in Hibernate, you expect your users to learn these, right? I though specs existed just for that, I'll take that as an attempt to vendor lock-in...

I see on your blog, that you took great deal of time to make some tests, again please share the resources, so everyone can authenticate / interpret / change the tests for themselves.

Lastly about the features, I would love to have your ideas on how popular each Hibernate feature, so that I can start building from the popular ones. But one thing again I will be careful is to retain the performance and quality - as I see that feature rich Hibernate has 2905 bugs as of today of which 58 blocker, 141 Critical, 2122 important. Good job!

"This sort of buzz happens every 2 years, it’s a cycle, there is nothing we can do about that.". I'd like to know about the other projects and your comprehensive (yet not based on facts) test against them.

See you in two years! :)

Hasan Ceylan

October 23, 2012 | Unregistered CommenterHasan Ceylan

Again for the sake of completeness, I'll paste my comments on Anthony's blog that provides evidence that a benchmark including the db times that end-to-end it is ~%40 faster with Batoo JPA:

" least I hope so, that most of the time is spent on network and database so is this really important to focus _that much_ on JPA internals". Certainly you would hope so, the longer it takes, the less visible how much Hibernate costs. FYI, while the JDBC driver is sleeping on socket for data, the CPU usage is '0'. So take that argument and multiple it with hundreds of requests. If you were right, on an 8 core system (like your QA) you could only have 8 requests concurrently. But that's something called context switch, so another thing one should know to building high performance frameworks.

"I’m not going to rely on the timers provided by the benchmark". Why do not want to take a well known profiler off or take JMeter which spits out what the user will see? The whole article is based on vmstat. Come on...

"Well used Hibernate is ways faster than noob Hibernate". Why doesn't Hibernate use the instrumentation well. Why do people need to tweak 50 setting to adjust one aspect, only to see the other aspect got worse? Why Hibernate cannot make those educated guesses? So my counter pack-shot is "Well written JPA is ways faster then others".

Shoot 1: (13 * 200) / (9 * 210) = 137.6%
Shoot 2: (44 * 64) / (36 * 55) = 142.2%
Shoot 3: (301 * 29) / ( 292 * 22) =135.9% (Your tweaks mean anything?)
Shoot 4: (990 * 6) / (1075 * 5) = %110.5 (Yep it is you, Hibernate still consumed more, just that you through a piece of junk as the server, so poor database server is starving so it cannot server the request. Otherwise you need to explain Java and DB running on the same server consomes ~5 mins %20 load, the new db server consumes ~ 16 mins %80 cpu load)
Shoot 5: When you provide the numbers I'll be happy to make the calculation for you.

On Shoot 4, as the numbers are really low, rounding matters, if say 6 secs is indeed 6.49 and and 5 secs is indeed 4.51 then the calculation is %132.5. Looks consistent? :)

October 23, 2012 | Unregistered CommenterHasan Ceylan

I am most certainly directly related, I'm a developer and in your primary target group. Which is the more likely user of your implementation, me or mr Ebersole? ;-)

Most code bases start out 90% nice interfaces, clean code, good performance but in the end, the remaining 10% needed for real-world application take down the performance and strain the architecture. But, perhaps this will be the exception to the rule.

October 23, 2012 | Unregistered CommenterNicklas Karlsson

it seems in no way you can open your eyes and admit that in a real architecture, with a remote database, your optimizations are less relevant.

Do you really think users are getting JPA to code tiny toy domain models like the one your are using in your benchmarks? Why would they waste time where it would be quicker to use plain JDBC????

You are getting really dishonest by saying things like
"Hibernate has 2905 bugs as of today of which 58 blocker, 141 Critical, 2122 important", your numbers are including
- every sub projects, not only the JPA implementation
- every releases including releases done in 2004 !!!
- feature requests, these a not bugs
- unchecked request, meaning all the users using jira instead of the user forum
- duplicated requests

Now please learn how to use jira and make the good filters then go back before saying stupid things.

Now people can understand your way to manipulate contexts and numbers.

I'm not going to waste more time here because we both have a job and now I really have better to do.

October 23, 2012 | Unregistered CommenterAnthony Patricio

BTW you said you implemented every aspect of JPA 2, did you run the TCK also?

October 23, 2012 | Unregistered CommenterNicklas Karlsson


I am sorry I missed the 'implementation' word. :)

So again I would like to emphasize, not all projects have a profiler and a benchmark as a build time aid. Do not worry, if there is a feature that concerns %1 of the audience yet drops the performance considerably, either it will be done right, or not at all.

TCK is in the list of the things. It is important but not crucial as it passes a framework yet it has ~3000 issues. :)


No my eyes are open. That is why there is Batoo JPA. I have been your customer for so long and felt the pain in slow J2EE stacks. I only started with the layer where the performance is most spoiled. So if there is no performance issue with Hibernate why your article ends with the largest font saying "Conclusion: I don’t care these CPU optimizations I prefer FEATURES!"

Talking about honesty, respect? Below are excerpts from yours and Steve's comments.

- "Unfortunately I threw away the results from my earlier run that came up with the specific 37,956 number for Hibernate persist run (you know in the garbage where argument like this really belong)".

- Manipulative 'vmstat' tests,

- Title "Decrypting another JPA benchmark",

- "For someone who claims to be implementing the fastest JPA implementation ever, he should review the basics".

- Doing 5 shoot of tests but copying the most vague number to here.

- Making assertions on your blog but censoring the comments?

I am sorry that the discussion has come this far. But that is due to the attitude you guys have shown. Otherwise I wouldn't have pull the Hibernate JIRA gun.

You go around, make assertions that Hibernate is not slow, yet performance issues pop up in Hibernate JIRA in the last month.

On the other hand please consult [1]. Since I do not know how to use JIRA please educate me. By the way this is only Hibernate ORM page which does not include Shards, Tools, etc. Jut Hibernate ORM...

UnResolved by Prority
Blocker: 58
Critical: 141
Major: 2122
Minor: 632
Trivial: 85
TOTAL: 3038

Status Summary
Open: 2906
In Progress: 6
ReOpended: 35
Resolved: 675
Closed: 3889
Awaiting Test Case: 86

TOTAL: 7597

Is it my lack of JIRA skills or Hibernate Team hasn't solved ~38% of issues ever opened? :)

UnResolved By Component
annotations: 223
caching (L2): 82
core: 1364
documentation: 104
entity-manager: 154
metamodel: 125
query-criteria: 138
query-hql: 243
query-sql: 80
build: 19
envers: 94
spatial: 4
testsuite: 26

That is what I see on Hibernate JIRA. Please share your numbers or tell me where I am wrong in interpretation of the stats.
While you throw away performance, you go "All that matters is features", when it comes to bugs these numbers about all the the features. If there are historical bugs, you either close them, or migrate to the new version. Even bugs with no component is another indication of JIRA misuse.

Looking at unresolved by Version I see 2848.

So do teach me how to use JIRA, would love to improve my my skills on that. Certainly issue management is an important task in OSS and as I did better on other areas, to build a better dealt community I wouldn't make the mistake Hibernate did.

Lastly, "Do you really think users are getting JPA to code tiny toy domain models like the one your are using in your benchmarks". Please do share the code in your QA that is large enough that you base Hibernate's quality on releases. After all, unlike making claims like "this is in-memory db", "this is only benchmark", "this is not remote database", "this is tiny tinny" I made my statement right from the beginning: Please prepare the test you like, I challenge Hibernate in every setup, in every application. Batoo JPA can beat Hibernate by far..

[1] Hibernate ORM - Issues

Hasan Ceylan

PS: You may start by paying attention to writing people's names right!

October 23, 2012 | Unregistered CommenterHasan Ceylan

I'm sorry I didn't get what you were referring to in the comment "I am sorry I missed the 'implementation' word. :)"

I fail to see how the number of bugs in Hibernate in any way affects your project but since you dedicated that much attention to the detail I am forced to ask "surely you see the connection between number of active users and number of bug reports"? If I write code for two years and they I say that "hey, it's bugfree, just look at my empty JIRA", you would give me a funny look, right? Have you seen the issue tracker of the Linux kernel (you know, that crappy piece of code that nobody uses)?

October 23, 2012 | Unregistered CommenterNicklas Karlsson

Hey Nicklas,

You said, "I'm no JPA implementation coders so I'm not going to comment on the technical aspects". And I took it as I am a developer but I do not code on top JPA. Hope that clarifies the little misunderstanding, my apologies if in any way this was offensive, as I am not a native English speaker. Apart from that, of course people and organizations like you are my target audience, but we were so much caught up with the performance argument.

On the other hand, yes any project with a large community should receive a lot of issues, that is totally acceptable. But IMHO, this should be to an extend and it also should address the issues and the numbers should in the range of hundreds if not below 100. That is also why I did not refer Batoo JPA bug free while it has a few improvement issues only. I certainly need to build my community to see my numbers, and even I can assure you that everyday if I receive 2 issues a day I would be happy to see them. But what I will not do is when the community is mature, Batoo JPA number of OPEN issues will not be 3 digit numbers let alone 4 digit numbers! To be frank a new project can never be declared as bug-free with low numbers (much like how much %1 vmstat numbers are meaningful). Having said that I promise you will never see over hundred issues open in Batoo JPA.

To foster that assertion for a couple of issues on StackOverflow, I have proven with Batoo JPA (refer to community sub project in the repo) that what the OP says is perfectly legal in terms of JPA but Hibernate fails to execute them. But As Hibernate fellows started -1ing my comments, I stopped doing.

But again now that we seem to be through with "Performance Argument" and I see Hibernate Team has thrown the towel, I may go back to advocate Hibernate Users on StackOverflow. :)

Going back to topic! I refer to Hibernate as full of bugs as they concluded "(they) don’t care these CPU optimizations I prefer FEATURES!". And simply put forward that what they call as features only bring in more bugs. On top of that for every %1 additional audience, they lower the whole projects performance.


To compare apples to apples, take Apache Tomca... Ages comparable, the community comparable, etc. They have 68 bugs of which half reported withing the last few months, and the other half do not have critical bugs mostly enhancements, only 3 blocker. The open issue ratio is 60 / ~1000 = %6. Less number of issues, less open / total ration, more recent. Guess what Tomcat has way larger audience then Hibernate.

OK, another example take MyFaces, again mature enough and in the same league as Hibernate. it has 0 blocker, and only 2 critical issues with ~300 / 3000, %10 open ratio.


Now both projects above are from Apache, so arguably it has the culture of Apache. Now checking OpenJPA. It looks exactly like Hibernate except divide the numbers by two. So is EclipseLink... That's why I felt the need of a quality JPA implementation along with performance and saw the niche from a different angle.


Now although not directly comparable as Linux kernel is maintained by hundreds of developers and has about at least 100 times larger code base, the number of total issues is 1500 / 22500. So ~5%. Again Kernel cares way better its community. And although it addresses every possible hardware, need, use, guess what it still can run on tiny mobile processors, so can Batoo JPA.

Hope that makes sense...


October 23, 2012 | Unregistered CommenterHasan Ceylan

Hasan - you are my personal hero! You de-weaponized your rude opponents with such style, unspoken ease and calmness. And who would ever think that lead dev from Hbrnt is such a nasty guy? :)

Besides personal-related issues, I used to work with hbrnt in many projects and I must admit - it is big, fat and its issues were 2 times the last argument before switching to jdbctemplate and eclipselink. Also, I know that in many companies in my country JPAs libraries are usually blacklisted because of ease of screwing performance and functionality. I know, it is problem of unexperienced developers and bad designs, but hey - you remember who proposed Open Session In View Filter? In my opinion, OSIVF is the biggest **** ever made by man in terms of "patterns"... Read that (
" If you access detached objects that have been loaded in the Session inside your JSP (or any other view rendering mechanism), you might hit an unloaded collection or a proxy that isn't initialized.".

Yeah - developer, that wants to access detached entities. So it means, that he doesnt know what`s been fetched from db? That`s just great... :-)


October 30, 2012 | Unregistered CommenterPiotr Szarański


Thank you for all the kudos.Hopefully Batoo JPA will white-list JPA in companies that stay away from it.

BTW, open session in view is not necessary with Batoo JPA like in EclipseLink (AFAIK). I do not agree with "violation of transaction demarcation". I validated this by reading Hibernate source, that, while the session / entity manager is connected, it opens and closes connection at will if there is no transaction active. However when the session / entity manager is closed, but still in the same VM, hibernate refuses to re-connect - fetch - disconnect. That's all read-only process and I do not see any valid reason not to re-connect. Should one arises I am inclined to keep the feature as default but allow it to be turned off by configuration.


October 30, 2012 | Unregistered CommenterHasan Ceylan


Can you configure & run your implementation against (at least one) of the databases in the open source JPA benchmark provided here:


January 17, 2013 | Unregistered CommenterAndrew

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>