Bio
Gavin King is a Distinguished Engineer at Red Hat. He's the creator of Hibernate, a popular persistence solution for Java and of the Ceylon programming language. He contributed to the Java Community Process as JBoss and then Red Hat representative for the EJB and JPA specifications and as spec lead and author of the CDI specification. He's currently a major contributor to the design of Jakarta Data and Jakarta Persistence. He lives in Barcelona with his wife and three daughters. His active interests include theoretical physics and quantum technologies.
Tags
Authors
Abe White of Solarmetric replies to my criticisms of JDO on TSS. I'm actually not interested in getting into a lengthy debate over this, but since there /was/ an error in my first post, I must certainly acknowledge that.
Yesterday, Linda DeMichiel announced the changes coming in EJB 3.0. There was a lot to digest in her presentation, and I think it will take a while for people to figure out the full implications of the new spec. So far, most attention has focused upon the redesign of entity beans, but that is most certainly not all that is new! The expert group has embraced annotations aggressively, finally eliminating deployment descriptor XML hell. Taking a leaf from Avalon, Pico, Spring, Hivemind, etc, EJB will use dependency injection as an alternative to JNDI lookups. Session beans will be POJOs, with a business interface, home objects have been eliminated. Along with various other changes, this means that EJB 3.0 will be a much more appropriate solution for web-based applications with servlets and business logic colocated in the same process (which is by far the most sane deployment topology for most - but not all - applications), without losing the ability to handle more complex distributed physical architectures.
Like, I suppose, many Java developers, I have so often read about the supposed scalability problems associated with stateful session beans, that I simply accepted that these problems were real, and refused to even consider using stateful beans. I guess this was laziness, but we don't have time to verify everything we read - and I'd never had cause to doubt that what I read was correct.
After more than a year of activity, development of the Hibernate2 branch has finally been wound up; Hibernate 2.1.3 will be one of the last releases and represents a rock-solid POJO persistence solution with essentially all the functionality needed by a typical Java application. Any future release of Hibernate 2.1 will contain only bugfixes. The branch that we have been calling 2.2, will actually be released as version 3.
If you ever work with relational databases, you should go out and buy O'Reilly's /SQL Tuning/, by Dan Tow. The book is all about how to represent a SQL query in a graphical form and then, using some simple rules of thumb, determine an optimal execution plan for the query. Once you have found the optimal execution plan, you can add indexes, query hints, or use some other tricks to persuade your database to use this execution plan. Fantastic stuff. There is even sufficient introductory material for those of us (especially me) who know less than we should about the actual technical details of full table scans, index scans, nested loops joins, hash joins, etcetera to be able to start feeling confident reading and understanding execution plans. Unlike most database books out there, this book is not very platform-specific, though it does often refer specifically to Oracle, DB2 and SQL Server.
Developerworks is featuring the best article I have ever read on the subject of Java performance. The authors dispose of the canard that temporary object creation is expensive in Java, by explaining how generational garbage collection works in the Sun JVM (this is a bit more detailed explanation than the typical one, by the way). Well, I already knew this; Hibernate rejected the notion of object pooling right from the start (unfortunately, the EJB spec has not yet caught up).
I have to repeat this cliche to myself at least once a week:
I just released 2.1.2 . This is a maintenence release, meaning no especially exciting new features (the interesting work is all going on in the 2.2 branch). However there are some small changes that might make a big performance difference in certain specific cases, especially if you are using a second-level cache. I'm hoping that this release brings the 2.1 branch to the same level of maturity that we were able to achieve with 2.0.3.
I just finished a consulting job at a large retailer where we managed to increase the performance of a Hibernate application by perhaps two orders of magnitude with just some fairly simple changes. It really drove home to me how almost all performance problems I've ever seen can be solved by either or both of:
One of the reasons we use relational database technology is that existing RDBMS implementations provide extremely mature, scalable and robust concurrency control. This means much more than simple read/write locks. For example, databases that use locking are built to scale efficiently when a particular transaction obtains /many/ locks - this is called /lock escalation/. On the other hand, some databases (for example, Oracle and PostgreSQL) don't use locks at all - instead, they use the multiversion concurrency model. This sophisticated approach to concurrency is designed to achieve higher scalability than is possible using traditional locking models. Databases even let you specify the required level of transaction isolation, allowing you to trade isolation for scalability.