Red Hat



In Relation To Gavin King

In Relation To Gavin King

Internationalized data in Hibernate

Posted by    |       |    Tagged as

We've seen a few people using internationalized reference data where labels displayed in the user interface depend upon the user's language. It's not immediately obvious how to deal with this in Hibernate, and I've been meaning to write up my preferred solution for a while now.

Suppose I have a table which defines labels in terms of a unique code, together with a language.

create table Label (
    code bigint not null,
    language char(2) not null,
    description varchar(100) not null,
    primary key(code, langauge)
)

Other entities refer to labels by their code. For example, the Category table needs category descriptions.

create table Category (
    category_id bigint not null primary key,
    discription_code bigint not null,
    parent_category_id foreign key references(category)
)

Note that for each description_code, there are potentially many matching rows in the Label table. At runtime, my Java Category instances should be loaded with the correct description for the user's language preference.

UI Labels should certainly be cached between transactions. We could implement this cache either in our application, or by mapping a Label class and using Hibernate's second-level cache. How we implement this is not very relevant, we'll assume that we have some cache, and can retrieve a description using:

Label.getDescription(code, language)

And get the code back using:

Label.getCode(description, language)

Our Category class looks like this:

public class Category {
    private Long id;
    private String description;
    private Category parent;
    ...
}

The description field holds the String-valued description of the Category in the user's language. But in the database table, all we have is the code of the description. It seems like this situation can't be handled in the a Hibernate mapping.

Whenever it seems like you can't do something in Hibernate, you should think UserType! We'll use a UserType to solve this problem.

public class LabelUserType {
    
    public int[] sqlTypes() { return Types.BIGINT; }
    
    public Class returnedClass() { return String.class; }
    
    public boolean equals(Object x, Object y) throws HibernateException {
        return x==null ? y==null : x.equals(y);
    }
    
    public Object nullSafeGet(ResultSet rs, String[] names, Object owner) 
        throws HibernateException, SQLException {
        
        Long code = (Long) Hibernate.LONG.nullSafeGet(rs, names, owner);
        return Label.getDescrption( code, User.current().getLanguage() );
    }
    
    public void nullSafeSet(PreparedStatement st, Object value, int index) 
        throws HibernateException, SQLException {
        
        Long code = Label.getCode( (String) value, User.current().getLanguage() );
        Hibernate.LONG.nullSafeSet(st, code, index);
    }
    
    public Object deepCopy(Object value) throws HibernateException {
        return value; //strings are immutable
    }
    
    public boolean isMutable() {
        return false;
    }
}

(We can get the current user's language preference by calling User.current().getLanguage().)

Now we can map the Category class:

<class name="Categoy">
    <id name="id" column="category_id">
        <generator class="native"/>
    </id>
    <property 
        name="description" 
        type="LabelUserType" 
        column="discription_code"
        not-null="true"/>
    <many-to-one 
        name="parent" 
        column="parent_category_id"/>
</class>

Note that we can even write queries against Category.description. For example:

String description = ...;
session.createQuery("from Category c where c.description = :description")
    .setParameter("description", description, Hibernate.custom(LabelUserType.class))
    .list();

or, to specify the code:

Long code = ...;
session.createQuery("from Category c where c.description = :code")
    .setLong("description", code)
    .list();

Unfortunately, we can't perform text-based searching using like, nor can we order by the textual description. We would need to perform sorting of (or by) labels in memory.

Notice that this implementation is very efficient, we never need to join to the Label table in our queries - we never need to query that table at all, except at startup time to initialize the cache. A potential problem is keeping the cache up to date if the Label data changes. If you use Hibernate to implement the Label cache, there's no problem. If you implement it in your application, you will need to manually refresh the cache when data changes.

This pattern can be used for more than internationalization, by the way!

Debunked?

Posted by    |       |    Tagged as

Abe White of Solarmetric replies to my criticisms of JDO on TSS. I'm actually not interested in getting into a lengthy debate over this, but since there /was/ an error in my first post, I must certainly acknowledge that.

First, Abe displays a trivial query that looks superficially similar in HQL and JDOQL. I'm not certain exactly what this is intended to prove, but I'm not going to get into a lengthy debate over this. I encourage anyone who is interested to compare the two query languages for themselves. It is my firm belief that a query language for ORM should look like SQL. (Unlike HQL and EJBQL, JDOQL is not targetted specifically at ORM, which explains some of the difference of opinion.) I guess I should make it clear that it is really the query language that is the showstopper for me personally.

Next, Abe points out that I am wrong about the semantics for unfetched associations in JDO detachment. I stand corrected. From the JDO1 spec, and from earlier conversations with JDO EG members, I had understood that the JDO guys were insistent that enhanced classes not be needed on the client - which means that interception could not be done for serialized instances on the client side. Apparently they have changed their minds and dropped that requirement. That's reasonable, I suppose. It probably means that there would be difficulties with enhancing at classloading time, but I'm not certain of this and I do accept that build-time enhancement is a reasonable approach to supporting detach/reattach.

There was some suggestion that my other objections to field interception might be wrong, but I think I'm right on those. I think you'll see this if you carefully consider how polymorphic associations and detachment complicate the picture. (Remember, we cannot decide the concrete subclass of an associated instance without hitting its database table or tables.)

Last, Abe argues that JDO really only has exactly the same kinds of identity and instance states as Hibernate. I agree - and that's the whole problem! Let's take identity. There is really just one kind of identity that is interesting: persistent identity. The only thing that should vary is how the identity value (primary key value) is assigned. But JDO has a different representation for datastore identity (a surrogate key generated by the persistence layer) and application identity (a natural key assigned by the application). JDO2 adds simple identity to paper over the problems with this, and complicates the picture further.

OK, no time for more, I gotta catch a plane. Of course, everyone thinks their own technology is best and that everyone who criticizes it is wrong/stupid/spreading FUD/etc. I have absolutely zero expectation that a debate about this will produce any clear technical winner (rather than smeared reputations). I'm fallible, the JDO guys are fallible, everyone else is fallible. I prefer to surrender to darwinism, and watch what solution is actually adopted in real projects.

EJB3

Posted by    |       |    Tagged as

Yesterday, Linda DeMichiel announced the changes coming in EJB 3.0. There was a lot to digest in her presentation, and I think it will take a while for people to figure out the full implications of the new spec. So far, most attention has focused upon the redesign of entity beans, but that is most certainly not all that is new! The expert group has embraced annotations aggressively, finally eliminating deployment descriptor XML hell. Taking a leaf from Avalon, Pico, Spring, Hivemind, etc, EJB will use dependency injection as an alternative to JNDI lookups. Session beans will be POJOs, with a business interface, home objects have been eliminated. Along with various other changes, this means that EJB 3.0 will be a much more appropriate solution for web-based applications with servlets and business logic colocated in the same process (which is by far the most sane deployment topology for most - but not all - applications), without losing the ability to handle more complex distributed physical architectures.

What is amazing is the broad consensus in the EJB Expert Group - all the way from traditional J2EE vendors like BEA, to we open source kiddies - about what was most needed in EJB3. There has been a lot of listening to users going on, which took me by surprise. Linda's leadership is also to be credited.

But anyway, this is the Hibernate blog, so I'm going to discuss persistence....

EJB 3.0 adopted a POJO-based entity bean programming model, very much similar to what we use in Hibernate. Entity beans may be serializable. They will not need to extend or implement any interfaces from javax.ejb. They will be concrete classes, with JavaBeans-style property accessors. Associations will be of type Set or Collection and always bidirectional, but un-managed. (Referential integrity of bidirectional associations is trivial to implement in your object model, and then applies even when the persistent objects are detached.) This model facilitates test-driven development, allows re-use of the domain model outside the context of the EJB container (especially, DTOs will be a thing of the past for many applications) and emphasizes the business problem, not the container.

Hibernate Query Language was originally based upon EJBQL, ANSI SQL and ODMG OQL, and now the circle is complete, with features of HQL (originally stolen from SQL and OQL) making their way back into EJBQL. These features include explicit joins (including outer joins), projection, aggregation, subselects. No more Fast Lane Reader type antipatterns!

In addition, EJBQL will grow support for bulk update and bulk delete (a feature that we do not currently have). Many users have requested this.

For the occasional cases where the newly enhanced EJBQL is not enough, you will be able to write queries in native SQL, and have the container return managed entities.

EJB 3.0 will replace entity bean homes with a singleton EntityManager object. Entities may be instantiated using new, and then made persistent by calling create(). They may be made transient by calling remove(). The EntityManager is a factory for Query objects, which may execute named queries defined in metadata, or dynamic queries defined via embedded strings or string manipulation. The EntityManager is very similar to a Hibernate Session, JDO PersistenceManager, TopLink UnitOfWork or ODMG Database - so this is a very well-established pattern! Association-level cascade styles will provide for cascade save and delete.

There will be a full ORM metadata specification, defined in terms of annotations. Inheritance will finally be supported, and perhaps some nice things like derived properties.

Because everyone will ask....

What was wrong with JDO? Well, the EJB expert group - before I joined - came to the conclusion that JDO was just not a very appropriate model for ORM, which echoes the conclusion reached by the Hibernate team, and many other people. There are some problems right at the heart of the JDO spec that are simply not easy to fix.

I'm sure I will cop enormous amounts of flak for coming out and talking about these problems in public, but I feel we need to justify this decision to the community, since it affects the community, and since the EG should be answerable to the community. We also need to put to rest the impression that this was just a case of Not Invented Here.

First, JDOQL is an abomination. (There I said it.) There are four standard ways of expressing object-oriented queries: query language, query by criteria, query by example and native SQL. JDOQL is none of these. I have no idea how the JDO EG arrived at the design they chose, but it sort of looks as if they couldn't decide between query language and query by criteria, so went down some strange middle road that exhibits the advantages of neither approach. My suggestion to adopt something like HQL was a nonstarter. The addition of support for projection and aggregation in JDO2 makes JDOQL even uglier and more complex than before. This is /not/ the solution we need!

Second, field interception - which is a great way to implement stuff like Bill Burke's ACID POJOs or the fine-grained cache replication in JBossCache - turns out, perhaps surprisingly, to be a completely inappropriate way to implement POJO persistence. The biggest problem rears its head when we combine lazy association fetching with detached objects. In a proxy-based solution, we throw an exception from unfetched associations if they are accessed outside the context of the container. JDO represents an unfetched association using null. This, at best, means you get a meaningless NPE instead of a LazyInitializationException. At worst, your code might misinterpret the semantics of the null and assume that there is no associated object. This is simply unnacceptable, and there does not appear to be any way to fix JDO to remedy this problem, without basically rearchitecting JDO. (Unlike Hibernate and TopLink, JDO was not designed to support detachment and reattachment.)

Proxy-base solutions have some other wonderful advantages, such as the ability to create an association to an object without actually fetching it from the database, and the ability to discover the primary key of an associated object without fetching it. These are both very useful features.

Finally, the JDO spec is just absurdly over-complex, defining three - now four - types of identity, where one would do, umpteen lifecycle states and transitions, when there should be just three states (persistent, transient, detached) and is bloated with useless features such as transient transactional instances. Again, this stuff is just not easy to change - the whole spec would need to be rewritten.

So, rather than rewriting JDO, EJB 3.0 entities will be based closely upon the (much more widely adopted) dedicated ORM solutions such as Hibernate and TopLink.

Stateful Session Beans Rock

Posted by    |       |    Tagged as

Like, I suppose, many Java developers, I have so often read about the supposed scalability problems associated with stateful session beans, that I simply accepted that these problems were real, and refused to even consider using stateful beans. I guess this was laziness, but we don't have time to verify everything we read - and I'd never had cause to doubt that what I read was correct.

A few months ago, as Christian and I were nailing down our notion of application transactions in a hotel room in Kohn Kaen, Thailand, it really struck me hard that a stateful bean is the perfect place to keep state associated with the application transaction. (An application transaction is a unit of work /from the point of view of the user/; it spans multiple database/JTA transactions.) For example, you could keep a dedicated Hibernate session for the lifetime of the stateful bean, obviating the need for detaching and reattaching the object graph at every request.

This made me wonder about the cause of the scalability problems that everyone talks about. Once I really started thinking about this, it just didn't add up! A stateful bean without failover is quite equivalent to a HttpSession with server affinity, a design that is commonly understood to scale acceptably well. Similarly, failover for stateful beans should be no more difficult to implement than HttpSession failover.

Indeed, it seemed that the use of stateful beans should actually improve performance, since we would no longer need to wear the cost of serializing state relating to the application transaction to and from either the web tier or database /upon each request/. It seems far more natural to keep the state right there, in the middle tier, along with the business logic, where it belongs.

The conclusion I came to at that time was that the scalability problems must be due to implementation problems in existing appservers.

Well, I was talking about this with a friend who works for one of the other J2EE vendors the other day and she pointed me to this excellent paper:

HTTP Session Object vs Stateful EJB

which really debunks that particular superstition.

Actually, there is a lot of nonsense written about the desirability of so-called stateless architectures. It certainly might be true that a truly stateful design has some nice scalability characteristics. The trouble is that a stateless application can't really /do/ anything of consequence. In a real life application, the state /has to live somewhere/. Serializing it to and from the web tier is just a waste of breath. Serializing it to the database is even worse.

In future, I think I'll find myself using stateful beans all the time.

Lesson: beware J2EE folklore!

The Scope of Hibernate Three

Posted by    |       |    Tagged as Java EE

After more than a year of activity, development of the Hibernate2 branch has finally been wound up; Hibernate 2.1.3 will be one of the last releases and represents a rock-solid POJO persistence solution with essentially all the functionality needed by a typical Java application. Any future release of Hibernate 2.1 will contain only bugfixes. The branch that we have been calling 2.2, will actually be released as version 3.

The Hibernate project has previously had quite circumscribed goals - we have limited ourselves to thinking about just the very next release, and just what our users are asking for right now. That approach has finally reached the end of its usefulness. Hibernate3 is concieved in hubris, with the goal of innovating beyond what our users are asking for, or have even thought of. Certainly we will be adding features that go well beyond the functionality of the best commercial ORM solutions such as TopLink. (Some of these features will be most interesting to people working with certain specialized kinds of problems.)

So, in this spirit of hubris, I'll drop my normal policy of not boasting about things we have not yet implemented, and give a sketch of some of the things that are planned for Hibernate3 alpha (as usual, we do not commit to dates for future releases).

Virtualization

A number of important and interesting problems may be solved by presenting the user with a certain, filtered, subset of the data. For example, a user might want to see data valid at a particular point in time, or may have permissions to view data only in a particular region. Rather than forcing application code to specify this filter criteria in tedious query conditions, Hibernate 3 will allow the (parametrized) filters to be applied at the session level.

(This feature has not yet been implemented, though some of the needed refactorings are currently sitting on my laptop.)

More Mapping Flexibility

For the overwhelming majority of cases, Hibernate2 provides all the O/R mapping options you will need. However, there are now a few new features that are powerful, if used judiciously.

  • single-class-to-multiple-table mappings using <join>
  • table-per-concrete-class-mappings using <union-subclass>
  • flexible discriminators using SQL formula mappings

We have long argued favor of extremely fine grained classes, and have viewed multi-table mappings as a huge misfeature (except for the very important special case of table-per-subclass inheritance mappings). Unfortunately, several commercial vendors insist upon trying to spin this misfeature as an actual competitive advantage of their products, causing us regular irritation. So we have provided the <join> mapping mainly to shut them up.

Well, I do conceed that there is one great use-case for <join>. We can now mix together table-per-hierarchy and table-per-subclass mappings in the same hierarchy. For example:

<class name="Document" table="DOCUMENTS">
    <id name="id">...</id>
    <discriminator column="TYPE" type="string"/>
    ...
    <subclass name="Book" discriminator-value="BOOK"/>
    <subclass name="Newspaper" discriminator-value="NEWSPAPER"/>
    <subclass name="XML"  discriminator-value="XML">
        <join table="XML_DOCUMENTS">
            ....
        </join>
    </subclass>
</class>

Hibernate's implicit polymorphism is a nice way to achieve most of the functionality of table-per-concrete-class without placing unnatural restrictions upon column types of different tables. It also nicely allows table-per-concrete-class to be mixed with other inheritance mapping strategies in the same hierarchy. Unfortunately, it has two limitations:

  • we cannot have a polymorphic collection of the superclass type
  • queries against the superclass resolve to muktiple SQL queries, which could be slower than using an SQL UNION.

The new <union-subclass> construct provides an explicit way to map classes to a table-per-concrete-class model and is implemented using SQL UNIONs.

If your table-per-class-hierarchy mapping does not feature a nice simple discriminator column, where values map one-to-one to the different classes, formula discriminators are for you. You can now discriminate between subclasses using multiple columns, multiple values of the same column, arbitrary SQL expressions or functions, even subqueries! An example:

<class name="Document" table="DOCUMENTS">
    <id name="id">...</id>
    <discriminator type="string"
        formula="case when TYPE='MONOGRAPH' then 'BOOK' when type='TABLOID'\
        or type='BROADSHEET' then 'NEWSPAPER' else TYPE end"/>
    <property name="type" column="TYPE"/>
    ....
    <subclass name="Book" discriminator-value="BOOK"/>
    <subclass name="Newspaper" discriminator-value="NEWSPAPER"/>
</class>

All of these features have been implemented in the Hibernate3 branch.

Representation Independence

Hibernate was conceived as a persistence solution for POJO domain models, and that remains the focus. Occasionally, we run into people who would like to represent their persistent entities in some more dynamic way, as a map, essentially. We used to point them to OFBiz Entity Engine. Well, Hibernate3 lets you represent your domain model as trees of HashMaps, or, with a little bit of user-written code, as just about anything. Here are three immediate applications of this new feature:

  • reimplement JBoss CMP 2.1 engine on top of Hibernate
  • reimplement OFBiz EntityEngine using Hibernate
  • natively persist SDOs

Indeed, the end goal is to allow the application to switch between whichever representation is appropriate to the task at hand; the same entity might be represented as a typesafe POJO, a Map, or an SDO, all with just a single Hibernate mapping.

This feature has been implemented in the Hibernate3 branch.

JDK 1.5 Support

JSR 175 annotations are a perfect fit for Hibernate metadata and we will embrace them aggressively. Emmanuel Bernard is working on this.

We will also need to support Java generics, which basically boils down to allowing typesafe collections (which is very trivial).

Stored Procedure Support

We are not enormous fans of using stored procedures for CRUD operations (of course, there are some other use cases where SPs make wonderful sense) but people working with legacy databases often need Hibernate to call a SP instead of generating its own SQL statement. In Hibernate 2.1, it is possible to achieve this with a custom persister. Hibernate3 will allow arbitrary SQL statements to be specified in the mapping document for create, update and delete. Max Andersen is currently implementing this feature.

Full Event Driven Design

In the early days of the project, I applied YAGNI aggressively. Hibernate was occasionally criticized for supposed architectural deficiencies, due to the absence of a clear upfront design. (In my view, this lack of upfront design was not actually an architectural problem - at least by my understanding of the word architecture - but that debate is for another day...) Anyway, it was my firm belief that an elegant and flexible design would eventually grow naturally, and that our scarce attention and development resources were better spent solving actual user-visible problems. I now feel quite vindicated in this decision; Hibernate has flourished, and the design that has emerged is both powerful and reasonably elegant. Take that, YAGNI-skeptics!

There is one final step of this process, that is now in the hands of Steve Ebersole (and already partially implemented). Hibernate3 will feature an event-oriented design, with event objects representing any interesting thing that occurs, and listener classes that implement standard Hibernate behaviours, or customized user-defined behaviors. This user extension capability has applications ranging for auditing to implementing strange cascade semantics. (Our previous attempt to solve these problems - the Interceptor interface - proved to be insufficient.) Importantly, the redesign simplifies the current monolithic SessionImpl class.

New AST-driven Query Parser

Well, actually /two/ final steps. (No-one expected the Spanish inquisition, did they?)

When I wrote the HQL parser, I knew zip about parsers. Hibernate ended up with a wierd-ass handwritten pure-look-behind parser which, to my great and enduring surprise, has actually served us very well and been the source of very few bugs. (YAGNI, once again.) But it is now well past time we had a proper ANTLR-grammar-based AST, and Joshua Davis is currently writing one. This probably won't mean much in the way of user-visible changes, but it should allow us to more easily support other query languages such as EJBQL.

Declarative Session Management

This is a JBoss-specific feature for now. New users often find session management to be tricky at first. It can take a bit of thought to get your head around the schizophrenic role played by the Hibernate session as both a kind of cache, and a kind of connection. A JDBC Connection is stateless; a Hibernate Session, on the other hand, is a stateful connection! We further confuse the issue by telling people that, actually, the session is really representing a kind of transaction!

Rather than force people to implement their own session handling, messing with threadlocals and exception handling, it would be nice to have some way to specify the session model (session-per-database-transaction, or session-per-application-transaction) declaratively, in the metadata for a session bean. The Spring Framework already provides some support for this kind of approach. Michael Gloegl is working on implementing this functionality using a JBoss Interceptor. It would be wonderful to have this facility in appservers other than JBoss. J2EE needs portable Interceptors for EJBs (they already exist for servlets).

Well, I have some more things on my list, but that will do for now!

SQL Tuning

Posted by    |       |    Tagged as

If you ever work with relational databases, you should go out and buy O'Reilly's /SQL Tuning/, by Dan Tow. The book is all about how to represent a SQL query in a graphical form and then, using some simple rules of thumb, determine an optimal execution plan for the query. Once you have found the optimal execution plan, you can add indexes, query hints, or use some other tricks to persuade your database to use this execution plan. Fantastic stuff. There is even sufficient introductory material for those of us (especially me) who know less than we should about the actual technical details of full table scans, index scans, nested loops joins, hash joins, etcetera to be able to start feeling confident reading and understanding execution plans. Unlike most database books out there, this book is not very platform-specific, though it does often refer specifically to Oracle, DB2 and SQL Server.

Finalizers are even eviler than you think

Posted by    |       |    Tagged as

Developerworks is featuring the best article I have ever read on the subject of Java performance. The authors dispose of the canard that temporary object creation is expensive in Java, by explaining how generational garbage collection works in the Sun JVM (this is a bit more detailed explanation than the typical one, by the way). Well, I already knew this; Hibernate rejected the notion of object pooling right from the start (unfortunately, the EJB spec has not yet caught up).

What I did /not/ know was that objects which implement finalize() require two full garbage collection cycles to be released. Now, everyone knows that finalize() cannot be relied upon and we should not write important code in a finalizer. But /this/ finalize() method, taken from Hibernate's SessionImpl class seemed like a really good idea:

/**
 * Just in case user forgot to call close()
 */
protected void finalize() throws Throwable {
  
  log.debug("running Session.finalize()");
  
  if (isCurrentTransaction) log.warn("afterTransactionCompletion() was never called");
  
  if (connection!=null) { //ie it was never disconnected
    if ( connection.isClosed() ) {
      log.warn("finalizing unclosed session with closed connection");
    }
    else {
      log.warn("unclosed connection");
      if (autoClose) connection.close();
    }
  }
}

The main thing that this method is doing is checking to see if the naughty application forgot to close the session and, if so, log a WARN. This is a really good idea! It is otherwise quite hard to noticed unclosed sessions, and the JDBC connections they own. Unfortunately it has the terrible side-effect of preventing the session from being garbage collected immediately. Now, even after reading the article, I didn't think that this would be such a big deal, since I dereference almost all of the session's state from close(). However, My performance tests are showing a really /big/ difference in performance, just from removing the finalizer. For one problematic test, I actually /halved/ the overhead of Hibernate!

I can barely believe this result, but I've been successfully reproducing it for the last two hours.

Wrestling With Pigs

Posted by    |       |    Tagged as

I have to repeat this cliche to myself at least once a week:

/Never wrestle with a pig; you both get dirty and the pig loves it./

One of the problems with online forums is that, naturally, they are dominated by the people with the most time on their hands - and by the people with the most dogmatic views. As in any community, the loudest views are often the least-informed. When criticized in a forum like TSS , it's usually better to just stay out of the mud. As difficult as it is to let uninformed statements go unchallenged, it is almost always the best decision. Let the pig be. Disputing a post brings attention to it. If the poster is of a particular personality type, the disputation will very quickly turn personal. Maintaining your dignity once that happens is virtually impossible.

In fact, what most amazes me about IT communities is the sheer ubiquity of /argumentum ad hominem/. I've always associated computing with the pursuit of understanding via scientifically inclined methodology. Yet most of the debate that occurs in the Java community consists of name-calling. I got so mad about this today that I broke all my own rules and launched some /ad hominem/ of my own, which is really quite self-defeating, I suppose.

The big problem from my point of view is that I can't simply ignore the online forums; as an open source project they are an absolutely indispensible way for us to get our ideas heard.

Clay Shirky has written insightfully about how online communities can be designed, so it is interesting to speculate about what kind of adjustments could be made to a community like TSS if we wanted to bring out our good sides, and encourage technical arguments rather than personal ones. But perhaps the very strength of TSS is the freewheeling nature of the debate there. Flame wars get attention; they generate the most traffic.

Well, I'm a big boy. Hibernate has been subject to all kinds of outlandish criticisms right from the start. But we are growing every month. We often joke that criticisms of Hibernate invariably begin with I've never used Hibernate but... and indeed that is still true. If our actual /users/ start bitching, /then/ we will need to start listening harder!

Apologies for the nontechnical post ;)

Hibernate 2.1.2

Posted by    |       |    Tagged as Hibernate ORM

I just released 2.1.2 . This is a maintenence release, meaning no especially exciting new features (the interesting work is all going on in the 2.2 branch). However there are some small changes that might make a big performance difference in certain specific cases, especially if you are using a second-level cache. I'm hoping that this release brings the 2.1 branch to the same level of maturity that we were able to achieve with 2.0.3.

If you are having problems...

Posted by    |       |    Tagged as

I just finished a consulting job at a large retailer where we managed to increase the performance of a Hibernate application by perhaps two orders of magnitude with just some fairly simple changes. It really drove home to me how almost all performance problems I've ever seen can be solved by either or both of:

  • appropriate session handling
  • appropriate association fetching strategies

(Note that I have not yet met a serious performance problem in Hibernate 2.x that I could not solve quite quickly.)

Hibernate's Session object is a powerful abstraction that allows some extremely flexible architectural choices. Unfortunately, this flexibility comes at a cost: many people seem to stuff it up! There are three well-understood patterns for managing Hibernate sessions correctly (actually, three-and-a-half, as I've recently discovered) and three common antipatterns. The fact that the antipatterns are more common than they should be suggests a real problem with our existing documentation, and highlights the fact that /we need to get this book out!/

The main reason we've previously been unable to explain the correct ways to handle Hibernate sessions is that we simply havn't had a decent language for describing our ideas. Since we've developed this language in the process of writing the book, explanations are much easier. The key concept is the notion of an /application transaction/. An application transaction is a unit of work from the point of view of the user; it spans multiple requests and multiple database transactions - it does, however, have a well-defined beginning and end. Even if you don't currently use this notion explicitly, you probably /do/ use it implicitly in your application.

Briefly, the three acceptable approaches are: session-per-request, session-per-request-with-detached-objects and session-per-application-transaction. A variation of the third approach is the newly-discovered session-per-application-transaction-with-flush-delayed-to-the-last-request (phew!). The three broken approaches are: session-per-operation (ie. many-sessions-per-request), session-per-user-session and session-per-application. If you are using any of these approaches, please stop.

The three acceptable approaches each have different performance and architectural implications and there is no best solution. It is incredibly important to choose the approach that is most suitable to your particular application (this was the key to the two-orders-of-magnitude improvement described above).

Association fetching is, I think, covered quite well in our documentation, but we still sometimes see people struggling with the dreaded n+1 SELECTs problem. So let me be very clear: Hibernate /completely/ solves the n+1 SELECTs problem! However, it takes some thought and a (very) little work on the part of the user to take full advantage of this fact. We recommend that all associations be configured for lazy fetching by default. Then, for particular use cases, eager outer join fetching may be chosen by specifying a LEFT JOIN FETCH clause in a HQL query, or by calling setFetchMode() for a Criteria query. (If you are too lazy to do this work, you could even try the new batch fetching features of Hibernate 2.1. We don't recommend this less elegant approach, however.)

Less common performance problems may be fixed by using a second-level cache, or occasionally by managing flushing manually (set the session flush mode to FlushMode.NEVER and flush manually when required). However, in almost all cases, acceptable performance can be achieved by concentrating upon the two items listed above.

back to top