Red Hat

In Relation To

The Hibernate team blog on everything data.

Finally final review

Posted by    |       |    Tagged as

I finally got the comments from our Hibernate in Action reviewers back, so it's time to give everyone an update on the current state. First, thanks to all of you guys for your feedback. We really appreciate the many hours you spent reading and commenting on the book. As one of you said: A good book is only possible with excellent reviewers.

Your feedback (and the overall response so far) was very positive, so we soon have a great Hibernate manual for everyone. Some of the minor things you found will be fixed immediately, as we go now in copyedit and typesetting (and then, finally to press).

The current schedule is to have Chapter 1 on the Manning Early Access Program next week. For all of you that don't know the process: With MEAP, you can get the chapters of a Manning book while it's finalized. This means you get PDFs just after typesetting and after some weeks, when all chapters are ready to print (remember that at the same time the copyeditor has to proofread everything), you will get a hardcopy of the finished book too. I'll give you an update as soon as we have something to show.

I'm still busy fixing code bugs in our examples and most importantly, finishing the example application. We also use this example app in our Hibernate public training , so expect some very interesting mapping techniques and Hibernate tricks. I'll have to add some kind of presentation layer on top, but I wasn't really happy with the WebWork2/Hibernate demo app we built earlier this year (WW2 needs documentation). I have to get back into web frameworks now (poor me), an area I successfully avoided for more than a year. I think I should give Tapestry a look, as it is at least properly documented. Any recommendations? No, not Struts!

Recent and upcoming presentations on EJB3 and object persistence

Posted by    |       |    Tagged as

I've posted the slides for two recent presentations.

The first is one I did for this year's TSS symposium. It discusses some of the issues surrounding persistent identity, and how they affect everything from how you define equals() for a persistent object, to what kind of cache architecture makes sense.

The next is a presentation about EJB3, focussing on entity beans. (Note that, so far, almost all of the blog coverage of EJB3 was written by people who havn't actually read the spec - the draft is not available just yet.)

Finally, I am hosting a BOF at JavaOne, on Monday at Monday 10:30 PM, at the North Meeting Room. I'll be talking about how ORM solutions work with /graphs/ of objects, especially in the context of detach/reattach. I'll compare three different approaches: peristence by reachability, Hibernate, and EJB3.

Free support

Posted by    |       |    Tagged as Java EE

Recently, we've started to hear complaints that we don't put enough effort into free support in the Hibernate forums. This really kinda hurts, since everyone /used/ to comment that we gave such /great/ support, and since I still spend hours most days reading and responding to forum posts. I don't get paid for this, and I rarely get thanked for it either (even by the people I do get time to respond to). I've been doing this for almost three years now.

Now, these complaints, I suppose, are mostly from lazy people who can't be bothered solving their own problems, and come to us before actually reading the documentation and FAQ thoroughly, searching the forum for previous posts, and/or trying to isolate the problem and step through their code with a debugger. I highly doubt that any of the complainers have ever been on the other side of the fence: actually /giving/ support for free. Personally, I don't think we have much responsibility to help these people, but I guess it's sometimes hard to tell who is who...

I would like to believe that the nice people who actually /deserve/ free support do still get it. But I freely admit that support is not up to the standard it was at two years ago. Why?

Well, the problem here is that free support is fundamentally unscalable, and a lot of people don't seem to realize this. So let me give you some idea of the actual numbers. When Hibernate had a couple of hundred users, and only a few posts per day in the forum, it was possible to give extended responses to each question. However, the Hibernate forum got an average of 120 posts per day over the past year, including weekends (that means over 200 on many days). And that doesn't even include JIRA issues. I estimate (conservatively) that it takes about 15 minutes on average to understand and respond to any post. That means 210 hours per week of work responding to forum posts. There are maybe 5 or 6 members of the team who are active on the forums. Do the division yourself (35-42 hours per person per week, something like that). That means, for every one of us, answering questions is a /full time job/ that we don't get paid for. We would have to spend as much time in the forum as /you spend at work/. And then, somehow, we have to do our actual jobs. Now, that's ok for me, I'm used to working 12 - 16 hour days. But I'm certainly not going to expect that of the guys who help out /purely as volunteers/!

So, free support was a casualty of the success of the project. Is that a reason to not use Hibernate? I guess it's a reason to not use any successful open source project. But we want our project to scale even further! So, we've been trying to think outside the box.

OK, I confess: I am an opensource zealot. I want OSS to gradually replace commercial software in most fields, starting with middleware. (I have both moral and practical reasons for wanting this.) And I'm serious about seeing how that can actually become /real/. Now, one of the selling features of OSS is the rapid, free support that supposedly exists in free software communities. I've often been sceptical of this particular item, and my own personal experience is that it's been oversold. Certainly, free support is a wonderful thing, but making it work is Hard. We certainly /want/ to make it work!

First, we realized that a book was needed. We're really putting a lot of hope into this taking some of the pressure off. It was a really draining effort that took a lot of energy away from other areas of the Hibernate project, but it's done now and available at Manning's website. Having a userbase that has read the book, knows the basics, and knows our private language will make giving free support just so much easier!

Second, we've been (successfully, so far) building a business around Hibernate. Our dev support customers can be guaranteed responses to their questions. So, if you can't be bothered to RTFM, that's fine - just buy support, and we'll be at your service, for the most basic of questions, and the most arcane.

But now comes the interesting thing: the more paying customers we have, the more people we can get working full time on Hibernate, and so the more forum posts we can handle. Conversely, the better we handle free support in the forums, the bigger our user base grows, and the more paying customers we see. This demonstrates how free support and commercial support are /complementary/ - it's not zero sum, by any means.

And I think this is going to be common to any open source project that really wants to win against it's commercial competitors: it simply must have some commercial aspect to it, to level the playing field. We're continuing to win only because we have JBoss behind us now. A year ago, I had to take annual leave and beg or pay my own way if I wanted to speak at conferences. Meanwhile, our users were on their own if they had a problem too complex to be addressed online. Now, we can actually get out there, in the field, in front of people!

Finally, we realized that the only way that free support can scale is if the Hibernate community really starts to pitch in and help answer questions. I guess we got off on the wrong foot here, when I used to answer all questions personally. Christian told me to stop answering as many questions, to get people used to the idea of helping each other out.

Our honeymoon as a cult project is long over. We've moved into the space where other very successful open source projects like Struts or JBoss found themselves long ago: our user base no longer feels a personal connection to the developers of the project, and is much less likely to be forgiving of our wrinkles. We now start to get many developers who use Hibernate not by their own choice, but because someone else made the choice for them. We also start to see developers forced to use Hibernate where ORM is /not/ appropriate. All these things mean we start to get more negative feedback than before. We get lots of people who expect Hibernate to be perfect - especially less experienced developers who have no real appreciation of just how hard Java object/relational persistence was before solutions like Hibernate came along.

That's all very distressing for those of us who are putting our life into the project, but we need to take it for what it is: a measure of our success. We'll keep innovating regardless...

Metadata driven applications

Posted by    |       |    Tagged as

Hibernate is great at representing strongly-typed, static object models. Not all applications are like this. Metadata-driven applications define entity type information in the database. Both the object model and the relational model support dynamic addition of new types, and perhaps even redefinition of existing types. Actually, most complex applications contain a mix of both static models and dynamic models.

Suppose that our system allowed supports various types of item, each with specialized attributes. If there were a static, predefined set of item types, we would probably model this using inheritance. But what if new types may be defined dynamically by the user - with the type definitions stored in the database?

We could define an ItemType class representing the definition of an item type. Each ItemType instance would own a collection of ItemTypeAttribute instances, each representing a named attribute that applies to that particular item type. ItemType and ItemTypeAttribute define the /meta-model/.

Item instances would each have a unique ItemType and would own a collection of ItemAttributeValue instances, representing concrete values of the applicable ItemTypeAttributes.

http://hibernate.sourceforge.net/metadata.gif

The mappings for the metamodel classes is quite straightforward. The only really interesting thing is that ItemType and ItemTypeAttribute are perfect examples of classes that should have second-level caching enabled: updates are infrequent, there are relatively few instances, instances are shared between many users and many instances of the Item class.

The mappings for ItemType and ItemTypeAttribute might look like this:

<class name="ItemType">
    <cache usage="nonstrict-read-write"/>
    <id name="id">
        <generator class="native"/>
     </id>
    <property name="name" 
            not-null="true" 
            length="20"/>
    <property name="description" 
            not-null="true" 
            length="100"/>
    <set name="attributes" 
            lazy="true" 
            inverse="true">
        <key column="itemType"/>
        <one-to-many class="ItemTypeAttribute"/>
    </set>
</class>

<class name="ItemTypeAttribute">
    <cache usage="nonstrict-read-write"/>
    <id name="id">
        <generator class="native"/>
    </id>
    <property name="name" 
             not-null="true" 
             length="20"/>
    <property name="description" 
             not-null="true" 
             length="100"/>
    <property name="type" 
             type="AttributeTypeUserType" 
             not-null="true"/>
    <many-to-one name="itemType" 
             class="ItemType" 
             not-null="true"/>
</class>

We do not enable proxies for these classes, since we expect that instances will always be cached. We'll leave the definition of the custom type AttributeTypeUserType to the you!

The mappings for Item and ItemAttributeValue are also straightforward:

<class name="Item" 
        lazy="true">
    <id name="id">
        <generator class="native"/>
    </id>
    <many-to-one name="type" 
        class="ItemType" 
        not-null="true" 
        outer-join="false"/>
    <set name="attributeValues" 
            lazy="true" 
            inverse="true">
        <key column="item"/>
        <one-to-many class="Item"/>
    </set>
</class>

<class name="ItemAttributeValue" 
        lazy="true">
    <id name="id">
        <generator class="native"/>
    </id>
    <many-to-one name="item" 
        class="Item" 
        not-null="true"/>
    <many-to-one name="type" 
        class="ItemTypeAttribute" 
        not-null="true" 
        outer-join="false"/>
    <property name="value" type="AttributeValueUserType">
        <column name="intValue"/>
        <column name="floatValue"/>
        <column name="datetimeValue"/>
        <column name="stringValue"/>
    </property>
</class>

Notice that we must explicitly set outer-join="false" to prevent Hibernate from outer join fetching the associated objects which we expect to find in the cache.

Finally, we need to define the custom type AttributeValueUserType, that takes the value of an ItemAttributeValue and stores it in the correct database column for it's type.

public class AttributeValueUserType implements UserType {

    public int[] sqlTypes() {
        return new int[] { Types.BIGINT, Types.DOUBLE, Types.TIMESTAMP, Types.VARCHAR };
    }
    
    public Class returnedClass() { return Object.class; }
    
    public Object nullSafeGet(ResultSet rs, String[] names, Object owner) 
        throws HibernateException, SQLException {
        
        Long intValue = (Long) Hibernate.LONG.nullSafeGet(rs, names[0], owner);
        if (intValue!=null) return intValue;
        
        Double floatValue = (Double) Hibernate.DOUBLE.nullSafeGet(rs, names[1], owner);
        if (floatValue!=null) return floatValue;
        
        Date datetimeValue = (Date) Hibernate.TIMESTAMP.nullSafeGet(rs, names[2], owner);
        if (datetimeValue!=null) return datetimeValue;
        
        String stringValue = (String) Hibernate.STRING.nullSafeGet(rs, names[3], owner);
        return stringValue;
        
    }
    
    public void nullSafeSet(PreparedStatement st, Object value, int index) 
        throws HibernateException, SQLException {
        
        Hibernate.LONG.nullSafeSet( st, (value instanceof Long) ? value : null, index );
        Hibernate.DOUBLE.nullSafeSet( st, (value instanceof Double) ? value : null, index+1 );
        Hibernate.TIMESTAMP.nullSafeSet( st, (value instanceof Date) ? value : null, index+2 );
        Hibernate.STRING.nullSafeSet( st, (value instanceof String) ? value : null, index+3 );
    }
    
    public boolean equals(Object x, Object y) throws HibernateException {
        return x==null ? y==null : x.equals(y);
    }
    
    public Object deepCopy(Object value) throws HibernateException {
        return value;
    }
    
    public boolean isMutable() {
        return false;
    }
    
}

Thats it!

UPDATE: I don't know what I was thinking! Of course, we need to be able to query the attributes of our items, so AttributeValueUserType should be a /composite/ custom type!

public interface CompositeUserType {
    
    public String[] getPropertyNames() {
        return new String[] { "intValue", "floatValue", "stringValue", "datetimeValue" };
    }
    
    public Type[] getPropertyTypes() {
        return new Type[] { Hibernate.LONG, Hibernate.DOUBLE, Hibernate.STRING, Hibernate.TIMESTAMP };
    }
            
    public Object getPropertyValue(Object component, int property) throws HibernateException {
        switch (property) {
            case 0: return (component instanceof Long) ? component : null;
            case 1: return (component instanceof Double) ? component : null;
            case 2: return (component instanceof String) ? component : null;
            case 3: return (component instanceof Date) ? component : null;
        }
        throw new IllegalArgumentException();
    }
    
    public void setPropertyValue(Object component, int property, Object value) throws HibernateException {
        throw new UnsupportedOperationException();
    }
    
    public Class returnedClass() {
        return Object.class;
    }
    
    public boolean equals(Object x, Object y) throws HibernateException {
        return x==null ? y==null : x.equals(y);
    }
    
    public Object nullSafeGet(ResultSet rs, String[] names, SessionImplementor session, Object owner) 
        throws HibernateException, SQLException {
        
        //as above!
    }
    
    public void nullSafeSet(PreparedStatement st, Object value, int index, SessionImplementor session) 
        throws HibernateException, SQLException {
        
        //as above!
    }
    
    public Object deepCopy(Object value) throws HibernateException {
        return value;
    }
    
    public boolean isMutable() {
        return false;
    }
    
    public Serializable disassemble(Object value, SessionImplementor session) throws HibernateException {
        return value;
    }
    
    public Object assemble(Serializable cached, SessionImplementor session, Object owner) throws HibernateException {
        return value;
    }
}

Now we can write queries like this one:

from Item i join i.attributeValues value where value.name = 'foo' and value.inValue = 69

JDK 1.5 breaks my ObjectFactory

Posted by    |       |    Tagged as

This is /just great/ ...

JDK 1.5 changes the method signature of javax.naming.spi.ObjectFactory.getObjectInstance() from this:

public Object getObjectInstance(Object reference, Name name, Context ctx, Hashtable env)

to this:

public Object getObjectInstance(Object reference, Name name, Context ctx, Hashtable<String, ?> env)

AFAICT, this means that is now impossible to write an ObjectFactory that compiles in both JDK 1.4 and 1.5. Ugh.

Internationalized data in Hibernate

Posted by    |       |    Tagged as

We've seen a few people using internationalized reference data where labels displayed in the user interface depend upon the user's language. It's not immediately obvious how to deal with this in Hibernate, and I've been meaning to write up my preferred solution for a while now.

Suppose I have a table which defines labels in terms of a unique code, together with a language.

create table Label (
    code bigint not null,
    language char(2) not null,
    description varchar(100) not null,
    primary key(code, langauge)
)

Other entities refer to labels by their code. For example, the Category table needs category descriptions.

create table Category (
    category_id bigint not null primary key,
    discription_code bigint not null,
    parent_category_id foreign key references(category)
)

Note that for each description_code, there are potentially many matching rows in the Label table. At runtime, my Java Category instances should be loaded with the correct description for the user's language preference.

UI Labels should certainly be cached between transactions. We could implement this cache either in our application, or by mapping a Label class and using Hibernate's second-level cache. How we implement this is not very relevant, we'll assume that we have some cache, and can retrieve a description using:

Label.getDescription(code, language)

And get the code back using:

Label.getCode(description, language)

Our Category class looks like this:

public class Category {
    private Long id;
    private String description;
    private Category parent;
    ...
}

The description field holds the String-valued description of the Category in the user's language. But in the database table, all we have is the code of the description. It seems like this situation can't be handled in the a Hibernate mapping.

Whenever it seems like you can't do something in Hibernate, you should think UserType! We'll use a UserType to solve this problem.

public class LabelUserType {
    
    public int[] sqlTypes() { return Types.BIGINT; }
    
    public Class returnedClass() { return String.class; }
    
    public boolean equals(Object x, Object y) throws HibernateException {
        return x==null ? y==null : x.equals(y);
    }
    
    public Object nullSafeGet(ResultSet rs, String[] names, Object owner) 
        throws HibernateException, SQLException {
        
        Long code = (Long) Hibernate.LONG.nullSafeGet(rs, names, owner);
        return Label.getDescrption( code, User.current().getLanguage() );
    }
    
    public void nullSafeSet(PreparedStatement st, Object value, int index) 
        throws HibernateException, SQLException {
        
        Long code = Label.getCode( (String) value, User.current().getLanguage() );
        Hibernate.LONG.nullSafeSet(st, code, index);
    }
    
    public Object deepCopy(Object value) throws HibernateException {
        return value; //strings are immutable
    }
    
    public boolean isMutable() {
        return false;
    }
}

(We can get the current user's language preference by calling User.current().getLanguage().)

Now we can map the Category class:

<class name="Categoy">
    <id name="id" column="category_id">
        <generator class="native"/>
    </id>
    <property 
        name="description" 
        type="LabelUserType" 
        column="discription_code"
        not-null="true"/>
    <many-to-one 
        name="parent" 
        column="parent_category_id"/>
</class>

Note that we can even write queries against Category.description. For example:

String description = ...;
session.createQuery("from Category c where c.description = :description")
    .setParameter("description", description, Hibernate.custom(LabelUserType.class))
    .list();

or, to specify the code:

Long code = ...;
session.createQuery("from Category c where c.description = :code")
    .setLong("description", code)
    .list();

Unfortunately, we can't perform text-based searching using like, nor can we order by the textual description. We would need to perform sorting of (or by) labels in memory.

Notice that this implementation is very efficient, we never need to join to the Label table in our queries - we never need to query that table at all, except at startup time to initialize the cache. A potential problem is keeping the cache up to date if the Label data changes. If you use Hibernate to implement the Label cache, there's no problem. If you implement it in your application, you will need to manually refresh the cache when data changes.

This pattern can be used for more than internationalization, by the way!

Comparing ORM tools

Posted by    |       |    Tagged as Hibernate ORM

I've seen three or four ORM tool comparisons in the last three weeks; on some weblogs, on our forum and I've even been part in several decisions.

I have the impression that many developers have problems categorizing and evaluating ORM tools, no matter if its Hibernate, Cayenne, PrIdE (I hope that spelling is correct), or some home-made JDBC framework. I got really frustrated at some point, but what brings me to this blog entry is probably a posting made today, by Scott Ferguson. He compares EJB CMP, JDO, and Hibernate. I wasn't really happy with his list of points. Don't get me wrong, I'm not complaining about Scott's conclusions (our precious Hibernate!), in fact, I actually usually listen to Scott. I've even followed Resins development closely several years ago, nearly got it approved for a medium-sized installation (politics...), and even reported and fixed some bugs.

So, this entry, after a long introduction, is about comparing ORM solutions. What all the reviews and articles had in common was a very very obscure criteria schema. In one article, I've seen someone comparing loading and saving a single object and looking at the lines of code that you need for this operation. Next, we hear something like my ORM should work with objects or other vague statements that, in practice, probably not help you decide what you should use.

I did my research for Hibernate in Action, and I think we have found an excellent taxonomy for ORM solutions. Actually, Mark Fussel started to use these categories in 1997, we merely rewrote his list and set it in context to Java application development:

Pure relational

The whole application, including the user interface, is designed around the relational model and SQL-based relational operations. Direct SQL can be fine-tuned in every aspect, but the drawbacks, such as difficult maintenance, lack of portability, and maintainability, are significant, especially in the long run. Applications in this category often make heavy use of stored procedures, shifting some of the work out of the business layer and into the database.

Light object mapping

Entities are represented as classes that are mapped manually to the relational tables. Hand-coded SQL/JDBC is hidden from the business logic using well-known design patterns (such as DAO). This approach is extremely widespread and is successful for applications with a small number of entities, or applications with generic, metadata-driven data models. Stored procedures might have a place in this kind of application.

Medium object mapping

The application is designed around an object model. SQL is generated at build time using a code generation tool, or at runtime by framework code. Associations between objects are supported by the persistence mechanism, and queries may be specified using an object-oriented expression language. Objects are cached by the persistence layer. A great many ORM products and homegrown persistence layers support at least this level of functionality. It's well suited to medium-sized applications with some complex transactions, particularly when portability between different database products is important. These applications usually don't use stored procedures.

Full object mapping

Full object mapping supports sophisticated object modeling: composition, inheritance, polymorphism, and persistence by reachability or a more flexible transitive persistence solution. The persistence layer implements transparent persistence; persistent classes do not inherit any special base class or have to implement a special interface. The persistence layer does not enforce a particular programming model for the domain model implementation. Efficient fetching strategies (lazy and eager fetching) and caching strategies are implemented transparently to the application. This level of functionality can hardly be achieved by a homegrown persistence layer - it's equivalent to months or years of development time.

In my experience, it is quite easy to find the category for a given product. In Hibernate in Action, we also have a list of interesting questions that you should ask if you compare ORM tools:

  • What do persistent classes look like? Are they fine-grained JavaBeans?
  • How is mapping metadata defined?
  • How should we map class inheritance hierarchies?
  • How does the persistence logic interact at runtime with the objects of the business domain?
  • What is the lifecycle of a persistent object?
  • What facilities are provided for sorting, searching, and aggregating?
  • How do we efficiently retrieve data with associations?

In addition, two issues are common to any data-access technology. They also impose fundamental constraints on the design and architecture of an ORM:

  • Transactions and concurrency
  • Cache management (and concurrency)

Find the answers to those questions, and you can compare ORM software. Scott in fact started right with the lifecycle, but he has not given enough information in his article for a real discussion, it's mostly his opinion (which is fine on a weblog).

There are, as always in life, many solutions and not a single product, project, or specification will be perfect in all scenarios. You don't have to try to get to the top of the list and always use Full object mapping (and the appropriate tool). There are very good reasons to use a Light object mapping tool (iBatis, for example) in some situations! In many situations, JDBC and SQL are the best choice. I'm talking about a comparison at the same level, and I've made good experience with the categories and questions I've shown above. Read the book. :)

Thanks for listening

History triggers and Hibernate

Posted by    |       |    Tagged as

Recently, I helped one of our customers migrating a legacy database to Hibernate; one of the more interesting topics was versioning and audit logging. Actually, in the last couple of months, the subject of historical data came up several times. No matter if it was a legacy SQL schema or a migration from a broken object-oriented database, everyone had their own way to log data changes.

In this entry, I'll introduce a clean and nice solution for this issue. My proposal naturally integrates with Hibernate. Let's use database triggers and views instead of code in the application layer.

While it is in fact quite easy to write a Hibernate Interceptor for audit logging (an example can be found in Hibernate in Action or on the Hibernate Wiki ), we always like to use the features of the database system. Implementing audit logging in the database is the best choice if many applications share the same schema and data, and usually much less hassle to maintain in the long run.

First, let's create an entity we want to implement a change history for, a simple Item. In Java, this entity is implemented as the Item class. As usual for a Hibernate application that uses Detached Objects and automatic optimistic concurrency control, we give it an id and a version property:

public class Item {

    private Long id = null
    private int version;
    private String description;
    private BigDecimal price;

    Item() {}
    
    ... // Accessor and business methods    
}

This class is then mapped to a table using Hibernate metadata:

<hibernate-mapping>
<class name="Item" table="ITEM_VERSIONED>
    <id name="id" column="ITEM_ID">
        <generator class="native"/>
    </id>
    <version name="version" column="VERSION"/>
    <property name="description" column="DESC"/>
    <property name="price" column="PRICE"/>
</class>
</hibernate-mapping>

The name of the mapped table is ITEM_VERSIONED. This is actually not a normal base table, but a database view that joins the data from two base tables. Let's have a look at the two tables in Oracle:

create table ITEM (
    ITEM_ID    NUMBER(19) NOT NULL,
    DESC       VARCHAR(255) NOT NULL,
    PRICE      NUMBER(19,2) NOT NULL,
    PRIMARY KEY(ITEM_ID)
)

create table ITEM_HISTORY (
    ITEM_ID    NUMBER(19) NOT NULL,
    DESC       VARCHAR(255) NOT NULL,
    PRICE      NUMBER(19,2) NOT NULL,
    VERSION    NUMBER(10) NOT NULL,
    PRIMARY KEY(ITEM_ID, VERSION)
)

The ITEM table is our real entity relation. The ITEM_HISTORY table has a different primary key, using the ITEM_ID and VERSION column. Our goal is to have one row per entity instance in ITEM (the newest version of our data) and one row for each item version in ITEM_HISTORY:

ITEM_ID   DESC            PRICE
1         A nice Item.    123,99
2         Another one.     34,44

ITEM_ID   DESC            PRICE      VERSION
1         The original.   123,99     0
1         An update.      123,99     1
1         A nice Item.    123,99     2
2         Another one.     34,44     0

So, instead of mapping our Java entity to any of the two tables, we map it to a new virtual table, ITEM_VERSIONED. This view merges the data from both base tables:

create or replace view ITEM_VERSIONED (ITEM_ID, VERSION, DESC, PRICE) as
    select I.ITEM_ID as ITEM_ID,
        (select max(IH.VERSION)
            from ITEM_HISTORY HI
            where HI.ITEM_ID = I.ITEM_ID) as VERSION,
        I.DESC as DESC,
        I.PRICE as PRICE
    from   ITEM I

The ITEM_VERSIONED view uses a correlated subquery and a theta-style join to get the highest version number for a particular item from the history table, while selecting the current values from the row in ITEM. Of course we could also directly read all data from ITEM_HISTORY, but this query is more flexible, for example if you don't want to include all original columns in the history.

Hibernate can now read entities and it has a version number for automatic optimistic locking. However, we can not save entities, since the view is read-only. (In Oracle and most other databases, views created using a join can not be updated.) You will get an exception if you try to update an entity.

We solve this problem by writing a database trigger. The trigger will intercept all updates and insertions for the view and redirect the data to the base tables. This kind of trigger is called an /INSTEAD OF/ trigger. Let's first handle insertion:

create or replace trigger ITEM_INSERT
    instead of insert on ITEM_VERSIONED begin
    
    insert into ITEM(ITEM_ID, DESC, PRICE)
           values (:n.ITEM_ID, :n.DESC, :n.PRICE);
           
    insert into ITEM_HISTORY(ITEM_ID, DESC, PRICE, VERSION)
           values (:n.ITEM_ID, :n.DESC, :n.PRICE, :n.VERSION);
end;

This trigger will execute two inserts and split the data between the entity and entity history table. Next, update operations:

create or replace trigger ITEM_UPDATE
    instead of update on ITEM_VERSIONED begin
    
    update ITEM set
            DESC = :n.DESC,
            PRICE = :n.PRICE,
           where
            ITEM_ID = :n.ITEM_ID;
           
    insert into ITEM_HISTORY(ITEM_ID, DESC, PRICE, VERSION)
           values (:n.ITEM_ID, :n.DESC, :n.PRICE, :n.VERSION);
end;

The entity table is updated first, with the new data. Then, a new row is written to the ITEM_HISTORY table.

This is actually all you need to implement a basic history functionality, just check /INSTEAD OF/ trigger support in your database management system. You can even enhance this pattern and make it much more flexible: write a new Auditinfo value type class with user and timestamp information and add an auditinfo property to your entity class in Java. Map this to some new columns in your view using a Hibernate custom UserType and track the information by setting the property in a Hibernate Interceptor when updates and inserts occur. Use AOP to externalize this aspect from your POJOs...

HTH

Debunked?

Posted by    |       |    Tagged as

Abe White of Solarmetric replies to my criticisms of JDO on TSS. I'm actually not interested in getting into a lengthy debate over this, but since there /was/ an error in my first post, I must certainly acknowledge that.

First, Abe displays a trivial query that looks superficially similar in HQL and JDOQL. I'm not certain exactly what this is intended to prove, but I'm not going to get into a lengthy debate over this. I encourage anyone who is interested to compare the two query languages for themselves. It is my firm belief that a query language for ORM should look like SQL. (Unlike HQL and EJBQL, JDOQL is not targetted specifically at ORM, which explains some of the difference of opinion.) I guess I should make it clear that it is really the query language that is the showstopper for me personally.

Next, Abe points out that I am wrong about the semantics for unfetched associations in JDO detachment. I stand corrected. From the JDO1 spec, and from earlier conversations with JDO EG members, I had understood that the JDO guys were insistent that enhanced classes not be needed on the client - which means that interception could not be done for serialized instances on the client side. Apparently they have changed their minds and dropped that requirement. That's reasonable, I suppose. It probably means that there would be difficulties with enhancing at classloading time, but I'm not certain of this and I do accept that build-time enhancement is a reasonable approach to supporting detach/reattach.

There was some suggestion that my other objections to field interception might be wrong, but I think I'm right on those. I think you'll see this if you carefully consider how polymorphic associations and detachment complicate the picture. (Remember, we cannot decide the concrete subclass of an associated instance without hitting its database table or tables.)

Last, Abe argues that JDO really only has exactly the same kinds of identity and instance states as Hibernate. I agree - and that's the whole problem! Let's take identity. There is really just one kind of identity that is interesting: persistent identity. The only thing that should vary is how the identity value (primary key value) is assigned. But JDO has a different representation for datastore identity (a surrogate key generated by the persistence layer) and application identity (a natural key assigned by the application). JDO2 adds simple identity to paper over the problems with this, and complicates the picture further.

OK, no time for more, I gotta catch a plane. Of course, everyone thinks their own technology is best and that everyone who criticizes it is wrong/stupid/spreading FUD/etc. I have absolutely zero expectation that a debate about this will produce any clear technical winner (rather than smeared reputations). I'm fallible, the JDO guys are fallible, everyone else is fallible. I prefer to surrender to darwinism, and watch what solution is actually adopted in real projects.

EJB3

Posted by    |       |    Tagged as

Yesterday, Linda DeMichiel announced the changes coming in EJB 3.0. There was a lot to digest in her presentation, and I think it will take a while for people to figure out the full implications of the new spec. So far, most attention has focused upon the redesign of entity beans, but that is most certainly not all that is new! The expert group has embraced annotations aggressively, finally eliminating deployment descriptor XML hell. Taking a leaf from Avalon, Pico, Spring, Hivemind, etc, EJB will use dependency injection as an alternative to JNDI lookups. Session beans will be POJOs, with a business interface, home objects have been eliminated. Along with various other changes, this means that EJB 3.0 will be a much more appropriate solution for web-based applications with servlets and business logic colocated in the same process (which is by far the most sane deployment topology for most - but not all - applications), without losing the ability to handle more complex distributed physical architectures.

What is amazing is the broad consensus in the EJB Expert Group - all the way from traditional J2EE vendors like BEA, to we open source kiddies - about what was most needed in EJB3. There has been a lot of listening to users going on, which took me by surprise. Linda's leadership is also to be credited.

But anyway, this is the Hibernate blog, so I'm going to discuss persistence....

EJB 3.0 adopted a POJO-based entity bean programming model, very much similar to what we use in Hibernate. Entity beans may be serializable. They will not need to extend or implement any interfaces from javax.ejb. They will be concrete classes, with JavaBeans-style property accessors. Associations will be of type Set or Collection and always bidirectional, but un-managed. (Referential integrity of bidirectional associations is trivial to implement in your object model, and then applies even when the persistent objects are detached.) This model facilitates test-driven development, allows re-use of the domain model outside the context of the EJB container (especially, DTOs will be a thing of the past for many applications) and emphasizes the business problem, not the container.

Hibernate Query Language was originally based upon EJBQL, ANSI SQL and ODMG OQL, and now the circle is complete, with features of HQL (originally stolen from SQL and OQL) making their way back into EJBQL. These features include explicit joins (including outer joins), projection, aggregation, subselects. No more Fast Lane Reader type antipatterns!

In addition, EJBQL will grow support for bulk update and bulk delete (a feature that we do not currently have). Many users have requested this.

For the occasional cases where the newly enhanced EJBQL is not enough, you will be able to write queries in native SQL, and have the container return managed entities.

EJB 3.0 will replace entity bean homes with a singleton EntityManager object. Entities may be instantiated using new, and then made persistent by calling create(). They may be made transient by calling remove(). The EntityManager is a factory for Query objects, which may execute named queries defined in metadata, or dynamic queries defined via embedded strings or string manipulation. The EntityManager is very similar to a Hibernate Session, JDO PersistenceManager, TopLink UnitOfWork or ODMG Database - so this is a very well-established pattern! Association-level cascade styles will provide for cascade save and delete.

There will be a full ORM metadata specification, defined in terms of annotations. Inheritance will finally be supported, and perhaps some nice things like derived properties.

Because everyone will ask....

What was wrong with JDO? Well, the EJB expert group - before I joined - came to the conclusion that JDO was just not a very appropriate model for ORM, which echoes the conclusion reached by the Hibernate team, and many other people. There are some problems right at the heart of the JDO spec that are simply not easy to fix.

I'm sure I will cop enormous amounts of flak for coming out and talking about these problems in public, but I feel we need to justify this decision to the community, since it affects the community, and since the EG should be answerable to the community. We also need to put to rest the impression that this was just a case of Not Invented Here.

First, JDOQL is an abomination. (There I said it.) There are four standard ways of expressing object-oriented queries: query language, query by criteria, query by example and native SQL. JDOQL is none of these. I have no idea how the JDO EG arrived at the design they chose, but it sort of looks as if they couldn't decide between query language and query by criteria, so went down some strange middle road that exhibits the advantages of neither approach. My suggestion to adopt something like HQL was a nonstarter. The addition of support for projection and aggregation in JDO2 makes JDOQL even uglier and more complex than before. This is /not/ the solution we need!

Second, field interception - which is a great way to implement stuff like Bill Burke's ACID POJOs or the fine-grained cache replication in JBossCache - turns out, perhaps surprisingly, to be a completely inappropriate way to implement POJO persistence. The biggest problem rears its head when we combine lazy association fetching with detached objects. In a proxy-based solution, we throw an exception from unfetched associations if they are accessed outside the context of the container. JDO represents an unfetched association using null. This, at best, means you get a meaningless NPE instead of a LazyInitializationException. At worst, your code might misinterpret the semantics of the null and assume that there is no associated object. This is simply unnacceptable, and there does not appear to be any way to fix JDO to remedy this problem, without basically rearchitecting JDO. (Unlike Hibernate and TopLink, JDO was not designed to support detachment and reattachment.)

Proxy-base solutions have some other wonderful advantages, such as the ability to create an association to an object without actually fetching it from the database, and the ability to discover the primary key of an associated object without fetching it. These are both very useful features.

Finally, the JDO spec is just absurdly over-complex, defining three - now four - types of identity, where one would do, umpteen lifecycle states and transitions, when there should be just three states (persistent, transient, detached) and is bloated with useless features such as transient transactional instances. Again, this stuff is just not easy to change - the whole spec would need to be rewritten.

So, rather than rewriting JDO, EJB 3.0 entities will be based closely upon the (much more widely adopted) dedicated ORM solutions such as Hibernate and TopLink.

back to top