Hibernate3 Join Tricks

Posted by    |      

One of the joys of working on an open source project with commercial competitors is having to implement features that our users simply don't ask for, and probably won't use in practice, just because those competitors try to spin their useless features as a competitive advantage. We realized ages ago that it's really hard to tell people that they don't need and shouldn't use a feature if you don't have it.

Multi-table mappings started out as a good example of that kind of features. We have been repeating the your object model should be at /least/ as fine-grained as your relational schema mantra for years now. Unfortunately, we keep hearing this echo back as Hibernate can't do multitable mappings. Nobody has ever once shown me a truly compelling usecase for multitable mappings in a real application, but apparently, if our competitors are to be believed, it is common to find schemas with attributes of the same entity scattered randomly across several different physical tables. I'll have to take their word on that one. I'm not saying you will /never/ run into this kind of thing and, indeed, I've seen a few borderline cases, though nothing that wasn't at least arguably better represented as an association. But certainly, to my mind, valid usecases for multitable mappings are not something you run into commonly enough for this to be an important feature. Perhaps the difference in perception is due to the fact that only /sane/ organizations use Hibernate.

Anyway, we introduced the <join/> mapping, just so we could tell people not to use it. Actually, it was fun to implement, and helped me make some really nice refactorings to the EntityPersister hierarchy.

Then a funny thing happened. I started to think of all kinds of useful things to do with <join/>, none of which had anything much to do with multitable mappings, as usually understood. And I'm pretty certain that these things were not what the other guys were talking about!

The first application I came up with is a mixed inheritance mapping strategy. Before, you had a choice between <subclass/> and <joined-subclass/> (now also <union-subclass/>), and you had to stick with that one strategy for the whole hierarchy.

It's now possible to write a mapping like this:

<class name="Superclass" 
        table="parent"
        discriminator-value="0">
    <id name="id">.....</id>
    <discriminator column="type" type="int"/>
    <property ...../>
    ...
    
    <subclass name="Subclass" 
            discriminator-value="1">
        <property .... >
        ...
    </subclass>
    
    <subclass name="JoinedSubclass" 
            discriminator-value="-1">
        <join table="child">
            <property ...../>
            ....
        </join>
    </subclass>
    
</class>

That's /really/ useful.

The next thing that <join/> can be used for required a little tweak. I added an inverse attribute to the join element, to declare that the joined table should not be updated by the owning entity. Now, it's possible to map an association (link) table - which usually represents a many-to-many association - with one-to-many multiplicity in the domain model. First, we have a basic many-to-many mapping, on the Parent side:

<class name="Parent">
    ...
    <set name="children" table="ParentChild" lazy="true">
        <key column="parentId"/>
        <many-to-many column="childId" class="Child"/>
    </set>
</class>

Now, we use a <join> mapping, to hide the association table from the Child end:

<class name="Child">
    ...
    <join table="ParentChild" inverse="true">
        <key column="childId"/>
        <many-to-one name="parent" column="parentId"/>
    </join>
</class>

Well, I'm not sure really how useful this is, but I was always jealous of the TopLink guys when they bragged how they could do this, and we got it /almost/ for free!

A third trick was also inspired by TopLink. A number of former TopLink users porting code to Hibernate found that Hibernate's table-per-class mapping strategy has significantly different performance characteristics to TopLink's. Hibernate has what seems to be a unique implementation of the table-per-class mapping strategy, in that no discriminator column is required to achieve polymorphism. Instead, Hibernate performs an outer join across all sublass tables, and checks which primary keys values are null in each returned row of results in order to determine the subclass that the row represents. In most circumstances, this offers an excellent performance balance, since it is not vulnerable to the dreaded N+1 selects problem. Furthermore, it does not require the addition of a type discriminator column to the table of the root class, which really feels extremely unnatural and redundant for this relational model.

An alternative approach, that TopLink uses, is to perform an initial query, check the value of a discriminator column, and then issue an extra query if the row represents a subclass instance. This isn't usually very efficient for shallow inheritance trees, but what we've seen is that some ex-TopLink users have created very deep or wide inheritance trees, in which case Hibernate's strategy can result in a single query with simply too many joins.

So, I added the outer-join attribute to <join/>. Its effect is slightly subtle. Consider the following mapping:

<class name="Foo" table="foos" discriminator-value="0">
    <id name="id">...</id>
    <discriminator column="type" type="int"/>
    <property name="name"/>
    <subclass name="Bar" discriminator-value="1">
        <join table="bars">
            <key column="fooId"/>
            <property name="amount"/>
        </join>
    </subclass>
</class>

When we execute a HQL query against the subclass Bar, Hibernate will generate SQL with an inner join between foos and bars. If we query against the superclass Foo, Hibernate will use an outer join.

(Note that you would not write the above mapping in practice; instead you would use <joined-subclass/> and eliminate the need for the discriminator.)

Suppose we set outer-join="false":

<class name="Foo" table="foos" discriminator-value="0">
    <id name="id">...</id>
    <discriminator column="type" type="int"/>
    <property name="name"/>
    <subclass name="Bar" discriminator-value="1">
        <join table="bars" outer-join="false">
            <key column="fooId"/>
            <property name="amount"/>
        </join>
    </subclass>
</class>

Now, when we query the subclass, the same SQL inner join will be used. But when we query the superclass, Hibernate won't use an outer join. Instead, it will issue an initial query against the foos table, and a sequential select against the bars table, whenever it finds a row with a discriminator value of 1.

Well, that's not such a great idea in this case. But imagine if Foo had a very large number of immediate subclasses. Then we might be avoiding a query with very many outer joins, in favor of several queries with no joins. Well, perhaps some people will find this useful....


Back to top