Red Hat

In Relation To

The Hibernate team blog on everything data.

The Seam Component Model

Posted by    |       |    Tagged as Seam

One of the distinctive features of Seam is that a lot more things are treated as components than what you might be used to from other architectures. In fact, pretty much every object you write will be treated as a Seam component.

For example, it is not normal in other frameworks to think of entity objects as components. But in Seam, we would typically treat User as a session-scope component (to model the currently logged in user) and entities like Order and Document as conversation-scope components (most conversations are centric to some entity instance).

Objects that listen for events are also components. Just like JSF, Seam does not have any notion of an Action class. Rather, it lets events be bound to methods of any Seam component. This means, in particular, that one Seam component can model a whole conversation, with the attributes of the component holding conversational state, and the methods of the component defining the functionality that occurs for each step in the conversation. (Note that this would be possible in plain JSF, except for the fact that JSF does not define a conversation context.)

The vision for Seam is that the notion of event will also be a unifying one. An event might be a UI event, it might be Web Services request, a transition event for the long-running business process, a JMS message, or something else we havn't thought of yet.

Again like JSF, an event listener method never takes any argument. This is quite in contrast to GUI frameworks like Swing, or Action based web frameworks like Struts where there is some Event object passed as a paremeter to the action method (a Servlet HttpRequest is an example of this pattern). Alternatively, some other frameworks will expose some thread-bound context object to the action listener. Both JSF and Seam offer thread-bound contexts as a secondary mechanism, but in the case of Seam, this mechanism is for exceptional cases only.

JSF has the right idea here. Ideally, the whole state of the system can be represented by components, that are assembled together automatically by the container. This eliminates the event object or context object from the view of the application, resulting in tidier and more naturally object-oriented code. Under the covers, the JSF implementation locates dependent beans of a managed bean using named context variables, and automagically instantiates a new instance of the needed managed bean if the context variable is empty.

Unfortunately, JSF is limited in three ways.

First, initilizing or modifying the value of a context variable requires direct access to the context object, FacesContext, breaking the abstraction. Seam fixes this problem by introducing outjection.

@In @Out Document document;

public void updateDocument() {
    document = entityManager.merge(document);
}

@Out User user;

public void login() {
    user = entityManager.createQuery("from User user ....")
           .setParameter("username", username)
           .setParameter("password", password)
           .getSingleResult();
}

Second, assembly (dependency injection) is performed when a managed bean is instantiated, which means that, (a) a component in a wide scope (the session, say) cannot have a reference to a component in a narrow scope (the request, for example) and (b) if we modify the value of a context variable, components which were already assembled will not see the new value, and will keep working with the obsolete object. Seam fixes this problem by making assembly (injection/outjection) a process that occurs at component invocation time.

@Entity @Scope(REQUEST)
public class OrderLine { ... }

@Stateful @Scope(CONVERSATION)
public class CreateOrderConversation {
   @In OrderLine orderLine;

   public addOrderLine()
   {
       order.addOrderLine(orderLine);
   }
}

At this time, most other IoC frameworks have the same two limitations, and this is perfectly defensible in the case of something like Spring where the components being injected are understood to be stateless, and hence any two instances of a component are interchangeable. It's not so good if components are stateful, as they are in JSF.

Third, JSF has just three contexts (request, session, application). So, if I want to hold state relating to a particular user across several requests, I have to put it in a session scoped component. This makes it effectively impossible to use JSF managed beans to create an application where the user can be doing two different things, concurrently, working in two windows at once! It also leaves it up to you to clean up state when you're finished with it, by manually removing session context variables via the context object (FacesContext).

Seam is the first attempt to create a truly uniform and unifying model of contexts which are meaningful to the /application/. The five basic contexts are event, conversation, session, business process and application. There is also the stateless psuedo-context.

The conversation context is a logical (application demarcated) context scoped to a particular view (browser window). For example, in an insurance claims application, where user input is collected over several screens, the Claim object might be a conversation scoped component.

The business process context holds state associated with the long running business process being orchestrated by jBPM. If review and approval of the insurance claim involves interaction with several different users, the Claim could be a business process scoped component, and would be available for injection during each user interaction.

You might object that an object might have one scope some of the time, and another scope at other times. Actually, I think this happens /much/ less frequently than you might expect, and if it does occur, Seam will support the idea of the same class having multiple roles in the system.

For applications with /extremely/ complex workflows, nested conversations and nested business processes are almost certainly needed, which opens the possibility for an /arbitrary/ set of scopes. Seam does not currently implement this, but the context model of Seam is designed for eventual extension to cover this kind of thing.

We've even discussed introducing more exotic contexts. Do transaction scoped components make sense? Probably not for application components, but possibly for infrastructural components. (Yes, the Seam component model has uses beyond application component management.) For now I'd prefer not to add things like this until we see a very clear need.

So, by this stage you're probably asking what this idea of contextual components actually /buys/ you? Well, for me there are three key things.

First, it allows us to bind stateful components, expecially entity beans, directly to the webpage. (Note that if you are going to bind your entities directly to JSF forms, you will also need some nice way to do validation, which is where Seam's integration with Hibernate Validator comes into the picture.) So, you can build a whole application from just pages, session-beans bound to events produced by the page, and entity beans bound to the page. It is this possibility for an unlayered architecture which makes Seam such a potentially productive environment. (Of course, if you want to introduce extra layers yourself, you can certainly do that, but it is not forced upon you.)

Second, it means that the container (Seam) can guarantee cleanup of state from ended or abandoned conversations. In the case of abandoned conversations, Seam gives you a choice: for server-side conversations, there is a conversation timeout that is independent of the session timeout. Alternatively, you can use client-side conversations.

Finally, the model allows stateful components to interact in a relatively loosly coupled way. Clients of a component need not to be aware of its lifecycle, or of its relationships to other components. All they need to know is what /kind/ of thing it is, and what are its operations.

Seam

Posted by    |       |    Tagged as Seam

We released Seam today.

http://jboss.com/products/seam

I feel like this is something I'm obligated to blog about. Unfortunately, I've just spent the past two weeks writing docs and webpages and blurbs and I feel if I sit down now and try to describe what Seam is all about, then I'll just be saying the same stuff, yet again, in another slightly different way. (And I'm afraid I get more grandiose each time.)

So let me just link to this page instead:

http://docs.jboss.com/seam/reference/en/html/pr01.html

Hopefully, I'll find the time to get a bit deeper into specific bits of Seam in future posts.

Thanks to everyone who made this release possible.

Generic DAO pattern with JDK 5.0

Posted by    |       |    Tagged as

One of the things in Hibernate in Action that needs serious improvement is the section about data access objects, the good old DAO pattern.

Things gone wrong

Some of the mistakes I've made in the first edition:

I assumed that DAO is not only a well known pattern, but that most people already had experience how to write /state-oriented/ data access object interfaces. It turns out that most developers are very familiar with /statement-oriented/ DAOs, like you would implement with plain JDBC or iBatis (or the new stateless Hibernate Session in Hibernate 3.1), but never thought about persistence services with active object state management before - and how to design an application data access interface for such a service.

All examples had internal exception handling. Now, in a real application I'd just take the Hibernate 2.x source code, replace HibernateException extends Exception with HibernateException extends RuntimeException and leave it to my controlling code, not the data access objects, to convert and handle exceptions (including transaction rollback).

But, in the book, I was trying to avoid this issue, something we should actually have resolved in Hibernate 2.x a long time ago. We finally switched to unchecked exceptions in Hibernate 3.x but I still see many DAO snippets that include some kind of exception handling (or even worse, an exception handling /framework/) for unchecked exceptions.

Finally, lazy transaction demarcation, that is, starting a transaction if the current thread doesn't already have one, was only shown as part of the DAO constructor. There are many other ways to do this. One would be the DAO factory, which I also skipped in the book, (wrongly) assuming that everybody knows how to write a factory.

Some other common mistakes I've seen in the past (even in recent other Hibernate books) and you should avoid when writing your DAOs:

Excessive exception handling and wrapping. You should never have to wrap and re-throw an unchecked exception in a DAO method, certainly not for logging, or worse to swallow the exception. Avoid, change, or wrap persistence services that only throw checked exceptions. The only exception states of interest, in higher level code, are retry the unit of work (e.g. database connection failed, lock couldn't be aquired, etc.) or don't retry the unit of work (constraint violation, SQL error, etc.). Everything else, like exceptions typed to the exact SQL error, are sugar on top - at most good for better looking error message screens. You can't use database exceptions for validation or any other regular data processing. Don't let yourself be fooled by the elaborate exception hierarchies in Hibernate and other frameworks.

Ignorance of object state management. If you have a full ORM solution like Hibernate, you get automatic object state management. If a property value of a business object instance is modified you don't have to call saveThisObject() on a DAO, automatic dirty checking is provided inside a transaction. Or, if you write your DAO with state management in mind, understand the state transitions to avoid unnecessary code: session.update(o) does not need to be called before session.delete(o), for example.

Session per operation anti-pattern. Never should you use a new Hibernate Session for each DAO operation. A unit of work has a larger scope and your Hibernate application will perform horrible with this anti-pattern. This is very easy to avoid, just setthe DAO's Session when it is constructed or look it up from a thread local. In fact, the scope of the Session often seems to be handled ad-hoc and on a this is better looking code basis, when it should actually be a conscious design decision.

So, in the second edition I plan to include more than just 4 pages about the DAO pattern and write more about state/statement-oriented APIs, exception handling, how to best deal with lazy transactions, how to add factories into the mix, and so on. I'll even include DAOs written as stateless EJB 3.0 session beans.

New DAO pattern for JDK 5.0

I wanted to actually defer updating the DAO section until I would work on that chapter in the book. However, some guys made proposals for /generic/ DAOs on the CaveatEmptor forum and I realized it really is time to put some of the new JDK 5.0 features to good use (besides annotations, of course). If you are still using JDK 1.4 primarily you might want to stop reading now... but I strongly encourage you to read the Hibernate in Action DAO examples (and any other examples, for that matter) with the caveats mentioned above in mind.

If you are using JDK 5.0, get the CaveatEmptor alpha2 release now. Let's walk through some of the DAO source code.

This time I based the DAO example on interfaces. Tools like Hibernate already provide database portability, so persistence layer portability shouldn't be a driving motivation for interfaces. However, DAO interfaces make sense in more complex applications, when several persistence services are encapsulate in one persistence layer, and they deserve more than a note (which was all I did in HiA first edition).

I use one interface per persistent entity, with a super interface for common CRUD functionality:

public interface GenericDAO<T, ID extends Serializable> {

    T findById(ID id, boolean lock);

    List<T> findAll();

    List<T> findByExample(T exampleInstance);

    T makePersistent(T entity);

    void makeTransient(T entity);
}

You can already see that this is going to be a pattern for a state-oriented data access API, with methods such as makePersistent() and makeTransient(). Furthermore, to implement a DAO you have to provide a type and an identifier argument. As for most ORM solutions, identifier types have to be serializable.

The DAO interface for a particular entity extends the generic interface and provides the type arguments:

public interface ItemDAO extends GenericDAO<Item, Long> {

    public static final String QUERY_MAXBID = "ItemDAO.QUERY_MAXBID";
    public static final String QUERY_MINBID = "ItemDAO.QUERY_MINBID";

    Bid getMaxBid(Long itemId);
    Bid getMinBid(Long itemId);

}

We basically separate generic CRUD operations and actual business-related data access operations from each other. (Ignore the named query constants for now, they are convenient if you use annotations.) However, even if only CRUD operations are needed for a particular entity, you should still write an interface for it, even it it is going to be empty. It is important to use a concrete DAO in your controller code, otherwise you will face some refactoring once you have to introduce specific data access operations for this entity.

An implementation of the interfaces could be done with any state-management capable persistence service. First, the generic CRUD implementation with Hibernate:

public class GenericHibernateDAO<T, ID extends Serializable> implements GenericDAO<T, ID> {

    private Class<T> persistentClass;
    private Session session;

    public GenericHibernateDAO(Class<T> persistentClass, Session session) {
        this.persistentClass = persistentClass;
        this.session = session;
    }

    protected Session getSession() {
        return session;
    }

    public Class<T> getPersistentClass() {
        return persistentClass;
    }

    public T findById(ID id, boolean lock) {
        T entity;
        if (lock)
            entity = (T) getSession().load(getPersistentClass(), id, LockMode.UPGRADE);
        else
            entity = (T) getSession().load(getPersistentClass(), id);

        return entity;
    }

    @SuppressWarnings("unchecked")
    public List<T> findAll() {
        return findByCriteria();
    }

    @SuppressWarnings("unchecked")
    public List<T> findByExample(T exampleInstance) {
        return findByCriteria( Example.create(exampleInstance) );
    }

    @SuppressWarnings("unchecked")
    public T makePersistent(T entity) {
        getSession().saveOrUpdate(entity);
        return entity;
    }

    public void makeTransient(T entity) {
        getSession().delete(entity);
    }

    /**
     * Use this inside subclasses as a convenience method.
     */
    @SuppressWarnings("unchecked")
    protected List<T> findByCriteria(Criterion... criterion) {
        Criteria crit = getSession().createCriteria(getPersistentClass());
        for (Criterion c : criterion) {
            crit.add(c);
        }
        return crit.list();
   }

}

There are some interesting things in this implementation. First, it clearly needs a Session to work, provided to the constructor. How you set the Session, and what scope this Session has is of no concern to the actual DAO implementation. What follows are the implementations of the generic CRUD operations, quite straightforward. The last method is quite nice, using another JDK 5.0 feature, /varargs/. It helps us to build Criteria queries in concrete entity DAOs. This is an example of a concrete DAO that extends the generic DAO implementation for Hibernate:

public class ItemDAOHibernate
        extends     GenericHibernateDAO<Item, Long>
        implements  ItemDAO {

    public ItemDAOHibernate(Session session) {
        super(Item.class, session);
    }

    public Bid getMaxBid(Long itemId) {
        Query q = getSession().getNamedQuery(ItemDAO.QUERY_MAXBID);
        q.setParameter("itemid", itemId);
        return (Bid) q.uniqueResult();
    }

    public Bid getMinBid(Long itemId) {
        Query q = getSession().getNamedQuery(ItemDAO.QUERY_MINBID);
        q.setParameter("itemid", itemId);
        return (Bid) q.uniqueResult();
    }

}

Another example which uses the findByCriteria() method of the superclass with variable arguments:

public class CategoryDAOHibernate
        extends     GenericHibernateDAO<Category, Long>
        implements  CategoryDAO {

    public CategoryDAOHibernate(Session session) {
        super(Category.class, session);
    }

    public Collection<Category> findAll(boolean onlyRootCategories) {
        if (onlyRootCategories)
            return findByCriteria( Expression.isNull("parent") );
        else
            return findAll();
    }
}

We bring it all together in a DAO factory, which not only sets the Session when a DAO is constructed, but can also take care of lazy transaction starting, and contains inline classes to implement CRUD-only DAOs with no business-related operations:

public class HibernateDAOFactory extends DAOFactory {

    public ItemDAO getItemDAO() {
        return new ItemDAOHibernate(getCurrentSession());
    }

    public CategoryDAO getCategoryDAO() {
        return new CategoryDAOHibernate(getCurrentSession());
    }

    public UserDAO getUserDAO() {
        // Inline implementation, constructor only
        class UserDAOHibernate
                extends GenericHibernateDAO<User, Long>
                implements UserDAO {

            public UserDAOHibernate(Session session) {
                super(User.class, session);
            }

        }

        return new UserDAOHibernate(getCurrentSession());
    }

    protected Session getCurrentSession() {
        // Get a Session and begin a database transaction. If the current
        // thread/EJB already has an open Session and an ongoing Transaction,
        // this is a no-op and only returns a reference to the current Session.
        HibernateUtil.beginTransaction();
        return HibernateUtil.getCurrentSession();
    }

}

This concrete factory for Hibernate DAOs extends the abstract factory, which is the interface we'll use in application code:

public abstract class DAOFactory {

    public static final DAOFactory EJB3_PERSISTENCE =
            new org.hibernate.ce.auction.dao.ejb3.Ejb3DAOFactory();

    public static final DAOFactory HIBERNATE =
            new org.hibernate.ce.auction.dao.hibernate.HibernateDAOFactory();

    public static final DAOFactory DEFAULT = HIBERNATE;

    // Add your DAO interfaces here
    public abstract ItemDAO getItemDAO();
    public abstract CategoryDAO getCategoryDAO();
    public abstract CommentDAO getCommentDAO();
    public abstract UserDAO getUserDAO();
    public abstract CategorizedItemDAO getCategorizedItemDAO();
}

Of course, the DEFAULT could be externalized to some configuration setting if your application needs to switch persistence layers for each deployment. Note that this factory example is suitable for persistence layers which are primarily implemented with a single persistence service, such as Hibernate or EJB 3.0 persistence. If you have to mix persistence APIs, for example, Hibernate and plain JDBC, the pattern changes slightly. I'll write more about it in the book - keep in mind that you can also call session.connection() /inside/ a Hibernate-specific DAO, or use one of the many bulk operation/SQL support options in Hibernate 3.1 to avoid plain JDBC.

Finally, this is how data access now looks like in controller/command handler code:

ItemDAO itemDAO = DAOFactory.DEFAULT.getItemDAO();
UserDAO userDAO = DAOFactory.DEFAULT.getUserDAO();

public void execute() {

    Bid currentMaxBid = itemDAO.getMaxBid(itemId);
    Bid currentMinBid = itemDAO.getMinBid(itemId);

    Item item = itemDAO.findById(itemId, true);

    newBid = item.placeBid(userDAO.findById(userId, false),
                            bidAmount,
                            currentMaxBid,
                            currentMinBid);
}

Two links you might find useful: ThreadLocal Session and Open Session in View

I also plan to write a section about /integration testing/ of DAOs. What do you think about this pattern? Did I miss anything and what else do you want covered in the book?

P.S. Credit has to be given to Eric Burke, who first posted the basics for this pattern on his blog. Unfortunately, not even the Google cache is available anymore.

Updated CaveatEmptor with DAO pattern and Nested Set

Posted by    |       |    Tagged as

As promised, a current snapshot of my work on Hibernate in Action, second edition. The CaveatEmptor alpha2 release has some quite interesting new examples:

  • New DAO pattern based on JDK 5.0 generics
  • First implementation of a Nested Set model for tree-structured read-mostly data

I'll blog about both later this week. Progress on the second edition of HiA is good but there is just so much new material to write about (esp. annotation mappings). Let's see if we can keep the early Q4 release date.

Hibernate Tools Alpha 5 released

Posted by    |       |    Tagged as

A new updated version of the Hibernate Tools (http://tools.hibernate.org) project have been made available.

The tools are for both Eclipse and ANT, the ANT related docs are at http://www.hibernate.org/hib_docs/tools/ant/index.html

The eclipse plugins is now compatible with WTP 0.7 and contains a bunch of new features and improvements.

My personal favorite at the moment, is the Dynamic Query Translator view which continously shows you the SQL, Hibernate will generate for the query you are typing in the HQL editor. Great for learning and tuning HQL queries - note that EJB3-QL works here too.

The HQL editor is also new and replaces the HQL view and introduces code formatting, syntax highlighting and code completion.

Furthermore we have a initial class diagram view as well as many other improvements to wizards, templates, and code generators. See the complete list with screenshots at class diagrams at http://www.hibernate.org/hib_docs/tools/newandnoteworthy/hibernate-eclipse-news-3.1.0.alpha5.html for more information.

As always feedback, ideas, contributions, patches, bug-reports are more than welcome by the usual means at http://forum.hibernate.org and JIRA.

par-tition your application

Posted by    |       |    Tagged as

Packaging has always been a manual operation in ORM world. In Hibernate, you have to list the mapped entities either through the configuration API or through the hibernate.cfg.xml file. For a while now, JBoss AS has introduced the notion of .har, basically an archive scanned by the deployer to discover the Hibernate configuration and the hbm.xml files in it.

Packaging inside an EJB3 container

The EJB3 expert group has introduced the same notion in the EJB3 public draft. A PAR archive is basically a jar file having the .par extension. All you have to do is put all your annotated entities in the archive and the container has the responsibility to scan it and find all annotated entities. A PAR archive is a persistence unit definition that will be used to create an EntityManagerFactory (aka SessionFactory in the Hibernate world). You will then be able to use your persistence unit (by looking up or injecting an EntityManager or an EntityManagerFactory) named by the name of the PAR file without the extension (ie mypersistenceunit.par will be referred as mypersistenceunit).

Since you might want to customize your persistence unit configuration, a persistence.xml file can be added in the META-INF directory.

<?xml version="1.0" encoding="UTF-8"?>
<entity-manager>
   <name>FinancialPU</name>
   <provider>org.hibernate.ejb.HibernatePersistence</provider>
   <jta-data-source>jdbc/MyDB</jta-data-source>
   <class>com.acme.MyClass</class>
   <jar-file>externalEntities.jar</jar-file>
   <properties>
       <property name="hibernate.max_fetch_depth" value="4"/>
   </properties>
</entity-manager>

Let's analyze this small but comprehensive example.

The name element allows you to override the persistence unit name (defaulted to the PAR file name minor the .par suffix).

The provider element allows you to express the Entity Manager implementation you want to use for this persistence unit. The value is defaulted to Hibernate Entity Manager if none is specified. This is a interesting one, this basically means that you can use several Entity Manager implementations in the same application or use the Hibernate Entity Manager implementation in lieu of your vendor EJB3 persistence implementation in a standard way!

The jta-data-source aside with the non-jta-data-source let you specify the datasource the persistence unit will work onto.

The class element, allows you to add explicitly some entities to be mapped. These entities are typically outside of the PAR archive and the Entity Manager will search them in the EAR classpath. This is particularly convenient to be able to share the same entity definition across several persistence unit.

The jar-file element, allows you to ask the entity manager implementation to add all the entities contained in a particular JAR and include them in the configuration. In the case of the Hibernate Entity Manager, it will also look at the hbm.xml files. This is particularly convenient to share a certain amount of entities definitions across several persistence units.

There is also a mapping-file element currently not supported in Hibernate Entity Manager's implementation.

The properties elements is a way to provide some implementation specific properties to your entity manager. In the case of Hibernate you can add most of the hibernate.* properties. You can also define the second level cache informations using hibernate.ejb.classcache.* and hibernate.ejb.collectioncache.*, please refer to the reference documentation for more information.

This is good news for JBoss users, the .har archive is now standardized. The packaging that has always been a strong concept in J2EE is now extended to the ORM world in a very ease of use manner.

Packaging in J2SE environment

The very new point is that the PAR packaging simplicity works in the exact same manner in the J2SE world. The only difference is that you'll need to define your datasource not through the jta-data-source element but through the classic hibernate.* connection properties. The PAR archive is still scanned to find its contained entities and hbm.xml files. In order to let the Hibernate Entity Manager discover the PAR files, they need to have a persistence.xml file in the META-INF directory (Hibernate Entity Manager basically request any resources named META-INF/persistence.xml and deduces the PAR archive location from it).

Let's imagine the following acmedomainmodel.par archive structure

com/acme/model/Animal.class (an @Entity annotated class)
com/acme/model/Dog.class (an @Entity annotated class)
com/acme/model/Cat.class (an @Entity annotated class)
com/acme/model/Customer.class (a non annotated POJO)
com/acme/model/Customer.hbm.xml (the metadata definitions of Customer)
META-INF/persistence.xml

where persistence.xml is

<?xml version="1.0" encoding="UTF-8"?>
<entity-manager>
   <properties>
       <property name="hibernate.max_fetch_depth" value="4"/>
       <property name="hibernate.dialect" value="org.hibernate.dialect.MySQLInnoDBDialect"/>
       <property name="hibernate.connection.driver_class" value="com.mysql.jdbc.Driver"/>
       <property name="hibernate.connection.username" value="emmanuel"/>
       <property name="hibernate.connection.password" value="secret"/>
       <property name="hibernate.connection.url" value="[=>jdbc:mysql:///test]"/>
       <property name="hibernate.cache.provider_class" value="org.hibernate.cache.EhCacheProvider"/>
   </properties>
</entity-manager>

My persistence unit named acmedomainmodel will then contains automatically Animal, Dog, Cat, Customer. Note that the Customer class don't really have to be in the PAR archive, it just need to be in the classpath as long as its hbm.xml definition itself is inside the PAR archive.

Note that you can tune the discovery mechanism through the hibernate.ejb.autodetection. The possible values are none (no auto detection), class (auto detection of the annotated entities), hbm (auto detection of the hbm files) and class,hbm (auto detection of both annotated entities and hbm files).

With a simple ant task you can then create a PAR archive which contains automatically your persistent domain model. No need to manually add the mapped entities inside the hibernate.cfg.xml file anymore.

Several persistent unit in my application

You can of course use several PARs archive in your application. The appropriate PAR archive will be processed based on the name your provide.

//create a keep the emf for later entity manager creations
EntityManagerFactory emf = Persistence.createEntityManagerFactory("acmedomainmodel");
...
EntityManager em = emf.createEntityManager();
em.getTransaction().begin();
em.persist(customer);
wolfy = em.find(Dog.class, wolfyId);
em.getTransaction().commit();
em.close();

Note that if there is only one PAR archive in your classpath, you don't have to pass the name to the createEntityManagerFactory() method, but is considered good practice however.

The PAR archive mechanism offers a very convenient and standard way to package your ORM persistent units. By its autodiscovery mechanism, the packaging setting up scales in a very elegant manner.

Hibernate Entity Manager 3.1 beta 2 and Hibernate Annotations 3.1 beta 4

Posted by    |       |    Tagged as

New releases of both Hibernate Entity Manager and Hibernate Annotations are available.

Hibernate Entity Manager now supports full .par archive including entity and hbm files auto discovery (see the previous blog entry for more details on this feature) and fixes some critical bugs (see the release notes http://sourceforge.net/project/shownotes.php?release_id=346915).

Hibernate Annotations new release is focused on better support of Hibernate specific features (all Hibernate id generators support, column indexes generation, non PK referencedColumnName support for @OneToOne, @ManyToOne, and @OneToMany...) and of course bug fixes (See the release notes http://sourceforge.net/project/shownotes.php?release_id=346914 for more details).

These releases are compatible with the latest Hibernate core release (3.1 beta1).

Multi-table Bulk Operations

Posted by    |       |    Tagged as

As I mentioned in my previous blog about Bulk Operations , both UPDATE and DELETE statements are challenging to handle against single entities contained across multiple tables (not counting associations), which might be the case with:

  • inheritence using <joined-subclass/>
  • inheritence using <union-subclass/>
  • entity mapping using the <join/> construct

For illustration purposes, lets use the following inheritance hierarchy:

Animal
  /   \
 /     \
Mammal   Reptile
   / \
  /   \
Human   Dog

all of which is mapped using the joined-subclass strategy.

Deletes

There are three related challenges with deletes.

  • deletes against a multi-table entity need to recursively cascade to:
  • all sub-class(es) row(s) matched by primary key (PK) value
  • its super-class row
  • all these orchestrated deletes need to occur in an order to avoid constraint violations
  • which rows need to get deleted?

Consider the following code:

session.createQuery( "delete Mammal m where m.age > 150" ).executeUpdate();

Obviously we need to delete from the MAMMAL table. Additionally, every row in the MAMMAL table has a corresponding row in the ANIMAL table; so for any row deleted from the MAMMAL table, we need to delete that corresponding ANIMAL table row. This fulfills cascading to the super-class. If the Animal entity itself had a super-class, we'd need to delete that row also, etc.

Next, rows in the MAMMAL table might have corresponding rows in either the HUMAN table or the DOG table; so, again, for each row deleted from the MAMMAL table, we need to make sure that any corresponding row gets deleted from the HUMAN or DOG table. This fulfills cascading to the sub-class. If either the Human or Dog entities had further sub-classes, we'd need to delete any of those rows also, etc.

The other challenge I mentioned is proper ordering of the deletes to avoid violating any constraints. The typical foreign key (FK) set up in our example structure is to have the FKs pointing up the hierarchy. Thus, the MAMMAL table has a FK from its PK to the PK of the ANIMAL table, etc. So we need to be certain that we order the deletes:

( HUMAN | DOG ) -> MAMMAL -> ANIMAL

Here, it does not really matter whether we delete from the HUMAN table first, or from the DOG table first.

So exactly which rows need to get deleted (a lot of this discussion applies to update statements as well)? Most databases do not support joined deletes, so we definitely need to perform the deletes seperately against the individual tables involved. The naive approach is to simply use a subquery returning the restricted PK values with the user-defined restriction as the restriction for the delete statement. That actually works in the example given before. But consider another example:

session.createQuery( "delete Human h where h.firstName = 'Steve'" ).executeUpdate();

I said before that we need to order the deletes so as to avoid violating defined FK constraints. Here, that means that we need to delete from the HUMAN table first; so we'd issue some SQL like:

delete from HUMAN where ID IN (select ID from HUMAN where f_name = 'Steve')

So far so good; perhaps not the most efficient way, but it works. Next we need to delete the corresponding row from the MAMMAL table; so we'd issue some more SQL:

delete from MAMMAL where ID IN (select ID from HUMAN where f_name = 'Steve')

Oops! This won't work because we previously deleted any such rows from the HUMAN table.

So how do we get around this? Definitely we need to pre-select and store the PK values matching the given where-clause restriction. One approach is to select the PK values through JDBC and store them within the JVM memory space; then later the PK values are bound into the individual delete statements. Something like:

PreparedStatement ps = connection.prepareStatement( 
        "select ID from HUMAN where f_name = 'Steve'"
);
ResultSet rs = ps.executeQuery();
HashSet ids = extractIds( rs );
int idCount = ids.size();

rs.close();
ps.close();

....

// issue the delete from HUMAN
String sql = 

ps = connection.prepareStatement(
        "delete from HUMAN where ID IN (" +
        generateCommaSeperatedParameterHolders( idCount ) +
        ")"
);
bindParameters( ps, ids );
ps.executeUpdate();

...

The other approach, the one taken by Hibernate, is to utilize temporary tables; where the matching PK values are stored on the database server itself. This is far more performant in quite a number of ways, which is the main reason this approach was chosen. Now we have something like:

// where HT_HUMAN is the temporary table (varies by DB)
PreparedStatement ps = connection.prepareStatement( 
        "insert into HT_HUMAN (ID) select ID from HUMAN where f_name = 'Steve'"
);
int idCount = ps.executeUpdate();
ps.close();

....

// issue the delete from HUMAN 
ps = connection.prepareStatement(
        "delete from HUMAN where ID IN (select ID from HT_HUMAN)"
);
ps.executeUpdate();

In the first step, we avoid the overhead of potential network communication associated with returning the results; we also avoid some JDBC overhead; we also avoid the memory overhead of needing to store the id values. In the second step, we again minimized the amount of data traveling between us and the database server; the driver and server can also recognize this as a repeatable prepared statement and avoid execution plan creation overhead.

Updates

There are really only two challenges with multi-table update statements:

  • partitioning the assignments from the set-clause
  • which rows need to get updated? This one was already discussed above...

Consider the following code:

session.createQuery( "update Mammal m set m.firstName = 'Steve', m.age = 20" )
        .executeUpdate();

We saw from before that the age property is actually defined on the Animal super-class and thus is mapped to the ANIMAL.AGE column; whereas the firstName property is defined on the Mammal class and thus mapped to the MAMMAL.F_NAME column. So here, we know that we need to perform updates against both the ANIMAL and MAMMAL tables (no other tables are touched, even though the Mammal might further be a Human or a Dog). Partitioning the assignments really just means identifying which tables are affected by the individual assignments and then building approppriate update statements. A minor challenge here was accounting for this fact when actually binding user-supplied parameters. Though, for the most part, partitioning the assignments and parameters was fairly academic exercise.

Bulk Operations

Posted by    |       |    Tagged as

The EJB3 persistence specification calls for implementors to support Bulk Operations in EJB-QL (the EJB Query Language). As part of Hibernate's implementation of EJB3 persistence, HQL (the Hibernate Query Language : which is a superset of EJB-QL) needed to support these Bulk Operations. This support is now code complete, even going beyond what is offered in the EJB3 persistence specification. There is one task outstanding against this bulk operation support in HQL, but this is completely beyond the scope of the support called for in the EJB3 persistence specification. I'll blog about this one later as it simply rocks ;)

So what exactly are Bulk Operations? Well for those of you familiar with SQL, it is analogous to Data Manipulation Language (DML) but, just like HQL and EJB-QL, defined in terms of the object model. What is DML? DML is the SQL statements which actually manipulate the state of the tabular data: INSERT, UPDATE, and DELETE.

Essentially, all that is to say that EJB-QL and HQL now support UPDATE and DELETE statements (HQL also supports INSERT statements, but more about that at a later time).

In its basic form, this support is not really all that difficult. I mean Hibernate already knows all the information pertaining to tables and columns; it already knows how to parse WHERE-clauses and the like. So what's the big deal? Well, in implementation, we ran across a few topics that make this support more challenging; which of course made it all the more fun to implement ;)

Update Statements

From the EJB3 persistence specification:

Bulk update and delete operations apply to entities of a single entity class 
(together with its subclasses, if any). Only one entity abstract schema type 
may be specified in the FROM or UPDATE clause.

The specification-defined psuedo-grammar for the update syntax:

update_statement ::= update_clause [where_clause]

update_clause ::=UPDATE abstract_schema_name [[AS ] identification_variable]
    SET update_item {, update_item}*

update_item ::= [identification_variable.]state_field = new_value

new_value ::=
    simple_arithmetic_expression |
    string_primary |
    datetime_primary |
    boolean_primary

The basic jist is:

  • There can only be a single entity (abstractschemaname) named in the update-clause; it can optionally be aliased. If the entity name is aliased, then any property references must be qualified using that alias; if the entity name is not aliased, then it is illegal for any property references to be qualified.
  • No joins (either implicit or explicit) can be specified in the update. Sub-queries may be used in the where-clause; the subqueries, themselves, can contain joins.
  • The where-clause is also optional.

Two interesting things to point out:

  • According to the specification, an UPDATE against a versioned entity should not cause the version to be bumped
  • According to the specification, the assigned new_value does not allow subqueries; HQL supports this!

Even though the spec disallows bumping the version on an update of a versioned entity, this is more-often-than-not the desired behavior. Because of the spec, Hibernate cannot do this by default so we introduced a new keyword VERSIONED into the grammar instead. The syntax is update versioned MyEntity ..., which will cause the version column values to get bumped for any affected entities.

Delete Statements

From the EJB3 persistence specification:

Bulk update and delete operations apply to entities of a single entity class 
(together with its subclasses, if any). Only one entity abstract schema type 
may be specified in the FROM or UPDATE clause.

A delete operation only applies to entities of the specified class and its 
subclasses. It does not cascade to related entities.

The specification-defined psuedo-grammar for the delete syntax:

delete_statement ::= delete_clause [where_clause]

delete_clause ::= DELETE FROM abstract_schema_name [[AS ] identification_variable]

The basic jist is:

  • There can only be a single entity (abstractschemaname) named in the from-clause; it can optionally be aliased. If the entity name is aliased, then any property references must be qualified using that alias; if the entity name is not aliased, then it is illegal for any property references to be qualified.
  • No joins (either implicit or explicit) can be specified in the delete. Sub-queries may be used in the where-clause; the subqueries, themselves, can contain joins.
  • The where-clause is also optional.

One very interesting thing to point out there. The specification specifically disallows cascading of the delete to releated entities (not including, abviously, db-level cascades).

Caching

Automatic and transparent object/relational mapping is concerned with the management of object state. This implies that the object state is available in memory. Bulk Operations, to a large extent, undermine that concern. The biggest issue is that of caching performed by the ORM tool/EJB3 persistence implementor.

The spec even makes a point to caution regarding this:

Caution should be used when executing bulk update or delete operations because 
they may result in inconsistencies between the database and the entities in the 
active persistence context. In general, bulk update and delete operations 
should only be performed within a separate transaction or at the beginning of a 
transaction (before entities have been accessed whose state might be affected 
by such operations).

In Hibernate terms, be sure to perform any needed Bulk Operations prior to pulling entities into the session, as failing to do so poses a risk for inconsistencies between the session (the /active persistence context/) and the database.

Hibernate also offers, as do most ORM tools, a shared cache (the second level cache). Executing Bulk Operations also poses a risk of inconsistencies between the shared cache and the database. Hibernate actually takes the responsility of managing this risk for you. Upon completion of a Bulk Operation, Hibernate invalidates any needed region(s) within the shared cache to maintain consistency. It has to be done through invalidation because the UPDATE or DELETE is executed solely on the database server; thus Hibernate has no idea about the ids of any affected entities, nor (in the case of updates) what the new state might be.

Conclusion

Bulk Operations are complimentary to the functionality provided by ORM tools. Especially in the case of batch processes, Bulk Operations coupled with the new StatelessSession functionlity (available > 3.1beta1) offer a more performant alternative to the normal row-based ORM focus.

This-n-that

Entities which are contained across multiple tables (not counting associations) cause particular challenges that I'll blog about later.

Have a look at the reference manual for discussion of these Bulk Operations within HQL.

For those of you familiar with ANTLR and its grammar definitions, the authoritative source for what is supported by HQL is the grammar files themselves.

Hibernate in Action Second Edition and EJB3

Posted by    |       |    Tagged as

The first edition of Hibernate in Action has spread quite successfully. On training or consulting somewhere on-site I often see people with a copy on their desk. And it has proven to be invaluable to me (and others at JBoss) bringing a few copies along every-time. There is simply no better additional training material than a professionally edited full-length book. The only downside is that it is only covering Hibernate 2.x.

Soon after the release of Hibernate in Action about a year ago we thought about an update. After all, development on Hibernate3 had already started and we knew that interesting stuff would happen in EJB3 persistence as well. I've mentioned a second edition a few times on the forum but we haven't been very specific about release dates and new or updated content - hence this blog entry to keep everybody up-to-date. The reason why we were quiet for some time is that we simply had to finish Hibernate3 first, an effort that was completed only a few months ago. But also the EJB3 specification and its influence on Hibernate had to be watched before we could start updating the book. Since Hibernate 3.0 is now long stable and even 3.1 is already on the horizon, and with EJB 3.0 available in public draft, we can continue updating the manuscript for Hibernate in Action, Second Edition.

I guess most of you first want to know when it is going to be available. Both Hibernate 3.1 and EJB 3.0 are being finalized (despite the current alpha tag on Hibernate 3.1, it's soon feature complete and not a big release anyway) but some things might still change. Usually these minor changes have not much impact on development but can make whole sections of documentation obsolete. After discussing the issue with our editor and publisher at Manning, we think that the updated edition can be available end of September 2005, or early in Q4. As always, the eBook edition might be available earlier than the print version.

We'll update the book for Hibernate3 and EJB3, and based on the feedback we got from readers and during training (our first Hibernate training last year was following the books structure) we'll make some major changes:

  • the Toolset chapter will be removed and integrated into a new beginners tutorial that also shows the new Eclipse-based and Ant tools, with a hands-on basic project setup
  • a new chapter will be added with best practices, patterns, and general tips & tricks - this will include a lot of FAQs from the forum and our customers, such as caching tricks, metadata-driven applications, dealing with large values, complex deployment scenarios, etc.
  • more illustrations will be included with many mapping examples

So, you can expect quite a lot of new content, especially wrt EJB3 API usage for all of you who want to learn the new interfaces and lifecycle (it's easy if you know Hibernate...) and more best practices.

We also have an updated version of CaveatEmptor for the second edition. I've packaged an alpha release you can already download . It includes a complete mapping of the domain model with EJB3/Hibernate3 annotations and ready-to-run EJB3 persistence unit tests in straightforward J2SE, using Hibernate EntityManager and Hibernate Annotations .

I'll keep you updated here and release new versions of CaveatEmptor as I work on the manuscript.

P.S. Don't miss the new EJB3 TrailBlazer tutorial for the JBoss EJB3 Application Server and send feedback to the expert group on the public draft .

back to top