Help

I'm the person behind annotations in Hibernate: Hibernate Annotations, Hibernate EntityManager, Hibernate Validator and Hibernate Search. I am a member of the JPA 2.0 expert group as well as the JSR-303 Bean validation spec lead. You can check out my book Hibernate Search in Action by Manning.

Location: CET
Occupation: Core developer at JBoss, by Red Hat
Archive
My Links

Today is a big day. The first release of Hibernate OGM with final status. Ever! Don't be fooled by the 4.1 number. Hibernate OGM is an object mapper for various NoSQL stores and offers the familiar JPA APIs. This final version offers mapping for MongoDB, Neo4J, Infinispan and Ehcache.

It has been a long journey to reach this release, much longer than we thought. And there is a long journey ahead of us to implement our full (and exciting!) vision. But today is the time to celebrate: download this puppy and try it out.

What is Hibernate OGM?

Hibernate OGM is an object mapper. It persists object graphs into your NoSQL datastore.

We took great care to map the object structures the most natural way possible ; we considered all the best practices for each of the NoSQL store we support. Storing a association in a document store is vastly different than storing the same association in a graph database.

// Unidirectional one to many association

// Basket
{
  "_id" : "davide_basket",
  "owner" : "Davide",
  "products" : [ "Beer", "Pretzel" ]
}

// Products
{
  "_id" : "Pretzel",
  "description" : "Glutino Pretzel Sticks"
}
{
  "_id" : "Beer",
  "description" : "Tactical nuclear penguin"
}

Hibernate OGM is 90% Hibernate ORM. We changed the parts that are specific to SQL and JDBC, but most of the engine remains untouched. Same power, same flexibility.

What is the API like?

Very simple. It's JPA. Or Hibernate ORM. Map your entities with JPA annotations (or via XML), then use JPA or the Hibernate native APIs to manipulate your objects.

@PersistenceContext EntityManager em;

// the transaction boundary is really here to express the flush time
@Transactional
public void createSomeUser() {
    Employer redHat =
        em.createQuery("from Employer e where e.name = :name")
        .setParamater("name", "Red Hat")
        .getSingleResult();
    User emmanuel = new User("Emmanuel", "Bernard");
    user.setTwitterHandle("emmanuelbernard");
    user.setEmployer(redHat);
    em.persist(user);
}

Our goal is to have a zero barrier of entry to NoSQL object mappers for people familiar with JPA or Hibernate ORM.

Hibernate OGM also has a flexible option system that lets you customize some of the NoSQL store specifics or mapping options. For example what is the MongoDB Write Concern for this entity (see code example) or should associations be stored in the owning entity document.

@Entity
@WriteConcern(JOURNALED) // MongoDB write concern
public class User {
    ...
}

And queries?

We cannot talk about JPA without mentioning JP-QL. Offering JP-QL support is challenging at many levels. To mention only two, joins usually don't exist in NoSQL, and each store has a very different set of query capabilities.

Hibernate OGM can convert JP-QL queries to the underlying native query language of the datastore. This functionality is still limited however. Besides some queries will never map to JP-QL. So we also let you write native queries specific to your NoSQL store and map the results to managed entities.

// native query using CypherQL
String query = "MATCH ( n:Poem { name:'Portia', author:'Oscar Wilde' } ) RETURN n";
Poem poem = (Poem) em.createNativeQuery( query, Poem.class ).getSingleResult();

Where can I use Hibernate OGM?

It works anywhere Hibernate ORM or any JPA provider works. Java SE, Java EE, all should be good. We do require JPA 2.1 though. If you use WildFly (8.2), we have a dedicated module to make things even easier.

For which NoSQL store?

MongoDB, Neo4J, Infinispan and Ehcache are the one we consider stable. We are working on CouchDB and Cassandra. But really, any motivated person can try and map other NoSQL stores: that's how a few got started. We have an API that has proven flexible enough so far.

Can Hibernate OGM do X, Y, Z?

Probably. Maybe not.

The best is to talk to us in our forums, check our documentation (we spent a lot of time on it), and simply give it a try.

Generally, the mapping support is complete. Our query support is still a bit limited compared to where we want it to be. It will improve quickly now that the foundations are here.

We want to know what you need out of Hibernate OGM, what feature you miss, which one should be changed. Come and talk to us in our forum or anywhere you can find us.

How to get started?

Most of what you need is available in our web site. There is a getting started guide, and the more complete reference documentation. Get the full Hibernate OGM distribution. And last but not least, for any help, reach us via our forum.

It would be impossible to mention all the persons that contributed to Hibernate OGM and how: conversations, support, code, documentation, bootstrapping new datastore providers... Many thanks to all of you for making this a reality.

We are not done yet, far from it. We have plenty of ideas on where we want to bring Hibernate OGM. That's a discussion for another day.

After a huge push, we are now one release away from our Final version. So without further due, I present you Hibernate OGM 4.1.0.CR1. To sum it up:

  • stable mapping for each of the supported datastores (Infinispan, Ehcache, MongoDB, Neo4J)
  • new and better one cache per entity structure for key/value stores
  • improvement in Neo4J and MongoDB around embedded objects and composite ids
  • better documentation

Our goal for Hibernate OGM 4.1 is to offer a good Object Mapper for each of the primary datastores we target. Go test it before the final and give us feedback!

Mapping stability and documentation

This CR release signals the stable version of how we persist each data structure on the various datastores. We strive to offer a mapping that is natural to each individual datastore. We made some final improvements and are confident we can support this version.

For each datastore, we documented how each JPA mapping is persisted (entities, star-to-one, star-to-many, embedded id, etc.). It makes for a tree-killer documentation but shows what is the truth for each mapping.

We took the opportunity to improve the documentation even further and plan to finish that work for the final version.

Additional key/value cache structure

Our tests have showed that storing each entity type and association in a dedicated cache is actually more efficient than sharing the same cache for all entities. Since we also think it is a more natural mapping, we now offer this option and make it the default.

A User and an Address entities would lead to the following caches:

  • User: contains the users
  • Address: contains the address
  • associations_User_Address: contains the navigation from a user to its list of addresses

An interesting side effect is that it makes the keys smaller in size and faster to compare. Both Infinispan and Ehcache are benefiting from this.

Improvements around embedded objects, embedded ids and properties

In Neo4J, (non id) embedded objects are now represented as individual nodes. It is more in line with the connection behavior of graph databases.

In MongoDB, embedded id foreign keys have been improved and are now represented as nested documents like embedded ids were already. We did not hear complains about this one, so we think you guys don't use composite ids with MongoDB. That's good, keep doing that :)

The null properties are no longer stored in any of the data stores. While it might make some queries involving null values a bit harder, it is the more natural mapping for the datastores.

Go, go, go

This post is about validation, Bean Validation specifically. And the what it solves. It is a reaction to a post by Julien Tournay which basically claims - if I sum it up - that:

  • Scala is faster than Java (he is coming from far far away on that one - du diable vauvert as we say in French)
  • Bean Validation is kind of crap
  • The new Play Unified Validation API is awesome

I usually am old and wise enough not to answer these kind of posts. But I want to feel the blood of my youth again damn it! After all I am from an era when The Server Side was all the rage, trolls had real teeth, things were black or white. That was good fun.

Why the You missed the point like Mars Climate Orbiter title? With a sensationalist title like Scala is faster than Java, I had to step up my game. Plus Mars Climate Orbiter missed Mars due to a conversion error and we will talk about conversion today.

I won't refute things point by point but rather highlight a few fundamental misunderstandings and explain why things are like they are in Bean Validation and why it makes sense.

Java was limiting

IMHO, @emmanuelbernard a créer son API avec les outils a sa disposition - @skaalf
IMHO, @emmanuelbernard has created his API with the tools he had at his disposal - @skaalf

We certainly pushed the boundaries of Java when both CDI and Bean Validation were designed and I wished I had more freedom here and there. But no Bean Validation is like it is because that's the most natural way validation is expressed in the Java ecosystem for what we wanted to achieve. It is no good to offer something that does not fit into the ecosystem.

For example, in Java, people use mutable objects. Mutable believe it or not is not a swear word in this language. My understanding is that Play Unified Validation API makes use of Scala's community incline for immutability.

Validation, conversion and marshalling

In the use case Julien takes as an example, a JSON string is unmarshalled, conversion is applied (for dates - darn JSON) and then values are validated. He considers it weird that Bean Validation only does the latter and finds the flow wrong.

We kept this separation on purpose. The expert group vastly agreed that Bean Validation was not in the business of conversion and that these steps ((un)marshalling, conversion, validation) should be separated. Here is a key reason:

  • marshalling and conversions are only at the Java boundaries (Web frameworks, services enpoints, datastores enpoints, etc)
  • validation happens both at these boundaries but also at key lifecycle events within the Java boundaries

Conceptually, separating validation from the rest makes a lot of sense to enable technology composition and avoid repetitions - more on that latter point later. What is important is that for a given boundary, they are properly integrated. It's not unlike inlining in compilation.

Bean Validation is in the business of not being called

One key point in the design of Bean Validation is that it has been built to be integrated within a cohesive stack more than to be used individually.

JPA transparently calls Bean Validation on your objects at the right time. JSF calls Bean Validation transparently before populating the beans. JAX-RS calls Bean Validation on the inbound and outbound resources transparently. Note that these three technologies do address the marshalling and conversion problem already. That's their job.

So the key is that an application rarely has to call the Bean Validation API. I consider it a failure or an unfinished work when it has to.

If you need to explicitly validate some piece of data in a fire and forget way, Bean Validation might not be the best tool. And that's OK.

Back to the JSON case. When the JSON Binding spec will be out, we will have the same integration that is currently present in Play Unified parsing, conversion and validation API (it could also be done today in Jackson, I'm not sure what extension points this library offers). While the three layers marshalling / conversion / validation will be conceptually separated - and I'm sure will report different types of error for better tracking - the implementation will use some specific APIs of Bean Validation to inline validation with the unmarshalling and conversion. That API exists BTW, it is Validator.validateValue which lets you validate a value without creating the POJO associated. It is used by web frameworks already. Pretty much what Play Unified Validation API does.

As for JAX-B and XML binding, well let's say that there is an X in it and that it's not safe for children. More seriously, we have integration plans with mapping between the XSD and the Bean Validation constraints but we haven't got around to do it yet.

Bean Validation is in the business of declaring constraints once

From what I see of the Play Unified Validation API, you put the declaration / implementation of the validation next the marshalling logic. In other words you cannot share the constraint declaration between different marshalling operations or even between object transformations.

Bean Validation has been designed to let the user declare the constraints once and have them validated across the whole stack. From the web form and service entry points down to the database input/output and schema definition. Some even use our metadata facility to propagate the constraints up to the client side (JavaScript).

And that's future proof, when we add JSON-B support, boom the constraints already declared are used automatically and for free.

Conclusion

Bean Validation cannot be understood if you take it in isolation. It is useful and works in isolation for sure but it shines when integrated in a platform from top to bottom. We did and keep doing it in Java EE but we also make sure to keep our APIs and SPIs open for other platforms to do the same. Spring famously has been doing some of it.

Let's mention a few key features of Bean Validation that are not addressed at all in Play's approach:

  • i18n
  • constraint inheritance
  • method validation
  • custom and context sensitive programmatic error report
  • partially valid object graph - yes that's important in a mutable world
  • etc

Even a few of these would make Play's stuff much much different.

Now with this context properly set, if you read back Julien's complaints about Bean Validation (especially about they design), they fall one by one pretty much. I have to admit their composition approach is much nicer than Bean Validation's which relies on annotated annotations. That's the main thing I miss.

Design is choice. Without knowledge to the designer's choices, you can too easily misunderstand her design and motives. Julien has been comparing Apples with Oranges, which is ironic for someone working on one of the most type-safe language out there :o)

But it's good that more people take data validation holistically and seriously and step up the game. Less work for app developers, safer apps, happier customers. Judging by the location, I could even share some thoughts over a few rounds of whisky with Julien :)

Peace.

Time for a new release of Hibernate OGM.

On the new feature side, we have added integration modules for both:

But the bulk of the work has been around polishing and fixing issues based on your feedback. In a nutshell, we polished MongoDB's support (security, native queries) and Neo4J. We fixed a few bugs around collections and query support. We kept improving the option framework which we use to define backend specific behaviors.

How to try it

You can take a look at the documentation or check how you can download Hibernate OGM 4.1.0.Beta2.

Many thanks to all the contributors that made this release possible whether it be via pull requests, bug reports or feedback on the forum.

One of the benefit of rewriting our Hibernate website is that we made our roadmaps more prominent. And in case of Hibernate OGM, we had to write it :)

Hibernate OGM, in a nutshell, is JPA for NoSQL backends (more here).

Gunnar and Davide have made significant forays into the feature set we wanted for our first stable version. This has made writing the roadmap much easier. You can get all the (up to date) details on Hibernate OGM's roadmap page. Here I will sum up what is between us and our first final version.

Option system: offer ability to define per property, per association, per entity or global options. Offered via annotations and via a programmatic API. This system will enable declarative denormalization, specific backend optimization settings and so on.

CouchDB support: offer CRUD support for CouchDB

Performance check: we keep a eye on them but a dedicated round of optimization will happen.

Better navigation with Neo4J: our initial mapping works but we know we can access associations in a faster way.

Batch changes: use the backend batching capability if it exists.

query support for known backends: add our current query support for Neo4J and CouchDB as much as possible.

cache query plans: it's all in the title :) Hibernate ORM already does that, we need to catch up.

I'm sure we will have to chop off a few additional tasks here and there but our goal is to stay focused on these as much as we can.

Showing 1 to 5 of 98 blog entries