NoSQL with Hibernate OGM 101: Persisting your first entities

The first final version of Hibernate OGM is out and we’ve recovered a bit from the release frenzy. So we thought it’d be a good idea to begin the new year with a tutorial-style blog series, which shows how to get started with Hibernate OGM and what it can do for you. Just in case you missed the news, Hibernate OGM is the newest project under the Hibernate umbrella and allows you to persist entity models in different NoSQL stores via the well-known JPA.

We’ll cover these topics in the following weeks:

Persisting your first entities (this instalment)
Querying for your data
Running on WildFly
Running with CDI on Java SE
Store data into two different stores in the same application

If you’d like us to discuss any other topics, please let us know. Just add a comment below or tweet your suggestions to us.

In this first part of the series we are going to set up a Java project with the required dependencies, create some simple entities and write/read them to and from the store. We’ll start with the Neo4j graph database and then we’ll switch to the MongoDB document store with only a small configuration change.

Project set-up

Let’s first create a new Java project with the required dependencies. We’re going to use Maven as a build tool in the following, but of course Gradle or others would work equally well.

Add this to the dependencyManagement block of your pom.xml:

...
<dependencyManagement>
    <dependencies>
        ...
        <dependency>
            <groupId>org.hibernate.ogm</groupId>
            <artifactId>hibernate-ogm-bom</artifactId>
            <type>pom</type>
            <version>4.1.1.Final</version>
            <scope>import</scope>
        </dependency>
            ...
    </dependencies>
</dependencyManagement>
...

This will make sure that you are using matching versions of the Hibernate OGM modules and their dependencies. Then add the following to the dependencies block:

...
<dependencies>
    ...
    <dependency>
        <groupId>org.hibernate.ogm</groupId>
        <artifactId>hibernate-ogm-neo4j</artifactId>
    </dependency>
    <dependency>
        <groupId>org.jboss.jbossts</groupId>
        <artifactId>jbossjta</artifactId>
    </dependency>
    ...
</dependencies>
...

The dependencies are:

The Hibernate OGM module for working with an embedded Neo4j database; This will pull in all other required modules such as Hibernate OGM core and the Neo4j driver. When using MongoDB, you’d swap that with hibernate-ogm-mongodb.
JBoss' implementation of the Java Transaction API (JTA), which is needed when not running within a Java EE container such as WildFly

The domain model

Our example domain model is made up of three classes: Hike, HikeSection and Person.

There is a composition relationship between Hike and HikeSection, i.e. a hike comprises several sections whose life cycle is fully dependent on the Hike. The list of hike sections is ordered; This order needs to be maintained when persisting a hike and its sections.

The association between Hike and Person (acting as hike organizer) is a bi-directional many-to-one/one-to-many relationship: One person can organize zero ore more hikes, whereas one hike has exactly one person acting as its organizer.

Mapping the entities

Now let’s map the domain model by creating the entity classes and annotating them with the required meta-data. Let’s start with the Person class:

@Entity
public class Person {

    @Id
    @GeneratedValue(generator = "uuid")
    @GenericGenerator(name = "uuid", strategy = "uuid2")
    private long id;

    private String firstName;
    private String lastName;

    @OneToMany(mappedBy = "organizer", cascade = CascadeType.PERSIST)
    private Set<Hike> organizedHikes = new HashSet<>();

    // constructors, getters and setters...
}

The entity type is marked as such using the @Entity annotation, while the property representing the identifier is annotated with @Id.

Instead of assigning ids manually, Hibernate OGM can take care of this, offering several id generation strategies such as (emulated) sequences, UUIDs and more. Using a UUID generator is usually a good choice as it ensures portability across different NoSQL datastores and makes id generation fast and scalable. But depending on the store you work with, you also could use specific id types such as object ids in the case of MongoDB (see the reference guide for the details).

Finally, @OneToMany marks the organizedHikes property as an association between entities. As it is a bi-directional entity, the mappedBy attribute is required for specifying the side of the association which is in charge of managing it. Specifying the cascade type PERSIST ensures that persisting a person will automatically cause its associated hikes to be persisted, too.

Next is the Hike class:

@Entity
public class Hike {

    @Id
    @GeneratedValue(generator = "uuid")
    @GenericGenerator(name = "uuid", strategy = "uuid2")
    private String id;

    private String description;
    private Date date;
    private BigDecimal difficulty;

    @ManyToOne
    private Person organizer;

    @ElementCollection
    @OrderColumn(name = "sectionNo")
    private List<HikeSection> sections;

    // constructors, getters and setters...
}

Here the @ManyToOne annotation marks the other side of the bi-directional association between Hike and Organizer. As HikeSection is supposed to be dependent on Hike, the sections list is mapped via @ElementCollection. To ensure the order of sections is maintained in the datastore, @OrderColumn is used. This will add one extra "column" to the persisted records which holds the order number of each section.

Finally, the HikeSection class:

@Embeddable
public class HikeSection {

    private String start;
    private String end;

    // constructors, getters and setters...
}

Unlike Person and Hike, it is not mapped via @Entity but using @Embeddable. This means it is always part of another entity (Hike in this case) and as such also has no identity on its own. Therefore it doesn’t declare any @Id property.

Note that these mappings looked exactly the same, had you been using Hibernate ORM with a relational datastore. And indeed that’s one of the promises of Hibernate OGM: Make the migration between the relational and the NoSQL paradigms as easy as possible!

Creating the persistence.xml

With the entity classes in place, one more thing is missing, JPA's persistence.xml descriptor. Create it under src/main/resources/META-INF/persistence.xml:

<?xml version="1.0" encoding="utf-8"?>

<persistence xmlns="http://java.sun.com/xml/ns/persistence"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd"
    version="2.0">

    <persistence-unit name="hikePu" transaction-type="RESOURCE_LOCAL">
        <provider>org.hibernate.ogm.jpa.HibernateOgmPersistence</provider>

        <properties>
            <property name="hibernate.ogm.datastore.provider" value="neo4j_embedded" />
            <property name="hibernate.ogm.datastore.database" value="HikeDB" />
            <property name="hibernate.ogm.neo4j.database_path" value="target/test_data_dir" />
        </properties>
    </persistence-unit>
</persistence>

If you have worked with JPA before, this persistence unit definition should look very familiar to you. The main difference to using the classic Hibernate ORM on top of a relational database is the specific provider class we need to specify for Hibernate OGM: org.hibernate.ogm.jpa.HibernateOgmPersistence.

In addition, some properties specific to Hibernate OGM and the chosen back end are defined to set:

the back end to use (an embedded Neo4j graph database in this case)
the name of the Neo4j database
the directory for storing the Neo4j database files

Depending on your usage and the back end, other properties might be required, e.g. for setting a host, user name, password etc. You can find all available properties in a class named <BACK END>Properties, e.g. Neo4jProperties, MongoDBProperties and so on.

Saving and loading an entity

With all these bits in place its time to persist (and load) some entities. Create a simple JUnit test shell for doing so:

public class HikeTest {

    private static EntityManagerFactory entityManagerFactory;

    @BeforeClass
    public static void setUpEntityManagerFactory() {
        entityManagerFactory = Persistence.createEntityManagerFactory( "hikePu" );
    }

    @AfterClass
    public static void closeEntityManagerFactory() {
        entityManagerFactory.close();
    }
}

The two methods manage an entity manager factory for the persistence unit defined in persistence.xml. It is kept in a field so it can be used for several test methods (remember, entity manager factories are rather expensive to create, so they should be initialized once and be kept around for re-use).

Then create a test method persisting and loading some data:

@Test
public void canPersistAndLoadPersonAndHikes() {
    EntityManager entityManager = entityManagerFactory.createEntityManager();

    entityManager.getTransaction().begin();

    // create a Person
    Person bob = new Person( "Bob", "McRobb" );

    // and two hikes
    Hike cornwall = new Hike(
            "Visiting Land's End", new Date(), new BigDecimal( "5.5" ),
            new HikeSection( "Penzance", "Mousehole" ),
            new HikeSection( "Mousehole", "St. Levan" ),
            new HikeSection( "St. Levan", "Land's End" )
    );
    Hike isleOfWight = new Hike(
            "Exploring Carisbrooke Castle", new Date(), new BigDecimal( "7.5" ),
            new HikeSection( "Freshwater", "Calbourne" ),
            new HikeSection( "Calbourne", "Carisbrooke Castle" )
    );

    // let Bob organize the two hikes
    cornwall.setOrganizer( bob );
    bob.getOrganizedHikes().add( cornwall );

    isleOfWight.setOrganizer( bob );
    bob.getOrganizedHikes().add( isleOfWight );

    // persist organizer (will be cascaded to hikes)
    entityManager.persist( bob );

    entityManager.getTransaction().commit();

    // get a new EM to make sure data is actually retrieved from the store and not Hibernate's internal cache
    entityManager.close();
    entityManager = entityManagerFactory.createEntityManager();

    // load it back
    entityManager.getTransaction().begin();

    Person loadedPerson = entityManager.find( Person.class, bob.getId() );
    assertThat( loadedPerson ).isNotNull();
    assertThat( loadedPerson.getFirstName() ).isEqualTo( "Bob" );
    assertThat( loadedPerson.getOrganizedHikes() ).onProperty( "description" ).containsOnly( "Visiting Land's End", "Exploring Carisbrooke Castle" );

    entityManager.getTransaction().commit();

    entityManager.close();
}

Note how both actions happen within a transaction. Neo4j is a fully transactional datastore which can be controlled nicely via JPA’s transaction API. Within an actual application one would probably work with a less verbose approach for transaction control. Depending on the chosen back end and the kind of environment your application runs in (e.g. a Java EE container such as WildFly), you could take advantage of declarative transaction management via CDI or EJB. But let’s save that for another time.

Having persisted some data, you can examine it, using the nice web console coming with Neo4j. The following shows the entities persisted by the test:

Hibernate OGM aims for the most natural mapping possible for the datastore you are targeting. In the case of Neo4j as a graph datastore this means that any entity will be mapped to a corresponding node.

The entity properties are mapped as node properties (see the black box describing one of the Hike nodes). Any not natively supported property types will be converted as required. E.g. that’s the case for the date property which is persisted as an ISO-formatted String. Additionally, each entity node has the label ENTITY (to distinguish it from nodes of other types) and a label specifying its entity type (Hike in this case).

Associations are mapped as relationships between nodes, with the association role being mapped to the relationship type.

Note that Neo4j does not have the notion of embedded objects. Therefore, the HikeSection objects are mapped as nodes with the label EMBEDDED, linked with the owning Hike nodes. The order of sections is persisted via a property on the relationship.

Switching to MongoDB

One of Hibernate OGM's promises is to allow using the same API - namely, JPA - to work with different NoSQL stores. So let’s see how that holds and make use of MongoDB which, unlike Neo4j, is a document datastore and persists data in a JSON-like representation. To do so, first replace the Neo4j back end with the following one:

...
<dependency>
    <groupId>org.hibernate.ogm</groupId>
    <artifactId>hibernate-ogm-mongodb</artifactId>
</dependency>
...

Then update the configuration in persistence.xml to work with MongoDB as the back end, using the properties accessible through MongoDBProperties to give host name and credentials matching your environment (if you don’t have MongoDB installed yet, you can download it here):

...
<properties>
    <property name="hibernate.ogm.datastore.provider" value="mongodb" />
    <property name="hibernate.ogm.datastore.database" value="HikeDB" />
    <property name="hibernate.ogm.datastore.host" value="mongodb.mycompany.com" />
    <property name="hibernate.ogm.datastore.username" value="db_user" />
    <property name="hibernate.ogm.datastore.password" value="top_secret!" />
</properties>
...

And that’s all you need to do to persist your entities in MongoDB rather than Neo4j. If you now run the test again, you’ll find the following BSON documents in your datastore:

# Collection "Person"
{
    "_id" : "50b62f9b-874f-4513-85aa-c2f59015a9d0",
    "firstName" : "Bob",
    "lastName" : "McRobb",
    "organizedHikes" : [
        "a78d731f-eff0-41f5-88d6-951f0206ee67",
        "32384eb4-717a-43dc-8c58-9aa4c4e505d1"
    ]
}

# Collection Hike
{
    "_id" : "a78d731f-eff0-41f5-88d6-951f0206ee67",
    "date" : ISODate("2015-01-16T11:59:48.928Z"),
    "description" : "Visiting Land's End",
    "difficulty" : "5.5",
    "organizer_id" : "50b62f9b-874f-4513-85aa-c2f59015a9d0",
    "sections" : [
        {
            "sectionNo" : 0,
            "start" : "Penzance",
            "end" : "Mousehole"
        },
        {
            "sectionNo" : 1,
            "start" : "Mousehole",
            "end" : "St. Levan"
        },
        {
            "sectionNo" : 2,
            "start" : "St. Levan",
            "end" : "Land's End"
        }
    ]
}
{
    "_id" : "32384eb4-717a-43dc-8c58-9aa4c4e505d1",
    "date" : ISODate("2015-01-16T11:59:48.928Z"),
    "description" : "Exploring Carisbrooke Castle",
    "difficulty" : "7.5",
    "organizer_id" : "50b62f9b-874f-4513-85aa-c2f59015a9d0",
    "sections" : [
        {
            "sectionNo" : 1,
            "start" : "Calbourne",
            "end" : "Carisbrooke Castle"
        },
        {
            "sectionNo" : 0,
            "start" : "Freshwater",
            "end" : "Calbourne"
        }
    ]
}

Again, the mapping is very natural and just as you’d expect it when working with a document store like MongoDB. The bi-directional one-to-many/many-to-one association between Person and Hike is mapped by storing the referenced id(s) on either side. When loading back the data, Hibernate OGM will resolve the ids and allow to navigate the association from one object to the other.

Element collections are mapped using MongoDB's capabilities for storing hierarchical structures. Here the sections of a hike are mapped to an array within the document of the owning hike, with an additional field sectionNo to maintain the collection order. This allows to load an entity and its embedded elements very efficiently via a single round-trip to the datastore.

Wrap-up

In this first instalment of NoSQL with Hibernate OGM 101 you’ve learned how to set up a project with the required dependencies, map some entities and associations and persist them in Neo4j and MongoDB. All this happens via the well-known JPA API. So if you have worked with Hibernate ORM and JPA in the past on top of relational databases, it never was easier to dive into the world of NoSQL.

At the same time, each store is geared towards certain use cases and thus provides specific features and configuration options. Naturally, those cannot be exposed through a generic API such as JPA. Therefore Hibernate OGM lets you make usage of native NoSQL queries and allows to configure store-specific settings via its flexible option system.

You can find the complete example code of this blog post on GitHub. Just fork it and play with it as you like.

Of course storing entities and getting them back via their id is only the beginning. In any real application you’d want to run queries against your data and you’d likely also want to take advantage of some specific features and settings of your chosen NoSQL store. We’ll come to that in the next parts of this series, so stay tuned!

In Relation To