Red Hat

In Relation To Elasticsearch

In Relation To Elasticsearch

This summer was relatively quiet in terms of releases, but many have been testing and improving the Beta1 release of our Hibernate Search / Elasticsearch integration.

So today we release version 5.6.0.Beta2 with 45 fixes and enhancements!

For a detailed list of all improvements, see this JIRA query.

The day of a Final release gets closer, but highly depends on your feedback. Please keep the feedback coming!

Please let us know of any problem or suggestion by creating an issue on JIRA, by sending an email to the developer’s developer’s mailing lists, or posting on the forums.

We also monitor Stack Overflow; when posting on SO please use the tag hibernate-search.

How to get this release

Everything you need is available on Hibernate Search’s web site.

Get it from Maven Central using the following coordinates:

<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search-orm</artifactId>
   <version>5.6.0.Beta2</version>
</dependency>

Downloads from Sourceforge are available as well.

Notes on compatibility

This version is compatible with Apache Lucene versions from 5.3.x to 5.5.x, and with Hibernate ORM versions 5.0.x and 5.1.x.

Compatibility with Hibernate ORM 5.2.x is not a reality yet - we expect to see that materialize in early October. Compatibility with Lucene 6.x is scheduled for Hibernate Search 6.0, which will take longer - probably early 2017.

Finally, the version we used of Elasticsearch for all developing and tests of this version was Elasticsearch v. 2.3.1. We will soon upgrade this to the latest version, and discuss strategies to test against multiple versions.

After over 60 resolved tasks, we’re proud to release Hibernate Search version 5.6.0.Beta1.

The Elasticsearch integration made significant progress, and we believe it to be ready for wider usage.

Progress of the Elasticsearch integration

Improvements since the previous milestone:

MassIndexer

significant better performance as it now uses bulk operations.

Calendar, Dates, numbers and mapping details

several corrections and improvements were made to produce a cleaner schema.

Cluster state

we now wait for a newly started Elasticsearch cluster to be "green" - or optionally "yellow" - before starting to use it.

WildFly modules

a critical bug was resolved, the modules should work fine now.

Many more

for a full list of all 63 improvements, see this JIRA query.

What is missing yet?

Performance testing

we didn’t do much performance testing, it’s probably not as efficient as it could be.

Relax the expected Elasticsearch version

it’s being tested with version 2.3.1 but we have plans to support a wider range of versions.

Explicit refresh requests

we plan to add methods to issue an indexreader refresh request, as the changes pushed to Elasticsearch are not immediately visible by default.

Your Feedback!

we think it’s in pretty good shape, it would be great for more people to try it out and let us know what is missing and how it’s working for you.

Notable differences between using embededd Lucene vs Elasticsearch

Unless you reconfigure Hibernate Search to use an async worker, by default when using the Lucene backend after you commit a transaction the changes to the index are immediately applied and any subsequent search will "see" the changes. On Elasticsearch the default is different: changes received by the cluster are only "visible" to searches after some seconds (1 by default).

You can reconfigure Hibernate Search to force a refresh of indexes after each write operation by using the hibernate.search.default.elasticsearch.refresh_after_write configuration setting.

This setting defaults to false as that’s the recommended setting for optimal performance on Elasticsearch. You might want to set this to true to make it simpler to write unit tests, but you should take care to not rely on the synchronous behaviour for your production code.

Improvements for embedded Lucene users

While working on Elasticsearch, we also applied some performance improvements which apply to users of the traditional Lucene embedded users.

Special thanks to Andrej Golovnin, who contributed several patches to reduce allocation of objects on the hot path and improve overall performance.

How to get this release

Everything you need is available on Hibernate Search’s web site.

Get it from Maven Central using the above coordinates.

Downloads from Sourceforge are available as well.

Feedback

Feedback always welcome!

Please let us know of any problem or suggestion by creating an issue on JIRA, or by sending an email to the developer’s developer’s mailing lists, or posting on the forums.

We also monitor Stack Overflow; when posting on SO please use the tag hibernate-search.

Having fixed several issues and tasks since the previous milestone, it’s time to publish our third milestone towards Elasticsearch integration: Hibernate Search version 5.6.0.Alpha3 is now available!

Migration from Hibernate Search 5.5.x

Even if you’re not interested in the new Elasticsearch support, you might want to try out this version as it benefits from Apache Lucene 5.5.0.

If you ignore the new features and want to simply use Lucene in embedded mode the migration is easy, and as usual we are maintaining notes regarding relevant API changes in the Migration Guide to Hibernate Search 5.6.

Elasticsearch support progress

  • you can now use the Analyzers from Elasticsearch

  • Multiple operations will now be sent to Elasticsearch as a single batch to improve both performance and consistency

  • Spatial indexing and querying is now feature complete

  • We’ll wait for Elasticsearch to be "green" before attempting to use it at boot

  • Many improvements in the query translation

  • Error capture and reporting was improved

  • the Massindexer is working now, but is not yet using efficient bulk operations

  • the Elasticsearch extensions are now included in the WildFly modules

How to get this release

Everything you need is available on Hibernate Search’s web site.

Get it from Maven Central using the above coordinates.

Downloads from Sourceforge are available as well.

Feedback

Feedback always welcome!

Please let us know of any problem or suggestion by creating an issue on JIRA, or by sending an email to the developer’s developer’s mailing lists, or posting on the forums.

We also monitor Stack Overflow; when posting on SO please use the tag hibernate-search.

During my talk at VoxxedVienna on using Hibernate Search with Elasticsearch earlier this week, there was an interesting question which I couldn’t answer right away:

"When running a full-text query with a projection of fields, is it possible to return the result as a list of POJOs rather than as a list of arrays of Object?"

The answer is: Yes, it is possible, result transformers are the right tool for this.

Let’s assume you want to convert the result of a projection query against the VideoGame entity shown in the talk into the following DTO (data transfer object):

public static class VideoGameDto {

    private String title;
    private String publisherName;
    private Date release;

    public VideoGameDto(String title, String publisherName, Date release) {
        this.title = title;
        this.publisherName = publisherName;
        this.release = release;
    }

    // getters...
}

This is how you could do it via a result transformer:

FullTextEntityManager ftem = ...;

QueryBuilder qb = ftem.getSearchFactory()
    .buildQueryBuilder()
    .forEntity( VideoGame.class )
    .get();

FullTextQuery query = ftem.createFullTextQuery(
    qb.keyword()
        .onField( "tags" )
        .matching( "round-based" )
        .createQuery(),
    VideoGame.class
    )
    .setProjection( "title", "publisher.name", "release" )
    .setResultTransformer( new BasicTransformerAdapter() {
        @Override
        public VideoGameDto transformTuple(Object[] tuple, String[] aliases) {
            return new VideoGameDto( (String) tuple[0], (String) tuple[1], (Date) tuple[2] );
        }
    } );

List<VideoGameDto> results = query.getResultList();

I’ve pushed this example to the demo repo on GitHub.

There are also some ready-made implementations of the ResultTransformer interface which you might find helpful. So be sure to check out its type hierarchy. For this example I found it easiest to extend BasicTransformerAdapter and implement the transformTuple() method by hand.

To the person asking the question: Thanks, and I hope this answer is helpful to you!

Today we’re proud to announce the first Alpha release sporting experimental integration with Elasticsearch!

We also updated to use Apache Lucene 5.5 - the latest stable release of our favourite search engine - as of course we’re not abandoning our traditional users!

What is the Hibernate Search / Elasticsearch integration?

Both Hibernate Search and Elasticsearch can do much more, but for the purpose of explaining this integration at an high level I’ll simplify their definitions as:

Hibernate Search

is a popular extension of the super popular Hibernate ORM framework, which makes it easy to index and query your entities using Apache Lucene.

Elasticsearch

is a very popular REST server which wraps the capabilities of Apache Lucene and makes it easier to scale this service horizontally.

Until today when using Hibernate Search you’d be using Apache Lucene directly, in what we will now call "embedded mode". In this mode a query is executed by the same process of your application, and while indexing happens in background still the overhead of such processing is happening within the same server and within the same JVM process as your Hibernate powered application.

With the Elasticsearch integration, rather than indexing your entities directly by managing the Lucene resources, we will be sending RPCs to an Elasticsearch cluster to achieve a very similar purpose: after all it is also Lucene based, so the feature match is extremely close!

This means that we’re able to transparently map all the current features to this new alternative backend, and by so doing give you more architectural choices at minimum required changes in your applications: the goal is that for most users the differences will be mostly in configuration details.

When using Elasticsearch we will need to send RPCs over the network to run queries and index updates, but on the other hand you benefit from Microservices - style decoupling and all the nice features that Elasticsearch can provide in terms of running and managing an horizontally scalable cluster.

In a nutshell…​

Elasticsearch will manage the index for you, Hibernate ORM will manage your objects and the transactions, Hibernate Search will transparently keep Elasticsearch synchronized to your database transactions and let you query your Domain Model returning managed objects, but moving the heavy lifting to an independent Elasticsearch service.

Elasticsearch integration status

This is literally being developed right now, so do not expect this to be feature complete nor reliable enough to run in a production system. Still, we already have a great set of features working so it’s a nice time to start playing with it and hopefully provide some feedback.

The documentation sports a new Elasticsearch chapter, which is a good place to start.

Configuring the Elasticsearch backend

Dependencies

You’ll need a new dependency, named org.hibernate:hibernate-search-backend-elasticsearch.

The Maven dependencies for Hibernate Search with Elasticsearch:

<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search-backend-elasticsearch</artifactId>
   <version>5.6.0.Alpha2</version>
</dependency>
<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search-orm</artifactId>
   <version>5.6.0.Alpha2</version>
</dependency>

Configuration options

Point it to your Elasticsearch node

hibernate.search.elasticsearch.host http://127.0.0.1:9200

Enable the backend to be used for all your indexed entities

hibernate.search.default.indexmanager elasticsearch

Allow it to destroy and create indexes

hibernate.search.elasticsearch.index_management_strategy CREATE_DELETE

For details about these configuration options see the Reference documentation.

Updating the indexes

As with Lucene in embedded mode, the indexes are updated automatically when you create or update entities which are mapped to Hibernate Search using the same annotations already familiar from our traditional index mapping (see Mapping entities to the index structure).

Running a query on an Elasticsearch mapped entity

In many cases the existing way (see Querying) of running queries should work: we do automatically translate the most common types of Apache Lucene queries and many of the queries generated by the Hibernate Search DSL.

On top of translating Lucene queries, you can directly create Elasticsearch queries by using either its String format or a JSON format:

Creating an Elasticsearch native query from a string:

FullTextSession fullTextSession = Search.getFullTextSession(session);
QueryDescriptor query = ElasticsearchQueries.fromQueryString("title:tales");
List<?> result = fullTextSession.createFullTextQuery(query, ComicBook.class).list();

Creating an Elasticsearch native query from JSON:

FullTextSession fullTextSession = Search.getFullTextSession(session);
QueryDescriptor query = ElasticsearchQueries.fromJson(
      "{ 'query': { 'match' : { 'lastName' : 'Brand' } } }");
List<?> result = session.createFullTextQuery(query, GolfPlayer.class).list();

Remaining work ahead

This is an early preview, and while we’re proud of some of the progress there are several areas which still need much coding. On the other hand, implementing some of these is not very hard: this might be the perfect time to join the project.

Please check with JIRA and the mailing lists for updates, but at the time of writing this at least the following features are known to not work yet:

  • Analyzer definitions are not being applied

  • Spatial queries need more work

  • Filters can’t be applied yet

  • Faceting is mostly implemented

  • Scheduled index optimisation is not applied

  • Query timeouts

  • Delete by queries

  • Resolution for Date type mapping is ignored

  • Scrolling on large results won’t work yet

  • MoreLikeThis queries

  • Mixing Lucene based indexes and Elasticsearch based indexes

Any aspect related to performance and efficiency will also be looked at only at the end of basic feature development.

API Changes

In the 5.x series we will keep backward compatibility.

That might come at a cost of not perfect Hibernate Search / Elasticsearch integration API wise. This is something we will address in the 6.x series. But our focus is on offering the right set of features and get feedback in 5.x before improving the APIs.

In a nutshell, 6.x will depend on how you use this feature in 5.6.

How to get this release

Everything you need is available on Hibernate Search’s web site.

Get it from Maven Central using the above coordinates.

Downloads from Sourceforge are available as well, but these don’t contain the Elasticsearch integration components yet. Similarly the WildFly modules also are not including the new Elasticsearch extensions yet.

Feedback

Feedback always welcome!

Please let us know of any problem or suggestion by creating an issue on JIRA, or by sending an email to the developer’s developer’s mailing lists, or posting on the forums.

We also monitor Stack Overflow; when posting on SO please use the tag hibernate-search.

back to top