Red Hat

In Relation To Hibernate Search

In Relation To Hibernate Search

Introducing Hibernate Search Sort DSL

Posted by    |       |    Tagged as Hibernate Search

With Elasticsearch support coming as a technological preview in Hibernate Search 5.6, you would think we’re leaving out other features. Well, think again! Enters the Sort DSL, which will work with Elasticsearch of course, but also with the good ol' Lucene backend.

The point here is to provide an API to build sort descriptions easily, without knowing everything about Hibernate Search features added on top of Lucene, such as DistanceSortField. And while we’re at it, we’re making it a modern, fluid API.

Most common case: sorting by field

The QueryBuilder interface now has an additional sort() method:

QueryBuilder builder = fullTextSession.getSearchFactory()
Query luceneQuery = builder.all().createQuery();
FullTextQuery query = fullTextSession.createFullTextQuery( luceneQuery, Book.class );
Sort sort = builder
    .byField("author").desc() // Descending order
    .andByField("title") // Default order (ascending)
List results = query.list();

Of course, other kinds of sort are available. Let’s have a look!

Sorting by relevance

The relevance sort is also available with byScore(). Obviously, there’s one key difference with that one: the sort is descending by default, so you get the most relevant results (higher scores) first. If you need the least relevant results, fear not, we got you covered with byScore().asc().

Sorting by distance

If your entity has some spatial fields you may also build spatial sorts:


Stabilizing with byIndexOrder()

byIndexOrder offers an arbitrary, yet deterministic sort. This comes handy when you want to stabilize your sort:


That way, if there are two books with the same title in your index, they will always keep the same relative order from one query to another.

Handling missing values

What if you’re sorting books by publishing date, and some of them haven’t even been published yet? No worry, you may decide whether the unpublished books will appear first or last:

    .byField("publishingDate_sort").desc() // Most recently published first
      .onMissingValue().sortFirst() // Not published yet => put this upper on the list
    .andByField("custom_id_sort") // Default for the case when multiple books have no publishing date

Accessing native features

Let’s assume you’re using an external backend, such as Elasticsearch. You may want to take advantage of a brand-new feature that appeared in the last snapshot of this backend, that feature you just spotted this morning and that would really save you of a lot of trouble on your project. But, hey, the Hibernate Search team is not on the same time zone, and even if they’re providing fast support, you’re not getting the feature pushed into Hibernate Search in time to meet your deadline. Which is this evening, by the way.

Well, guess what: you can use that feature anyway. The sorting API also allows using native sorts. When using the Elasticsearch backend, it means passing the JSON description of this sort, which will be added to the Elasticsearch query as is:

    .byNative("", "{'order':'asc', 'mode': 'min'}")


Of course, one could point out that this API is not really backend-independent. The API itself, its interfaces and methods, mostly are, but the returned type (Sort) is clearly bound to Apache Lucene.

Well, one day at a time: the API in its current form can be adapted to be completely backend-agnostic, so it’s paving the way to Hibernate Search 6.x, while still requiring no change to any other contract such as FullTextQuery.setSort(Sort). And that means it’s available directly in 5.6.0.Beta3!

So be sure to check it out, and to check the documentation for more information. Or, you know, since it’s a fluid API, you can simply use your IDE autocomplete feature and see what’s available!

In any case, feel free to contact us for any question, problem or simply to give us your feedback!

Today we have three releases of Hibernate Search!

I’m proud to announce our team is a bit larger nowadays, and more contributors are volunteering too, so we managed to increase the development pace. Today we release version 5.6.0.Beta3, 5.7.0.Alpha1 and 5.5.5.Final.

Version 5.6.0.Beta3

the latest version of our main development branch, with experimental Elasticsearch integration.

Version 5.7.0.Alpha1

essentially the same as 5.6.0.Beta3, but compatible with Hibernate ORM version 5.2.x.

Version 5.5.5.Final

a maintenance release of our stable branch.

A 5.7 preview released when 5.6 isn’t out yet?

Let me explain this unusual decision was taken to accomodate for the needs of you all.

The 5.6 series is creating a lot of anticipation with the Elasticsearch integration being a very welcome new feature; it’s meant to be an experimental new feature as we won’t break our APIs yet while all integration needs are analyzed, still it’s taking a bit longer than expected and even though it’s and experimental feature we don’t want to rush it and need to finish it up properly.

In the meantime the Hibernate ORM project released a series 5.2.x, and several users have been asking to get an Hibernate Search version compatible with it. We could not upgrade our 5.6 series yet, as then people using an older Hibernate ORM would not be able to play with the Elasticsearch integration.

So now that 5.6 is in good shape - we decided the next release will be a candidate release - we felt we could already publish a 5.7 version, which is just exactly the same but in a new branch made compatible with the very latest Hibernate ORM.

How is the Elasticsearch integration coming?

It’s maturing at high speed. The biggest obstacles have been resolved, so we definitely look out for more feedback at this point; as mentioned, the next version will be a candidate release.

Hibernate Search now has a proper Sorting API: watch this space as we’ll publish a dedicated blog about it, or get a peek at the query sorting paragraph in the documentation.

This is an important milestone, as it makes sorting queries on Elasticsearch possible through our DSL.

How to get these releases

All versions are available on Hibernate Search’s web site.

Ideally use a tool to fetch it from Maven central; these are the coordinates:


To use the experimental Elasticsearch integration you’ll also need:


Downloads from Sourceforge are available as well.

This summer was relatively quiet in terms of releases, but many have been testing and improving the Beta1 release of our Hibernate Search / Elasticsearch integration.

So today we release version 5.6.0.Beta2 with 45 fixes and enhancements!

For a detailed list of all improvements, see this JIRA query.

The day of a Final release gets closer, but highly depends on your feedback. Please keep the feedback coming!

Please let us know of any problem or suggestion by creating an issue on JIRA, by sending an email to the developer’s developer’s mailing lists, or posting on the forums.

We also monitor Stack Overflow; when posting on SO please use the tag hibernate-search.

How to get this release

Everything you need is available on Hibernate Search’s web site.

Get it from Maven Central using the following coordinates:


Downloads from Sourceforge are available as well.

Notes on compatibility

This version is compatible with Apache Lucene versions from 5.3.x to 5.5.x, and with Hibernate ORM versions 5.0.x and 5.1.x.

Compatibility with Hibernate ORM 5.2.x is not a reality yet - we expect to see that materialize in early October. Compatibility with Lucene 6.x is scheduled for Hibernate Search 6.0, which will take longer - probably early 2017.

Finally, the version we used of Elasticsearch for all developing and tests of this version was Elasticsearch v. 2.3.1. We will soon upgrade this to the latest version, and discuss strategies to test against multiple versions.

Hi, I’m Mincong, an engineering student from France. I’m glad to present my Google Summer of Code 2016 project, which provides an alternative to the current mass indexer implementation of Hibernate Search, using the Java Batch architecture (JSR 352). I’ve been working on this project for 4 months. Before getting started, I want to thank Google for sponsoring the project, the Hibernate team for accepting my proposal and my mentors Gunnar and Emmanuel for their help during this period. Now, let’s begin!

What is it about?

Hibernate Search brings full-text search capabilities to your Hibernate/JPA applications by synchronizing the state of your entities with a search index maintained by Lucene (or Elasticsearch as of Hibernate Search 5.6!). Index synchronization usually happens on the fly as entities are modified, but there may be cases where an entire index needs to be re-built, e.g. when enabling indexing for an existing entity type or after changes have been applied directly to the database, bypassing Hibernate (Search).

Hibernate Search provides the mass indexer for this purpose. It was the goal of my GSoC project to develop an alternative using the API for Java batch applications standardized by JSR 352.

What do we gain from JSR 352?

Implementing the mass indexing functionality using the standardized batching API allows you to use the existing tools of your runtime environment for starting/stopping and monitoring the status of the indexing process. E.g. in WildFly you can use the CLI to do so.

Also JSR 352 provides a way to restart specific job runs. This is very useful if re-indexing of an entity type failed mid-way, for instance due to connectivity issues with the database. Once the problem is solved, the batch job will continue where it left off, not processing again those items already processed successfully.

As JSR 352 defines common concepts of batch-oriented applications such as item readers, processors and writers, the job architecture and workflow is very easy to follow. In JSR 352, the workflow is written in an XML file (the "job XML"), which is used to specify a job, its steps and directs their execution. So you can understand the process without jumping into the code.

<job id="massIndex">
    <step id="beforeChunk" next="produceLuceneDoc">
        <batchlet ref="beforeChunkBatchlet"/>

    <step id="produceLuceneDoc" next="afterChunk">
        <chunk checkpoint-policy="custom">
            <reader ref="entityReader"/>
            <processor ref="luceneDocProducer"/>
            <writer ref="luceneDocWriter"/>
            <checkpoint-algorithm ref="checkpointAlgorithm"/>

    <step id="afterChunk">
        <batchlet ref="afterChunkBatchlet"/>

As you see, it brings a pervasive batch-processing workload. Anyone who has experience in ETL processes should have no difficulty to understand our new implementation.

Example usages

Here are the example usages of the new mass indexer under a draft version. It allows you to add one or multiple class types. If you have more than one root entity to index, then you can use the addRootEntities(Class<?>…​) method.

How to use the new MassIndexer
long executionId = new MassIndexer()
        .addRootEntity( Company.class )
Another example with a more customized configuration:
long executionId = new MassIndexer()
        .addRootEntity( Company.class, Employee.class )
        .cacheable( false )
        .checkpointFreq( 1000 )
        .rowsPerPartition( 100000 )
        .maxThreads( 10 )
        .purgeAtStart( true )
        .optimizeAfterPurge( true )
        .optimizeAtEnd( true )


In order to maximize the performance, we highly recommend you to speed up the mass indexer using parallelism. Parallelism is activated by default. Under the JSR 352 standard, the exact word is "partitioning". The indexing step may run as multiple partitions, one per thread. Each partition has its own partition ID and parameters. If there are more partitions than threads, partitions are considered as a queue to consume: each thread can only run one partition at a time and won’t consume the next partition until the previous one is finished.

massIndexer = massIndexer.rowsPerPartition( 500 );


Mass indexer supports checkpoint algorithm. If the job is interrupted for any reason, mass indexer can be restarted from the last checkpoint, stored by the batch runtime. And the entities already indexed won’t be lost, because they are already flushed to the directory provider. Assume that N is the value of checkpoint frequency, then a partition will reach at checkpoint every N items processed inside the partition. You can overwrite it to adapt your business requirements.

massIndexer = massIndexer.checkpointFreq( 1000 );


For further usage, please check my GitHub repo gsoc-hsearch. If you want to play with it, you can download the code and build it with Maven:

$ git clone -b 1.0 git://
$ cd gsoc-hsearch
$ mvn install

Current status and next steps

Currently, the new implementation accepts different types of entity as entry, provides high level of customization of the job properties and parallel indexation. The job periodically saves its current progress to enable restart from the last point of consistency. Load balancing has been considered to avoid overload of any single thread. This indexing batch job is available under Java SE and Java EE.

There are still many things to do, e.g. related to performance improvements, integration into WildFly, monitoring, more fine-grained selection of entities to be re-indexed etc. Here are some of the ideas:

  • Core: partition mapping for composite ID

  • Integration: package the batch job as a WildFly module

  • Integration: start the indexing batch job from FullTextSession and FullTextEntityManager

  • Integration: embed this project into Hibernate Search

  • Monitoring: enhance the basic monitoring, e.g. progress status for restarted job

  • Performance: Ensure a great performance of this implementation

These tasks are tracked as GitHub issues, you can check the complete TODO list here.


If you are using Hibernate Search and ever wished for a more standardized approach to mass indexing, this project clearly is for you.

We still need to apply some improvements and polishing before integrating it as a module into the Hibernate Search core code base, but any bug reports or comments on the project will be very helpful. So please give it a try and let us know about your feedback. Just drop a comment below or raise an issue on GitHub.

Looking forward to hearing from you!

We are making good progress on our next major release which focuses on Elasticsearch integration but we don’t forget our beloved users of Hibernate Search 5.5.x and here is a new stable release to prove it!

This bugfix release is entirely based on user feedback so keep it coming!

Hibernate Search version 5.5.4.Final is available now and fixes the following issues:

  • HSEARCH-2301 - CriteriaObjectInitializer is suboptimal when we query only one subtype of a hierarchy

  • HSEARCH-2286 - DistanceSortField should support reverse sorting

  • HSEARCH-2306 - Upgrade 5.5.x to Hibernate ORM 5.0.9

  • HSEARCH-2307 - Documentation shouldn’t suggest need for @Indexed of embedded association fields

Small focus on HSEARCH-2301 as it might significantly improve your performances if you index complex hierarchy of objects. Prior to this fix, when querying the database to hydrate the objects, Hibernate Search was using the root type of the hierarchy potentially leading to queries with a lot of joins. Hibernate now builds the most efficient query possible depending on the effective results.

You can see two instances of this issue on Stack Overflow here and here.


How to get this release

Everything you need is available on Hibernate Search’s web site. Download the full distribution from here, or get it from Maven Central using the above coordinates, and don’t hesitate to reach us in our forums or mailing lists.

We also monitor closely the hibernate-search tag on Stack Overflow.

After over 60 resolved tasks, we’re proud to release Hibernate Search version 5.6.0.Beta1.

The Elasticsearch integration made significant progress, and we believe it to be ready for wider usage.

Progress of the Elasticsearch integration

Improvements since the previous milestone:


significant better performance as it now uses bulk operations.

Calendar, Dates, numbers and mapping details

several corrections and improvements were made to produce a cleaner schema.

Cluster state

we now wait for a newly started Elasticsearch cluster to be "green" - or optionally "yellow" - before starting to use it.

WildFly modules

a critical bug was resolved, the modules should work fine now.

Many more

for a full list of all 63 improvements, see this JIRA query.

What is missing yet?

Performance testing

we didn’t do much performance testing, it’s probably not as efficient as it could be.

Relax the expected Elasticsearch version

it’s being tested with version 2.3.1 but we have plans to support a wider range of versions.

Explicit refresh requests

we plan to add methods to issue an indexreader refresh request, as the changes pushed to Elasticsearch are not immediately visible by default.

Your Feedback!

we think it’s in pretty good shape, it would be great for more people to try it out and let us know what is missing and how it’s working for you.

Notable differences between using embededd Lucene vs Elasticsearch

Unless you reconfigure Hibernate Search to use an async worker, by default when using the Lucene backend after you commit a transaction the changes to the index are immediately applied and any subsequent search will "see" the changes. On Elasticsearch the default is different: changes received by the cluster are only "visible" to searches after some seconds (1 by default).

You can reconfigure Hibernate Search to force a refresh of indexes after each write operation by using the configuration setting.

This setting defaults to false as that’s the recommended setting for optimal performance on Elasticsearch. You might want to set this to true to make it simpler to write unit tests, but you should take care to not rely on the synchronous behaviour for your production code.

Improvements for embedded Lucene users

While working on Elasticsearch, we also applied some performance improvements which apply to users of the traditional Lucene embedded users.

Special thanks to Andrej Golovnin, who contributed several patches to reduce allocation of objects on the hot path and improve overall performance.

How to get this release

Everything you need is available on Hibernate Search’s web site.

Get it from Maven Central using the above coordinates.

Downloads from Sourceforge are available as well.


Feedback always welcome!

Please let us know of any problem or suggestion by creating an issue on JIRA, or by sending an email to the developer’s developer’s mailing lists, or posting on the forums.

We also monitor Stack Overflow; when posting on SO please use the tag hibernate-search.

While the team has been busy implementing great new features such as the Elasticsearch integration for the next 5.6 release, some of you provided interesting feedback on our stable release.

The summary of the feedback I heard is that migrating to the new sorting requirements can be confusing, and there were some issues with our Faceting implementation.

Hibernate Search version 5.5.3.Final is available now, fixing the reported issues and improving the error messages around sorting.

The changelog is rather small, so this time I’ll post it verbatim:

  • HSEARCH-1917 - Cannot index null or empty values for faceted fields

  • HSEARCH-2082 - Documentation refers to @SortField when it should be @SortableField

  • HSEARCH-2085 - Typo in hibernate-search-engine logger

  • HSEARCH-2086 - Long and Date range faceting doesn’t honor hasZeroCountsIncluded

  • HSEARCH-2179 - Hanging during shutdown of SyncWorkProcessor

  • HSEARCH-2193 - LuceneBackendQueueTask does not release the Directory lock on update failures

  • HSEARCH-2200 - Typo in log message

  • HSEARCH-2240 - Parallel service lookup might fail to find the service

  • HSEARCH-2199 - Allows the use of CharFilter in the programmatic API of SearchMapping

  • HSEARCH-2084 - Upgrade to WildFly 10.0.0.Final

  • HSEARCH-2089 - Ensure the performance tests do not use the WildFly embedded version of Search

  • HSEARCH-1951 - Improve resulting error message when applying the wrong Sort Type

  • HSEARCH-2090 - Using the wrong header in the distribution/pom.xml

  • HSEARCH-2241 - Clarify deprecation of setFilter() method on FullTextQuery

Spot inefficient sorting operations easily in test suites

While Hibernate Search already would log a warning when forced to perform a query using a sub-optimal sorting strategy, that wasn’t making it very easy to spot mapping or usage mistakes.

Set this property: = false

and you’ll have your tests fail with an exception rather than log warnings.

This property is not new in this release, but it’s worth reminding as it makes it much easier to validate your migrations from previous versions.


What are we working on?

The Elasticsearch integration is almost feature complete, we expect to be able to release a Beta1 version in some weeks.

How to get this release

Everything you need is available on Hibernate Search’s web site. Download the full distribution from here, or get it from Maven Central using the above coordinates, and don’t hesitate to reach us in our forums or mailing lists.

Having fixed several issues and tasks since the previous milestone, it’s time to publish our third milestone towards Elasticsearch integration: Hibernate Search version 5.6.0.Alpha3 is now available!

Migration from Hibernate Search 5.5.x

Even if you’re not interested in the new Elasticsearch support, you might want to try out this version as it benefits from Apache Lucene 5.5.0.

If you ignore the new features and want to simply use Lucene in embedded mode the migration is easy, and as usual we are maintaining notes regarding relevant API changes in the Migration Guide to Hibernate Search 5.6.

Elasticsearch support progress

  • you can now use the Analyzers from Elasticsearch

  • Multiple operations will now be sent to Elasticsearch as a single batch to improve both performance and consistency

  • Spatial indexing and querying is now feature complete

  • We’ll wait for Elasticsearch to be "green" before attempting to use it at boot

  • Many improvements in the query translation

  • Error capture and reporting was improved

  • the Massindexer is working now, but is not yet using efficient bulk operations

  • the Elasticsearch extensions are now included in the WildFly modules

How to get this release

Everything you need is available on Hibernate Search’s web site.

Get it from Maven Central using the above coordinates.

Downloads from Sourceforge are available as well.


Feedback always welcome!

Please let us know of any problem or suggestion by creating an issue on JIRA, or by sending an email to the developer’s developer’s mailing lists, or posting on the forums.

We also monitor Stack Overflow; when posting on SO please use the tag hibernate-search.

In this post, I’d like you to meet Martin, who, in spite of his young age, has been very active in the Hibernate Search project development, implementing some interesting extensions or helping with pull request reviewing.

Because I’d love to see more university students getting involved with open source software, I took this opportunity and interviewed Martin about this experience.

  1. Hi, Martin. You are one of the youngest contributors we’ve ever had. Can you please introduce yourself?

    Hi, Vlad. I am a 22-year-old Master’s Degree student at University of Bayreuth, Germany and have been interested in Hibernate Search and Fulltext Search (Lucene, Solr) for quite some time now. I am also a firm believer of Open Source and have actually always wanted to become a contributor of a tool (or software) many other developers use in their projects. Knowing that a piece of code you wrote is running in other systems is quite the rewarding feeling.

  2. I understand that you took part in the Google Summer of Code event. Can you tell us a little bit about this program?

    Yes, I took part in last year Google Summer of Code program and was mentored by Sanne Grinovero while working on adapting Hibernate Search to work with any JPA provider. It gave me the opportunity to dive more deeply into the codebase as it allowed me to concentrate on nothing but my project work-wise. In general Google Summer of Code is one of the best learning experiences any student that wants to get into Open Source can have.

  3. Contributing to an open-source project is a great learning experience. Has this activity helped you improve your skills?

    Definitely. While building new features or tracking down bugs, you encounter loads of different pieces of code you have to work through. With that comes learning new technologies and APIs. Also, the general process of submitting JIRA issues, discussing them and implementing the solutions is something you can learn while working on an open source project. Trying out the process yourself is invaluable and cannot be compared to just learning them on paper. This is also something I always tell to new coders: Try it out or you will not get it 100%.

  4. Do you think the entry barrier is high for starting contributing to an open source project? How should we encourage students to getting involved with open source?

    In the case of the Hibernate team, I can only say that it was quite easy to get into contact with the other developers. I just got onto IRC and asked questions about problems I had. They helped me with every question I had, so I stuck around. Then, I started reporting issues or making feature requests and was immediately incorporated into discussions. So no, the barrier is not high (at least for me in the case of the Hibernate team).

    I think open source needs to be encouraged more at a university level. I think many students don’t realize what they are missing. Yes, open standards are encouraged and teaching uses open APIs all over the place, but universities tend to keep much of the work that is suitable for open source behind closed doors (btw: I don’t think that closed source is always a bad thing, but it sometimes is in the way of innovation).

  5. What are your plans for the future?

    Firstly, I want to finish my Masters degree at University. I haven’t fully decided yet, whether I want to stay at University or not. Time will tell, I guess. Secondly, I want to keep contributing to Hibernate Search and finish merging the features of last years Google Summer of Code into the core code base.

Thank you, Martin, and keep up the good work.

During my talk at VoxxedVienna on using Hibernate Search with Elasticsearch earlier this week, there was an interesting question which I couldn’t answer right away:

"When running a full-text query with a projection of fields, is it possible to return the result as a list of POJOs rather than as a list of arrays of Object?"

The answer is: Yes, it is possible, result transformers are the right tool for this.

Let’s assume you want to convert the result of a projection query against the VideoGame entity shown in the talk into the following DTO (data transfer object):

public static class VideoGameDto {

    private String title;
    private String publisherName;
    private Date release;

    public VideoGameDto(String title, String publisherName, Date release) {
        this.title = title;
        this.publisherName = publisherName;
        this.release = release;

    // getters...

This is how you could do it via a result transformer:

FullTextEntityManager ftem = ...;

QueryBuilder qb = ftem.getSearchFactory()
    .forEntity( VideoGame.class )

FullTextQuery query = ftem.createFullTextQuery(
        .onField( "tags" )
        .matching( "round-based" )
    .setProjection( "title", "", "release" )
    .setResultTransformer( new BasicTransformerAdapter() {
        public VideoGameDto transformTuple(Object[] tuple, String[] aliases) {
            return new VideoGameDto( (String) tuple[0], (String) tuple[1], (Date) tuple[2] );
    } );

List<VideoGameDto> results = query.getResultList();

I’ve pushed this example to the demo repo on GitHub.

There are also some ready-made implementations of the ResultTransformer interface which you might find helpful. So be sure to check out its type hierarchy. For this example I found it easiest to extend BasicTransformerAdapter and implement the transformTuple() method by hand.

To the person asking the question: Thanks, and I hope this answer is helpful to you!

back to top