Red Hat

In Relation To Hibernate Search

In Relation To Hibernate Search

Today we are releasing two new versions of Hibernate Search: 5.6.0.Beta4 and 5.7.0.Beta1!

Version 5.6.0.Beta4 brings the latest bugfixes and previously missing features for our experimental Elasticsearch integration. This is the version to use with Hibernate ORM versions 5.0.x and 5.1.x.

Version 5.7.0.Beta1 brings the exact same changes as 5.6.0.Beta1, but on top of the compatibility with Hibernate ORM version 5.2.x that was introduced with 5.7.0.Alpha1.

What’s new?

  • HSEARCH-402: A new async reader strategy has been added for the Lucene indexing service, bringing performance boosts when you are okay with your queries being run on an out-of-date index (how much out-of-date is configurable).

  • HSEARCH-2260: A new VALIDATE index schema management strategy has been added for Elasticsearch, allowing you to automatically check on startup that your Hibernate Search mappings are in line with the Elasticsearch mappings.

  • Issues with @IndexedEmbedded in the Elasticsearch integration have be addressed: everything should now work properly, with the notable exception of @IndexedEmbedded.indexNullAs (not to be confused with @Field.indexNullAs).

  • HSEARCH-2235: You can now configure Hibernate Search to send requests to Elasticsearch servers in round-robin, enabling load-balancing. Failover is not supported yet, but we’ll be working on it.

  • HSEARCH-2360: Elasticsearch projections now use source filtering, greatly reducing the bandwidth needs when retrieving results.

  • We now test our Elasticsearch integration against version 2.4.2, which fixed an issue with date formats that impacted Hibernate Search. We strongly recommend to update your 2.4.x instances to the lastest available version in the 2.4.x series.

  • …​ and much more. The full change log can be found on our JIRA instance or on our GitHub repository.

When will this 5.6.0 be released?

We’ve been caught up in the polishing work with the Elasticsearch integration lately, but we’re seeing the end of the tunnel: the list of open tasks is getting shorter and shorter. The first release candidate for Hibernate Search 5.6.0 will land by the end of next week.

So, if you haven’t tested 5.6 already, now’s the time! Should you find any bug, please report them on our JIRA instance.

What about Elasticsearch 5 support?

Please be aware that we’re not currently supporting Elasticsearch 5.x. The main reason is it brings several backward-incompatible changes that would require quite a bit of work if we still want to support the 2.x series. And we don’t want to postpone the Hibernate Search 5.6.0 release any more.

Our plan is to release a 5.6.0 supporting Elasticsearch 2.x, and add Elasticsearch 5 support in Hibernate Search 6.0 or, maybe, in an early 5.8 release. You may refer to HSEARCH-2434 to track the status of Elasticsearch 5.0 support.

When will 5.7.0 be released?

Everything is going smoothly with this version, and very few bugs have been reported. As soon as 5.6.0 will be released, we’ll publish the candidate release for 5.7.0.

How to get these releases

All versions are available on Hibernate Search’s web site.

Ideally use a tool to fetch it from Maven central; these are the coordinates:

<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search-orm</artifactId>
   <version>5.6.0.Beta4</version>
</dependency>

Or, for Hibernate Search 5.7:

<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search-orm</artifactId>
   <version>5.7.0.Beta1</version>
</dependency>

To use the experimental Elasticsearch integration you’ll also need:

<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search-elasticsearch</artifactId>
   <version>5.6.0.Beta4</version>
</dependency>

(Change the version to 5.7.0.Beta1 in order to test the Elasticsearch integration within Hibernate Search 5.7)

Downloads from Sourceforge are available as well.

Meet Michael Simons

Posted by    |       |    Tagged as Discussions Hibernate ORM Hibernate Search

In this post, I’d like you to meet Michael Simons, a long-time Spring and Hibernate user, and NetBeans Dream Team member.

Michael Simons, align=
  1. Hi, Michael. Would you like to introduce yourself and tell us a little bit about your developer experience?

    My name is Michael Simons, @rotnroll666 on Twitter.

    I am a 37 year old developer living in Aachen, Germany. As such, I qualify at least age wise as a senior developer. I’ve been in the software industry for nearly 15 years now. I do have a technical apprenticeship and half finished degree in applied Mathematics. I never finished those studies as I landed in the company I still work with, ENERKO INFORMATIK.

    Apart from being a husband and a father of two, I run our local JUG, the Euregio JUG.

    I’m also a member of the NetBeans Dream Team.

    We, at ENERKO INFORMATIK, create software in the energy and utility markets and work both in the technical and geographical parts as well as sales.

    I did really database-centric software for at least 4 years and, I am still fluent in SQL and also PL/SQL. We once did XML processing inside Oracle databases, which works surprisingly well. After the slow death of Oracle Forms Client / Server, we migrated our desktop applications from 2004 onwards to Java Swing.

    Around that time I played around with Hibernate for the first time. It seemed like the golden bullet back then to get table rows into objects. It wasn’t. But that I learned much later. As Gavin King said: „Just because you’re using Hibernate, doesn’t mean you have to use it for everything.“

    Apart from „boring“ business application, I’ve been not only blogging for a long time but ran several online communities, one of them still alive. Daily Fratze is a daily photo project started in 2005 as a PHP application. Then, in 2006 became a Ruby on Rails site and by 2010, I started migrating it to a Spring and Hibernate backend.

    Looking back at that code I notice how much I didn’t know about the purpose and intention of JPA / Hibernate. As it is true for a lot of stuff, you have to know your tools and the stuff they are designed for. Not all relations are meant to be materialized in objects, not all queries can be generated. And knowledge about when a Session is flushed and when not is essential. It’s not enough to know SQL to fully utilize Hibernate, but it’s also not enough to know Hibernate for abstracting persistence away. I changed the flaws in the site code, but you as the reader of this interview must not learn it the hard way, I really recommend Vlad Mihalcea's book High-Performance Java Persistence.

    My biking project and the site and API of the Euregio JUG are my latest public projects that represent the stuff learned with the experience above. I use those projects as reference projects for my work.

    Since several years, I mostly use NetBeans exclusively for all kind of software development. It supports me for plain Java, Spring, insanely well for JPA entities (build in checks for queries, coding style), for front-end (HTML as well as JavaScript). Most important, it has great integration for code quality related tools like JaCoCo, Sonar and more.

  2. You’ve designed the Java User Group Euregio Maas-Rhine site. Can you tell us what frameworks have you used?

    For Euregio JUG (source is on GitHub), I chose the following:

    • Spring Boot as the application container and glue for

      • Spring Framework and MVC

      • JPA and its implementation Hibernate

      • Spring Data JPA

    • The frontend is done pretty old school by server side rendered templates using Thymeleaf.

      As a rule of the thumb I’d choose the following stack when using any kind of SQL data store:

    • automatic database migrations using Flyway or Liquibase

    • JPA / Hibernate together with Spring Data JPA as long as I can express my domain model as entities and tables

    • JPQL queries if necessary with the benefit that they are checked at application start

    • No native queries hidden away in some annotations

    • If I have to do native queries, I’ll choose Springs JDBC template or since 2015 jOOQ if applicable.

    • My rule for switching to native queries is when I do have projections or „hard“ analytics that would take an awful lot of Java Code instead of a few lines SQL.

  3. Why did you choose Hibernate ORM and Search over other frameworks and did it match your expectations?

    From my background, the data model has always been essential. I worked with awesome models and not so great ones. If you can map your domain to a data model, having a working domain driven design, Hibernate helps you a lot to materialize it back into Java.

    We want to publish articles (posts) and events on the site. People should be able to register for those events, one time each. This fits perfectly into a simple and well understandable domain model that maps perfectly to objects and entities.

    Why bother writing those plain SQL statements for selecting and updating stuff myself?

    I chose the combination of Hibernate and Spring Data JPA for a reason: The domain I map in Hibernate facilitates all the „magic“ Spring Data JPA does: Generating queries, conditions and such: I hardly have to write anything myself.

    If you chose JPA / Hibernate ORM in your project, I really recommend adding Spring Data JPA to the mix. Even if it’s a Java EE project. Spring Data JPA is a bit harder to configure there but provides the user with a lot of helpful stuff.

    Using Hibernate Search integration was an experiment. I’m using it for a long time now on my daily photo project. With little effort, my entities provide access to a Lucene based index and I don’t have to fight with SOLR.

    The EuregJUG site has no local storage in contrast to my daily photo project. So, I had to test drive the upcoming Elastic Search integration in 5.6.0, which works with the same set of annotations, the same entities but not against a local index but against a remote Elastic search index. You can see it in those commits described here and use it here.

    It really isn’t much stuff added, it fits into the repository / DDD approach and matches my expectations.

    Regarding Spring Boot, I’ve been doing Spring now for more than 7 years and Boot since early 2014. It has an awesome community and actually never disappointed me.

  4. We always value feedback from our users, so can you tell us what you’d like us to improve or are there features that we should add support for?

    As a Hibernate user the most common problems I had during the time which I didn’t know exactly what I was doing: Having problems mapping my tables, slow queries and such. Also throwing Hibernate at problems that would have been a better fit for plain SQL caused problems. In both cases, those problems could be solved on my end.

    The experience in Hibernates bug tracker is improving a lot lately and I would appreciate an even better integration with Spring Data JPA.

    To close this, I have to say that I really like the stack mentioned above. We have reached a quality of component where you can really work well from a Domain Driven Design perspective, switch to SQL if needed and still having a clean architecture with clean, testable code. Problems that have been around 10 years ago most often gone.

    It’s really not hard to get an application up and running, including unit and integration tests. If you leave aside the hypes, you can concentrate on both actual problems and on enabling people to learn and do their jobs.

    I really like the fact that Hibernate Spatial find its way into ORM itself. If you have correctly mapped entities and you need to query them via spatial regions, that stuff is really helpful and works quite well. I’d appreciate more information there.

Thank you, Michael, for taking your time. It is a great honor to have you here. To reach Michael, you can follow him on Twitter.

Introducing Hibernate Search Sort DSL

Posted by    |       |    Tagged as Hibernate Search

With Elasticsearch support coming as a technological preview in Hibernate Search 5.6, you would think we’re leaving out other features. Well, think again! Enters the Sort DSL, which will work with Elasticsearch of course, but also with the good ol' Lucene backend.

The point here is to provide an API to build sort descriptions easily, without knowing everything about Hibernate Search features added on top of Lucene, such as DistanceSortField. And while we’re at it, we’re making it a modern, fluid API.

Most common case: sorting by field

The QueryBuilder interface now has an additional sort() method:

QueryBuilder builder = fullTextSession.getSearchFactory()
  .buildQueryBuilder().forEntity(Book.class).get();
Query luceneQuery = builder.all().createQuery();
FullTextQuery query = fullTextSession.createFullTextQuery( luceneQuery, Book.class );
Sort sort = builder
  .sort()
    .byField("author").desc() // Descending order
    .andByField("title") // Default order (ascending)
  .createSort();
query.setSort(sort);
List results = query.list();

Of course, other kinds of sort are available. Let’s have a look!

Sorting by relevance

The relevance sort is also available with byScore(). Obviously, there’s one key difference with that one: the sort is descending by default, so you get the most relevant results (higher scores) first. If you need the least relevant results, fear not, we got you covered with byScore().asc().

Sorting by distance

If your entity has some spatial fields you may also build spatial sorts:

  .sort()
    .byDistance()
      .onField("location")
      .fromLatitude(24.0d)
      .andLongitude(32.0d)
  .createSort()

Stabilizing with byIndexOrder()

byIndexOrder offers an arbitrary, yet deterministic sort. This comes handy when you want to stabilize your sort:

  .sort()
    .byField("title")
    .andByIndexOrder()
  .createSort()

That way, if there are two books with the same title in your index, they will always keep the same relative order from one query to another.

Handling missing values

What if you’re sorting books by publishing date, and some of them haven’t even been published yet? No worry, you may decide whether the unpublished books will appear first or last:

  .sort()
    .byField("publishingDate_sort").desc() // Most recently published first
      .onMissingValue().sortFirst() // Not published yet => put this upper on the list
    .andByField("custom_id_sort") // Default for the case when multiple books have no publishing date
  .createSort()

Accessing native features

Let’s assume you’re using an external backend, such as Elasticsearch. You may want to take advantage of a brand-new feature that appeared in the last snapshot of this backend, that feature you just spotted this morning and that would really save you of a lot of trouble on your project. But, hey, the Hibernate Search team is not on the same time zone, and even if they’re providing fast support, you’re not getting the feature pushed into Hibernate Search in time to meet your deadline. Which is this evening, by the way.

Well, guess what: you can use that feature anyway. The sorting API also allows using native sorts. When using the Elasticsearch backend, it means passing the JSON description of this sort, which will be added to the Elasticsearch query as is:

  .sort()
    .byNative("authors.name", "{'order':'asc', 'mode': 'min'}")
    .andByField("title")
  .createSort()

Next…​

Of course, one could point out that this API is not really backend-independent. The API itself, its interfaces and methods, mostly are, but the returned type (Sort) is clearly bound to Apache Lucene.

Well, one day at a time: the API in its current form can be adapted to be completely backend-agnostic, so it’s paving the way to Hibernate Search 6.x, while still requiring no change to any other contract such as FullTextQuery.setSort(Sort). And that means it’s available directly in 5.6.0.Beta3!

So be sure to check it out, and to check the documentation for more information. Or, you know, since it’s a fluid API, you can simply use your IDE autocomplete feature and see what’s available!

In any case, feel free to contact us for any question, problem or simply to give us your feedback!

Today we have three releases of Hibernate Search!

I’m proud to announce our team is a bit larger nowadays, and more contributors are volunteering too, so we managed to increase the development pace. Today we release version 5.6.0.Beta3, 5.7.0.Alpha1 and 5.5.5.Final.

Version 5.6.0.Beta3

the latest version of our main development branch, with experimental Elasticsearch integration.

Version 5.7.0.Alpha1

essentially the same as 5.6.0.Beta3, but compatible with Hibernate ORM version 5.2.x.

Version 5.5.5.Final

a maintenance release of our stable branch.

A 5.7 preview released when 5.6 isn’t out yet?

Let me explain this unusual decision was taken to accomodate for the needs of you all.

The 5.6 series is creating a lot of anticipation with the Elasticsearch integration being a very welcome new feature; it’s meant to be an experimental new feature as we won’t break our APIs yet while all integration needs are analyzed, still it’s taking a bit longer than expected and even though it’s and experimental feature we don’t want to rush it and need to finish it up properly.

In the meantime the Hibernate ORM project released a series 5.2.x, and several users have been asking to get an Hibernate Search version compatible with it. We could not upgrade our 5.6 series yet, as then people using an older Hibernate ORM would not be able to play with the Elasticsearch integration.

So now that 5.6 is in good shape - we decided the next release will be a candidate release - we felt we could already publish a 5.7 version, which is just exactly the same but in a new branch made compatible with the very latest Hibernate ORM.

How is the Elasticsearch integration coming?

It’s maturing at high speed. The biggest obstacles have been resolved, so we definitely look out for more feedback at this point; as mentioned, the next version will be a candidate release.

Hibernate Search now has a proper Sorting API: watch this space as we’ll publish a dedicated blog about it, or get a peek at the query sorting paragraph in the documentation.

This is an important milestone, as it makes sorting queries on Elasticsearch possible through our DSL.

How to get these releases

All versions are available on Hibernate Search’s web site.

Ideally use a tool to fetch it from Maven central; these are the coordinates:

<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search-orm</artifactId>
   <version>5.6.0.Beta3</version>
</dependency>

To use the experimental Elasticsearch integration you’ll also need:

<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search-elasticsearch</artifactId>
   <version>5.6.0.Beta3</version>
</dependency>

Downloads from Sourceforge are available as well.

This summer was relatively quiet in terms of releases, but many have been testing and improving the Beta1 release of our Hibernate Search / Elasticsearch integration.

So today we release version 5.6.0.Beta2 with 45 fixes and enhancements!

For a detailed list of all improvements, see this JIRA query.

The day of a Final release gets closer, but highly depends on your feedback. Please keep the feedback coming!

Please let us know of any problem or suggestion by creating an issue on JIRA, by sending an email to the developer’s developer’s mailing lists, or posting on the forums.

We also monitor Stack Overflow; when posting on SO please use the tag hibernate-search.

How to get this release

Everything you need is available on Hibernate Search’s web site.

Get it from Maven Central using the following coordinates:

<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search-orm</artifactId>
   <version>5.6.0.Beta2</version>
</dependency>

Downloads from Sourceforge are available as well.

Notes on compatibility

This version is compatible with Apache Lucene versions from 5.3.x to 5.5.x, and with Hibernate ORM versions 5.0.x and 5.1.x.

Compatibility with Hibernate ORM 5.2.x is not a reality yet - we expect to see that materialize in early October. Compatibility with Lucene 6.x is scheduled for Hibernate Search 6.0, which will take longer - probably early 2017.

Finally, the version we used of Elasticsearch for all developing and tests of this version was Elasticsearch v. 2.3.1. We will soon upgrade this to the latest version, and discuss strategies to test against multiple versions.

Hi, I’m Mincong, an engineering student from France. I’m glad to present my Google Summer of Code 2016 project, which provides an alternative to the current mass indexer implementation of Hibernate Search, using the Java Batch architecture (JSR 352). I’ve been working on this project for 4 months. Before getting started, I want to thank Google for sponsoring the project, the Hibernate team for accepting my proposal and my mentors Gunnar and Emmanuel for their help during this period. Now, let’s begin!

What is it about?

Hibernate Search brings full-text search capabilities to your Hibernate/JPA applications by synchronizing the state of your entities with a search index maintained by Lucene (or Elasticsearch as of Hibernate Search 5.6!). Index synchronization usually happens on the fly as entities are modified, but there may be cases where an entire index needs to be re-built, e.g. when enabling indexing for an existing entity type or after changes have been applied directly to the database, bypassing Hibernate (Search).

Hibernate Search provides the mass indexer for this purpose. It was the goal of my GSoC project to develop an alternative using the API for Java batch applications standardized by JSR 352.

What do we gain from JSR 352?

Implementing the mass indexing functionality using the standardized batching API allows you to use the existing tools of your runtime environment for starting/stopping and monitoring the status of the indexing process. E.g. in WildFly you can use the CLI to do so.

Also JSR 352 provides a way to restart specific job runs. This is very useful if re-indexing of an entity type failed mid-way, for instance due to connectivity issues with the database. Once the problem is solved, the batch job will continue where it left off, not processing again those items already processed successfully.

As JSR 352 defines common concepts of batch-oriented applications such as item readers, processors and writers, the job architecture and workflow is very easy to follow. In JSR 352, the workflow is written in an XML file (the "job XML"), which is used to specify a job, its steps and directs their execution. So you can understand the process without jumping into the code.

<job id="massIndex">
    <step id="beforeChunk" next="produceLuceneDoc">
        <batchlet ref="beforeChunkBatchlet"/>
    </step>

    <step id="produceLuceneDoc" next="afterChunk">
        <chunk checkpoint-policy="custom">
            <reader ref="entityReader"/>
            <processor ref="luceneDocProducer"/>
            <writer ref="luceneDocWriter"/>
            <checkpoint-algorithm ref="checkpointAlgorithm"/>
        </chunk>
        ...
    </step>

    <step id="afterChunk">
        <batchlet ref="afterChunkBatchlet"/>
    </step>
</job>

As you see, it brings a pervasive batch-processing workload. Anyone who has experience in ETL processes should have no difficulty to understand our new implementation.

Example usages

Here are the example usages of the new mass indexer under a draft version. It allows you to add one or multiple class types. If you have more than one root entity to index, then you can use the addRootEntities(Class<?>…​) method.

How to use the new MassIndexer
long executionId = new MassIndexer()
        .addRootEntity( Company.class )
        .start();
Another example with a more customized configuration:
long executionId = new MassIndexer()
        .addRootEntity( Company.class, Employee.class )
        .cacheable( false )
        .checkpointFreq( 1000 )
        .rowsPerPartition( 100000 )
        .maxThreads( 10 )
        .purgeAtStart( true )
        .optimizeAfterPurge( true )
        .optimizeAtEnd( true )
        .start();

Parallelism

In order to maximize the performance, we highly recommend you to speed up the mass indexer using parallelism. Parallelism is activated by default. Under the JSR 352 standard, the exact word is "partitioning". The indexing step may run as multiple partitions, one per thread. Each partition has its own partition ID and parameters. If there are more partitions than threads, partitions are considered as a queue to consume: each thread can only run one partition at a time and won’t consume the next partition until the previous one is finished.

massIndexer = massIndexer.rowsPerPartition( 500 );

Checkpoints

Mass indexer supports checkpoint algorithm. If the job is interrupted for any reason, mass indexer can be restarted from the last checkpoint, stored by the batch runtime. And the entities already indexed won’t be lost, because they are already flushed to the directory provider. Assume that N is the value of checkpoint frequency, then a partition will reach at checkpoint every N items processed inside the partition. You can overwrite it to adapt your business requirements.

massIndexer = massIndexer.checkpointFreq( 1000 );

Run

For further usage, please check my GitHub repo gsoc-hsearch. If you want to play with it, you can download the code and build it with Maven:

$ git clone -b 1.0 git://github.com/mincong-h/gsoc-hsearch.git
$ cd gsoc-hsearch
$ mvn install

Current status and next steps

Currently, the new implementation accepts different types of entity as entry, provides high level of customization of the job properties and parallel indexation. The job periodically saves its current progress to enable restart from the last point of consistency. Load balancing has been considered to avoid overload of any single thread. This indexing batch job is available under Java SE and Java EE.

There are still many things to do, e.g. related to performance improvements, integration into WildFly, monitoring, more fine-grained selection of entities to be re-indexed etc. Here are some of the ideas:

  • Core: partition mapping for composite ID

  • Integration: package the batch job as a WildFly module

  • Integration: start the indexing batch job from FullTextSession and FullTextEntityManager

  • Integration: embed this project into Hibernate Search

  • Monitoring: enhance the basic monitoring, e.g. progress status for restarted job

  • Performance: Ensure a great performance of this implementation

These tasks are tracked as GitHub issues, you can check the complete TODO list here.

Feedback

If you are using Hibernate Search and ever wished for a more standardized approach to mass indexing, this project clearly is for you.

We still need to apply some improvements and polishing before integrating it as a module into the Hibernate Search core code base, but any bug reports or comments on the project will be very helpful. So please give it a try and let us know about your feedback. Just drop a comment below or raise an issue on GitHub.

Looking forward to hearing from you!

We are making good progress on our next major release which focuses on Elasticsearch integration but we don’t forget our beloved users of Hibernate Search 5.5.x and here is a new stable release to prove it!

This bugfix release is entirely based on user feedback so keep it coming!

Hibernate Search version 5.5.4.Final is available now and fixes the following issues:

  • HSEARCH-2301 - CriteriaObjectInitializer is suboptimal when we query only one subtype of a hierarchy

  • HSEARCH-2286 - DistanceSortField should support reverse sorting

  • HSEARCH-2306 - Upgrade 5.5.x to Hibernate ORM 5.0.9

  • HSEARCH-2307 - Documentation shouldn’t suggest need for @Indexed of embedded association fields

Small focus on HSEARCH-2301 as it might significantly improve your performances if you index complex hierarchy of objects. Prior to this fix, when querying the database to hydrate the objects, Hibernate Search was using the root type of the hierarchy potentially leading to queries with a lot of joins. Hibernate now builds the most efficient query possible depending on the effective results.

You can see two instances of this issue on Stack Overflow here and here.

<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search-orm</artifactId>
   <version>5.5.4.Final</version>
</dependency>
<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-core</artifactId>
   <version>5.0.9.Final</version>
</dependency>
<dependency>
   <groupId>org.apache.lucene</groupId>
   <artifactId>lucene-core</artifactId>
   <version>5.3.1</version>
</dependency>

How to get this release

Everything you need is available on Hibernate Search’s web site. Download the full distribution from here, or get it from Maven Central using the above coordinates, and don’t hesitate to reach us in our forums or mailing lists.

We also monitor closely the hibernate-search tag on Stack Overflow.

After over 60 resolved tasks, we’re proud to release Hibernate Search version 5.6.0.Beta1.

The Elasticsearch integration made significant progress, and we believe it to be ready for wider usage.

Progress of the Elasticsearch integration

Improvements since the previous milestone:

MassIndexer

significant better performance as it now uses bulk operations.

Calendar, Dates, numbers and mapping details

several corrections and improvements were made to produce a cleaner schema.

Cluster state

we now wait for a newly started Elasticsearch cluster to be "green" - or optionally "yellow" - before starting to use it.

WildFly modules

a critical bug was resolved, the modules should work fine now.

Many more

for a full list of all 63 improvements, see this JIRA query.

What is missing yet?

Performance testing

we didn’t do much performance testing, it’s probably not as efficient as it could be.

Relax the expected Elasticsearch version

it’s being tested with version 2.3.1 but we have plans to support a wider range of versions.

Explicit refresh requests

we plan to add methods to issue an indexreader refresh request, as the changes pushed to Elasticsearch are not immediately visible by default.

Your Feedback!

we think it’s in pretty good shape, it would be great for more people to try it out and let us know what is missing and how it’s working for you.

Notable differences between using embededd Lucene vs Elasticsearch

Unless you reconfigure Hibernate Search to use an async worker, by default when using the Lucene backend after you commit a transaction the changes to the index are immediately applied and any subsequent search will "see" the changes. On Elasticsearch the default is different: changes received by the cluster are only "visible" to searches after some seconds (1 by default).

You can reconfigure Hibernate Search to force a refresh of indexes after each write operation by using the hibernate.search.default.elasticsearch.refresh_after_write configuration setting.

This setting defaults to false as that’s the recommended setting for optimal performance on Elasticsearch. You might want to set this to true to make it simpler to write unit tests, but you should take care to not rely on the synchronous behaviour for your production code.

Improvements for embedded Lucene users

While working on Elasticsearch, we also applied some performance improvements which apply to users of the traditional Lucene embedded users.

Special thanks to Andrej Golovnin, who contributed several patches to reduce allocation of objects on the hot path and improve overall performance.

How to get this release

Everything you need is available on Hibernate Search’s web site.

Get it from Maven Central using the above coordinates.

Downloads from Sourceforge are available as well.

Feedback

Feedback always welcome!

Please let us know of any problem or suggestion by creating an issue on JIRA, or by sending an email to the developer’s developer’s mailing lists, or posting on the forums.

We also monitor Stack Overflow; when posting on SO please use the tag hibernate-search.

While the team has been busy implementing great new features such as the Elasticsearch integration for the next 5.6 release, some of you provided interesting feedback on our stable release.

The summary of the feedback I heard is that migrating to the new sorting requirements can be confusing, and there were some issues with our Faceting implementation.

Hibernate Search version 5.5.3.Final is available now, fixing the reported issues and improving the error messages around sorting.

The changelog is rather small, so this time I’ll post it verbatim:

  • HSEARCH-1917 - Cannot index null or empty values for faceted fields

  • HSEARCH-2082 - Documentation refers to @SortField when it should be @SortableField

  • HSEARCH-2085 - Typo in hibernate-search-engine logger

  • HSEARCH-2086 - Long and Date range faceting doesn’t honor hasZeroCountsIncluded

  • HSEARCH-2179 - Hanging during shutdown of SyncWorkProcessor

  • HSEARCH-2193 - LuceneBackendQueueTask does not release the Directory lock on update failures

  • HSEARCH-2200 - Typo in log message

  • HSEARCH-2240 - Parallel service lookup might fail to find the service

  • HSEARCH-2199 - Allows the use of CharFilter in the programmatic API of SearchMapping

  • HSEARCH-2084 - Upgrade to WildFly 10.0.0.Final

  • HSEARCH-2089 - Ensure the performance tests do not use the WildFly embedded version of Search

  • HSEARCH-1951 - Improve resulting error message when applying the wrong Sort Type

  • HSEARCH-2090 - Using the wrong header in the distribution/pom.xml

  • HSEARCH-2241 - Clarify deprecation of setFilter() method on FullTextQuery

Spot inefficient sorting operations easily in test suites

While Hibernate Search already would log a warning when forced to perform a query using a sub-optimal sorting strategy, that wasn’t making it very easy to spot mapping or usage mistakes.

Set this property:

    hibernate.search.index_uninverting_allowed = false

and you’ll have your tests fail with an exception rather than log warnings.

This property is not new in this release, but it’s worth reminding as it makes it much easier to validate your migrations from previous versions.

<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search-orm</artifactId>
   <version>5.5.3.Final</version>
</dependency>
<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-core</artifactId>
   <version>5.0.6.Final</version>
</dependency>
<dependency>
   <groupId>org.apache.lucene</groupId>
   <artifactId>lucene-core</artifactId>
   <version>5.3.1</version>
</dependency>

What are we working on?

The Elasticsearch integration is almost feature complete, we expect to be able to release a Beta1 version in some weeks.

How to get this release

Everything you need is available on Hibernate Search’s web site. Download the full distribution from here, or get it from Maven Central using the above coordinates, and don’t hesitate to reach us in our forums or mailing lists.

Having fixed several issues and tasks since the previous milestone, it’s time to publish our third milestone towards Elasticsearch integration: Hibernate Search version 5.6.0.Alpha3 is now available!

Migration from Hibernate Search 5.5.x

Even if you’re not interested in the new Elasticsearch support, you might want to try out this version as it benefits from Apache Lucene 5.5.0.

If you ignore the new features and want to simply use Lucene in embedded mode the migration is easy, and as usual we are maintaining notes regarding relevant API changes in the Migration Guide to Hibernate Search 5.6.

Elasticsearch support progress

  • you can now use the Analyzers from Elasticsearch

  • Multiple operations will now be sent to Elasticsearch as a single batch to improve both performance and consistency

  • Spatial indexing and querying is now feature complete

  • We’ll wait for Elasticsearch to be "green" before attempting to use it at boot

  • Many improvements in the query translation

  • Error capture and reporting was improved

  • the Massindexer is working now, but is not yet using efficient bulk operations

  • the Elasticsearch extensions are now included in the WildFly modules

How to get this release

Everything you need is available on Hibernate Search’s web site.

Get it from Maven Central using the above coordinates.

Downloads from Sourceforge are available as well.

Feedback

Feedback always welcome!

Please let us know of any problem or suggestion by creating an issue on JIRA, or by sending an email to the developer’s developer’s mailing lists, or posting on the forums.

We also monitor Stack Overflow; when posting on SO please use the tag hibernate-search.

back to top