Hibernate Search 4 is coming

Posted by    |       Hibernate Search

Hibernate Search is a library that integrates Hibernate ORM with Apache Lucene or Elasticsearch by automatically indexing entities, enabling advanced search functionality: full-text, geospatial, aggregations and more. For more information, see Hibernate Search on hibernate.org.

The release cycle of Hibernate Search 4 has begun. Alpha 1 is out. We already have many things implemented so this change is consistent and more releases will come quickly.

Hibernate Search 4 goals are two folds:

  • Be compatible with the new Hibernate Core 4 releases.
  • Make the necessary architecture change to reach the future goals of Hibernate Search.

In particular, making Hibernate Search independent from Hibernate Core and allowing more scalable cloud-tainted backends.

This release already includes lots of changes

Split between API / SPI and implementation classes

Each class is now categorized into either an API, a SPI or an implementation class.

  • APIs (in regular packages) are safe to be used in your application and we try very hard to not break these contracts
  • SPI (in .spi packages) are classes that are used by frameworks integrating with Hibernate Search (like Infinispan's search module). These contract are pretty stable but might change more often than APIs
  • Implementations (in .impl packages) are implementation details. Don't let your application depend on these.

If you were a good citizen and already used the API only, you should not be affected. If you were using SPI or internal classes, you will have to adjust. Check our wiki page for the migration guide.

Move to JBoss Logging and error codes

JBoss Logging has some nice features including error internationalization and error codes in messages. You will be able to Google HSEARCH00043 and see why you have such problem.

Integration with Hibernate Core 4

Nuf said.

Move to the per index backend architecture

This will give you more flexibility on how you want your entities indexed and us the possibility for additional optimizations down the road. You can use different technologies for each index, for example use a Lucene backend for some indexes and an Infinispan index for others which need real-time clustering. Also it's now possible to configure the performance parameters of each index separately, from the async/sync option to the number of Worker threads and queue sized in the backend executor.

MassIndexer is no longer an exclusive mode

The MassIndexer no longer locks out the main backend listening for Hibernate events, so it can be started while other transactions run. Until it finished however some results might be missing from the index.

New binary format of communication between remote backends

Lucene no longer guarantees the Serializable contract for Documents and Fields. This is a problem when you use a clustered model for Hibernate Search.

So we have introduced a new communication protocol in the JMS and JGroups backends so they can pass along Lucene works in a safe way. This aligns with our quest to shield you from incompatible changes Lucene may make in future versions. We also want to make it easier to upgrade a cluster of Hibernate Search nodes and let them interact without issue when possible.

So now you can use NumericField in clustered environment, which was previously not possible as it has never been Serializable.

Aggressive on performance

The backend is now quite aggressive in write performance, enabling exclusive_index_use by default and having merged some of the performance tricks from the MassIndexer back into the standard backend to take advantage of them all the time. For example the new backend design allows us to analyse Documents in multiple threads while still guaranteeing writes happen in the proper order. This is configured with the worker.thread_pool.size property, defaulting to one, and applies even to backends configured for synchronous updates.

Get the release

That's all for now, check out the release and make sure to read the Migration Guide.

Many thanks to the community and particularly Davide D'Alto for his contributions and Adam Harris, Samppa Saarela and Elmer van Chastelet for their suggestions for performance and design improvements.


Back to top