I'm working in the Hibernate and Infinispan teams at JBoss, caring about Lucene integration in products we support, striving to make it easier to use and to integrate in well known APIs and patterns, and finally to make it scale better; I love clean and well performing code.

I've been an early adopter of cloud deployments scaling Lucene to a huge number of requests on EC2 using Hibernate Search, and after that I worked with Sourcesense to make JIRA clusterable via Infinispan. Have been trainer on Seam and Hibernate courses.

Location: Newcastle, UK
Occupation: Doing stuff at JBoss, a Division of Red Hat Inc

Last night I uploaded two bugfix releases of Hibernate Search stable branches:

  • 4.4.3.Final (Hibernate ORM 4.2 and JPA 2.0 users, JBoss 7.2 and EAP6)
  • 4.5.1.Final (Hibernate ORM 4.3 and JPA 2.1 users, WildFly 8)

They both contain several backported fixes, thanks to the excellent testing efforst of Guillaume Smet and Yoann Rodiere, who found very sophisticated issues and also helped with patches. I now added Yoann as committer too, congratulations!

Details of fixes can be found in the 4.4.3.Final changelog.txt and 4.5.1.Final changelog.txt.

Happy searching!

Version 5.0.0.Alpha3 is now available: now integrating with Apache Lucene 4.7.1, which was released just 24 hours before.


More Like This

Introduced and better described on our previous post and in the Query DSL chapter, the new feature now also works with compressed fields and @IndexedEmbedded fields.

OSGi and ClassLoaders

On our path to 5.0 we're aiming to a full internal refactoring of ClassLoaders handling, Service loading strategies, etc.. with the goal to be reliable in complex modular deployments, including OSGi. To reach a full OSGi compatibility some public API packages will need to change in the next version too!

Many smaller details

There is a list of smaller polishing, like more reliable JGroups and Infinispan tests, a diet program for dependencies, updates to latest Hibernate ORM, JGroups and Infinispan versions.

Performance tuning

The primary performance bottleneck I've observed in the new Lucene 4 backend is the need to tune the max_threads_state option on Lucene's IndexWriter. This option controls the level of parallelism you want to allow to the IndexWriter. The default is a very reasonable 8, but this is now configurable using the typical format as expressed in the Lucene Tuning chapter:​[default|<indexname>].​indexwriter.max_thread_states 

What's next?

We're currently busy with OSGi tests, an easy way to extend the set of FieldBridges supported by the engine, improved handling of dynamic types and overall structure of how you define your indexed model. Also worth nothing that all of this will be integrated in the Infinispan Query engine soon. You can find an high level overview on our Roadmap page.

The release 5.0.0.Alpha2 is now available on our shiny new website: as the alpha1 release also did, it integrates with Apache Lucene 4.6.1, but now we do it better ;-)


More Like This queries

New features! A More Like This query is a special kind of query which takes a model document as an input, rather than a traditional string. It has been available for you to use since a long time via Lucene's MoreLikeThis Query implementation, but this implementation was rather tricky to use on our richer entity based model. Hibernate Search now provides direct support for this functionality via our Query builder DSL, and in its simplest form looks like this:

Coffee exampleCoffee = ...

QueryBuilder qb = fullTextSession.getSearchFactory()
        .forEntity( Coffee.class )

Query mltQuery = qb
            .toEntity( exampleCoffee )

List results = fullTextSession
        .createFullTextQuery( mltQuery, Coffee.class )

What does it do? It returns a list of Coffee instances which are similar to the exampleCoffee instance. The definition of similar is as usual controlled by the analyzers and indexing options you choose. By default the list is of course ordered according to the scoring model, so the top match would be the example entity itself (this might be surprising but is often useful in practice).

A more extensive blogpost about this will follow, but if you can't wait to learn more see all details in the Building queries chapter.

Faceting improvements

One of the highest voted improvement requests on JIRA, it is now finally possible to facet on embedded collections. Hardy also started exploring possible performance improvements, and how to use the new Lucene 4 features: feedback, use cases or patches would be very welcome as we're eager to improve faceting more.

Watch the migration guide

If you're updating an application from previous versions of Hibernate Search, we highly recommend to keep an eye on the Migration Guide as the changes in the Lucene API are significant and not always self-documenting. Suggestions for the migration guide are also very welcome.

The Apache Lucene Migration Guide might also be useful, but we applied most of it already to the internal engine for you to use transparently.

The hibernate-search-analyzers module is removed

This module was created years ago when we had to fork some Lucene code to allow an easy migration path, but is now since long an empty module just depending on various commonly used analyzers. It's time for spring cleaning of dependencies, so the no longer needed module is removed: if you where using it, just remove it from your project and include a direct dependency to the analyzers you need from the Apache Lucene ecosystem.

What's next?

You can find an high level overview on our Roadmap page, or check the fine grained break down on this JIRA filter. Essentially we're aiming now at OSGi compability and at usability improvements which had to be postponed to a major release.

The first milestone using the latest Apache Lucene is now available, having version 5.0.0.Alpha1.


Not just an Alpha release

Since this is the starting point of a new major release 5.0, we will be making many API improvements too. But since migrating to Lucene 4 is not a simple drop-in replacement, this will probably force you to make several changes in code using Lucene APIs directly. For this reason, during this initial Alpha1 milestone we intentionally avoided making any change in the Hibernate Search APIs so that you can use this version as a safe harbor milestone to simplify your migration.

You'll need the migration guide

As always our Migration Guide was updated; you're probably going to need it. If I've missed to document some needed change, or if any aspect is unclear, please let us know we'll be happy to evolve the guide.

The Apache Lucene Migration Guide might also be useful, but we applied most of it already to the internal engine for you to use transparently.

No longer depending on Apache Solr

We never used much from Solr, other than taking advantage of its powerfull and extensive collection of Analyzer helpers. These are now all moved into Apache Lucene, a welcome cleanup of our dependency tree.

What's next?

We will now start adapting our APIs to make sure to make the most of the new Lucene's capabilities. As usual refer to JIRA and our Roadmap, and feel free to make suggestions.

Hibernate Search 4.5.0.Final is available now.

This minor release could be promoted quickly as we didn't include any new feature compared to the 4.4 series, other than to focus on compatibility with Hibernate ORM 4.3 and WildFly 8 (JPA 2.1 and JavaEE 7 respectively).

WildFly 8 Integration

The WildFly application server will include this Hibernate Search version, making it even simpler to get started. Our documentation explains how to activate the module, but this will be outdated soon!

Essentially you need to either

  • Add a line to the MANIFEST of your deployment


  • Declare the dependency in a jboss-deployment-structure.xml file included in your deployment

The documentation still instructs to download the necessary modules, as that's required with WildFly 8.0.0.CR1, but this step should not be necessary in the final version of WildFly 8!

Of course, we'll still provide the same modules in future so that you won't be limited to use the version included in WildFly exclusively, but will always have the option to choose a different version.

OpenShift users

Since Hibernate Search is being included in WildFly 8, we're looking forward to it being available to all OpenShift users via the WildFly cartridge.

Why should you upgrade?

To remind on all the good reasons to update, these are the most notable improvements of the 4.5 branch:

JPA 2.1 compatibility

This Hibernate Search version is meant to work with Hibernate ORM 4.3.x series: our implementation for the JPA 2.1 standard, now included in WildFly 8.

Improved performance

Both Hibernate ORM and Hibernate Search are getting leaner at each release, allowing you to make better usage of your memory.

Simplified MassIndexer

The MassIndexer is now simpler to tune, and some problems related to lazy initialization exceptions where resolved.


Next steps

We expect to start rolling out preview tags of Hibernate Search 5 very soon: this is going to be based on the highly requested Apache Lucene 4.

Consequentially the 4.5 branch is from now on in maintenance mode, and will receive only critical fixes or as contributed by goodwilling users.

See our Roadmap for an overview of the plan, and don't hesitate to send suggestions our way!

Showing 1 to 5 of 40 blog entries