Help

I'm working in the Hibernate and Infinispan teams at JBoss, caring about Lucene integration in products we support, striving to make it easier to use and to integrate in well known APIs and patterns, and finally to make it scale better; I love clean and well performing code.

I've been an early adopter of cloud deployments scaling Lucene to a huge number of requests on EC2 using Hibernate Search, and after that I worked with Sourcesense to make JIRA clusterable via Infinispan. Have been trainer on Seam and Hibernate courses.

Location: Newcastle, UK
Occupation: Doing stuff at JBoss, a Division of Red Hat Inc
Archive

Today we released yet another milestone for the Hibernate Search 5 train. We worked in parallel on multiple fronts; the most notable changes are:

OSGi support, API changes

All our modules now provide OSGi metadata to make life easier for users running in OSGi containers. We also included an example features file and integration tests using Karaf.

Keep in mind that Apache Lucene is not providing this same metadata, so it might be worth looking into the features file we use to learn how the Lucene modules need to be wrapped.

Class Relocations

A consequence of being OSGi compliant is that we had to move some packages of well-known APIs; please see the migration guide for all details.

Apache Lucene 4.8.1, Java7 now required

Apache Lucene requires Java7 since version 4.8 and we don't want you to miss out some of the great improvements it provides, or potential bugfixes in the near future so we now require Java7 too.

Apache Lucene 4.8.1 was released today, so we could include it in this release too.

Bridge Providers loaded by auto-discovery

We always had a strong differentiation between FieldBridge(s) included in Hibernate Search, and custom (application provided) FieldBridges. From this release the discovery of built-in bridges uses the Service Loader pattern, so that we can move some bridge implementations to optional modules, and also eventually provide support for the new date/time types defined in Java8 but also by Joda Time, and potentially your own custom types but this will need some further refinement work.

Several other improvements

These won't make the headlines as a Java requirements change, still we have some more relevant news:

  • Infinispan upgraded to 7.0.0.Alpha4: now also requires Java7 and supports the distributed Lucene Directory for Apache Lucene 4.8
  • the needed Infinispan update implies using latest JGroups 3.5.0.Beta5
  • all our documentation was migrated to AsciiDoc , it's now much easier to contribute to documentation!
<dependency>
 <groupId>org.hibernate</groupId>
 <artifactId>hibernate-search-orm</artifactId>
 <version>5.0.0.Alpha4</version>
</dependency>

Last night I uploaded two bugfix releases of Hibernate Search stable branches:

  • 4.4.3.Final (Hibernate ORM 4.2 and JPA 2.0 users, JBoss 7.2 and EAP6)
  • 4.5.1.Final (Hibernate ORM 4.3 and JPA 2.1 users, WildFly 8)

They both contain several backported fixes, thanks to the excellent testing efforst of Guillaume Smet and Yoann Rodiere, who found very sophisticated issues and also helped with patches. I now added Yoann as committer too, congratulations!

Details of fixes can be found in the 4.4.3.Final changelog.txt and 4.5.1.Final changelog.txt.

Happy searching!

Version 5.0.0.Alpha3 is now available: now integrating with Apache Lucene 4.7.1, which was released just 24 hours before.

<dependency>
 <groupId>org.hibernate</groupId>
 <artifactId>hibernate-search-orm</artifactId>
 <version>5.0.0.Alpha3</version>
</dependency>

More Like This

Introduced and better described on our previous post and in the Query DSL chapter, the new feature now also works with compressed fields and @IndexedEmbedded fields.

OSGi and ClassLoaders

On our path to 5.0 we're aiming to a full internal refactoring of ClassLoaders handling, Service loading strategies, etc.. with the goal to be reliable in complex modular deployments, including OSGi. To reach a full OSGi compatibility some public API packages will need to change in the next version too!

Many smaller details

There is a list of smaller polishing, like more reliable JGroups and Infinispan tests, a diet program for dependencies, updates to latest Hibernate ORM, JGroups and Infinispan versions.

Performance tuning

The primary performance bottleneck I've observed in the new Lucene 4 backend is the need to tune the max_threads_state option on Lucene's IndexWriter. This option controls the level of parallelism you want to allow to the IndexWriter. The default is a very reasonable 8, but this is now configurable using the typical format as expressed in the Lucene Tuning chapter:

hibernate.search.​[default|<indexname>].​indexwriter.max_thread_states 

What's next?

We're currently busy with OSGi tests, an easy way to extend the set of FieldBridges supported by the engine, improved handling of dynamic types and overall structure of how you define your indexed model. Also worth nothing that all of this will be integrated in the Infinispan Query engine soon. You can find an high level overview on our Roadmap page.

The release 5.0.0.Alpha2 is now available on our shiny new website: as the alpha1 release also did, it integrates with Apache Lucene 4.6.1, but now we do it better ;-)

<dependency>
 <groupId>org.hibernate</groupId>
 <artifactId>hibernate-search-orm</artifactId>
 <version>5.0.0.Alpha2</version>
</dependency>

More Like This queries

New features! A More Like This query is a special kind of query which takes a model document as an input, rather than a traditional string. It has been available for you to use since a long time via Lucene's MoreLikeThis Query implementation, but this implementation was rather tricky to use on our richer entity based model. Hibernate Search now provides direct support for this functionality via our Query builder DSL, and in its simplest form looks like this:

Coffee exampleCoffee = ...

QueryBuilder qb = fullTextSession.getSearchFactory()
        .buildQueryBuilder()
        .forEntity( Coffee.class )
        .get();

Query mltQuery = qb
        .moreLikeThis()
            .comparingAllFields()
            .toEntity( exampleCoffee )
            .createQuery();

List results = fullTextSession
        .createFullTextQuery( mltQuery, Coffee.class )
        .list();

What does it do? It returns a list of Coffee instances which are similar to the exampleCoffee instance. The definition of similar is as usual controlled by the analyzers and indexing options you choose. By default the list is of course ordered according to the scoring model, so the top match would be the example entity itself (this might be surprising but is often useful in practice).

A more extensive blogpost about this will follow, but if you can't wait to learn more see all details in the Building queries chapter.

Faceting improvements

One of the highest voted improvement requests on JIRA, it is now finally possible to facet on embedded collections. Hardy also started exploring possible performance improvements, and how to use the new Lucene 4 features: feedback, use cases or patches would be very welcome as we're eager to improve faceting more.

Watch the migration guide

If you're updating an application from previous versions of Hibernate Search, we highly recommend to keep an eye on the Migration Guide as the changes in the Lucene API are significant and not always self-documenting. Suggestions for the migration guide are also very welcome.

The Apache Lucene Migration Guide might also be useful, but we applied most of it already to the internal engine for you to use transparently.

The hibernate-search-analyzers module is removed

This module was created years ago when we had to fork some Lucene code to allow an easy migration path, but is now since long an empty module just depending on various commonly used analyzers. It's time for spring cleaning of dependencies, so the no longer needed module is removed: if you where using it, just remove it from your project and include a direct dependency to the analyzers you need from the Apache Lucene ecosystem.

What's next?

You can find an high level overview on our Roadmap page, or check the fine grained break down on this JIRA filter. Essentially we're aiming now at OSGi compability and at usability improvements which had to be postponed to a major release.

The first milestone using the latest Apache Lucene is now available, having version 5.0.0.Alpha1.

<dependency>
 <groupId>org.hibernate</groupId>
 <artifactId>hibernate-search-orm</artifactId>
 <version>5.0.0.Alpha1</version>
</dependency>

Not just an Alpha release

Since this is the starting point of a new major release 5.0, we will be making many API improvements too. But since migrating to Lucene 4 is not a simple drop-in replacement, this will probably force you to make several changes in code using Lucene APIs directly. For this reason, during this initial Alpha1 milestone we intentionally avoided making any change in the Hibernate Search APIs so that you can use this version as a safe harbor milestone to simplify your migration.

You'll need the migration guide

As always our Migration Guide was updated; you're probably going to need it. If I've missed to document some needed change, or if any aspect is unclear, please let us know we'll be happy to evolve the guide.

The Apache Lucene Migration Guide might also be useful, but we applied most of it already to the internal engine for you to use transparently.

No longer depending on Apache Solr

We never used much from Solr, other than taking advantage of its powerfull and extensive collection of Analyzer helpers. These are now all moved into Apache Lucene, a welcome cleanup of our dependency tree.

What's next?

We will now start adapting our APIs to make sure to make the most of the new Lucene's capabilities. As usual refer to JIRA and our Roadmap, and feel free to make suggestions.

Showing 1 to 5 of 41 blog entries