Red Hat

In Relation To Hibernate Search

In Relation To Hibernate Search

Eclipse tools for Hibernate Search

Posted by Dmitry Bocharov    |       |    Tagged as Hibernate Search JBoss Tools

I’m glad to announce the first release of the Eclipse plugin for Hibernate Search. In this post I want to describe its features and ask you for any comments, positive or (even more important for me) negative.

Installation

The plugin was made as a feature of jbosstools-hibernate plugin, which can be downloaded and installed on its own or together with the full JBoss Tools distribution. After that you can install the Hibernate Search plugin via the Eclipse Marketplace.

All requirements, such as eclipse version and platform support, are listed in the link.

In order to work with Hibernate Search you have to set the Hibernate configuration properties hibernate.search.default.directory_provider and hibernate.search.default.indexBase.

Functionality

The plugin was thought to be some kind of a Luke tool inside Eclipse. It was thought to be more convenient than launching a separate application, and picks up the configuration directly from your Hibernate configuration.

Three options were added to the console configurations: Index rebuild, Explore documents and Try analysers.

Index Rebuild Action screenshot

Index rebuild

When introducing Hibernate Search in an existing application, you have to create an initial Lucene index for the data already present in your database.

The option "Rebuild index" will do so by re-creating the Lucene index in the directory specified by the hibernate.search.default.indexBase property.

Screenshot of Hibernate Search indexed entities
Screenshot for Hibernate Search configuration properties

Explore Documents

After creating the initial index you can now inspect the Lucene Documents it contains.

All entities annotated as @Indexed are displayed in the Lucene Documents tab. Tick the checkboxes as needed and load the documents. Iterate through the documents using arrows.

Screenshot of Lucene Documents inspection

Try the Analyzers

The "try analyzers" instrument allows you to view the result of work of different Lucene Analyzers. The combo-box contains all classes in the workspace which extend org.apache.lucene.analysis.Analyzer, including custom implementations created by the user. While you type the text you want to analyse, the result immediately appears in the AnalysisResultTab view.

Screenshot of Try Analyzers

Possible issues

One problem which you might have is that "Index rebuild" option seems to do nothing. As a temporary workaround try to set the Hibernate configuration property "hibernate.search.autoregister_listeners" to "true" explicitly.

If you have any other problems, such as unexpected behaviour, strange windows with exceptions or any errors in the Error log view feel free to contact me directly anywhere or just create an issue in the plugin github page.

Plans

  • Make options "Index rebuild" and "Explore documents" available not only for configurations, but also for concrete entities under session factory.

  • Make Lucene Documents view more comfortable to use and add there more features from Luke tool, for example, the ability to search over documents.

  • Increasing stability of the plugin and implementing your suggestions!

Version 5.5.2.Final is now available, our latest stable version sporting integration with Hibernate ORM 5 and Apache Lucene 5.3 - the state of the art.

Creating this version to be compatible with these two great OSS projects kept us busy for a good deal of this past year; I remember discussing this option with superstar OSS contributors Uwe Schindler (Apache Lucene developer) and Gustavo Nalle (Infinispan developer) at FOSDEM in January 2015! I am grateful to both for their guidance and suggestions, as driving progress forward is sometimes challenging when we strive to keep backwards compatibility as best as we can.

On top of that, our same small but amazing team as been working hard on Hibernate OGM 5, a bit of Hibernate Validator, incredible performance improvements on Hibernate ORM "classic" and is still tinkering on the internals of Hibernate Search to make an Elasticsearch backend an alternative to plain Lucene.

Feel like helping with the Elasticsearch integration?

Version 5.6 will feature an experimental integration with Elasticsearch. Early feedback would be very welcome! If you feel like helping, quite some code is integrated in the master branch already and there are clear TODO and FIXME comments to be found in the code.

Feel free to join the hacking! But best to let us know which area you feel like working on to avoid duplication of efforts and conflicting patches.

The 5.5.2.Final release and our great community

Technically I pushed the release button the 24th, but I couldn’t publish the blog until now because of travels.

On top of the improvements from previous 5.5.1.Final release which sported significant performance improvements, this version now incorporates several important corrections by Yoann Rodière and Guillaume Smet around metadata which was producing incorrect range queries for embedded numeric types.

Yoann also contributed another great performance improving patch, by limiting the recursion triggered by @ContainedIn more strictly, by considering the indexed paths and not just the recursion depth. You might not notice this if you have very simple flat models, but this can provide a very significant boost for all those of you using Hibernate Search to index rich models with extensive usage of @ContainedIn annotations.

Many thanks to Yoann and Guillaume, they are "power users" who do the right thing: their hands-on regular feedback is invaluable, not to mention they send such great patches for everybody else to benefit from. Not least, this is very encouraging for us: let us know which great things you’re building with these libraries!

<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search-orm</artifactId>
   <version>5.5.2.Final</version>
</dependency>
<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-core</artifactId>
   <version>5.0.6.Final</version>
</dependency>
<dependency>
   <groupId>org.apache.lucene</groupId>
   <artifactId>lucene-core</artifactId>
   <version>5.3.1</version>
</dependency>

How to get this release

Everything you need is available on Hibernate Search’s web site. Download the full distribution from here, or get it from Maven Central using the above coordinates, and don’t hesitate to reach us in our forums or mailing lists.

Hibernate Search 5.5.1.Final is now available!

Feedback about our recent 5.5.0.Final release has been great, and while the good news is that nobody reported significant issues, some people also pointed out that the new sorting system was a bit limited.

So we’ve been working on enhancing the FieldBridge API to make sure that those more expert users who implement their own bridging would have a better control on how sorting works as well.

We’ve also started some work to push the performance higher, and overall I’m proud to state that this 5.5.1.Final release is including some small internal polish, but results in measurable improvements.

Sorting & FieldBridge improvements

If you create custom FieldBridge implementations, you can now declare field metadata to benefit from the improved sorting performance.

Since Hibernate Search 5.5, we recommend you explicitly mark which fields will be used for sorting via the @SortableField annotation.

This has an effect on indexing and triggers improved query performance as long as Hibernate Search understands it can use the new more effective sorting strategy.

As kindly reported by Ashot Golovenko, when implementing a custom FieldBridge there was no way to let the Hibernate Search Engine component know which custom created fields are valid to sort on, so even if your implementation was clever enough to index fields appropriately to take advantage of the new sorting capabilities, this wouldn’t be used at query time.

This has been fixed by introducing an extension to FieldBridge: org.hibernate.search.bridge.MetadataProvidingFieldBridge which allows you to configure the metadata correctly. Beware though: this approach is meant for expert users, and the Query engine is going to trust that the metadata you define is actually reflected by what your bridge implementation writes into the index. In case of mismatches, you’ll have runtime exceptions during query time when sorting on a field which wasn’t indexed as declared.

Spot inefficient sorting operations easily in test suites

While Hibernate Search already would log a warning when forced to perform a query using a sub-optimal sorting strategy, that wasn’t making it very easy to spot mapping or usage mistakes.

You can now set this property:

    hibernate.search.index_uninverting_allowed = false

and you’ll have your tests fail with a reasonable exception rather than log the warning.

Performance improvements

There are many internal improvements related to performance.

The most interesting one is that now we’ll be able to automatically skip scoring in various index housekeeping operations. This implies you’ll see lower CPU usage on some index write and update operations, and also improved query performance when certain automatic filtering needs to be applied on your queries, such as for narrowing down the entity types, apply sharding related filtering.

The sorting operations have been improved as well: we can now skip the index uninverting process when sorting on distances during spatial queries, or when sorting by scores.

Memory usage has been reduced as well! Special thanks to Andrej Golovnin to have diagnosed and reported HSEARCH-2029, that fix alone will reduce our permanent memory usage.

We also reduced allocation of several short lived but heavy objects being used during indexing and query execution, overall this should improve the efficiency of the JVM.

The performance work is an on-going challenge: I’m quite happy to see very respectable figures, but we’re planning even more improvements, and if you have profiling data or other useful data to share don’t be selfish and share it! Always happy to improve it further.

Components upgrades

Several components were upgraded; most notably we’re now using the latest Apache Lucene version 5.3.1.

  • Upgrade Narayana to 5.2.5.Final

  • Upgrade JGroups to 3.6.6.Final

  • Upgrade Hibernate ORM to 5.0.4.Final

  • Upgrade Apache Lucene to 5.3.1

  • Upgrade to Hibernate Commons Annotations 5.0.1.Final

<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search-orm</artifactId>
   <version>5.5.1.Final</version>
</dependency>
<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-core</artifactId>
   <version>5.0.4.Final</version>
</dependency>
<dependency>
   <groupId>org.apache.lucene</groupId>
   <artifactId>lucene-core</artifactId>
   <version>5.3.1</version>
</dependency>

Next?

We’re working on version 5.6, as previously announced it’s going to sport an experimental integration with Elasticsearch.

How to get this release

Everything you need is available on Hibernate Search’s web site. Download the full distribution from here, or get it from Maven Central using the above coordinates, and don’t hesitate to reach us in our forums or mailing lists.

Hibernate Search 5.5 Final is out

Posted by Davide D'Alto    |       |    Tagged as Hibernate Search Releases

I’m happy to announce the latest final release of Hibernate Search: Hibernate Search 5.5 Final. We mainly consolidated the features included in the latest candidate release and worked on fixing some bugs.

As a reminder on versions:

Hibernate Search now requires at least Hibernate ORM 5.0.0.Final, and at the time of writing only Infinispan 8.0.1.Final supports real time replication of an Apache Lucene 5.3 index.

<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search-orm</artifactId>
   <version>5.5.0.Final</version>
</dependency>
<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-core</artifactId>
   <version>5.0.1.Final</version>
</dependency>
<dependency>
   <groupId>org.infinispan</groupId>
   <artifactId>infinispan-directory-provider</artifactId>
   <version>8.0.1.Final</version>
</dependency>
<dependency>
   <groupId>org.apache.lucene</groupId>
   <artifactId>lucene-core</artifactId>
   <version>5.3.0</version>
</dependency>

WildFly 10, Lucene 5

Last week, we released Hibernate Search 5.4 Final with the minimal set of changes to make it work with Hibernate ORM 5. This version uses Lucene 5.3; if you haven’t upgraded already, we recommend to start with version 5.4.0.Final, first.

Hibernate Search 5.5 will be included in WildFly 10.

Out of the box indexing of java.time types

Hibernate Search is now able to automatically provide a sensible mapping for java.time.Year, java.time.Duration java.time.Instant, java.time.LocalDate, java.time.LocalTime, java.time.LocalDateTime, java.time.LocalTime, java.time.MonthDay, java.time.OffsetDateTime, java.time.OffsetTime, java.time.Period, java.time.YearMonth, java.time.ZoneDateTime, java.time.ZoneId, java.time.ZoneOffset.

That means that it includes an out of the box FieldBridge for each of these types, and allows transparent indexing and querying of properties of these types. You can of course customize the indexing by providing your own FieldBridge, as usual.

This feature is only available if you are running on a Java 8 runtime, although all other features of Hibernate Search are still backwards compatible with Java 7.

Sorting improvements

Since Apache Lucene 5.0 (and including 5.3 as we currently require) we can provide a significant performance improvement if you let us know in advance which fields you intend to use for sorting.

For this purpose a new annotation org.hibernate.search.annotations.SortableField is available. If you start using this annotation in your projects you’ll benefit from improved performance, but for those who don’t want to update their mapping yet we will fallback to the older strategy.

This subject is discussed in detail in this follow-up post.

Encoding null tokens in your index

When using @Field(indexNullAs=) to encode some marker value in the index, the type of the marker is now required to be of a compatible field type as all other values which are indexed in that same field.

This was problematic for `NumericField`s, the ones optimised for range queries on numbers, as we would previously allow you to encode a string keyword like 'null': this is no longer allowed, you will have to pick a number to be used to represent the null value.

As an example for an "age" property you might want to use:

@Field(indexNullAs = "-1")
Integer nullableAge;

How to get this release

Everything you need is available on Hibernate Search’s web site. Download the full distribution from SourceForge, or get it from Maven Central, and don’t hesitate to reach us in our forums or mailing lists.

Order, ooorder! Sorting results in Hibernate Search 5.5

Posted by Gunnar Morling    |       |    Tagged as Hibernate Search

"Order, ooorder!" - Sometimes not only the honourable members of the House of Commons need to be called to order, but also the results of Hibernate Search queries need to be ordered in a specific way.

To do so, just pass a Sort object to your full-text query before executing it, specifying the field(s) to sort on:

FullTextSession session = ...;
QueryParser queryParser = ...;

FullTextQuery query = session.createFullTextQuery( queryParser.parse( "summary:lucene" ), Book.class );
Sort sort = new Sort( new SortField( "title", SortField.Type.STRING, false ) );
query.setSort( sort );
List<Book> result = query.list();

As of Lucene 5 (which is what Hibernate Search 5.5 is based on), there is a big performance gain if the fields to sort on are known up front. In this case these fields can be stored as so-called "doc value fields", which is much faster and less memory-consuming than the traditional approach of index un-inverting.

For that purpose, Hibernate Search provides the new annotation @SortableField (and it’s multi-valued companion, @SortableFields) for tagging those fields that should be available for sorting. The following example shows how do it:

@Entity
@Indexed(index = "Book")
public class Book {

    @Id
    private Integer id;

    @Field
    @SortableField
    @DateBridge(resolution = Resolution.DAY)
    private Date publicationDate;

    @Fields({
        @Field,
        @Field(name = "sortTitle", analyze = Analyze.NO, store = Store.NO, index = Index.NO)
    })
    @SortableField(forField = "sortTitle")
    private String title;

    @Field
    private String summary;

    // constructor, getters, setters ...
}

@SortableField is used next to the known @Field annotation. In case a single field exists for a given property (e.g. publicationDate) just specifying the @SortableField annotation is enough to make that field sortable. If several fields exist (see the title property), specify the field name via @SortableField#forField().

Note that sort fields must not be analyzed. In case you want to index a given property analyzed for searching purposes, just add another, un-analyzed field for sorting as it is shown for the title property. If the field is only needed for sorting and nothing else, you may configure it as un-indexed and un-stored, thus avoid unnecessary index growth.

For using the configured sort fields when querying nothing has changed. Just specify a Sort with the required field(s):

FullTextQuery query = ...;
Sort sort = new Sort(
    new SortField( "publicationDate", SortField.Type.LONG, false ),
    new SortField( "sortTitle", SortField.Type.STRING, false )
);
query.setSort( sort );

Now what happens if you sort on fields which you have not explicitly declared as sortable, e.g. when migrating an existing application over to Hibernate Search 5.5? The good news is that the sort will be applied as expected from a functional perspective. Hibernate Search detects the missing sort fields and transparently falls back to index-univerting.

But be aware that this comes with a performance penalty (also it is quite memory-intensive as uninverting is RAM-only operation) and it even might happen that this functionality will be removed from Lucene in a future version altogether. Thus watch out for messages like the following in your log files and follow the advice to declare the missing sort fields:

WARN ManagedMultiReader:142 - HSEARCH000289: Requested sort field(s) summary_forSort are not configured for entity \
type org.hibernate.search.test.query.Book mapped to index Book, thus an uninverting reader must be created. You \
should declare the missing sort fields using @SortField.

When migrating an existing application, be sure to rebuild the affected index(es) as described in the reference guide.

With all the required sort fields configured, your search results with be in order, just as the British parliament members after being called to order by Mr. Speaker.

Releasing Hibernate Search 5.5.0.CR1

Posted by Sanne Grinovero    |       |    Tagged as Hibernate Search Releases

The first Candidate Release for Hibernate Search 5.5 is now available, introducing integration with Apache Lucene 5.3 and several nice improvements.

As a reminder on versions: Hibernate Search now requires Hibernate ORM 5.0.0.Final at least, and at the time of writing only Infinispan 8.0.1.Final supports real time replication of an Apache Lucene 5.3 index.

<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-search-orm</artifactId>
   <version>5.5.0.CR1</version>
</dependency>
<dependency>
   <groupId>org.hibernate</groupId>
   <artifactId>hibernate-core</artifactId>
   <version>5.0.1.Final</version>
</dependency>
<dependency>
   <groupId>org.infinispan</groupId>
   <artifactId>infinispan-directory-provider</artifactId>
   <version>8.0.1.Final</version>
</dependency>
<dependency>
   <groupId>org.apache.lucene</groupId>
   <artifactId>lucene-core</artifactId>
   <version>5.3.0</version>
</dependency>

Out of the box indexing of java.time types

Hibernate Search is now able to automatically provide a sensible mapping for java.time.Year, java.time.Duration java.time.Instant, java.time.LocalDate, java.time.LocalTime, java.time.LocalDateTime, java.time.LocalTime, java.time.MonthDay, java.time.OffsetDateTime, java.time.OffsetTime, java.time.Period, java.time.YearMonth, java.time.ZoneDateTime, java.time.ZoneId, java.time.ZoneOffset.

That means that it includes an out of the box FieldBridge for each of these types, and allows transparent indexing and querying of properties of these types. You can of course customize the indexing by providing your own FieldBridge, as usual.

This feature is only available if you are running on a Java 8 runtime, although all other features of Hibernate Search are still backwards compatible with Java 7.

Sorting improvements

Since Apache Lucene 5.0 (and including 5.3 as we currently require) we can provide a significant performance improvement if you let us know in advance which fields you intend to use for sorting.

For this purpose a new annotation org.hibernate.search.annotations.SortableField is available. If you start using this annotation in your projects you’ll benefit from improved performance, but for those who don’t want to update their mapping yet we will fallback to the older strategy.

This subject is discussed in detail in this follow-up post.

Encoding null tokens in your index

When using @Field(indexNullAs=) to encode some marker value in the index, the type of the marker is now required to be of a compatible field type as all other values which are indexed in that same field.

This was problematic for `NumericField`s, the ones optimised for range queries on numbers, as we would previously allow you to encode a string keyword like 'null': this is no longer allowed, you will have to pick a number to be used to represent the null value.

As an example for an "age" property you might want to use:

@Field(indexNullAs = "-1")
Integer nullableAge;

How to get this release

Everything you need is available on Hibernate Search’s web site. Download the full distribution from here, or get it from Maven Central using the above coordinates, and don’t hesitate to reach us in our forums or mailing lists.

Feedback always welcome! Please let us know of any problem before we release the Final version.

After two candidate releases and a bit of quiet time in waiting for Hibernate ORM 5.0.0.Final, we now published Hibernate Search 5.4.0.Final: the first stable release compatible with ORM 5.

Minimal API changes

This version is designed to minimize your work to upgrade the Hibernate stack, so that you can focus on the changes needed to get running with Hibernate ORM 5 without having to deal with changes in the Search API.

Plans for WildFly 10, Lucene 5

WildFly 10 is coming fast and will feature both Hibernate ORM 5 and Hibernate Search, so please take advantage of this Hibernate Search 5.4.0.Final release as a milestone to get your applications up to date.

In WildFly 10 however we plan to include a further evolved version of Hibernate Search, to also update Apache Lucene to the latest (currently Lucene 5.3).

So the plan is that we’ll include some evolution of Hibernate Search 5.5.x in WildFly 10, but upgrading both ORM and Lucene could be a lot of changes for you to face in one sprint so I highly recommend using this version 5.4.0.Final as an intermediate, reliable refactoring step.

Other improvements

Since the main mantra for this version was "don’t change anything" (except the Hibernate ORM version), the new and cool stuff will be coming in the next minor release.

Of course we also fixed some minor bugs; the most relevant ones are in the area of the Query DSL, improving the handling of null-tokens during query generation and better recognition of Numeric Fields.

For details, see:

  • HSEARCH-1981 QueryBuilder should not tokenize the null-token when searching for null

  • HSEARCH-1973 Make sure the metadata recognizes a field as Numeric even if it’s wrapped into a NullEncodingTwoWayFieldBridge

  • HSEARCH-1968 Avoid the NPE on empty results during a Faceting Query

Special thanks to Marcel Barbosa Pinto for identifying and fixing the NPE on Faceting queries!

How to get this release

Everything you need is available on Hibernate Search’s web site. Download the full distribution from here, or get it from Maven Central, and don’t hesitate to reach us in our forums or mailing lists.

In the last couple of months we’ve been working to upgrade Hibernate Search to use Apache Lucene 5, to keep up with the greatest releases from the Lucene community.

Today Hibernate Search 5.5.0.Alpha1 is available!

As the version suggests this is the first cut, but we’ll also need your feedback and suggestions to better assess the needed steps to evolve this into a great, efficient and stable Final release.

A stable API

As Uwe Schindler - a great developer from the Lucene community - had suggested me to try, it turns out it was possible to update from Apache Lucene version 4 to 5 (a major upgrade) in a minor upgrade of Hibernate Search: many things improved, some things are slightly different, but overall the API was not affected by any significant change.

You might be affected by some minor changes of the Lucene API if you were writing power-use extensions to Hibernate Search, in that case:

Please let us know if there are difficulties in upgrading: we can try to improve on that, or at least improve the migration guide. If you don’t tell us, we might not know and many other developers will have a hard time!

Improved Performance

Several changes in the Apache Lucene code relate to improved efficiency, in particular of memory usage.

However our aim for this release was functionality and correctness; we didn’t have time to run performance tests, and to be fair even if we had the time nothing beats feedback from the community of users.

We would highly appreciate your help in trying it out and run some performance tests, especially if you can compare with previous versions of Hibernate Search (based on earlier versions of Apache Lucene).

If you think you can try this, it’s a unique opportunity to both contribute and to get our insight and support to analyse and understand your performance reports!

Sorting and Faceting: use the right types!

It appears that when using Apache Lucene 4, if you were running a Numeric Faceting Query but the target field was not Numeric, it would still work - or appear to work.

Similarly, if you Sort a query using the wrong SortType, Apache Lucene 4 appears to be forgiving.

When using Apache Lucene 5 now, and also because of some internal changes we had to do to keep your migration cost easier, the validation on the right types got much stricter.

Please be aware that you’ll get runtime exceptions like the following when mistmaching the types:

Type mismatch: ageForIntSorting was indexed with multiple values per document, use SORTED_SET instead

I’m aware those are confusing, so we’ll working on improving the errors: HSEARCH-1951; still even with those error messages, the meaning is you have to fix your mapping. No shame in that, we figured this problem out as several unit tests were too relaxed ;-)

How to get it

Everything you need is available on Hibernate Search’s web site. Download the full distribution from here, or get it from Maven Central, and don’t hesitate to reach us in our forums or mailing lists.

While Hibernate ORM applies some more polish before releasing the final version, so did the Hibernate Search team.

Hibernate Search version 5.4.0.CR2 is now available! It was built and tested with Hibernate ORM 5.0.0.CR3, again we’re just waiting for that to be final but decided to release another Candidate Release as some fixes and improvements were recently applied.

The "Meridian crossing issue"

The Spatial query feature was affected by a math sign mistake which would reveal itself on distance based queries which would cross the 180' meridian. Thanks to Alan Field for the test and Nicholas Helleringer for debugging and fixing the issue!

Serialization issue on enabling Term Vectors

Kudos to Benoit Guillon, who debugged and fixed a mistake in the Avro based serializer, for which values of Term Vectors would not be indexed when enabling Term Vectors in combination with a remote backend. See also HSEARCH-1936.

Various small improvements

You can now have the MassIndexer set a different timeout for the internal transactions it will start, so if you’re running Hibernate Search in a container like WildFly you no longer have to make a choice between having a deadline of 5 minutes or changing the default timeout of the whole container.

  • Improved validation when choosing an invalid Sort definition

  • The Infinispan module will now get you a warning if your Maven project is using the old artifact ids

  • The performance integration tests have been re-enabled, and fixed to run on latest WildFly and Java8 and Java9

  • Fixed a potential issue with timezone encoding for indexed dates

  • Improved the code generating the path for index storage to handle "." elements in the configured paths

  • Upgraded to Hibernate ORM 5.0.0.CR3 and WildFly 10.0.0.Alpha6 (WildFly only used for testing)

Where to download it from

Everything you need is available on Hibernate Search’s web site. Download the full distribution from here, or get it from Maven Central, and don’t hesitate to reach us in our forums

What’s next?

Soon we’ll be releasing a preview of 5.5 to have you upgrade to Apache Lucene 5.2. The Hibernate Search version 5.4.0.Final will be released when Hibernate ORM 5 is released as stable.

Handling queries on complex types in Hibernate Search

Posted by Emmanuel Bernard    |       |    Tagged as Hibernate Search

Writing queries using complex types can be a bit surprising in Hibernate Search. For these multi-fields types, the key is to target each individual field in the query. Let’s discuss how this works.

What’s a complex type?

Hibernate Search lets you write custom types that take a Java property and create Lucene fields in a document. As long as there is a one property for one field relationship, you are good. It becomes more subtle if your custom bridge stores the property in several Lucene fields. Say an Amount type which has the numeric part and the currency part.

Let’s take a real example from a user using Infinispan's search engine - proudly served by Hibernate Search.

The FieldBridge
public class JodaTimeSplitBridge implements TwoWayFieldBridge {

    /**
     * Set year, month and day in separate fields
     */
    @Override
    public void set(String name, Object value, Document document, LuceneOptions luceneoptions) {
        DateTime datetime = (DateTime) value;
        luceneoptions.addFieldToDocument(
            name+".year", String.valueOf(datetime.getYear()), document
        );
        luceneoptions.addFieldToDocument(
            name+".month", String.format("%02d", datetime.getMonthOfYear()), document
        );
        luceneoptions.addFieldToDocument(
            name+".day", String.format("%02d", datetime.getDayOfMonth()), document
        );
    }

    @Override
    public Object get(String name, Document document) {
        IndexableField fieldyear = document.getField(name+".year");
        IndexableField fieldmonth = document.getField(name+".month");
        IndexableField fieldday = document.getField(name+".day");
        String strdate = fieldday.stringValue()+"/"+fieldmonth.stringValue()+"/"+fieldyear.stringValue();
        DateTime value = DateTime.parse(strdate, DateTimeFormat.forPattern("dd/MM/yyyy"));
        return String.valueOf(value);
    }

    @Override
    public String objectToString(Object date) {
        DateTime datetime = (DateTime) date;
        int year = datetime.getYear();
        int month = datetime.getMonthOfYear();
        int day = datetime.getDayOfMonth();
        String value = String.format("%02d",day)+"/"+String.format("%02d",month)+"/"+String.valueOf(year);
        return String.valueOf(value);
    }
}
The entity using the bridge
[...]
@Indexed
class BlogEntry {
    [...]

    @Field(store=Store.YES, index=Index.YES)
    @FieldBridge(impl=JodaTimeSplitBridge.class)
    DateTime creationdate;
}

Let’s query this field

A naive but intuitive query looks like this.

Incorrect query
QueryBuilder qb = sm.buildQueryBuilderForClass(BlogEntry.class).get();
Query q = qb.keyword().onField("creationdate").matching(new DateTime()).createQuery();
CacheQuery cq = sm.getQuery(q, BlogEntry.class);
System.out.println(cq.getResultSize());

Unfortunately that query will always return 0 result. Can you spot the problem?

It turns out that Hibernate Search does not know about these subfields creationdate.year, creationdate.month and creationdate.day. A FieldBridge is a bit of a blackbox for the Hibernate Search query DSL, so it assumes that you index the data in the field name provided by the name parameter (creationdate in this example).

We have plans in a not so future version of Hibernate Search to address that problem. It will only require you to provide a bit of metadata when you write such advanced custom field bridge. But that’s the future, so what to do now?

Use a single field

I am cheating here but as much as you can, try and keep the one property = one field mapping. Life will be much simpler to you. In this specific JodaTime type example, this is extremely easy. Use the custom bridge but instead of creating three fields (for year, month, day), keep it as a single field in the form of yyyymmdd.

Let’s again use our user real life solution.

A bridge using one field
public class JodaTimeSingleFieldBridge implements TwoWayFieldBridge {

    /**
     * Store the data in a single field in yyymmdd format
     */
    @Override
    public void set(String name, Object value, Document document, LuceneOptions luceneoptions) {
        DateTime datetime = (DateTime) value;
        luceneoptions.addFieldToDocument(
            name, datetime.toString(DateTimeFormat.forPattern("yyyyMMdd")), document
        );
    }


    @Override
    public Object get(String name, Document document) {
        IndexableField strdate = document.getField(name);
        return DateTime.parse(strdate.stringValue(), DateTimeFormat.forPattern("yyyyMMdd"));
    }

    @Override
    public String objectToString(Object date) {
        DateTime datetime = (DateTime) date;
        return datetime.toString(DateTimeFormat.forPattern("yyyyMMdd"));
    }
}

In this case, it would even be better to use a Lucene numeric format field. They are more compact and more efficient at range queries. Use luceneOptions.addNumericFieldToDocument( name, numericDate, document );.

The query above will work as expected now.

But my type must have multiple fields!

OK, OK. I won’t avoid the question. The solution is to disable the Hibernate Query DSL magic and target the fields directly.

Let’s see how to do it based on the first FieldBridge implementation.

Query targeting multiple fields
int year = datetime.getYear();
int month = datetime.getMonthOfYear();
int day = datetime.getDayOfMonth();

QueryBuilder qb = sm.buildQueryBuilderForClass(BlogEntry.class).get();
Query q = qb.bool()
    .must( qb.keyword().onField("creationdate.year").ignoreFieldBridge().ignoreAnalyzer()
                .matching(year).createQuery() )
    .must( qb.keyword().onField("creationdate.month").ignoreFieldBridge().ignoreAnalyzer()
                .matching(month).createQuery() )
    .must( qb.keyword().onField("creationdate.day").ignoreFieldBridge().ignoreAnalyzer()
                .matching(day).createQuery() )
   .createQuery();

CacheQuery cq = sm.getQuery(q, BlogEntry.class);
System.out.println(cq.getResultSize());

The key is to:

  • target directly each field,

  • disable the field bridge conversion for the query,

  • and it’s probably a good idea to disable the analyzer.

It’s a rather advanced topic and the query DSL will do the right thing most of the time. No need to panic just yet.

But in case you hit a complex type needs, it’s interesting to understand what is going on underneath.

back to top