Red Hat

The latest Hibernate Search beta v. 4.2.0.Beta2 is available!

In this iteration we introduce Apache Tika integration, Spatial Queries are now able to sort on distance, and as usual a list of less noticeable improvements.

Apache Tika integration

Apache Tika allows you to extract text and index any kind of documents, like MP3 metadata, PDF text, office files. You can annotate a Blob field if loading the media files from a database, or have the String field point to a resource or file path.

@Entity
@Indexed
public class Book {

        Integer id;
        Blob content;

        @Id @GeneratedValue
        public Integer getId() {
                return id;
        }

        public void setId(Integer id) {
                this.id = id;
        }

        @Lob @Basic(fetch = FetchType.LAZY)
        @Field @TikaBridge // <- just add the TikaBridge as an adaptor to make the Blob indexable as any
        public Blob getContent() {
                return content;
        }

        public void setContent(Blob content) {
                this.content = content;
        }
}

The @TikaBridge annotation supports more options to tune the kind of text extraction; refer to the documentation for more details. Consider this feature experimental for now: we didn't add an option to make the text extraction asynchronous yet, so we might need to change the API to introduce that.

Spatial Queries sorted by distance

Thanks to all of Nicolas's Helleringer work, it's now easy to

  • Return the distance from the search center to each hit (via a projection)
  • Apply a sort criteria on the distance

Let's see an example from our large collection of self-documenting examples (the testsuite!):

QueryBuilder builder = em.getSearchFactory().buildQueryBuilder().forEntity( Cafe.class ).get();

org.apache.lucene.search.Query luceneQuery = builder.spatial()
    .onCoordinates( "location" )
    .within( 100, Unit.KM )
        .ofLatitude( centerLatitude )
        .andLongitude( centerLongitude )
    .createQuery();

FullTextQuery hibQuery = em.createFullTextQuery( luceneQuery, Cafe.class );

Sort distanceSort = new Sort( new DistanceSortField( centerLatitude, centerLongitude, "location" ) );

hibQuery.setSort( distanceSort );

hibQuery.setProjection( FullTextQuery.THIS, FullTextQuery.SPATIAL_DISTANCE );

hibQuery.setSpatialParameters( centerLatitude, centerLongitude, "location" );

List results = hibQuery.getResultList();

Several more reasons to upgrade

  • Apache Lucene upgraded to version 3.6.1
  • JMS and JMX integrations improved
  • The MassIndexer now correctly applies EntityIndexingInterceptor
  • Lower memory usage
  • Spatial Queries improved
  • Improved some classloaders for better integration with other libraries

The complete list of changes can be found here. Check the Migration Guide.

It has been a while since 4.2.0.Beta1 but the summer is over, so try these quickly as we'll move to the Final soon! As always, feedback is very welcome.

back to top