Hibernate Search is a library that integrates Hibernate ORM with Apache Lucene or Elasticsearch by automatically indexing entities, enabling advanced search functionality: full-text, geospatial, aggregations and more. For more information, see Hibernate Search on hibernate.org.

The most significant part, by far, of Hibernate Annotations 3.2.1 is the complete rewriting and feature expansion of Hibernate Search formerly known as Hibernate Lucene.

Long story short

Hibernate Search allows you to search your domain model (google it) without the hassle and mismatches introduced by the full text technology. Indexing is done automatically, the mapping between the object model and the index documents is described through annotations, the querying capability is integrated with the regular Hibernate querying system. Hibernate Search use Apache Lucene underneath and lowers the barrier to entry to such a technology technology.

What is it all about

In a few words, bringing Google search capabilities to your domain model.

For most people, queries are synonym of SQL query, and this is indeed the case of most applications. There is a spectrum of queries, however, that are not handled by SQL (at least without proprietary extension): free text search, proximity search, phrase search, synonyms, approaching terms, result by relevance... Full-text search engines solve these classes of problems.

Full text search queries involve two steps. Indexing, ie maintain coherence between the database information and the full text index information. Querying, the ability to query in a free form the indexed information.

Integrating such a search capability to a system is not that easy. In most systems, a mismatch exists between the data structure used by the application core and the data structure used by the full text search. For applications using ORM such as Hibernate, the former is mostly designed around the object model, while the latter is designed around the notion of documents containing several fields of strings. Handling this mismatch and maintaining the data synchronized between both part of the system tend to be too tedious for a massive adoption.

Hibernate Search aims to tackle the mismatch complexity for you, and to lower the barrier to entry of full text technology such as Apache Lucene in most applications.

Indexing

Hibernate Search is a glue code between Hibernate and Apache Lucene. Apache Lucene is a fantastic full-text index Java library, and the de facto standard in the open source world. Hibernate Search listen to any changes made to the domain model thanks to the Hibernate event model. All modifications made to your persistent objects will be propagated to the Apache Lucene index transparently. Under the hood, Hibernate Search is optimizing indexing by batching the works. The current implementation queue the work per transaction. Other pluggable implementations will be possible shortly. Finally, you can force indexing of a given set of objects, which is particularly useful when initializing the index. How indexes are organized is pretty much up to you, you can have one Directory (index) per entity type (recommended) or share the same Directory for several entities.

Metadata

Indexing means translating the java object attributes to a (potentially degraded) string representation. The bridge between properties and index fields is driven by annotations metadata and defaulted to a built-in set of bridges. Flexibility is provided by the ability to use a custom bridge (very similar to the notion of /UserType/).

Searching

One of the mismatch is that full text queries in Apache Lucene returns /Documents/ and not regular domain objects. Hibernate Search implements /org.hibernate.Query/ and gives you a unified query model regardless of the query engine (criteria, HQL, SQL, Lucene). In particular, you have access to pagination and all the query APIs like /scroll()/, /list()/ etc... All queries will return managed objects (ie attached to a session), that you will be able to use and modify at will like a regular hibernate managed object (because it /is/ a regular managed object). You can decide to query on all entities or only a subset of them. Like in Hibernate, querying on a subset of entities is polymorphic.

Try it out

Hibernate Search goal is really to lower the barrier to entry of full text engine technology. Apache Lucene is often criticized by beginners for its low level API and inherent complexity. Hibernate Search make it simple to use, removing some of the complexity, but let you access all the power and flexibility of Apache Lucene if you need to. Hibernate Search is part of Hibernate Annotations. Check it out and download it , using it is much simpler than explaining it :-)


Back to top