I have written an article / tutorial over on Thinking in Seam regarding using Hibernate Search with JBoss Seam. You can read the article by following this link. Comments are, as ever, appreciated.
In Relation To Hibernate Search
In Relation To Hibernate Search
It has been a long time since an Hibernate Search release but we have not been lazy. We are pleased to announce 3.1.0 Beta1 with tons of new features and enhancements. This release uses Lucene 2.3.x and works with Hibernate Core 3.3, Hibernate Annotations 3.4 and Hibernate EntityManager 3.4. Here is a list of some of the major new features and enhancements:
- more flexible analyzer support (see below)
- the Hibernate Search engine is no longer tied to Hibernate Core (see below)
- performance enhancements on projections (Hibernate Search is now as fast as pure Lucene)
- performance enhancements in the object loading algorithm (when multiple object types are requested)
- better memory management on large index copies
- better mass indexing approach by explicitly flushing changes to indexes via a programmatic API (deprecating the old batch_size approach)
- better resource sharing through the shared-segments reader provider strategy
- better and more transparent filter caching solution
- access to more Lucene features including term position, similarity and query explanations
- simplification of configuration (events)
- more built in bridges
Hibernate Search let's you define analyzers declaratively and decouple tokenizer and token filters usage thanks to the Solr analyzer framework. It is now very easy to index a field for phonetic, synonym, snowball (stemming) and many more. A small dependency bug has leaked in this beta1 version. You will need to replace apache-solr-analzers.jar by a full solr distribution jar you can download at apache.org if you ant to use @AnalyzerDef on some filters.
The core engine is now abstracted form Hibernate Core thanks to the job done by Navin, our Google Summer of Code student. Hibernate Search is now the JBoss Cache full-text search engine (more on that in a later post) and is now open to support alternative data stores (including other ORMs).
We will likely post new entries to zoom on some of these features.
Hibernate Search in Action already reflects most of the new features and will describe all of them in the near future.
It's been quite some time since the latest release of Hibernate Search, but since the code base has been fairly stable and bug free, we have been holding it until now. But there are some interesting features that could not wait anymore:
- transparent reindexing on all collection changes
- support of Lucene 2.3 (performance improvements and stability)
- query ResultTransformer making projections even more friendly
Finally! This one has been annoying some of you for some time now. If you use Hibernate 3.2.6, Hibernate Search will reindex the entities on collection change. Be sure to add the appropriate additional event listeners
<event type="post-collection-recreate"/> <listener class="org.hibernate.search.event.FullTextIndexCollectionEventListener"/> </event> <event type="post-collection-remove"/> <listener class="org.hibernate.search.event.FullTextIndexCollectionEventListener"/> </event> <event type="post-collection-update"/> <listener class="org.hibernate.search.event.FullTextIndexCollectionEventListener"/> </event>
This event listeners configuration will transparently be done when Hibernate Annotations 3.3.1 is out (the code is checked in already).
Hibernate Search now runs Lucene 2.3. Hibernate Search is fully backward compatible with Lucene 2.2 but we highly recommend you to move to Lucene 2.3 (included in the latest Hibernate Search distribution) as some interesting performance improvements have been done by the Lucene team (I know, they did it again).
Projection query is a useful tool in some performance critical situations but the returned result is List<Object>: needless to say, not very developer friendly.
ResultTransformer, already available in regular Hibernate Core queries to post process the results, can now be used in Hibernate Search queries.
FullTextQuery query = s.createFullTextQuery( query, Employee.class ); query.setProjection("id", "lastname", "department"); query.setResultTransformer( new AliasToBeanResultTransformer(EmployeeView.class) ); List<EmployeeView> results = (List<EmployeeView>) query.list();
Some additional bug fixes and enhancements have been introduced, including @IndexedEmbedded used in multiple levels, Hibernate Search filter caching actually cache now with the standard Lucene CachingWrapperFilter and so on.
Max and I will be at JavaPolis next week. I don't know what Max is doing there but I will talk about Hibernate Search and JSR-303 Bean Validation, both talks on Thursday the 13th. Speaking of JSR-303, I have done a quick interview with Mark Newton on the topic: in a nutshell, it's shaping well and we hope to have a draft out in a month or so for you to review :)
Max should have some exciting news on the tooling side for Seam, JSF and Hibernate.
Speaking of Seam, Pete will be there as well for a university around JBoss Seam
Come by the booth, I'm sure we will have some beers for you.
The Hibernate Search team is pleased to announce version 3.0 final. Hibernate Search provides full text search (google-like) capabilities to Hibernate domain model objects. Based on Apache Lucene, Hibernate Search focuses on ease of use and ease of configuration, lowering the barrier to entry of Lucene and its integration with a domain model.
Key features include:
- Transparent index synchronization: This feature eliminates the need to manually update the index on data change. Events generated by Hibernate Core will trigger the update transparently for the application. Index updates are scoped per transaction to match the application transactional behavior.
- Seamless integration with the Hibernate and Java Persistence query model: Hibernate Search embraces both the Hibernate and Java Persistence semantic and APIs. As a result, switching from a Hibernate Query Language (HQL) query to a full text query requires only minimal changes to the application.
- Out-of-the-box asynchronous clustering mode: Handles clustered applications, this out of the box mode also handles gracefully indexing load peaks, avoiding potential contentions on online systems.
- Product extensibility: Developers can extend Hibernate Search with a series of extension points for deep index interaction customization that helps edge case applications meet their performance and architectural requirements and constraints.
Some additional noticeable features:
- query filter (similar to the Hibernate Filter feature): useful for security, temporal data, category filtering etc transparently cached for the user
- join-style query: ability to query based on associated entities
- query projection: avoid database roundtrips if the relevant data is also stored in the index
- access to the result score, boost, total number of results and other Lucene metadata
- ability to manually (re)index and purge data form the index
- index sharding: sharing the same index for several classes or splitting (sharding) a given class into several indexes. It is useful for performance when the index becomes /very/ big.
- transparently optimized access to Lucene both for index update and queries
- native access to the Lucene resources
Many thanks to the community for having over the past year shown support, enthusiasm and helped the product maturation both from a feature set and stability point of view. You can download Hibernate Search or walk through the documentation and the getting started section. Happy searching :)
Hibernate Search 3.0.0.CR1 is now out. This release is mainly the last bits of new features and polishing before the final version. The next cycle will be dedicated to bug fixes (of any bug that pops up), as well as test suite and documentation improvements.
Thanks to Hardy for the new getting started guide (this should ease the path for newcomers), and to John for hammering the last features we wanted in the GA version:
- /manual indexing/ you can disable event based indexing: useful when the
- /purge/ you can remove an entity from the index without affecting the database. This is especially useful if you take care of the indexing manually (using a timestamp method for example)
The next version should be the GA release unless some complex bugs are discovered.
Check the changelogs for a detailed change list.
Hibernate Search has a new beta out and comes with a bunch of interesting new features:
- Named filters: custom filters on query results (transparently cacheable)
- Automatic index optimization
- Access to query metadata (Score, ...)
- Support for the Java Persistence API
- Index Sharding (indexing an entity into several underlying Lucene indexes)
Based on Lucene filters, named filters provide the ability to apply custom filter restrictions to the query results. Enabled by name and parameters (very much like Hibernate Core filters), filters are cacheable to improve performance. Some noticeable use cases are security, temporal data, restriction by population, query within query results.
Hibernate Search can transparently optimize your index after a certain amount of operations (add, delete) or transactions.
The projection API has been enhanced to return query specific data like the document score (relevance) and a few other metadata.
There is now a FullTextEntityManager and FullTextQuery (extending javax.persistence.Query). No need to access entityManager.getDelegate() anymore.
In extreme cases, Lucene indexes need to be split into several physical indexes. Hibernate Search can now index a given entity to several underlying Lucene indexes.
There are a few more additional features:
- Ability to index a given property in multiple different fields with different settings (without the need for a custom FieldBridge)
- Fine grained analyzers (global, per entity, per property or per field)
- Expose Lucene merge factor, max merge doc and max buffered docs
- Ships with Lucene 2.2
Thanks to John Griffin and Hardy Ferentschik for stepping up on this release. The feature set is up to what was envisioned for the final release (much more actually) and has proven very stable. We expect a short CR cycle and the GA soon after.
The most significant part, by far, of Hibernate Annotations 3.2.1 is the complete rewriting and feature expansion of Hibernate Search formerly known as Hibernate Lucene.
Hibernate Search allows you to search your domain model (google it) without the hassle and mismatches introduced by the full text technology. Indexing is done automatically, the mapping between the object model and the index documents is described through annotations, the querying capability is integrated with the regular Hibernate querying system. Hibernate Search use Apache Lucene underneath and lowers the barrier to entry to such a technology technology.
In a few words, bringing Google search capabilities to your domain model.
For most people, queries are synonym of SQL query, and this is indeed the case of most applications. There is a spectrum of queries, however, that are not handled by SQL (at least without proprietary extension): free text search, proximity search, phrase search, synonyms, approaching terms, result by relevance... Full-text search engines solve these classes of problems.
Full text search queries involve two steps. Indexing, ie maintain coherence between the database
information and the full text index information. Querying, the ability to query in a
free form the indexed information.
Integrating such a search capability to a system is not that easy. In most systems, a mismatch exists between the data structure used by the application core and the data structure used by the full text search. For applications using ORM such as Hibernate, the former is mostly designed around the object model, while the latter is designed around the notion of documents containing several fields of strings. Handling this mismatch and maintaining the data synchronized between both part of the system tend to be too tedious for a massive adoption.
Hibernate Search aims to tackle the mismatch complexity for you, and to lower the barrier to entry of full text technology such as Apache Lucene in most applications.
Hibernate Search is a glue code between Hibernate and Apache Lucene. Apache Lucene is a fantastic full-text index Java library, and the de facto standard in the open source world. Hibernate Search listen to any changes made to the domain model thanks to the Hibernate event model. All modifications made to your persistent objects will be propagated to the Apache Lucene index transparently. Under the hood, Hibernate Search is optimizing indexing by batching the works. The current implementation queue the work per transaction. Other pluggable implementations will be possible shortly. Finally, you can force indexing of a given set of objects, which is particularly useful when initializing the index. How indexes are organized is pretty much up to you, you can have one Directory (index) per entity type (recommended) or share the same Directory for several entities.
Indexing means translating the java object attributes to a (potentially degraded) string representation. The bridge between properties and index fields is driven by annotations metadata and defaulted to a built-in set of bridges. Flexibility is provided by the ability to use a custom bridge (very similar to the notion of /UserType/).
One of the mismatch is that full text queries in Apache Lucene returns /Documents/ and not regular domain objects. Hibernate Search implements /org.hibernate.Query/ and gives you a unified query model regardless of the query engine (criteria, HQL, SQL, Lucene). In particular, you have access to pagination and all the query APIs like /scroll()/, /list()/ etc... All queries will return managed objects (ie attached to a session), that you will be able to use and modify at will like a regular hibernate managed object (because it /is/ a regular managed object). You can decide to query on all entities or only a subset of them. Like in Hibernate, querying on a subset of entities is polymorphic.
Hibernate Search goal is really to lower the barrier to entry of full text engine technology. Apache Lucene is often criticized by beginners for its low level API and inherent complexity. Hibernate Search make it simple to use, removing some of the complexity, but let you access all the power and flexibility of Apache Lucene if you need to. Hibernate Search is part of Hibernate Annotations. Check it out and download it , using it is much simpler than explaining it :-)