Red Hat

In Relation To Hibernate Search

In Relation To Hibernate Search

Hibernate Search 3.1.1 GA

Posted by    |       |    Tagged as Hibernate Search

With work on version 3.2 of Hibernate Search well underway and a range of very interesting features in the pipeline (eg programmatic configuration API, bulk indexing and dynamic boosting), we decided to provide some of the bug fixes also for the 3.1 version of Hibernate Search. Hence here is Hibernate Search 3.1.1 GA. On top of several bug fixes which are listed in the release notes we also upgraded Lucene from 2.4 to 2.4.1.

We recommend users of version 3.1 to upgrade to 3.1.1 to come into the benefits of these bug fixes.

You can download the release here.

enjoy!

I am please to announce the GA release of Hibernate Search 3.1. This release focuses on performance improvement and code robustness but also add interesting new features focused on usability:

  • An analyzer configuration model to declaratively use and compose features like phonetic approximation, n-gram approximation, search by synonyms, stop words filtering, elision correction, unaccented search and many more.
  • A lot of performance improvements at indexing time (including reduced lock contention, parallel execution).
  • A lot of performance improvements at query time (including I/O reduction both for Lucene and SQL, and better concurrency).
  • Additional new features both for indexing and querying (including support for term vector, access to scoped analyzer at query time and access to results explanation).

A more detailed overlook of the features follows

Analyzer

  • Support for declarative analyzer composition through the Solr library

Analyzers can now be declaratively composed as a tokenizer and a set of filters. This enable easy composition of the following features: phonetic approximation, n-gram approximation, search by synonyms, stop words filtering, elision correction, unaccented search and so on.

  • Support for dynamic analyzers

Allows a given entity to defined of the analyzer used at runtime. A typical use case is multi-language support where the language varies from one entity instance to an other.

Indexing

Indexing performance has been enhanced and new controls have been added

New features

  • Better control over massive manual indexing (flushToIndexes())
  • Support for term vector
  • Support for custom similarity
  • Better control over index writing (RAM consumption, non-compound file format flag, ...)
  • Better support for large index replication

Performance improvements

  • Improve contention and lock window during indexing
  • Reduce the number of index opening/closing
  • Indexing is done in parallel per directory

Querying

New useful features have been added to queries and performance has been improved.

New features

  • Expose entity-scoped and named analyzer for easy reuse at query time
  • Filter results (DocIdSet) can now be cached declaratively (default)
  • Query results Explanation is now exposed for better debugging information

Performance improvements

  • Reduces the number of database roundtrips on multi-entity searches
  • Faster Lucene queries on indexes containing a single entity type (generally the case)
  • Better performance on projected properties (no noticeable overhead compared to raw Lucene calls)
  • Reduction of I/O consumption on Lucene by reading only the necessary fields of a document (when possible)
  • Reduction of document reads (during pagination and on getResultSize() calls)
  • Faster index reopening (keeps unchanged segments opened)
  • Better index reader concurrency (use of the read only flag)

Libraries

  • Migration to Lucene 2.4 (and its performance improvements)
  • Upgrade to Hibernate Core 3.3
  • Use SLF4J as log facade

Bug fixes

  • Fix a few race conditions on multi core machines
  • Resources are properly discarded at SessionFactory.close()
  • Fix bug related to embedded entities and @Indexed (HSEARCH-142)
  • Filter instances are properly cached now
  • And more (see the change logs)

Download and all

You can download the release here. Changelogs are available on JIRA. Migrating to this version is recommended for all users (see the Migration guide).

Many thanks to everyone who contributed to the development and test of this release. Especially, many thanks to Sanne and Hardy who worked tirelessly pushing new features and enhancements and fixing bugs while John and I where finishing Hibernate Search in Action.

We have some features on the work that we could not put in 3.1, so stay tuned.

Hibernate Search 3.1 enters release candidate phase

Posted by    |       |    Tagged as Hibernate Search

Hibernate Search 3.1.0.CR1 is out. Download it here.

One of the main work was to align more closely with the new Lucene features:

  • read-only IndexReader for better concurrency at search time
  • use of DocIdSet rather than BitSet in filter implementations for greater flexibility
  • explicit use of Lucene's commit()

We have also added a few performance tricks:

  • avoid reading unnecessary fields from Lucene when possible
  • use Hibernate queries when projecting the object instance (as opposed to rely on batch size)

@DocumentId is now optional and defaults to the property marked as @Id. Scores are no longer normalized (ie no longer <= 1)

The full changelog is available here. Expect a GA in the next two weeks. Help us iron this release and provide issue reports in JIRA.

Many thanks to Hardy for literally hammering out fixes after fixes and making the release happen on time.

Hibernate Search Clustering with Terracotta

Posted by    |       |    Tagged as Hibernate Search

Hi,

I just thought that the release of Hibernate Search 3.1.0.Beta2 would be a good time to announce another clustering possibility for Hibernate Search - Terracotta clustering. Why would one use Terracotta? Well, there are several potential benefits of Terracotta clustering over the default JMS clustering currently used by Hibernate Search?

  • Updates to the index are immediately visible to all nodes in the cluster
  • You don't have the requirement of a shared file system
  • The faster RAMDirectory is used instead of the slower FSDirectory

But let's get started. You can download the code for the following example here or you can just download the binary package. At the moment the code is not yet part of the Search codebase, but probably it will at some stage.

First you will need to download and install Terracotta. I am using the 2.6.2 release. Just unpack the release into a arbitrary directory. I am using /opt/java/terracotta. Next you will the main Compass jar. You can use this jar. Place this jar into the modules directory of your terracotta installation. This solution does not rely on any Compass classes per se, but utilizes a custom RAMDirectoy implementation - org.compass.needle.terracotta.TerracottaDirectory. This is required since Lucene's RAMDirectory is not Terracotta clusterable out of the box. Let's start the terracotta server now. Switch into the bin directory of your terracotta installation and run ./start-tc-server.sh. Check the log to see whether the server started properly.

Next download and extract hsearch-demo-1.0.0-SNAPSHOT-dist.tar.gz. The dist package currently assumes that you have a mysql database running with a database hibernate and a username/password of hibernate/hibernate. You can change these settings and use a different database if you build the dist package from the source, but more to this later. The dist further assumes that you have installed Terracotta under /opt/java/terracotta. If this is not the case you can change the repository node in config/tc-config.xml. Provided that you have a running mysql database and tc-config.xml properly reflects your terracotta installation directory things should be as easy as just typing ./run.sh. The scripts will ask you whether you want to start a standalone application or a terracotta clustered one. Just press 't' to start a terracotta clustered app. You should get up a Swing JTable:

Press the index button to create an initial index. The data model is based on the former Seam sample DVD store application. Once the index is created just search for example for Tom. You should get a list of DVDs in the table. Experiment a little with the application and different queries. When you are ready start a second instance of the application by running ./run.sh again. You won't have to create the index again. In the second instance the DVDs should be searchable right away. You can also edit the title field of a DVD in one application and search for the updated title in the other. Also try closing both applications and restarting a new instance. Again DVDs should be searchable right away. The Terracotta server keeps a persistent copy of the clustered Lucene directory.

Ok, now it is time to build the application from the source. This will allow you to actually inspect the code and change things like database settings. Donwload hsearch-demo-1.0.0-SNAPSHOT-project.tar.gz and unpack the tarball. Import the maven project in your preferred IDE. To build the project you will need to define the following repositories in your settings.xml:

        <repository>
          <id>jboss</id>
          <url>http://repository.jboss.com/maven2</url>
        </repository>
        <repository>
          <id>compass-project.org</id>
          <url>http://repo.compass-project.org</url>
        </repository>

If you want to use a different database you can add/modify the profiles section in pom.xml. Also have a look at src/main/scripts/tc-config.xml and adjust any settings which differ in your setup. Once you are happy with everything just run mvn assembly:assembly to build your own version of the application.

I basically just started experimenting with this form of clustering and there are still several open questions:

  • How does it perform compared to the JMS clustering?
  • What are the limits for the RAMDirectory size?
  • How can I add failover capabilities?

I am planning to do some more extensive performance tests shortly. Stay tuned in case you are interested.

--Hardy

P.S. It would be great if someone actually tries this out and let me know if it works. As said, it's still work in progress. Any feedback is welcome :)

Hibernate Search 3.1.0.Beta2: focus on lock contention

Posted by    |       |    Tagged as Hibernate Search

Hibernate Search 3.1 beta2 is out with a significant focus on performance improvements, scalability and API clean up.

Here is the main area of work:

  • Upgrade to Lucene 2.4 which opened up a lot of optimization possibilities on the Hibernate Search side.
  • Inserts and deletes are now done in a single index opening rather than two.
  • The window of locking has been reduced a lot during writes, especially on transactions involving several entities.
  • Filter caching configuration has been simplified.
  • Expose scoped analyzer for a given class: queries can now use the same analyzer used at indexing time transparently.
  • Properly genericize the API (no more raw type used)
  • Fix a few bugs around the Solr analyzer integration and moved to Solr 1.3.
  • Fix various bugs including the long standing HSEARCH-142.

We have incorporated a lot of enhancement based on our work on the book Hibernate Search in Action and some genius performance ideas from Sanne. This version is still a beta because we still have a few optimization and enhancements in our pocket but CR1 should come out mid november-ish.

The complete list of changes can be found in jira and you can download Hibernate Search here.

Let us know what you think.

Herbstcampus Nürnberg

Posted by    |       |    Tagged as Hibernate Search

Just came back from Nürnberg where I was invited to present Hibernate Search at the Herbstcampus conference. It was the first year for this conference with the hope of making it an established yearly event. Given that already in the first year over 200 people showed up Herbstcampus might be on the right track.

While in Nürnberg I started putting together a simple Hibernate Search demo. What I wanted was just the bare bone minimum to get things running. I ended up with a simple Swing GUI with a couple of buttons and a JTable. In case you are interested check it out on the Hibernate Wiki.

On the culinary side I had the luck that this week was also the yearly Nürnberger Altstadtfest. I ended up spending a couple of hours walking around the stalls and sampling some Nürnberger Rostbratwürste and Lebkuchen. Yummi :)

Finally some pictures from Nürnberg:

--Hardy

Hibernate Search and JBoss Seam

Posted by    |       |    Tagged as Hibernate Search

I have written an article / tutorial over on Thinking in Seam regarding using Hibernate Search with JBoss Seam. You can read the article by following this link. Comments are, as ever, appreciated.

Hibernate Search 3.1.0 Beta1: .., better, faster, ...

Posted by    |       |    Tagged as Hibernate Search

It has been a long time since an Hibernate Search release but we have not been lazy. We are pleased to announce 3.1.0 Beta1 with tons of new features and enhancements. This release uses Lucene 2.3.x and works with Hibernate Core 3.3, Hibernate Annotations 3.4 and Hibernate EntityManager 3.4. Here is a list of some of the major new features and enhancements:

  • more flexible analyzer support (see below)
  • the Hibernate Search engine is no longer tied to Hibernate Core (see below)
  • performance enhancements on projections (Hibernate Search is now as fast as pure Lucene)
  • performance enhancements in the object loading algorithm (when multiple object types are requested)
  • better memory management on large index copies
  • better mass indexing approach by explicitly flushing changes to indexes via a programmatic API (deprecating the old batch_size approach)
  • better resource sharing through the shared-segments reader provider strategy
  • better and more transparent filter caching solution
  • access to more Lucene features including term position, similarity and query explanations
  • simplification of configuration (events)
  • more built in bridges

Hibernate Search let's you define analyzers declaratively and decouple tokenizer and token filters usage thanks to the Solr analyzer framework. It is now very easy to index a field for phonetic, synonym, snowball (stemming) and many more. A small dependency bug has leaked in this beta1 version. You will need to replace apache-solr-analzers.jar by a full solr distribution jar you can download at apache.org if you ant to use @AnalyzerDef on some filters.

The core engine is now abstracted form Hibernate Core thanks to the job done by Navin, our Google Summer of Code student. Hibernate Search is now the JBoss Cache full-text search engine (more on that in a later post) and is now open to support alternative data stores (including other ORMs).

We will likely post new entries to zoom on some of these features.

Hibernate Search in Action already reflects most of the new features and will describe all of them in the near future.

Many thanks to all contributors and particularly Hardy and Sanne who did a tremendous job. Go try it out here and let us know what you think on the forum.

Hibernate Search 3.0.1

Posted by    |       |    Tagged as Hibernate Search Seam

It's been quite some time since the latest release of Hibernate Search, but since the code base has been fairly stable and bug free, we have been holding it until now. But there are some interesting features that could not wait anymore:

  • transparent reindexing on all collection changes
  • support of Lucene 2.3 (performance improvements and stability)
  • query ResultTransformer making projections even more friendly

Transparent reindexing on collection change

Finally! This one has been annoying some of you for some time now. If you use Hibernate 3.2.6, Hibernate Search will reindex the entities on collection change. Be sure to add the appropriate additional event listeners

        <event type="post-collection-recreate"/>
            <listener class="org.hibernate.search.event.FullTextIndexCollectionEventListener"/>
        </event>
        <event type="post-collection-remove"/>
            <listener class="org.hibernate.search.event.FullTextIndexCollectionEventListener"/>
        </event>
        <event type="post-collection-update"/>
            <listener class="org.hibernate.search.event.FullTextIndexCollectionEventListener"/>
        </event>

This event listeners configuration will transparently be done when Hibernate Annotations 3.3.1 is out (the code is checked in already).

Lucene 2.3

Hibernate Search now runs Lucene 2.3. Hibernate Search is fully backward compatible with Lucene 2.2 but we highly recommend you to move to Lucene 2.3 (included in the latest Hibernate Search distribution) as some interesting performance improvements have been done by the Lucene team (I know, they did it again).

Query ResultTransformers

Projection query is a useful tool in some performance critical situations but the returned result is List<Object[]>: needless to say, not very developer friendly.

ResultTransformer, already available in regular Hibernate Core queries to post process the results, can now be used in Hibernate Search queries.

FullTextQuery query = s.createFullTextQuery( query, Employee.class );
query.setProjection("id", "lastname", "department");
query.setResultTransformer( new AliasToBeanResultTransformer(EmployeeView.class) );
List<EmployeeView> results = (List<EmployeeView>) query.list();

And a few other bug fixes

Some additional bug fixes and enhancements have been introduced, including @IndexedEmbedded used in multiple levels, Hibernate Search filter caching actually cache now with the standard Lucene CachingWrapperFilter and so on.

The release can be downloaded here, the complete changelog can be found at here.

Enjoy

Hibernate at JavaPolis

Posted by    |       |    Tagged as Bean Validation Hibernate Search

Max and I will be at JavaPolis next week. I don't know what Max is doing there but I will talk about Hibernate Search and JSR-303 Bean Validation, both talks on Thursday the 13th. Speaking of JSR-303, I have done a quick interview with Mark Newton on the topic: in a nutshell, it's shaping well and we hope to have a draft out in a month or so for you to review :)

Max should have some exciting news on the tooling side for Seam, JSF and Hibernate.

Speaking of Seam, Pete will be there as well for a university around JBoss Seam

Come by the booth, I'm sure we will have some beers for you.

back to top