Red Hat

In Relation To Hibernate Search

In Relation To Hibernate Search

Hibernate Search Clustering with Terracotta

Posted by    |       |    Tagged as Hibernate Search

Hi,

I just thought that the release of Hibernate Search 3.1.0.Beta2 would be a good time to announce another clustering possibility for Hibernate Search - Terracotta clustering. Why would one use Terracotta? Well, there are several potential benefits of Terracotta clustering over the default JMS clustering currently used by Hibernate Search?

  • Updates to the index are immediately visible to all nodes in the cluster
  • You don't have the requirement of a shared file system
  • The faster RAMDirectory is used instead of the slower FSDirectory

But let's get started. You can download the code for the following example here or you can just download the binary package. At the moment the code is not yet part of the Search codebase, but probably it will at some stage.

First you will need to download and install Terracotta. I am using the 2.6.2 release. Just unpack the release into a arbitrary directory. I am using /opt/java/terracotta. Next you will the main Compass jar. You can use this jar. Place this jar into the modules directory of your terracotta installation. This solution does not rely on any Compass classes per se, but utilizes a custom RAMDirectoy implementation - org.compass.needle.terracotta.TerracottaDirectory. This is required since Lucene's RAMDirectory is not Terracotta clusterable out of the box. Let's start the terracotta server now. Switch into the bin directory of your terracotta installation and run ./start-tc-server.sh. Check the log to see whether the server started properly.

Next download and extract hsearch-demo-1.0.0-SNAPSHOT-dist.tar.gz. The dist package currently assumes that you have a mysql database running with a database hibernate and a username/password of hibernate/hibernate. You can change these settings and use a different database if you build the dist package from the source, but more to this later. The dist further assumes that you have installed Terracotta under /opt/java/terracotta. If this is not the case you can change the repository node in config/tc-config.xml. Provided that you have a running mysql database and tc-config.xml properly reflects your terracotta installation directory things should be as easy as just typing ./run.sh. The scripts will ask you whether you want to start a standalone application or a terracotta clustered one. Just press 't' to start a terracotta clustered app. You should get up a Swing JTable:

Press the index button to create an initial index. The data model is based on the former Seam sample DVD store application. Once the index is created just search for example for Tom. You should get a list of DVDs in the table. Experiment a little with the application and different queries. When you are ready start a second instance of the application by running ./run.sh again. You won't have to create the index again. In the second instance the DVDs should be searchable right away. You can also edit the title field of a DVD in one application and search for the updated title in the other. Also try closing both applications and restarting a new instance. Again DVDs should be searchable right away. The Terracotta server keeps a persistent copy of the clustered Lucene directory.

Ok, now it is time to build the application from the source. This will allow you to actually inspect the code and change things like database settings. Donwload hsearch-demo-1.0.0-SNAPSHOT-project.tar.gz and unpack the tarball. Import the maven project in your preferred IDE. To build the project you will need to define the following repositories in your settings.xml:

        <repository>
          <id>jboss</id>
          <url>http://repository.jboss.com/maven2</url>
        </repository>
        <repository>
          <id>compass-project.org</id>
          <url>http://repo.compass-project.org</url>
        </repository>

If you want to use a different database you can add/modify the profiles section in pom.xml. Also have a look at src/main/scripts/tc-config.xml and adjust any settings which differ in your setup. Once you are happy with everything just run mvn assembly:assembly to build your own version of the application.

I basically just started experimenting with this form of clustering and there are still several open questions:

  • How does it perform compared to the JMS clustering?
  • What are the limits for the RAMDirectory size?
  • How can I add failover capabilities?

I am planning to do some more extensive performance tests shortly. Stay tuned in case you are interested.

--Hardy

P.S. It would be great if someone actually tries this out and let me know if it works. As said, it's still work in progress. Any feedback is welcome :)

Hibernate Search 3.1.0.Beta2: focus on lock contention

Posted by    |       |    Tagged as Hibernate Search

Hibernate Search 3.1 beta2 is out with a significant focus on performance improvements, scalability and API clean up.

Here is the main area of work:

  • Upgrade to Lucene 2.4 which opened up a lot of optimization possibilities on the Hibernate Search side.
  • Inserts and deletes are now done in a single index opening rather than two.
  • The window of locking has been reduced a lot during writes, especially on transactions involving several entities.
  • Filter caching configuration has been simplified.
  • Expose scoped analyzer for a given class: queries can now use the same analyzer used at indexing time transparently.
  • Properly genericize the API (no more raw type used)
  • Fix a few bugs around the Solr analyzer integration and moved to Solr 1.3.
  • Fix various bugs including the long standing HSEARCH-142.

We have incorporated a lot of enhancement based on our work on the book Hibernate Search in Action and some genius performance ideas from Sanne. This version is still a beta because we still have a few optimization and enhancements in our pocket but CR1 should come out mid november-ish.

The complete list of changes can be found in jira and you can download Hibernate Search here.

Let us know what you think.

Herbstcampus Nürnberg

Posted by    |       |    Tagged as Hibernate Search

Just came back from Nürnberg where I was invited to present Hibernate Search at the Herbstcampus conference. It was the first year for this conference with the hope of making it an established yearly event. Given that already in the first year over 200 people showed up Herbstcampus might be on the right track.

While in Nürnberg I started putting together a simple Hibernate Search demo. What I wanted was just the bare bone minimum to get things running. I ended up with a simple Swing GUI with a couple of buttons and a JTable. In case you are interested check it out on the Hibernate Wiki.

On the culinary side I had the luck that this week was also the yearly Nürnberger Altstadtfest. I ended up spending a couple of hours walking around the stalls and sampling some Nürnberger Rostbratwürste and Lebkuchen. Yummi :)

Finally some pictures from Nürnberg:

--Hardy

Hibernate Search and JBoss Seam

Posted by    |       |    Tagged as Hibernate Search

I have written an article / tutorial over on Thinking in Seam regarding using Hibernate Search with JBoss Seam. You can read the article by following this link. Comments are, as ever, appreciated.

Hibernate Search 3.1.0 Beta1: .., better, faster, ...

Posted by    |       |    Tagged as Hibernate Search

It has been a long time since an Hibernate Search release but we have not been lazy. We are pleased to announce 3.1.0 Beta1 with tons of new features and enhancements. This release uses Lucene 2.3.x and works with Hibernate Core 3.3, Hibernate Annotations 3.4 and Hibernate EntityManager 3.4. Here is a list of some of the major new features and enhancements:

  • more flexible analyzer support (see below)
  • the Hibernate Search engine is no longer tied to Hibernate Core (see below)
  • performance enhancements on projections (Hibernate Search is now as fast as pure Lucene)
  • performance enhancements in the object loading algorithm (when multiple object types are requested)
  • better memory management on large index copies
  • better mass indexing approach by explicitly flushing changes to indexes via a programmatic API (deprecating the old batch_size approach)
  • better resource sharing through the shared-segments reader provider strategy
  • better and more transparent filter caching solution
  • access to more Lucene features including term position, similarity and query explanations
  • simplification of configuration (events)
  • more built in bridges

Hibernate Search let's you define analyzers declaratively and decouple tokenizer and token filters usage thanks to the Solr analyzer framework. It is now very easy to index a field for phonetic, synonym, snowball (stemming) and many more. A small dependency bug has leaked in this beta1 version. You will need to replace apache-solr-analzers.jar by a full solr distribution jar you can download at apache.org if you ant to use @AnalyzerDef on some filters.

The core engine is now abstracted form Hibernate Core thanks to the job done by Navin, our Google Summer of Code student. Hibernate Search is now the JBoss Cache full-text search engine (more on that in a later post) and is now open to support alternative data stores (including other ORMs).

We will likely post new entries to zoom on some of these features.

Hibernate Search in Action already reflects most of the new features and will describe all of them in the near future.

Many thanks to all contributors and particularly Hardy and Sanne who did a tremendous job. Go try it out here and let us know what you think on the forum.

Hibernate Search 3.0.1

Posted by    |       |    Tagged as Hibernate Search Seam

It's been quite some time since the latest release of Hibernate Search, but since the code base has been fairly stable and bug free, we have been holding it until now. But there are some interesting features that could not wait anymore:

  • transparent reindexing on all collection changes
  • support of Lucene 2.3 (performance improvements and stability)
  • query ResultTransformer making projections even more friendly

Transparent reindexing on collection change

Finally! This one has been annoying some of you for some time now. If you use Hibernate 3.2.6, Hibernate Search will reindex the entities on collection change. Be sure to add the appropriate additional event listeners

        <event type="post-collection-recreate"/>
            <listener class="org.hibernate.search.event.FullTextIndexCollectionEventListener"/>
        </event>
        <event type="post-collection-remove"/>
            <listener class="org.hibernate.search.event.FullTextIndexCollectionEventListener"/>
        </event>
        <event type="post-collection-update"/>
            <listener class="org.hibernate.search.event.FullTextIndexCollectionEventListener"/>
        </event>

This event listeners configuration will transparently be done when Hibernate Annotations 3.3.1 is out (the code is checked in already).

Lucene 2.3

Hibernate Search now runs Lucene 2.3. Hibernate Search is fully backward compatible with Lucene 2.2 but we highly recommend you to move to Lucene 2.3 (included in the latest Hibernate Search distribution) as some interesting performance improvements have been done by the Lucene team (I know, they did it again).

Query ResultTransformers

Projection query is a useful tool in some performance critical situations but the returned result is List<Object[]>: needless to say, not very developer friendly.

ResultTransformer, already available in regular Hibernate Core queries to post process the results, can now be used in Hibernate Search queries.

FullTextQuery query = s.createFullTextQuery( query, Employee.class );
query.setProjection("id", "lastname", "department");
query.setResultTransformer( new AliasToBeanResultTransformer(EmployeeView.class) );
List<EmployeeView> results = (List<EmployeeView>) query.list();

And a few other bug fixes

Some additional bug fixes and enhancements have been introduced, including @IndexedEmbedded used in multiple levels, Hibernate Search filter caching actually cache now with the standard Lucene CachingWrapperFilter and so on.

The release can be downloaded here, the complete changelog can be found at here.

Enjoy

Hibernate at JavaPolis

Posted by    |       |    Tagged as Bean Validation Hibernate Search

Max and I will be at JavaPolis next week. I don't know what Max is doing there but I will talk about Hibernate Search and JSR-303 Bean Validation, both talks on Thursday the 13th. Speaking of JSR-303, I have done a quick interview with Mark Newton on the topic: in a nutshell, it's shaping well and we hope to have a draft out in a month or so for you to review :)

Max should have some exciting news on the tooling side for Seam, JSF and Hibernate.

Speaking of Seam, Pete will be there as well for a university around JBoss Seam

Come by the booth, I'm sure we will have some beers for you.

Full Text search for Hibernate goes final

Posted by    |       |    Tagged as Hibernate Search

The Hibernate Search team is pleased to announce version 3.0 final. Hibernate Search provides full text search (google-like) capabilities to Hibernate domain model objects. Based on Apache Lucene, Hibernate Search focuses on ease of use and ease of configuration, lowering the barrier to entry of Lucene and its integration with a domain model.

Key features include:

  • Transparent index synchronization: This feature eliminates the need to manually update the index on data change. Events generated by Hibernate Core will trigger the update transparently for the application. Index updates are scoped per transaction to match the application transactional behavior.
  • Seamless integration with the Hibernate and Java Persistence query model: Hibernate Search embraces both the Hibernate and Java Persistence semantic and APIs. As a result, switching from a Hibernate Query Language (HQL) query to a full text query requires only minimal changes to the application.
  • Out-of-the-box asynchronous clustering mode: Handles clustered applications, this out of the box mode also handles gracefully indexing load peaks, avoiding potential contentions on online systems.
  • Product extensibility: Developers can extend Hibernate Search with a series of extension points for deep index interaction customization that helps edge case applications meet their performance and architectural requirements and constraints.

Some additional noticeable features:

  • query filter (similar to the Hibernate Filter feature): useful for security, temporal data, category filtering etc transparently cached for the user
  • join-style query: ability to query based on associated entities
  • query projection: avoid database roundtrips if the relevant data is also stored in the index
  • access to the result score, boost, total number of results and other Lucene metadata
  • ability to manually (re)index and purge data form the index
  • index sharding: sharing the same index for several classes or splitting (sharding) a given class into several indexes. It is useful for performance when the index becomes /very/ big.
  • transparently optimized access to Lucene both for index update and queries
  • native access to the Lucene resources

Many thanks to the community for having over the past year shown support, enthusiasm and helped the product maturation both from a feature set and stability point of view. You can download Hibernate Search or walk through the documentation and the getting started section. Happy searching :)

Release Candidate for Hibernate Search 3.0.0

Posted by    |       |    Tagged as Hibernate Search

Hibernate Search 3.0.0.CR1 is now out. This release is mainly the last bits of new features and polishing before the final version. The next cycle will be dedicated to bug fixes (of any bug that pops up), as well as test suite and documentation improvements.

Thanks to Hardy for the new getting started guide (this should ease the path for newcomers), and to John for hammering the last features we wanted in the GA version:

  • /manual indexing/ you can disable event based indexing: useful when the
  • /purge/ you can remove an entity from the index without affecting the database. This is especially useful if you take care of the indexing manually (using a timestamp method for example)

The next version should be the GA release unless some complex bugs are discovered.

Check the changelogs for a detailed change list.

Hibernate Search 3.0 Beta 4: new features bandwagon

Posted by    |       |    Tagged as Hibernate Search

Hibernate Search has a new beta out and comes with a bunch of interesting new features:

  • Named filters: custom filters on query results (transparently cacheable)
  • Automatic index optimization
  • Access to query metadata (Score, ...)
  • Support for the Java Persistence API
  • Index Sharding (indexing an entity into several underlying Lucene indexes)

Named filters

Based on Lucene filters, named filters provide the ability to apply custom filter restrictions to the query results. Enabled by name and parameters (very much like Hibernate Core filters), filters are cacheable to improve performance. Some noticeable use cases are security, temporal data, restriction by population, query within query results.

Automatic index optimization

Hibernate Search can transparently optimize your index after a certain amount of operations (add, delete) or transactions.

Query result metadata

The projection API has been enhanced to return query specific data like the document score (relevance) and a few other metadata.

Support for the Java Persistence API

There is now a FullTextEntityManager and FullTextQuery (extending javax.persistence.Query). No need to access entityManager.getDelegate() anymore.

Index sharding

In extreme cases, Lucene indexes need to be split into several physical indexes. Hibernate Search can now index a given entity to several underlying Lucene indexes.

And a few more

There are a few more additional features:

  • Ability to index a given property in multiple different fields with different settings (without the need for a custom FieldBridge)
  • Fine grained analyzers (global, per entity, per property or per field)
  • Expose Lucene merge factor, max merge doc and max buffered docs
  • Ships with Lucene 2.2

Thanks to John Griffin and Hardy Ferentschik for stepping up on this release. The feature set is up to what was envisioned for the final release (much more actually) and has proven very stable. We expect a short CR cycle and the GA soon after.

Check out here for more info. The download page is here .

back to top