Red Hat

In Relation To Sanne Grinovero

In Relation To Sanne Grinovero

Berlin Buzzwords coming soon

I'll be at Berlin Buzzwords 2012, to meet with the awesome community of people interested in scalable search, NoSQL and bigdata in the cloud.

Infinispan Lucene Directory

Of particular interest to Hibernate Search and Infinispan Query users and contributors, I've been given the opportunity to talk about the Infinispan Lucene Directory we built as an extension to the Infinispan project: the capability to store and efficiently replicate Lucene indexes in the Infinispan grid. Of course, this Directory implementation doesn't depend on Hibernate Search or Infinispan Query and can be used to solve the reliable replication problem with Lucene indexes in any other application using Lucene. In fact its development was initially sponsored by Sourcesense to replicate JIRA instances and is now evolving in the Infinispan project as a high performance alternative to the traditional Directory implementations.. for more details come to my talk or come talking to me at any time.

The conference

It's the first time for me to go to Berlin Buzzwords, but I've heard excellent feedback from the past editions so I'm really looking forward it: the program is full of amazing titles, and many interesting speakers and no doubt attendees to talk with.

About other JBoss people going, you might meet Mircea Markus from the Infinispan team and Lukáš Vlček our ElasticSearch expert and the man behind our new

Looking forward to meet you all there!

A bugfix release for Hibernate Search 4.1 ! Some of you have been reporting reduced performance after migrating from 3.4 or 4.0, which was not expected at all as the internals got smarter at each release. It turns out there was a quite critical bug: file handle leaks.

Big thanks to Bostjan Lah for reporting this problem and providing a nice test to let me reproduce the problem, and to Jan Slezak for all the help verifying the solution. Also Michael Heinrichs reported an issue with programmatic configuration and fixed it, awesome!

What changed compared to 4.1.0.Final?

Besides the important bugfixes, not much changed as expected for minor releases: mostly documentation clarifications, some classloader issues which would affect you only when embedding it in a JBoss Module. For a detailed list see the JIRA changelogs.

We strongly recommend to upgrade; see also the Migration Guide.

Hibernate Search 4.1 CR3

Posted by    |       |    Tagged as Hibernate Search

Another CR for Hibernate Search 4.1 is ready! Even being in candidate release phase, we introduced something more than the usual minor bugfixes, as the following improvements are too nice to delay and technically not very risky.

Depending on Hibernate 4.1

Hibernate Search was updated to work with Hibernate ORM 4.1, as it was still using 4.0.

Rethinking the JGroups integration

The JGroups Channel is the communication transport used when connecting multiple nodes in cluster using JGroups; before 4.1.0.CR3, Hibernate Search would expect you to configure a Channel for each clustered index, but having to configure multiple JGroups Channel is tedious: for example, each channel should use a different set of network ports.

The Channel is now a service shared across all indexes: every index configured to use JGroups will share the same Channel instance. This simplifies configuration, network administration and speeds up initialization.

Configuration details are described in the JGroups configuration paragraph.

If you were using JGroups before, please see the Migration Guide.

JGroups channel injection

It is now possible to have Hibernate Search use an existing JGroups Channel, injecting the instance in the configuration. This was primarily introduced for other frameworks integrating our search engine, such as CapeDwarf, so they can control the Channel lifecycle and make use of alternative initialization options. Remember however: Search installs it's own message Receiver, it's not going to share the channel with other services!


org.jgroups.JChannel channel = ...//initialize or lookup the channel directly
Map<String,String> properties = new HashMap<String,String>();
properties.put( JGroupsChannelProvider.CHANNEL_INJECT, channel );
properties.put( //...other options
EntityManagerFactory emf = Persistence.createEntityManagerFactory( "userPU", properties );

Plans for next...

We're working as well on making the master/slave an automatic election process, but that's too big of a change for a CR, so consider it just a teaser for upcoming 4.2 ! Of course, you can help starting to test it today if you're willing to participate in the coding and try the bleeding edge.

New paths to indexing: do we know better?

Posted by    |       |    Tagged as Hibernate Search

Hibernate Search 4.1.0.Beta2 is released, and contains a very interesting improvement: it is now possible to precisely express which paths will be indexed when using @IndexedEmbedded.

Previously, when using @IndexedEmbedded, we would walk the entity graph up to the specified depth to index all the traversed branches. We would index to the same depth all paths, until the specified maximum depth is reached or a smaller value for depth was encountered. In a complex model it could become complex to control what exactly would get indexed.

On the forums Zach Kurey, who was having this problem, asked me just out of curiosity why we didn't provide an explicit paths-to-be-included option. Surely, he wrote, there must be a reason. Truth be told, there was no reason: we just hadn't thought about it.

So, if you have suggestions, don't think we know better. Get in touch! Our role is to protect the quality of the code and catalyse the experience of many clever users: we need to hear from you to keep on improving.

After a long discussion about the API and implementation details, this release makes the new @IndexedEmbedded(includePaths) feature available for everyone to use. Thanks to Zach and Davide D'Alto, as after contributing to the design they also provided the patches and tests, making this brilliant idea available to everyone.

How does it work?

In the following Indexed Entity we declare that when indexing each Person we want to index the name and surname fields, and its parents as well by using the well known @IndexedEmbedded annotation:

public class Person {

   public int id;

   public String name;

   public String surname;

   @IndexedEmbedded(includePaths = { "name" })
   public Set<Person> parents;

   public Person child;

    ...//other fields omitted

The news is the attribute includePaths of the annotation, which points out that we don't want to recursively index all fields for the parent Person, but only its name field.

This was a very simple example; the reference documentation contains more examples and details. In short, it provides better control on which fields will be indexed, avoiding to index unnecessary objects. Of course this improves overall performance.

Hibernate Search 4.1.0.Beta2 awaits you!

Of course this release contains some more bugfixes and improvements, for more details check the release notes.

Hibernate Search version 4.1.0.Beta1 was tagged; the most essential change compared to January's release 4.1.0.Alpha1 was HSEARCH-1034, made to allow Infinispan Query to use the fluent Programmatic Mapping API as already available to Hibernate users.

More changes are being developed: stay tuned for new MassIndexer improvements, some new performance improving tricks, and a fierce discussion is going on to provide a new pragmatic way to define index mappings starting from the Query use cases.

Integrations with Infinispan

The Infinispan project released a new milestone version 5.1.1.FINAL, which is relevant to Hibernate Search users in many ways:

  • Hibernate Search can use Infinispan to distribute the index among several clustered nodes.
  • JBoss AS 7.1 will use this version as the fundamental clustering technology.
  • Hibernate OGM can map JPA entities to Infinispan instead of a database, and use Hibernate Search as query engine and replicate the indexes storing them in Infinispan.
  • Infinispan Query uses the Hibernate Search Engine component to make it possible to search across the values stored in Infinispan. All you need to do is add the dependency to infinispan-query, enable indexing in the configuration and either annotate the objects you store in the grid like you would do with Hibernate Search entitites, or define the mappings using the programmatic API.

More details on Infinispan Query can be found in the Infinispan reference, but if you're familiar with Hibernate Search there's not much to learn as they share most features and configuration options as defined on the Hibernate Search reference manual.

Hibernate Search 4.1 is coming

Posted by    |       |    Tagged as Hibernate Search

We tagged Hibernate Search 4.1.0.Alpha1, and artifacts are now ready to be downloaded. 4.1 is meant to mainly upgrade the core dependencies and will have a quick development cycle.

Upgraded dependencies

  • Apache Lucene 3.5
  • Infinispan 5.1
  • JGroups 3.0

To use the above versions, upgrading is required as each of the mentioned projects changed some of its API used by Hibernate Search. Of course Hibernate Search shields you from these changes being fully backwards compatible.

MassIndexer performance

The MassIndexer is quick again! To be honest this is not an improvement but is a bugfix of a performance regression. If you noticed a performance drop in mass indexing using 4.0.0.Final, please try again with this new release and you will see a significant improvement. While working towards 4.1 final we're going to improve it's features and possibly performance even more, finally taking advantage of the new internal design provided by 4.0.

Great contributions

Guillaume Smet identified and fixed a regression for which dirty collections would not be re-indexed when having a custom FieldBridge instead of the standard @IndexedEmbedded.

Davide D'Alto improved the algorithm identifying the elements which need to be loaded and re-indexed: it's now able to avoid some unnecessary database loads in specific use cases having complex relations, consequently also reducing the index size.

The usual links

As always distribution bundles are available from Sourceforge, or you can get it via Maven artifacts. User questions are welcome on the forums, bugs and improvements can be discussed on the mailing list or posted to JIRA directly, possibly with unit tests.

Complete details of all changes are tracked on JIRA.

After Devoxx, JBug Newcastle

Posted by    |       |    Tagged as Events

Infinispan team at Devoxx

Two weeks ago we where at Devoxx, with Pete Muir and Mircea Markus we had a three hours long workshop about using Infinispan in a real world JEE application. All our notes for the presentation are available here, and it includes the source code used for the demo and all slides.
The instructions contain both a zip of the source code or pointers to a Git repository; if you're familiar with Git the history contains each step from the guide so you can try follow the workshop chapter by chapter: we hope it's clear enough for anyone not familiar with Infinispan, if not questions and suggestions for improvements are welcome.

Hibernate OGM and Search updates

At the same conference as a member of the Hibernate OGM team we met Greg Luck from EHCache fame and we started some concrete plans to support EHCache as a data store for Hibernate OGM. If anyone wants to write a custom module for OGM, please note that we have now an experimental integration layer and Infinispan is no longer a dependency: we have instead an example implementation using a HashMap, so it should be easy to integrate with any other NoSQL database. Some interest was shown around Neo4J, MongoDB and HBase integration, but we need a volunteer to start working on it... feel free to join!

In a different area, same conference we met Karel Maesen of Hibernate Spatial, so stay tuned for a better integration in that area; if you're interested in geolocation you might want to have a look at the draft for integration in Hibernate Search being proposed by Emmanuel and Nicholas Helleringer at HSEARCH-923.

Next week: Arquillian at JBug Newcastle

Next week I'll be in the Newcastle office introducing Arquillian and Shrinkwrap together with Paul Robinson, the lead of the Web Service Transactions project. The talk is named Testing JEE Applications in the container using Arquillian: after an introduction on the coolest testing tools we plan to run a workshop and have everyone try it out.

The workshop is scheduled for Tuesday 13th December in the University of Newcastle, and as always discussions and questions are welcome on any JBoss technology. Full details of the event can be found here.

The OpenBlend conference in Ljubljana, Slovenia will be held the 15th September in the fabulous setting of the Ljubljana Castle.

Since it was incredibly complex to plan my travel to get there, I'll make it worth the effort by having two talks:

  1. Introducing Hibernate OGM: porting JPA applications to NoSQL
  2. Introduction to Byteman and The Jokre

Both projects are very young, in fact I think this is going to be the first time we reveal (1) the Jokre - a very innovative optimisation engine - and Hibernate OGM is definitely a hot topic.

I also look forward to see the other talks of the day, meet team mates such as Bela Ban from JGroups and Infinispan, Adam Warski the creator of Hibernate Envers (but presenting Torquebox & CDI), Aleš Justin the Weld lead and master of the conference, and everyone else meeting there: above all, it's always nice to hear what people do or would like to do with the tools we build, and meeting more people willing to join the open source effort.

1- please don't cheat by downloading the source code yet: it's pointless, you won't understand it. If you do, please add some comments to the code.

A much requested Hibernate Search 3.4.1 released

Posted by    |       |    Tagged as Hibernate Search

While our focus has been on the exciting new improvements in Hibernate Search 4, since the release of the last stable release 3.4.0.Final we had much interesting feedback from the community, including bugreports and patches.

Since some contributors have asked for a bugfix release, here comes Hibernate Search 3.4.1.Final!

What's new

  • Some tricky indexing issues with @IndexedEmbedded entities in a @ManyToOne relation fixed
  • Faceting was a new feature, several bugs where fixed
  • 3.4 introduced dirty checking of collections: both a bug was solved and performance was improved even more

All details are tracked on JIRA.

A sad warning

As we now mention on the documentation too, Java 7 is not a recommended VM to use yet.

The usual links

As always distribution bundles are available from Sourceforge, or you can get it via Maven artifact. Questions can be posted on the forums, bugs can be discussed with us or posted to JIRA directly, possibly with unit tests.

Many thanks to Mathieu Perez, Nikita D, Kyrill Alyoshin, Elmer van Chastelet, Guillaume Smet and Samppa Saarela for their code analysis, tests and fixes.

Hibernate Search 4 is coming

Posted by    |       |    Tagged as Hibernate Search

The release cycle of Hibernate Search 4 has begun. Alpha 1 is out. We already have many things implemented so this change is consistent and more releases will come quickly.

Hibernate Search 4 goals are two folds:

  • Be compatible with the new Hibernate Core 4 releases.
  • Make the necessary architecture change to reach the future goals of Hibernate Search.

In particular, making Hibernate Search independent from Hibernate Core and allowing more scalable cloud-tainted backends.

This release already includes lots of changes

Split between API / SPI and implementation classes

Each class is now categorized into either an API, a SPI or an implementation class.

  • APIs (in regular packages) are safe to be used in your application and we try very hard to not break these contracts
  • SPI (in .spi packages) are classes that are used by frameworks integrating with Hibernate Search (like Infinispan's search module). These contract are pretty stable but might change more often than APIs
  • Implementations (in .impl packages) are implementation details. Don't let your application depend on these.

If you were a good citizen and already used the API only, you should not be affected. If you were using SPI or internal classes, you will have to adjust. Check our wiki page for the migration guide.

Move to JBoss Logging and error codes

JBoss Logging has some nice features including error internationalization and error codes in messages. You will be able to Google HSEARCH00043 and see why you have such problem.

Integration with Hibernate Core 4

Nuf said.

Move to the per index backend architecture

This will give you more flexibility on how you want your entities indexed and us the possibility for additional optimizations down the road. You can use different technologies for each index, for example use a Lucene backend for some indexes and an Infinispan index for others which need real-time clustering. Also it's now possible to configure the performance parameters of each index separately, from the async/sync option to the number of Worker threads and queue sized in the backend executor.

MassIndexer is no longer an exclusive mode

The MassIndexer no longer locks out the main backend listening for Hibernate events, so it can be started while other transactions run. Until it finished however some results might be missing from the index.

New binary format of communication between remote backends

Lucene no longer guarantees the Serializable contract for Documents and Fields. This is a problem when you use a clustered model for Hibernate Search.

So we have introduced a new communication protocol in the JMS and JGroups backends so they can pass along Lucene works in a safe way. This aligns with our quest to shield you from incompatible changes Lucene may make in future versions. We also want to make it easier to upgrade a cluster of Hibernate Search nodes and let them interact without issue when possible.

So now you can use NumericField in clustered environment, which was previously not possible as it has never been Serializable.

Aggressive on performance

The backend is now quite aggressive in write performance, enabling exclusive_index_use by default and having merged some of the performance tricks from the MassIndexer back into the standard backend to take advantage of them all the time. For example the new backend design allows us to analyse Documents in multiple threads while still guaranteeing writes happen in the proper order. This is configured with the worker.thread_pool.size property, defaulting to one, and applies even to backends configured for synchronous updates.

Get the release

That's all for now, check out the release and make sure to read the Migration Guide.

Many thanks to the community and particularly Davide D'Alto for his contributions and Adam Harris, Samppa Saarela and Elmer van Chastelet for their suggestions for performance and design improvements.

back to top