Help

Hibernate Search 4.4.0.Beta1 is ready for downloads! You can get it either from Maven repositories or from Sourceforge.

Index Sharding

Sharding is a common practice among Apache Lucene users, and Hibernate Search supports it since years. It means that we split the index storage in multiple Lucene indexes, while hiding the logical complexity. This is most commonly used to:

  • Keep individual index sizes reasonable: handy for backups and performance
  • Specialize individual indexes for different language / terminology (more on this below)
  • Separate master nodes to scale writing throughput on multiple nodes
  • Legal requirements to store some data in physically independent media

So far however you would need to configure the number of shards you need in the Hibernate Search configuration, basically requiring advance knowledge of which shards your application would use.

Dynamic Sharding

With the new feature added in this 4.4.0.Beta1 release you don't have to know in advance which shards you might need at runtime. So for example if you are sharding your entities according to description languages, just storing an entity in a new language can trigger the creation of the new index infrastructure, on the fly.

All details can be found on the reference documentation.

With the previous Sharding feature, which we now call static sharding and is deprecated, you might have been used to deal with an array of indexes. Shards where identified by their position in the array. In the new model, shards are identified by a name: a simple String which maps to their IndexManager name.

Implementors will need to create a ShardIdentifierProvider, which fullfills the following needs:

Discover existing shards at boot time

Since the shards are not defined in the configuration, you need to provide a list of known shards via some code. A new mechanism was setup to allow for example to query the database using an Hibernate Session during the initialization phase. See also the AnimalShardIdentifierProvider example implementation.

Discover new shards at runtime

The second operation that a ShardIdentifierProvider needs to provide, is to watch for new shard identifiers and eventually notify the framework.

List the known shard identifiers

Finally the ShardIdentifierProvider implementation will need to keep the record of known shard names; that requires a bit of concurrent code, hopefully the example is going to be of inspiration.

Optionally you can also make your implementation really smart by watching for your custom FullTextFilters being applied to queries, to narrow down to which shards a query should be executed on. See more at Using filters in a sharded environment.

As usual the issue tracker is JIRA and all code is on GitHub: pull requests and feedback welcome.

For a detailed list of all changes in this release, see the release notes.

The next goal is to work towards a 4.4.0.Final release. If you can help us getting there fast, then we'll finally branch towards the next mayor release and start the transformations needed to support Apache Lucene version 4.

13 comments:
 
28. Sep 2013, 14:07 CET | Link
Adrian

This is great news! Is there support for the mass indexer? We would like to be able to mass index on a per shard basis. In other words we would like to have a shard per customer and the ability to reindex per customer.

Thanks

ReplyQuote
 
29. Sep 2013, 12:24 CET | Link
Adrian wrote on Sep 28, 2013 08:07:
This is great news! Is there support for the mass indexer? We would like to be able to mass index on a per shard basis. In other words we would like to have a shard per customer and the ability to reindex per customer.

Hi Adrian, no we hadn't thought about that, but it seems like an excellent idea! Please file it on JIRA as a feature request.

 
02. Oct 2013, 12:31 CET | Link
Adrian

It should be there! I remember commenting on it previously but maybe it was an actual jira ticket.

 
02. Oct 2013, 12:36 CET | Link
Adrian

Okay it was a comment related to https://hibernate.atlassian.net/browse/HSEARCH-499. Would this feature cover sharding? Both would be useful to me as hibernate search often goes out of sync with the database (I thought this was supposed to be impossible with transactions?).

 
02. Oct 2013, 13:43 CET | Link
... as hibernate search often goes out of sync with the database (I thought this was supposed to be impossible with transactions?).

Maybe not impossible but it should be very unfortunate, yes. I would be very interested to know more about how you get them out of sync, if you happen to find some clues please start a thread on the forums we can try thinking about it.

 
22. Apr 2014, 09:26 CET | Link

Both the manufacturers and the makers of the replicated watches keep on playing a cat and mouse game.The manufacturers are observing the market and get to know details regarding which of their models have been replicated.In order to prevent replications and to make the original stand out,they make subtle changes to the http://gucci-uksale.co.uk original. In such cases where the manufacturer of the original watches make changes to their watches,it won’t be long before the same changes are incorporated on the Rolex Milgauss Replica Watches.There’s been instances when even specialists have been fooled by the quality of these watches and they find it hard to think that anyone can manufacture Rolex Milgauss Replica Watches that bear such a striking resemblance to the original. There’s individuals who love the best in life and most of them earn to pay for the best things in life.These people replica gucci messenger bags know the worth of Rolex Milgauss Watches and how wearers of the same make a fashion statement wherever they go.Now that you have a rough idea of these Rolex Milgauss Replica Watches,is it not high time that you got a quantity of them for yourself?

 
24. Apr 2014, 09:51 CET | Link

Rolex could be the most regarded watch model when while in the world. Patek Philippe Replica Rolex has become all around considering that 1905, and earned its status demonstrating exceptional and exclusivity,Panerai Replica alongside one another with sky rocketing prices (From a few 1000's to in excessive of one particular certain hundred thousand US bucks!!!

 
24. May 2014, 17:40 CET | Link

I am no stranger to yoga. I have been incorporating it into my regular workout schedule for years. I love the way I feel after yoga. It's a perfect combination of strong and relaxed. My current yoga workout of choice is the Jillian Michaels two part series Yoga Inferno. These awesome workouts are swiss replica watches only 30 minutes long but VERY effective! Trust me, you will be feeling it. Your body will be sore (which is good because it means it's working). I do my workouts at home. I like the flexibility of doing it when I want and not having to drive anywhere. I have a subscription to the fitness website lv.com. For a monthly fee of $9.95, I can stream replica omega hundreds of fitness videos on my iPad, iPhone or computer. I also get access to videos on meditation, nutrition and general well-being. Click on the link below to learn more. You can sign-up and do a free 10 day trial. Nothing to lose.

 
02. Oct 2014, 13:12 CET | Link
jassica

Whatever your needs, whatever your fantasy, our products will give you and your partner a night to remember over and over again, What are you waiting for? sparxxrx

 
01. Nov 2014, 03:34 CET | Link
Whatever

It might seem, that the human chorionic gonadotropin diet, or HCG diet, is the most dangerous diets available, since it only allows the participants to consume 500 to 550 calories per day.

 
04. Nov 2014, 04:15 CET | Link

With the new feature added in this 4.4.0.Beta1 release you don't have to know in advance which shards you might need at runtime. So for example if you are sharding your entities according to description languages, just storing an entity in a new language can trigger the creation of the new index infrastructure, on the fly.

 
05. Dec 2014, 02:39 CET | Link

Thank you for the information, will be following your blog with a greatest interest. Keep posting.I gained so much information in this site and also it gives us new insight about the topic that is being discussed. I appreciate with this and if you have some more information please share it with me. I was just looking for this information and got it here now. You've such a wonderful website. Keep sharing and cheers for good work here. aidasafira400 , zasmotor , rebelmouse , kiwibox

Post Comment