Help

Hibernate Search 4.3.0.Beta1 is now available both in Maven repositories and from Sourceforge.

What's new?

  • Performance boosts for the NRT backend
  • Spatial API is getting nicer
  • Modules for deploying on JBoss improved (bugfixes)
  • Compatible with JBoss EAP 6.1

More details can be found on this JIRA filter.

Performance improvements for NRT users

We got a brand new performance testsuite, so we started to play with it and spotted some interesting optimisation opportunities which had eluded us in previous tests. The NRT backend (near-real-time) was affected by some unnecessary locking contention, which could in some scenarios result in significant slowdowns.

So what kind of fix are we talking about? Let's see the performance results of the new tests on the latest Final release first:

Performance Report: Hibernate Search 4.2.0.Final

SUMMARY
    Name   : FileSystemNearRealTimeTestScenario

    Memory usage (total-free):
        before : 37MB
        after  : 40MB

TASKS
    10000x InsertBookTask                      | sum 25:24.769 | avg 00:00.152
    10000x UpdateBookRatingTask                | sum 25:01.950 | avg 00:00.150
    10000x UpdateBookTotalSoldTask             | sum 22:54.125 | avg 00:00.137
    10000x QueryBooksByAuthorTask              | sum 20:22.324 | avg 00:00.122
    10000x QueryBooksByAverageRatingTask       | sum 30:21.692 | avg 00:00.182
    10000x QueryBooksByBestRatingTask          | sum 39:56.530 | avg 00:00.239
    10000x QueryBooksByNewestPublishedTask     | sum 27:02.078 | avg 00:00.162
    10000x QueryBooksBySummaryTask             | sum 27:19.568 | avg 00:00.163
    10000x QueryBooksByTitleTask               | sum 27:49.037 | avg 00:00.166
    10000x QueryBooksByTotalSoldTask           | sum 26:01.403 | avg 00:00.156

TEST CONFIGURATION
    threads              : 10
    measured cycles      : 10000
    warmup cycles        : 100
    initial book count   : 1000000
    initial author count : 10000

Let's see now how much this improved.

Performance Report: Hibernate Search 4.3.0.Beta1

SUMMARY
    Name   : FileSystemNearRealTimeTestScenario

    Memory usage (total-free):
        before : 38MB
        after  : 40MB

TASKS
    10000x InsertBookTask                      | sum 04:53.440 | avg 00:00.029
    10000x UpdateBookRatingTask                | sum 04:32.154 | avg 00:00.027
    10000x UpdateBookTotalSoldTask             | sum 04:41.969 | avg 00:00.028
    10000x QueryBooksByAuthorTask              | sum 01:58.408 | avg 00:00.011
    10000x QueryBooksByAverageRatingTask       | sum 12:02.741 | avg 00:00.072
    10000x QueryBooksByBestRatingTask          | sum 12:26.415 | avg 00:00.074
    10000x QueryBooksByNewestPublishedTask     | sum 12:01.274 | avg 00:00.072
    10000x QueryBooksBySummaryTask             | sum 07:08.790 | avg 00:00.042
    10000x QueryBooksByTitleTask               | sum 02:03.112 | avg 00:00.012
    10000x QueryBooksByTotalSoldTask           | sum 11:54.997 | avg 00:00.071

[same configuration]

And here comes the traditional disclaimer: don't expect the exact same performance benefit to apply to your application. Other applications are very likely to benefit from this but the scale will be different. This is why I am not sharing hardware details, they are not relevant: suffice it to say these tests where run in same conditions, so they comparable among each other.

We can't test all applications out there but I think I can state as an educated guess that I don't expect there to be cases in which performance could worsen. Improvements are likely to be measurable for any application using the near-real-time IndexManager, and could be even better than these figures if you have higher contention (more threads), slower storage performance, or significantly larger indexes.

Thanks for this

I would like to express gratitude for these exciting figures to the whole Apache Lucene development team for having created the Near-Real-Time improvements in Lucene, which we're building on to provide this feature, and to Tomas Hradec from the JBoss QA team for creating the performance tests which nailed the problem and allowed us to make the measuring needed for these improvements.

If anyone wants to contribute tests, even performance ones, we'll be glad to play with them and use them as a base for future improvements.

As usual, the issue tracker is JIRA and all code is on GitHub: pull requests and any kind of feedback welcome.

Stay tuned and test this quickly as the Final release will arrive very quickly! We're planning a CR (Candidate Release) next week.

1 comment:
 
22. Sep 2014, 08:31 CET | Link

His or her designer watches are well-known pertaining to obtaining greater good quality, chanel replica bags longevity and also gorgeous design and style. To don a real popular gucci replica bags would likely without doubt certainly be a rather nice and also comfortable expertise. Conversely, chanel replica designer watches aren’t suitable for having an experienced caterer on the muscle size market. That they are charged considerably beyond your finances in the regular particular person. Consequently, the majority of folks are merely capable to intimately take pleasure in your wristwatch only via wristwatch mags. The idea should be tough for you to visualize precisely how shocking an engaged ticking replica gucci bags enjoy will be by simply just having the capacity to effect your snaps inside mag. Amidst a lot of the Panerai wristwatch fans, a number of have a very solid would like to end this specific ache to be swindled involving having the capacity to individual his or her lovable cartier replica watches. Precisely how they're able to get a touch..

ReplyQuote
Post Comment