benchmark drop for PrimaryKey

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

benchmark drop for PrimaryKey

Michael Sokolov-4
I happened to stumble across this chart https://home.apache.org/~mikemccand/lucenebench/PKLookup.html showing a pretty drastic drop in this benchmark on 5/13. I looked at the commits between the previous run and this one and did some investigation, trying to do some git bisect to find the problem using benchmarks as a test, but it proved to be quite difficult due to a breaking change re: MemoryCodec that also required corresponding changes in  benchmark code.

In the end, I think removing MemoryCodec is what caused the drop in perf here, based on this comment in benchmark code:

'2011-06-26'
   Switched to MemoryCodec for the primary-key 'id' field so that lookups (either for PKLookup test or for deletions during reopen in the NRT test) are fast, with no IO.  Also switched to NRTCachingDirectory for the NRT test, so that small new segments are written only in RAM.

I don't really understand the implications here beyond benchmarks, but it does seem that perhaps some essential high-performing capability has been lost?  Is there some equivalent thing remaining after MemoryCodec's removal that can be used for primary keys?

-Mike
Reply | Threaded
Open this post in threaded view
|

Re: benchmark drop for PrimaryKey

Adrien Grand

We don't have anything that performs as well anymore indeed, but I'm not sure this is a big deal. I would suspect that there were not many users of that postings format, one reason being that it was not supported in terms of backward compatibility (like any codec but the default one) and another reason being that it used a lot of RAM. In a number of cases, we try to fold benefits of alternative codecs in the default codec, for instance we used to have a "pulsing" postings format that could record postings in the terms dictionary in order to save one disk seek, and we ended up folding this feature into the default postings format by only enabling it on terms that have a document frequency of 1 and index_options=DOCS_ONLY, so that it would be always used with primary keys. For that postings format, it didn't really make sense as the way that it managed to be so much faster was by loading much more information in RAM, which we don't want to do with the default codec.

Le jeu. 23 août 2018 à 22:40, Michael Sokolov <[hidden email]> a écrit :
I happened to stumble across this chart https://home.apache.org/~mikemccand/lucenebench/PKLookup.html showing a pretty drastic drop in this benchmark on 5/13. I looked at the commits between the previous run and this one and did some investigation, trying to do some git bisect to find the problem using benchmarks as a test, but it proved to be quite difficult due to a breaking change re: MemoryCodec that also required corresponding changes in  benchmark code.

In the end, I think removing MemoryCodec is what caused the drop in perf here, based on this comment in benchmark code:

'2011-06-26'
   Switched to MemoryCodec for the primary-key 'id' field so that lookups (either for PKLookup test or for deletions during reopen in the NRT test) are fast, with no IO.  Also switched to NRTCachingDirectory for the NRT test, so that small new segments are written only in RAM.

I don't really understand the implications here beyond benchmarks, but it does seem that perhaps some essential high-performing capability has been lost?  Is there some equivalent thing remaining after MemoryCodec's removal that can be used for primary keys?

-Mike
Reply | Threaded
Open this post in threaded view
|

Re: benchmark drop for PrimaryKey

david.w.smiley@gmail.com
Switching to "FST50" ought to bring back much of the benefit of "Memory".

On Thu, Aug 23, 2018 at 5:15 PM Adrien Grand <[hidden email]> wrote:

We don't have anything that performs as well anymore indeed, but I'm not sure this is a big deal. I would suspect that there were not many users of that postings format, one reason being that it was not supported in terms of backward compatibility (like any codec but the default one) and another reason being that it used a lot of RAM. In a number of cases, we try to fold benefits of alternative codecs in the default codec, for instance we used to have a "pulsing" postings format that could record postings in the terms dictionary in order to save one disk seek, and we ended up folding this feature into the default postings format by only enabling it on terms that have a document frequency of 1 and index_options=DOCS_ONLY, so that it would be always used with primary keys. For that postings format, it didn't really make sense as the way that it managed to be so much faster was by loading much more information in RAM, which we don't want to do with the default codec.

Le jeu. 23 août 2018 à 22:40, Michael Sokolov <[hidden email]> a écrit :
I happened to stumble across this chart https://home.apache.org/~mikemccand/lucenebench/PKLookup.html showing a pretty drastic drop in this benchmark on 5/13. I looked at the commits between the previous run and this one and did some investigation, trying to do some git bisect to find the problem using benchmarks as a test, but it proved to be quite difficult due to a breaking change re: MemoryCodec that also required corresponding changes in  benchmark code.

In the end, I think removing MemoryCodec is what caused the drop in perf here, based on this comment in benchmark code:

'2011-06-26'
   Switched to MemoryCodec for the primary-key 'id' field so that lookups (either for PKLookup test or for deletions during reopen in the NRT test) are fast, with no IO.  Also switched to NRTCachingDirectory for the NRT test, so that small new segments are written only in RAM.

I don't really understand the implications here beyond benchmarks, but it does seem that perhaps some essential high-performing capability has been lost?  Is there some equivalent thing remaining after MemoryCodec's removal that can be used for primary keys?

-Mike
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
Reply | Threaded
Open this post in threaded view
|

Re: benchmark drop for PrimaryKey

Michael Sokolov-4
OK thanks. I guess this benchmark must be run on a large-enough index that it doesn't fit entirely in RAM already anyway? When I ran it locally using the vanilla benchmark instructions, I believe the generated index was quite small (wikimedium10k).  At any rate, I don't have any specific use case yet, just thinking about some possibilities related to primary key lookup and came across this anomaly. Perhaps at least it deserves an annotation on the benchmark graph.
Reply | Threaded
Open this post in threaded view
|

Re: benchmark drop for PrimaryKey

Michael Sokolov-4
I think the benchmarks need updating after LUCENE-8461. I got them working again by replacing lucene70 with lucene80 everywhere except for the DocValues formats, and adding the backward-codecs.jar to the benchmarks build. I'm not sure that was really the right way to go about this? After that I did try switching to use FST50 for this PKLookup benchmark (see below), but it did not recover the lost perf.

diff --git a/src/python/nightlyBench.py b/src/python/nightlyBench.py
index b42fe84..5807e49 100644
--- a/src/python/nightlyBench.py
+++ b/src/python/nightlyBench.py
@@ -699,7 +699,7 @@ def run():
-                                  idFieldPostingsFormat='Lucene50',
+                                  idFieldPostingsFormat='FST50',


On Thu, Aug 23, 2018 at 5:52 PM Michael Sokolov <[hidden email]> wrote:
OK thanks. I guess this benchmark must be run on a large-enough index that it doesn't fit entirely in RAM already anyway? When I ran it locally using the vanilla benchmark instructions, I believe the generated index was quite small (wikimedium10k).  At any rate, I don't have any specific use case yet, just thinking about some possibilities related to primary key lookup and came across this anomaly. Perhaps at least it deserves an annotation on the benchmark graph.
Reply | Threaded
Open this post in threaded view
|

Re: benchmark drop for PrimaryKey

Adrien Grand
I don't think you need an index that is so large that the terms dictionary doesn't fit in the OS cache to reproduce the difference, but you might need a larger index indeed. On my end I use wikimedium10M or wikimediumall (and wikibigall if I need to test phrases) most of the time as I get more noise with smaller indices. I added an annotation, it should be caught up next time benchmarks run.

I also pushed a change to take into account the fact that the default codec changed. However, I did not add backward-codecs.jar to the classpath, you should rebuild the index that you use for benchmarking so that it uses the Lucene80 codec instead of Lucene70.

Le ven. 24 août 2018 à 02:03, Michael Sokolov <[hidden email]> a écrit :
I think the benchmarks need updating after LUCENE-8461. I got them working again by replacing lucene70 with lucene80 everywhere except for the DocValues formats, and adding the backward-codecs.jar to the benchmarks build. I'm not sure that was really the right way to go about this? After that I did try switching to use FST50 for this PKLookup benchmark (see below), but it did not recover the lost perf.

diff --git a/src/python/nightlyBench.py b/src/python/nightlyBench.py
index b42fe84..5807e49 100644
--- a/src/python/nightlyBench.py
+++ b/src/python/nightlyBench.py
@@ -699,7 +699,7 @@ def run():
-                                  idFieldPostingsFormat='Lucene50',
+                                  idFieldPostingsFormat='FST50',


On Thu, Aug 23, 2018 at 5:52 PM Michael Sokolov <[hidden email]> wrote:
OK thanks. I guess this benchmark must be run on a large-enough index that it doesn't fit entirely in RAM already anyway? When I ran it locally using the vanilla benchmark instructions, I believe the generated index was quite small (wikimedium10k).  At any rate, I don't have any specific use case yet, just thinking about some possibilities related to primary key lookup and came across this anomaly. Perhaps at least it deserves an annotation on the benchmark graph.
Reply | Threaded
Open this post in threaded view
|

Re: benchmark drop for PrimaryKey

Michael Sokolov-4
In fact I see a pronounced effect even with the smallish (10k) index! And I should correct my earlier statement about FST50 - My earlier test was flawed: I was confused about how these benchmarks work and updated nightlyBench.py rather than my localrun.py. After correcting that and comparing FST50 with Memory I see that indeed it recovers the lost perf in this benchmark, indeed in three runs it seems to be a consistent improvement over Memory, although these test results are quite noisy so that may not be accurate. 

Maybe we ought to update nightlyBench.py to use the FST50 codec for this test? I'm not sure what it is trying to demonstrate though: would that be a "fair" test? AT least it would be more faithful to the original version of the chart. Also, please let me know if these benchmarking discussions belong elsewhere; I see that luceneutil is not really part of the apache package per se, but I doubt it has its own mailing list :)

On Fri, Aug 24, 2018 at 3:17 AM Adrien Grand <[hidden email]> wrote:
I don't think you need an index that is so large that the terms dictionary doesn't fit in the OS cache to reproduce the difference, but you might need a larger index indeed. On my end I use wikimedium10M or wikimediumall (and wikibigall if I need to test phrases) most of the time as I get more noise with smaller indices. I added an annotation, it should be caught up next time benchmarks run.

I also pushed a change to take into account the fact that the default codec changed. However, I did not add backward-codecs.jar to the classpath, you should rebuild the index that you use for benchmarking so that it uses the Lucene80 codec instead of Lucene70.

Le ven. 24 août 2018 à 02:03, Michael Sokolov <[hidden email]> a écrit :
I think the benchmarks need updating after LUCENE-8461. I got them working again by replacing lucene70 with lucene80 everywhere except for the DocValues formats, and adding the backward-codecs.jar to the benchmarks build. I'm not sure that was really the right way to go about this? After that I did try switching to use FST50 for this PKLookup benchmark (see below), but it did not recover the lost perf.

diff --git a/src/python/nightlyBench.py b/src/python/nightlyBench.py
index b42fe84..5807e49 100644
--- a/src/python/nightlyBench.py
+++ b/src/python/nightlyBench.py
@@ -699,7 +699,7 @@ def run():
-                                  idFieldPostingsFormat='Lucene50',
+                                  idFieldPostingsFormat='FST50',


On Thu, Aug 23, 2018 at 5:52 PM Michael Sokolov <[hidden email]> wrote:
OK thanks. I guess this benchmark must be run on a large-enough index that it doesn't fit entirely in RAM already anyway? When I ran it locally using the vanilla benchmark instructions, I believe the generated index was quite small (wikimedium10k).  At any rate, I don't have any specific use case yet, just thinking about some possibilities related to primary key lookup and came across this anomaly. Perhaps at least it deserves an annotation on the benchmark graph.