-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-8585: Index-time jump-tables for DocValues #525
Conversation
… lucene70 codec classes (temporary kludge until the correct strategy is found)
lucene/core/src/java/org/apache/lucene/codecs/lucene70/package-info.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene80/Lucene80DocValuesProducer.java
Outdated
Show resolved
Hide resolved
…arger changes underway)
…edDISI creation) and introduced secondary slices for jump-tables to avoid cache thrashing. The DENSE block rank structure handling is still in flux: Current implementations uses an explicit slice, but this might change depending on review discussion.
…es a slice. BaseDocValuesFormatTestCase.testSparseGCDCompression does not pass yet
… unit tests passes
lucene/test-framework/src/java/org/apache/lucene/index/BaseDocValuesFormatTestCase.java
Outdated
Show resolved
Hide resolved
…IndexedDISI was not positioned at 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tokee, I left some comments but I think it's close to being mergeable!
lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/codecs/lucene80/Lucene80DocValuesProducer.java
Outdated
Show resolved
Hide resolved
lucene/core/src/test/org/apache/lucene/index/TestDocValues.java
Outdated
Show resolved
Hide resolved
lucene/test-framework/src/java/org/apache/lucene/index/BaseDocValuesFormatTestCase.java
Outdated
Show resolved
Hide resolved
lucene/test-framework/src/java/org/apache/lucene/index/BaseDocValuesFormatTestCase.java
Outdated
Show resolved
Hide resolved
I forgot to mention: precommit complains about unused imports. |
…ank-spans for DENSE blocks
…along with the slice (not very elegant)
…to provide jump-table support to empty end-blocks
lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java
Outdated
Show resolved
Hide resolved
…nd not jump-tables. TestIndexedDISI.testFewMissingDocs fails
lucene/core/src/java/org/apache/lucene/codecs/lucene80/IndexedDISI.java
Outdated
Show resolved
Hide resolved
…Powers and added unit-test for it
…on) caused a NullPointerException
…se to TestLucene80DocValuesFormat
…ble was still present even though the jump-table input was changed to start at the jump-table
…andomAccessInput reuse
|
||
// Returns the offset to the jump-table for vBPV | ||
private long writeValuesMultipleBlocks(SortedNumericDocValues values, long gcd) throws IOException { | ||
long[] offsets = new long[ArrayUtil.oversize(100, Long.BYTES)]; // 100 blocks = 1.6M values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you reduce the initial size so that testing exercises this line of code?
@@ -1216,7 +1219,7 @@ private void doTestNumericsVsStoredFields(double density, LongSupplier longs) th | |||
doc.add(dvField); | |||
|
|||
// index some docs | |||
int numDocs = atLeast(300); | |||
int numDocs = atLeast((int) (minDocs*1.172)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this 1.172 factor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that should have been explained. It is because I made the minDocs variable. Originally the test was fixed with 256 as minDocs. 256*1.172=300
, so with this trick the numDocs/minDocs
-factor stayed constant for the old tests. I have probably been overthinking this. Should I set it to something less puzzling like 1.5 or 2?
Closing pull request as this has been committed to master. |
First complete attempt of LUCENE-8585, with all jump-tables at index-time and with the not-used-anymore lucene70 codec classes moved to backward-codecs.