Core: Add SortedPosDeleteWriter#1858
Core: Add SortedPosDeleteWriter#1858rdblue merged 5 commits intoapache:masterfrom openinx:sorted-pos-delete-writer
Conversation
|
|
||
| SortedPosDeleteWriter<Record> writer = new SortedPosDeleteWriter<>(appenderFactory, fileFactory, format, null, 100); | ||
| try (SortedPosDeleteWriter<Record> closeableWriter = writer) { | ||
| for (int index = 0; index < rowSet.size(); index += 2) { |
There was a problem hiding this comment.
I think if we delete them in natural order, sorting them or not in delete writer will result in the correct order. Do we want to initialize the index as 4 and decrement the counter to test the sorting logic?
| } | ||
|
|
||
| public void delete(CharSequence path, long pos, T row) { | ||
| posDeletes.compute(CharSequenceWrapper.wrap(path), (k, v) -> { |
There was a problem hiding this comment.
We could not use wrapper.set here because we will put this item into map and if not then other paths also use wrapper.set to compare CharSequence then the key of map will be messed up. It's safe to create a new CharSequenceWrapper here.
There was a problem hiding this comment.
You are right, forgot that we may put the key into map too.
| // Write all the sorted <path, pos, row> triples. | ||
| for (CharSequence path : paths) { | ||
| List<PosValue<T>> positions = posDeletes.get(wrapper.set(path)); | ||
| positions.sort(posValueComparator); |
There was a problem hiding this comment.
Nit: could probably be positions.sort(Comparator.comparingLong(PosValue::pos))
| } | ||
|
|
||
| public void delete(CharSequence path, long pos, T row) { | ||
| posDeletes.compute(CharSequenceWrapper.wrap(path), (k, v) -> { |
There was a problem hiding this comment.
You are right, forgot that we may put the key into map too.
|
@openinx, I'm planning on reviewing these PRs over the weekend. Thanks for getting all of this done! |
core/src/main/java/org/apache/iceberg/io/SortedPosDeleteWriter.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/iceberg/io/SortedPosDeleteWriter.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/iceberg/io/SortedPosDeleteWriter.java
Outdated
Show resolved
Hide resolved
data/src/test/java/org/apache/iceberg/io/TestGenericSortedPosDeleteWriter.java
Show resolved
Hide resolved
| // The 2th file has: <100, val-100> , <101, val-101> , ... , <199, val-199> | ||
| // The 3th file has: <200, val-200> , <201, val-201> , ... , <299, val-299> | ||
| // The 4th file has: <300, val-300> , <301, val-301> , ... , <399, val-399> | ||
| // The 5th file has: <400, val-400> , <401, val-401> , ... , <499, val-499> |
There was a problem hiding this comment.
Thanks, this really helps when reading the test.
data/src/test/java/org/apache/iceberg/io/TestGenericSortedPosDeleteWriter.java
Outdated
Show resolved
Hide resolved
|
Overall, looks great! I noted a few things, but I think we should be able to get this in with just a couple fixes. |
|
Thanks for the fixes, @openinx! I merged this. |
This is a separate issue (from here) to implement the writer to write the sorted pos-deletes.