Skip to content

Core: Add SortedPosDeleteWriter#1858

Merged
rdblue merged 5 commits intoapache:masterfrom
openinx:sorted-pos-delete-writer
Dec 7, 2020
Merged

Core: Add SortedPosDeleteWriter#1858
rdblue merged 5 commits intoapache:masterfrom
openinx:sorted-pos-delete-writer

Conversation

@openinx
Copy link
Member

@openinx openinx commented Dec 2, 2020

This is a separate issue (from here) to implement the writer to write the sorted pos-deletes.


SortedPosDeleteWriter<Record> writer = new SortedPosDeleteWriter<>(appenderFactory, fileFactory, format, null, 100);
try (SortedPosDeleteWriter<Record> closeableWriter = writer) {
for (int index = 0; index < rowSet.size(); index += 2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we delete them in natural order, sorting them or not in delete writer will result in the correct order. Do we want to initialize the index as 4 and decrement the counter to test the sorting logic?

}

public void delete(CharSequence path, long pos, T row) {
posDeletes.compute(CharSequenceWrapper.wrap(path), (k, v) -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: wrapper.set?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could not use wrapper.set here because we will put this item into map and if not then other paths also use wrapper.set to compare CharSequence then the key of map will be messed up. It's safe to create a new CharSequenceWrapper here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, forgot that we may put the key into map too.

// Write all the sorted <path, pos, row> triples.
for (CharSequence path : paths) {
List<PosValue<T>> positions = posDeletes.get(wrapper.set(path));
positions.sort(posValueComparator);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could probably be positions.sort(Comparator.comparingLong(PosValue::pos))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good.

Copy link
Contributor

@yyanyy yyanyy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

}

public void delete(CharSequence path, long pos, T row) {
posDeletes.compute(CharSequenceWrapper.wrap(path), (k, v) -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, forgot that we may put the key into map too.

@rdblue
Copy link
Contributor

rdblue commented Dec 5, 2020

@openinx, I'm planning on reviewing these PRs over the weekend. Thanks for getting all of this done!

// The 2th file has: <100, val-100> , <101, val-101> , ... , <199, val-199>
// The 3th file has: <200, val-200> , <201, val-201> , ... , <299, val-299>
// The 4th file has: <300, val-300> , <301, val-301> , ... , <399, val-399>
// The 5th file has: <400, val-400> , <401, val-401> , ... , <499, val-499>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this really helps when reading the test.

@rdblue
Copy link
Contributor

rdblue commented Dec 6, 2020

Overall, looks great! I noted a few things, but I think we should be able to get this in with just a couple fixes.

@rdblue rdblue merged commit df1859d into apache:master Dec 7, 2020
@rdblue
Copy link
Contributor

rdblue commented Dec 7, 2020

Thanks for the fixes, @openinx! I merged this.

pvary pushed a commit to pvary/iceberg that referenced this pull request Dec 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants