Making the IcebergWriter always use 1 thread for writing to the data file #130

sundargates · 2021-12-02T01:17:23Z

Context

We have observed that the IcebergWriterStage can produce corrupted data files when the committer worker changes. This is because changes to the committer worker result in new subscriptions to the data file stream (Observable<DataFIle>) produced by the writer stage. The new subscriptions result in the input hot stream being consumed from a different thread than before, and since the data structures that parquet manages are not thread-safe, this leads to data file corruption.

As part of this change, we ensure that all the writes to the ParquetFileWriter are made from the same thread.

Checklist

./gradlew build compiles code correctly
Added new tests where applicable
./gradlew test passes all tests
Extended README or added javadocs where applicable
Added copyright headers for new files from CONTRIBUTING.md

jeffchao · 2021-12-02T01:43:57Z

@sundargates looks good so far. I'll take a closer look tomorrow. Question: Did reproducing it via resubmitting/killing the Committer stage work? I'm wondering if my hunch that we discussed offline helped.

liuml07

Looks good to me; I'm still learning the root cause and how this is being tested.

...ctor-iceberg/src/main/java/io/mantisrx/connector/iceberg/sink/writer/IcebergWriterStage.java

liuml07 · 2021-12-02T01:33:34Z

mantis-connectors/mantis-connector-iceberg/build.gradle

+        testImplementation.extendsFrom shadow
+        all {
+            resolutionStrategy {
+                force "org.apache.parquet:parquet-hadoop:${parquetVersion}"


Curious, why do we need to force the Parquet version?

Added reasoning for why we need to depend on this parquet version

...ctor-iceberg/src/main/java/io/mantisrx/connector/iceberg/sink/writer/IcebergWriterStage.java

...eberg/src/test/java/io/mantisrx/connector/iceberg/sink/writer/IcebergWriterEndToEndTest.java

liuml07 · 2021-12-02T02:18:41Z

...eberg/src/test/java/io/mantisrx/connector/iceberg/sink/writer/IcebergWriterEndToEndTest.java

+    Thread.sleep(2 * size);
+    if (failure.get() != null) {
+      throw new Exception(failure.get());
+    }


Wondering if this can be in a loop so the test can fail fast.

for (int i = 0; i < 2 * size / 100; i++) { Thread.sleep(100); if (failure.get() != null) { LOG.error(failure.get()); fail(....); } }

…file

sundargates requested review from calvin681, liuml07, jeffchao and Andyz26 December 2, 2021 01:17

sundargates requested review from nickmahilani and piygoyal as code owners December 2, 2021 01:17

liuml07 approved these changes Dec 2, 2021

View reviewed changes

sundargates force-pushed the sundaram/iceberg_bug branch from bfb6e8c to 10ccc68 Compare December 4, 2021 22:25

Making the IcebergWriter always use 1 thread for writing to the data …

fe07f8b

…file

sundargates force-pushed the sundaram/iceberg_bug branch from 5cdd30f to fe07f8b Compare December 4, 2021 23:18

sundargates merged commit 667e58c into Netflix:master Dec 5, 2021

jeffchao mentioned this pull request Dec 7, 2021

Iceberg sink race condition creates corrupted data files #129

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making the IcebergWriter always use 1 thread for writing to the data file #130

Making the IcebergWriter always use 1 thread for writing to the data file #130

sundargates commented Dec 2, 2021 •

edited

Loading

jeffchao commented Dec 2, 2021

liuml07 left a comment

liuml07 Dec 2, 2021

sundargates Dec 4, 2021

liuml07 Dec 2, 2021 •

edited

Loading

Making the IcebergWriter always use 1 thread for writing to the data file #130

Making the IcebergWriter always use 1 thread for writing to the data file #130

Conversation

sundargates commented Dec 2, 2021 • edited Loading

Context

Checklist

jeffchao commented Dec 2, 2021

liuml07 left a comment

Choose a reason for hiding this comment

liuml07 Dec 2, 2021

Choose a reason for hiding this comment

sundargates Dec 4, 2021

Choose a reason for hiding this comment

liuml07 Dec 2, 2021 • edited Loading

Choose a reason for hiding this comment

sundargates commented Dec 2, 2021 •

edited

Loading

liuml07 Dec 2, 2021 •

edited

Loading