Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-9527. Race condition in RocksDatabase #5947

Merged
merged 24 commits into from
Jan 10, 2024

Conversation

sumitagrawl
Copy link
Contributor

What changes were proposed in this pull request?

  1. when increment counter, also assertClose() is added to ensure db is not closed over race condition
  2. added increment/decrement counter for few missing methods
  3. refactor close to make asynchronous for closeOnError(), to simplify handling increment/decrement counter to avoid forever wait. This can happen for newly added methods as public method call another public method and closeOnError if wait for reference count, it will never be "0" and wait forever. To remove this coupling, async close is done for closeOnError case.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-9527

How was this patch tested?

  • existing test cases

@sumitagrawl sumitagrawl requested a review from szetszwo January 8, 2024 16:24
@SaketaChalamchala
Copy link
Contributor

@swamirishi

@adoroszlai adoroszlai requested a review from swamirishi January 8, 2024 17:06
Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumitagrawl , I believe the problem is that there are two atomic variables. Let's remove isClosed.

  • Update close()
  @Override
  public void close() {
    // wait till all access to rocks db is process to avoid crash while close
    while (!counter.compareAndSet(0, Long.MIN_VALUE)) {
      try {
        Thread.currentThread().sleep(1);
      } catch (InterruptedException e) {
        close(columnFamilies, db, descriptors, writeOptions, dbOptions);
        Thread.currentThread().interrupt();
        return;
      }
    }

    // Wait for all background work to be cancelled first. e.g. RDB compaction
    db.get().cancelAllBackgroundWork(true);

    // Then close all attached listeners
    dbOptions.listeners().forEach(listener -> listener.close());

    if (columnFamilies != null) {
      columnFamilies.values().stream().forEach(f -> f.markClosed());
    }
    // close when counter is 0, no more operation
    close(columnFamilies, db, descriptors, writeOptions, dbOptions);
  }
  • Change assertClose() to acquire():
@@ -443,16 +434,17 @@ private boolean shouldClose(RocksDBException e) {
     }
   }
   
-  private void assertClose() throws IOException {
-    if (isClosed()) {
-      throw new IOException("Rocks Database is closed");
+  private UncheckedAutoCloseable acquire() throws IOException {
+    if (counter.getAndIncrement() < 0) {
+      counter.getAndDecrement();
+      throw new IOException("Rocks Database is closed: " + this.name);
     }
+    return counter::getAndDecrement;
   }
  • The other methods becomes:
  public void put(ColumnFamily family, ByteBuffer key, ByteBuffer value) throws IOException {
    try (UncheckedAutoCloseable ignored = acquire()) {
      counter.incrementAndGet();
      db.get().put(family.getHandle(), writeOptions, key, value);
    } catch (RocksDBException e) {
      closeOnError(e, true);
      throw toIOException(this, "put " + bytes2String(key), e);
    }
  }

@sumitagrawl
Copy link
Contributor Author

@szetszwo Thanks for review, I have updated using acquire, but still can not remove isClosed variable,

  • This is to avoid multiple close being triggered, otherwise, it will wait in while loop to check and set. Similarly flow can also wait in same loop. To avoid same, need this isClosed flag.
  • To avoid lock contention, like some close keeps waiting in while loop, and other threads keeps taking counter and release continuously.

@sumitagrawl sumitagrawl requested a review from szetszwo January 9, 2024 12:34
@szetszwo
Copy link
Contributor

szetszwo commented Jan 9, 2024

... otherwise, it will wait in while loop to check and set. ...

You are right. We should check inside the while loop.

@Override
  public void close() {
    // wait till all access to rocks db is process to avoid crash while close
    while (!counter.compareAndSet(0, Long.MIN_VALUE)) {
     if (counter.get() < 0) {
       return;
     }

      try {
        Thread.currentThread().sleep(1);
      } catch (InterruptedException e) {
        close(columnFamilies, db, descriptors, writeOptions, dbOptions);
        Thread.currentThread().interrupt();
        return;
      }
    }

    // Wait for all background work to be cancelled first. e.g. RDB compaction
    db.get().cancelAllBackgroundWork(true);

    // Then close all attached listeners
    dbOptions.listeners().forEach(listener -> listener.close());

    if (columnFamilies != null) {
      columnFamilies.values().stream().forEach(f -> f.markClosed());
    }
    // close when counter is 0, no more operation
    close(columnFamilies, db, descriptors, writeOptions, dbOptions);
  }

To avoid lock contention, ...

counter is AtomicLong. Where is the lock?

@sumitagrawl
Copy link
Contributor Author

counter is AtomicLong. Where is the lock?

@szetszwo
not lock, I mean, while loop keeps waiting (starvation) to set min value, while other threads can keep inc/dec and use (like lock contention)

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the change looks good.

@szetszwo szetszwo merged commit 0c2e2a6 into apache:master Jan 10, 2024
34 checks passed
adoroszlai pushed a commit to adoroszlai/ozone that referenced this pull request Jan 25, 2024
swamirishi pushed a commit to swamirishi/ozone that referenced this pull request Jun 10, 2024
(cherry picked from commit 0c2e2a6)
Change-Id: I58cf584bef32f4a5e104c5bbe5294e5ab9577ff6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants