Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-10184. Fix ManagedStatistics not closed properly #6055

Merged
merged 3 commits into from
Jan 25, 2024

Conversation

whbing
Copy link
Contributor

@whbing whbing commented Jan 22, 2024

What changes were proposed in this pull request?

When config set:

   <property>
      <name>ozone.metastore.rocksdb.statistics</name>
      <value>ALL</value>
   </property> 

ManagedStatistics not closed properly in OM and DN:

2024-01-22 18:34:22,137 [LeakDetector-ManagedRocksObject0] WARN org.apache.hadoop.hdds.utils.db.managed.ManagedRocksObjectUtils: ManagedStatistics is not closed properly
StackTrace for unclosed instance: org.apache.hadoop.hdds.utils.db.managed.ManagedStatistics.<init>(ManagedStatistics.java:30)
org.apache.hadoop.hdds.utils.db.DBStoreBuilder.getDefaultDBOptions(DBStoreBuilder.java:418)
org.apache.hadoop.hdds.utils.db.DBStoreBuilder.build(DBStoreBuilder.java:209)
org.apache.hadoop.ozone.om.OmMetadataManagerImpl.loadDB(OmMetadataManagerImpl.java:607)
org.apache.hadoop.ozone.om.OmMetadataManagerImpl.loadDB(OmMetadataManagerImpl.java:570)
org.apache.hadoop.ozone.om.OmMetadataManagerImpl.start(OmMetadataManagerImpl.java:560)
org.apache.hadoop.ozone.om.OmMetadataManagerImpl.<init>(OmMetadataManagerImpl.java:342)
org.apache.hadoop.ozone.om.OzoneManager.instantiateServices(OzoneManager.java:803)
org.apache.hadoop.ozone.om.OzoneManager.<init>(OzoneManager.java:683)
org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:768)
org.apache.hadoop.ozone.om.OzoneManagerStarter$OMStarterHelper.start(OzoneManagerStarter.java:189)
org.apache.hadoop.ozone.om.OzoneManagerStarter.startOm(OzoneManagerStarter.java:86)
2024-01-22 19:51:22,414 [LeakDetector-ManagedRocksObject0] WARN org.apache.hadoop.hdds.utils.db.managed.ManagedRocksObjectUtils: ManagedStatistics is not closed properly
StackTrace for unclosed instance: org.apache.hadoop.hdds.utils.db.managed.ManagedStatistics.<init>(ManagedStatistics.java:30)
org.apache.hadoop.ozone.container.metadata.AbstractDatanodeStore.start(AbstractDatanodeStore.java:121)
org.apache.hadoop.ozone.container.metadata.AbstractDatanodeStore.<init>(AbstractDatanodeStore.java:99)
org.apache.hadoop.ozone.container.metadata.DatanodeStoreSchemaThreeImpl.<init>(DatanodeStoreSchemaThreeImpl.java:66)
org.apache.hadoop.ozone.container.keyvalue.helpers.BlockUtils.getUncachedDatanodeStore(BlockUtils.java:85)
org.apache.hadoop.ozone.container.common.utils.HddsVolumeUtil.initPerDiskDBStore(HddsVolumeUtil.java:74)
org.apache.hadoop.ozone.container.common.volume.HddsVolume.loadDbStore(HddsVolume.java:365)
org.apache.hadoop.ozone.container.common.utils.HddsVolumeUtil.loadVolume(HddsVolumeUtil.java:111)
org.apache.hadoop.ozone.container.common.utils.HddsVolumeUtil.lambda$loadAllHddsVolumeDbStore$0(HddsVolumeUtil.java:97)
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1618)
java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10184

How was this patch tested?

  • origin exist test
  • start om and dn in test cluster, not reproduce after fixed

@whbing
Copy link
Contributor Author

whbing commented Jan 22, 2024

@adoroszlai PTAL, thanks you !

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @whbing for reporting this issue and working on the fix.

I wonder if Statistics should be kept open until DBOptions is closed.

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@whbing Thanks working over this
ManagedStatistics is assigned to options, so do options close implicitly on its closer?
Even if we set ManagedStatistics to ManagedOption for closer, will there be twice close or not - One by rocksdb and another by ManagedOption(if its added).

So need to check do really need have leak detection for ManagedStatistics?

@adoroszlai
Copy link
Contributor

ManagedStatistics is assigned to options, so do options close implicitly on its closer?

@sumitagrawl each object needs to be closed explicitly

https://github.com/facebook/rocksdb/blob/ef342246dc63c54d61909a6a5a0917769d83688a/java/samples/src/main/java/RocksDBSample.java#L30-L35

@whbing
Copy link
Contributor Author

whbing commented Jan 23, 2024

  private static GenericTestUtils.LogCapturer log = 
      GenericTestUtils.LogCapturer.captureLogs(ManagedRocksObjectUtils.LOG);
  assertFalse(log.getOutput().contains("is not closed properly"));

Simple verification of the startup log, fixed after closing ManagedStatistics

Copy link
Member

@aswinshakil aswinshakil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though this fixes this issue, We are closing the ManagedStatistics object before collecting any stats. I wonder if it records any of the stats if the ManagedStatistics is closed here.

@whbing
Copy link
Contributor Author

whbing commented Jan 24, 2024

The gc has already begun to reclaim the ManagedStatistics object after init ManagedStatistics instance during we startup the OM. This indicates that the ManagedStatistics object is not continuously referenced throughout the runtime of OM. Perhaps we should determine exactly when its lifecycle ends.

Maybe we can trace the metric changes after close ManagedStatistics to verify if it records any of the stats.

@adoroszlai
Copy link
Contributor

I tested this for a short period, RocksDB statistics looked OK in Prometheus.

cd hadoop-ozone/dist/target/ozone-1.5.0-SNAPSHOT/compose/ozone
export COMPOSE_FILE=docker-compose.yaml:monitoring.yaml
OZONE_DATANODES=3 ./run.sh -d
docker-compose exec -T scm ozone freon ockg -n10000 -t4
open http://localhost:9090

Still, closing Statistics doesn't seem right. I think it should be kept in RDBStore (passed via DBStoreBuilder), and closed in RDBStore#close.

@whbing
Copy link
Contributor Author

whbing commented Jan 24, 2024

I think it should be kept in RDBStore (passed via DBStoreBuilder), and closed in RDBStore#close.

👍 @adoroszlai Thanks for reminding. I'll add a new commit later

@whbing
Copy link
Contributor Author

whbing commented Jan 24, 2024

@adoroszlai PTAL again if you have time , Thanks !

@adoroszlai adoroszlai self-requested a review January 24, 2024 15:24
Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @whbing for updating the patch, LGTM. Two minor possible improvements noted. If you agree with them, only apply them if there are any other requests for changes. Otherwise they are fine to be included in future tasks.

Comment on lines +237 to 240
if (statistics != null) {
IOUtils.close(LOG, statistics);
}
IOUtils.close(LOG, db);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: IOUtils.close accepts multiple objects to be closed, and handles null values, too.

Suggested change
if (statistics != null) {
IOUtils.close(LOG, statistics);
}
IOUtils.close(LOG, db);
IOUtils.close(LOG, db, statistics);

Comment on lines +194 to +198
if (!rocksDbStat.equals(OZONE_METADATA_STORE_ROCKSDB_STATISTICS_OFF)) {
statistics = new ManagedStatistics();
statistics.setStatsLevel(StatsLevel.valueOf(rocksDbStat));
dbOptions.setStatistics(statistics);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ManagedStatistics statistics could be a local variable in build().

I don't think it makes any difference functionally, but it would be easier to follow its lifecycle. With instance variable, the reader needs to consider what happens if the same builder is used to build multiple RDBStore instances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The global variable statistics are easier to pass to the RDBStore In current method structure, , nothing else.

Copy link
Member

@aswinshakil aswinshakil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @whbing for the patch. LGTM.

@whbing
Copy link
Contributor Author

whbing commented Jan 25, 2024

@adoroszlai @aswinshakil @sumitagrawl Thank you for the above review !

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@adoroszlai adoroszlai merged commit 9018728 into apache:master Jan 25, 2024
35 checks passed
@adoroszlai
Copy link
Contributor

Thanks @whbing for the fix, @aswinshakil, @sumitagrawl for the review.

adoroszlai pushed a commit to adoroszlai/ozone that referenced this pull request Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants