-
Notifications
You must be signed in to change notification settings - Fork 519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-4488. Open RocksDB read only when loading containers at Datanode startup #1605
Conversation
@sodonnel Thanks the patch. Do you test the time cost when open RockDB with read only ? |
@runzhiwang We have a cluster running with some dense nodes. I put some experimental changes into the code and tested the startup time before and after. With the original code, the startup time was about 500 seconds. With HDDS-4427 plus opening RocksDB readonly (this change) the startup time dropped to about 250 seconds. Timing the average RocksDB open time - read-write it was 75ms and read only it was about 35ms. These results were consistent across several test runs. |
@sodonnel Thanks. Got it. |
@sodonnel Thanks the patch. I have merged it. |
* HDDS-3698-upgrade: (46 commits) HDDS-4468. Fix Goofys listBucket large than 1000 objects will stuck forever (apache#1595) HDDS-4417. Simplify Ozone client code with configuration object -- addendum (apache#1581) HDDS-4476. Improve the ZH translation of the HA.md in doc. (apache#1597) HDDS-4432. Update Ratis version to latest snapshot. (apache#1586) HDDS-4488. Open RocksDB read only when loading containers at Datanode startup (apache#1605) HDDS-4478. Large deletedKeyset slows down OM via listStatus. (apache#1598) HDDS-4452. findbugs.sh couldn't be executed after a full build (apache#1576) HDDS-4427. Avoid ContainerCache in ContainerReader at Datanode startup (apache#1549) HDDS-4448. Duplicate refreshPipeline in listStatus (apache#1569) HDDS-4450. Cannot run ozone if HADOOP_HOME points to Hadoop install (apache#1572) HDDS-4346.Ozone specific Trash Policy (apache#1535) HDDS-4426. SCM should create transactions using all blocks received from OM (apache#1561) HDDS-4399. Safe mode rule for piplelines should only consider open pipelines. (apache#1526) HDDS-4367. Configuration for deletion service intervals should be different for OM, SCM and datanodes (apache#1573) HDDS-4462. Add --frozen-lockfile to pnpm install to prevent ozone-recon-web/pnpm-lock.yaml from being updated automatically (apache#1589) HDDS-4082. Create ZH translation of HA.md in doc. (apache#1591) HDDS-4464. Upgrade httpclient version due to CVE-2020-13956. (apache#1590) HDDS-4467. Acceptance test fails due to new Hadoop 3 image (apache#1594) HDDS-4466. Update url in .asf.yaml to use TLP project (apache#1592) HDDS-4458. Fix Max Transaction ID value in OM. (apache#1585) ...
* HDDS-3698-upgrade: (47 commits) HDDS-4468. Fix Goofys listBucket large than 1000 objects will stuck forever (apache#1595) HDDS-4417. Simplify Ozone client code with configuration object -- addendum (apache#1581) HDDS-4476. Improve the ZH translation of the HA.md in doc. (apache#1597) HDDS-4432. Update Ratis version to latest snapshot. (apache#1586) HDDS-4488. Open RocksDB read only when loading containers at Datanode startup (apache#1605) HDDS-4478. Large deletedKeyset slows down OM via listStatus. (apache#1598) HDDS-4452. findbugs.sh couldn't be executed after a full build (apache#1576) HDDS-4427. Avoid ContainerCache in ContainerReader at Datanode startup (apache#1549) HDDS-4448. Duplicate refreshPipeline in listStatus (apache#1569) HDDS-4450. Cannot run ozone if HADOOP_HOME points to Hadoop install (apache#1572) HDDS-4346.Ozone specific Trash Policy (apache#1535) HDDS-4426. SCM should create transactions using all blocks received from OM (apache#1561) HDDS-4399. Safe mode rule for piplelines should only consider open pipelines. (apache#1526) HDDS-4367. Configuration for deletion service intervals should be different for OM, SCM and datanodes (apache#1573) HDDS-4462. Add --frozen-lockfile to pnpm install to prevent ozone-recon-web/pnpm-lock.yaml from being updated automatically (apache#1589) HDDS-4082. Create ZH translation of HA.md in doc. (apache#1591) HDDS-4464. Upgrade httpclient version due to CVE-2020-13956. (apache#1590) HDDS-4467. Acceptance test fails due to new Hadoop 3 image (apache#1594) HDDS-4466. Update url in .asf.yaml to use TLP project (apache#1592) HDDS-4458. Fix Max Transaction ID value in OM. (apache#1585) ...
What changes were proposed in this pull request?
When a datanode is started, it must read some metadata from all the Containers. Part of that metadata is stored in RocksDB, so the startup process involves opening each rocksDB and closing it again.
Testing on a dense node, with 45 high performance spinning disks and 200K containers, I saw about 75ms on average to open each RockDB. Further testing demonstrated that if we open RockDB read only, the average open time is about 35ms.
At startup time, the DBs are only read, and never written, so opening read only is fine. HDDS-4427 already ensures these opened DBs are not cached.
This change adds an "openReadOnly" flag to RDBStore and the related classes to pass it down the stack, allowing the DB to be opened read only.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-4488
How was this patch tested?
Currently existing unit tests.