Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OAK-11182: Create and test general mechanism for creating MongoDocumentStore fullGC garbage for testing #1805

Open
wants to merge 86 commits into
base: trunk
Choose a base branch
from

Conversation

shodaaan
Copy link
Contributor

  • created GenerateFullGCCommand oak-run class with methods for generating garbage of type GAP_ORPHANS and EMPTY_PROPERTIES. All garbage is generated under a specific main node. Also created a cleanup method for deleting all generated garbage
  • added call to GenerateFullGCCommand to AvailableModes with specific oak-run input variable
  • created GenerateFullGCCommandTest class with unit test methods for generating and cleaning up garbage
  • made VersionGarbageCollector.fullGCMode public, so it can be used Command class

Rishabh Kumar and others added 30 commits June 13, 2024 17:59
…gment has zero length (apache#1427)

* OAK-10771 - Log if evicted segment has length zero.

* OAK-10771 - Add diskCacheSize stats

* OAK-10771 - Record cache size and change

---------

Co-authored-by: Axel Hanikel <ahanikel@adobe.com>
…e/deleteMany to make it compatible with both mongo & mongosh
…ache#1466)

* OAK-10811: (oak-search-elastic) reduce contention in IndexTracker

* OAK-10811: ElasticIndexInfoProvider uses node api to get :status info

* OAK-10811: rolled back unneeded change
* OAK-10675: add service principal support in oak-blob-cloud-azure

* OAK-10675: use user delegation key signed sas for service principal and add license to new files

* OAK-10675: some self review changes

* OAK-10675: set null blob headers as empty string

* OAK-10675: update pom to resolve package dependency

* OAK-10675: add import packages to resolve bundle

* OAK-10770: embed runtime dependencies of azure identity in oak-segment-azure

* OAK-10770: remove unused imports, refactor dependencies

* OAK-10770: add imports, refactor dependencies

* OAK-10675: fix runtime dependency issues, code review
…local names containing '{' or '}'. (apache#1285)

Fixed issue and added test case.

Co-authored-by: Julian Reschke <reschke@apache.org>
Done

---------

Co-authored-by: Julian Sedding <jsedding@apache.org>
* OAK-10780: add azure access token refresh logic

* OAK-10780: modify log statement

* OAK-10780: modify log statement

* OAK-10780: reduce token refresh interval
…anded names (apache#1488)

* OAK-10621: o.a.j.o.namepath.JcrPathParser does not accept indexed expanded names.

Fixed, added unit tests.
* OAK-10675: add service principal support in oak-blob-cloud-azure

* OAK-10675: use user delegation key signed sas for service principal and add license to new files

* OAK-10675: some self review changes

* OAK-10675: set null blob headers as empty string

* OAK-10675: update pom to resolve package dependency

* OAK-10675: add import packages to resolve bundle

* OAK-10770: embed runtime dependencies of azure identity in oak-segment-azure

* OAK-10770: remove unused imports, refactor dependencies

* OAK-10770: add imports, refactor dependencies

* OAK-10675: fix runtime dependency issues, code review

* OAK-10781: add access token refresh mechanism

* close executor

* reduce token refresh delauy to 1 minute

* retrigger build
…#1454)

Co-authored-by: Stefan Egli <stefanegli@apache.org>
Co-authored-by: Rishabh Kumar <diam@adobe.com>
Co-authored-by: Jose Cordero <corderob@adobe.com>
Co-authored-by: Daniel Iancu <iancu@adobe.com>

Full details in branch: DetailedGC/OAK-10199
…generated garbage nodes to shorter names

- made executor a class level variable and added close method for it
- renamed files to include "Garbage"
- other small suggested changes
…arbageOptionType parameter passed to the generateGarbage Command

- reverted change in VersionGarbageCollector.java
…ommand, which allow for generating gap orphans up to 15 levels deep and also allows for deleting nodes in their hierarchy up to the specified orphansLevelGap level

- separated the garbage generation code for EMPTY_PROPS / GAP_ORPHANS into 2 separate methods
- adjusted existing unit test methods in GenerateGarbageCommandTest and added new ones for the new level / gap parameters
- other small changes from the PR review
Copy link
Contributor

@stefan-egli stefan-egli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few more points.

One general question is : as this command is clearly a test command not to be used against a production environment : how could that be emphasized. For example: we could add an explicit: "are you really sure you want to generate test garbage against ...". That would make it clear that the target repository is going to be written to. Wdyt?

…der the garbage root path

- added additional unit test for cleanup of empty properties garbage
- used path utility methods where possible
- added some method descriptions where relevant
- renamed constants and updated comments and strings to remove "fullGC" from strings
@shodaaan
Copy link
Contributor Author

Added a few more points.

One general question is : as this command is clearly a test command not to be used against a production environment : how could that be emphasized. For example: we could add an explicit: "are you really sure you want to generate test garbage against ...". That would make it clear that the target repository is going to be written to. Wdyt?

I resolved all of the points, thank you for the suggestions.

Regarding the explicit check, I don't think this is necessary because oak-run is a tool that by design makes important / deep changes in the MongoDB it is run on, so users should always be careful and know the exact consequences of the commands they run.
The USAGE text clearly states what the command and associated subcommands do.
Also, since oak-run commands are designed to be used "with intent", and not "by accident", having an additional confirmation step for each garbage generation command run would just annoy the user IMHO. I didn't see this check for any other command in oak-run.

@stefan-egli
Copy link
Contributor

Regarding the explicit check, I don't think this is necessary

Fair enough.

Two unrelated additional comments:

  • the DocumentNodeStore is not properly disposed. You can see this by running the oak-run command consecutively : the second start will have to wait 2min for the lease timeout of the first non-gracefully shut down run. We should ensure a graceful disposal.
  • what might be useful is to add the base path where the garbage is generated to - such that the user immediately knows (and doesn't eg have to consult the source code)

- added documentNodeStore.dispose() calls to CreateGarbageCommand.java
@shodaaan
Copy link
Contributor Author

Regarding the explicit check, I don't think this is necessary

Fair enough.

Two unrelated additional comments:

  • the DocumentNodeStore is not properly disposed. You can see this by running the oak-run command consecutively : the second start will have to wait 2min for the lease timeout of the first non-gracefully shut down run. We should ensure a graceful disposal.
  • what might be useful is to add the base path where the garbage is generated to - such that the user immediately knows (and doesn't eg have to consult the source code)

I added documentNodeStore.dispose() calls to CreateGarbageCommand. Thank you for the suggestion

The root path where garbage is generated (/tmp/oak-run-generated-test-garbage) is specified in the USAGE string of CreateGarbageCommand, so users will know what the path is where garbage is generated.

- added disabled lease check mode setting to DocumentNodeStoreBuilder in GenerateGarbageCommand
- added some more output messages to make the functionality clearer when running oak-run
@shodaaan
Copy link
Contributor Author

shodaaan commented Nov 6, 2024

Regarding the explicit check, I don't think this is necessary

Fair enough.
Two unrelated additional comments:

  • the DocumentNodeStore is not properly disposed. You can see this by running the oak-run command consecutively : the second start will have to wait 2min for the lease timeout of the first non-gracefully shut down run. We should ensure a graceful disposal.
  • what might be useful is to add the base path where the garbage is generated to - such that the user immediately knows (and doesn't eg have to consult the source code)

I added documentNodeStore.dispose() calls to CreateGarbageCommand. Thank you for the suggestion

The root path where garbage is generated (/tmp/oak-run-generated-test-garbage) is specified in the USAGE string of CreateGarbageCommand, so users will know what the path is where garbage is generated.

I added proper disposal of DocumentNodeStore for both createGarbage and cleanupGarbage commands. Tested oak-run with repeated runs and disposal works correctly.

I also added some println messages with the base path of the generated garbage, as well as the paths of the documents deleted by the "clean" sub-command.

Copy link
Contributor

@rishabhdaim rishabhdaim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would replace the usage of generate with create

- removed Closeable implementation of CreateGarbageCommand
- some minor renames
Copy link
Contributor

@stefan-egli stefan-egli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had troubles getting the continuous mode running and I believe it's broken because of the new closer logic closing it early.

(other than that added a few detail remarks as well)

…d.java in order to cover all cases, including continuous garbage generation

- updated resource disposal in unit tests to account for changes
- other minor changes / renames
@shodaaan
Copy link
Contributor Author

Had troubles getting the continuous mode running and I believe it's broken because of the new closer logic closing it early.

(other than that added a few detail remarks as well)

Tested oak-run with all modes (including continuous generation), everything works fine now.

Copy link
Contributor

@stefan-egli stefan-egli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 lgtm, started the jenkins run, would suggest to go ahead with merge unless we find a regression (unlikely), and then we can iterate from there

@stefan-egli
Copy link
Contributor

would suggest to go ahead with merge unless we find a regression

squash merge obviously!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.