Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate why TimeBasedDirectoryCleaner counts too many deletes on MacOS #143

Closed
chrisrohr opened this issue Apr 27, 2020 · 8 comments · Fixed by #554 or #556
Closed

Investigate why TimeBasedDirectoryCleaner counts too many deletes on MacOS #143

chrisrohr opened this issue Apr 27, 2020 · 8 comments · Fixed by #554 or #556
Assignees
Labels
investigation Something that needs to be investigated before implementation can proceed

Comments

@chrisrohr
Copy link
Contributor

No description provided.

@sleberknight sleberknight added the investigation Something that needs to be investigated before implementation can proceed label Apr 27, 2020
@sleberknight
Copy link
Member

See the testScheduleCleanup_WithScheduledExecutor_UsingMultipleConcurrentCleaners_IntegrationTest which is @EnabledOnOs(OS.LINUX)

@sleberknight
Copy link
Member

To clarify, when run on macOS (tested on Catalina and Big Sur), we see more deletes than the total number of directories created.

I modified waitUntilReachExpectedDeleteCount as follows and each time I run the tests I get slightly different results.

    private void waitUntilReachExpectedDeleteCount(TimeBasedDirectoryCleaner cleaner1,
                                                   TimeBasedDirectoryCleaner cleaner2,
                                                   int expectedDeleteCount) {
        await().atMost(TEN_SECONDS).until(() -> {
            long aggregateDeleteCount = cleaner1.getDeleteCount() + cleaner2.getDeleteCount();
            if (aggregateDeleteCount > expectedDeleteCount) {
                LOG.warn("expectedDeleteCount: {} ; aggregateDeleteCount: {}", expectedDeleteCount, aggregateDeleteCount);
            }
            return aggregateDeleteCount >= expectedDeleteCount;
        });
    }

Here is output from the most recent time I ran this test:

For totalFileCount=500:

expectedDeleteCount: 500 ; aggregateDeleteCount: 511

For totalFileCount=2000:

expectedDeleteCount: 2000 ; aggregateDeleteCount: 2053

@chrisrohr
Copy link
Contributor Author

This test is in TimeBasedDirectoryCleanerConfigTest. Putting this here since I kept looking in TimeBasedDirectoryCleanerTest and couldn't find it. The error I consistently get is on line waitUntilReachExpectedDeleteCount(cleaner1, cleaner2, 300); and it timeout at 10 seconds and states that the file delete errors are ~150.

@chrisrohr
Copy link
Contributor Author

Ok, I have a theory as to why this isn't working. When I introduce a small delay in between the two scheduleCleanup calls everything works correctly. I think that there must be something on the MacOS file system that is allowing our delete call to return true when 2 threads delete at the same time for both deletes. This causes our numbers to miscount. I know this test is ensuring that multiple cleaners can run at the same time, but I'm not sure that the use case of them starting at the exact same time is something we need to work around. I will submit a PR for this to at least have the tests pass and remove the @Enable

@sleberknight
Copy link
Member

Yes, pretty sure the code works, given that we've used it in production for years now. So changing the code isn't necessary, which is what I think you are saying, only the test?

@chrisrohr
Copy link
Contributor Author

Correct, this is only a test change.

@chrisrohr
Copy link
Contributor Author

The small delay in setting up the 2 cleaners can be as low as 200ms for the test using 500 files and 400ms for the test using 2000 files. That is enough time for them both to delete but not step on each other.

@sleberknight
Copy link
Member

And only add this delay on macOS? e.g. delayIfMacOs() and using "Mac OS X".equals(System.getProperty("os.name")) (or just using one of the many, many, many constants in SystemUtils in commons-lang3, e.g. SystemUtils.IS_OS_MAC

chrisrohr added a commit that referenced this issue Apr 21, 2021
…urrentCleaners_IntegrationTest on all platforms

The concurrent cleaners need to be created with a small buffer in between.  See #143 for details

Fixes #143
sleberknight pushed a commit that referenced this issue Apr 22, 2021
…urrentCleaners_IntegrationTest on all platforms (#554)

* Re-enable testScheduleCleanup_WithScheduledExecutor_UsingMultipleConcurrentCleaners_IntegrationTest on all platforms
* The concurrent cleaners need to be created with a small buffer in between on macOS.  See #143 for details

Fixes #143
sleberknight added a commit that referenced this issue May 4, 2021
Change the concurrent test in TimeBasedDirectoryCleanerConfigTest so
that it uses '>=' instead of '==' on macOS when waiting for number of
deletions and when making assertions about the total delete count. This
accommodates (though certainly doesn't explain) the behavior we see on
macOS in which there are more successful deletions than there are files
created. On Linux (CentOS as well as Ubuntu) we have never seen that
behavior, so for those OSes we retain the strict equality check.

Relates to #143 (which is already closed)
sleberknight added a commit that referenced this issue May 4, 2021
* Refactor TimeBasedDirectoryCleanerConfigTest for macOS vs Linux

Change the concurrent test in TimeBasedDirectoryCleanerConfigTest so
that it uses '>=' instead of '==' on macOS when waiting for number of
deletions and when making assertions about the total delete count. This
accommodates (though certainly doesn't explain) the behavior we see on
macOS in which there are more successful deletions than there are files
created. On Linux (CentOS as well as Ubuntu) we have never seen that
behavior, so for those OSes we retain the strict equality check.

Relates to #143 (which is already closed)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
investigation Something that needs to be investigated before implementation can proceed
Projects
None yet
2 participants