Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional content checking to ResourceWatcher #79423

Merged
merged 3 commits into from
Oct 20, 2021

Conversation

tvernum
Copy link
Contributor

@tvernum tvernum commented Oct 19, 2021

In some cloud environments, there may be frequent synchronization of
configuration files from the orchestration layer to the ES container.

This can trigger frequent, unnecessary reloading of files.

Previously, code that used the ResourceWatcherService / FileWatcher
would need to detect "no-op" file changes itself. With the addition of
this content checking support, it can be handled efficiently by the
Resource Watcher Service.

In some cloud environments, there may be frequent synchronization of
configuration files from the orchestration layer to the ES container.

This can trigger frequent, unnecessary reloading of files.

Previously, code that used the ResourceWatcherService / FileWatcher
would need to detect "no-op" file changes itself. With the addition of
this content checking support, it can be handled efficiently by the
Resource Watcher Service.
@tvernum tvernum added >enhancement :Core/Infra/Core Core issues without another label v8.0.0 v7.16.0 labels Oct 19, 2021
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Oct 19, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@ywangd ywangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This might be a good tradeoff in the situation when there are many more false postives (watcher gets notified when file does not actually change). But it adds an additional file read for true postives. Also the downstream consumer still needs to check for false postives though much less likely to happen. Overall since false postives are much more likely to happen and true postives are relatively rare, it's an improvement.

}
byte[] prevDigest = this.digest;
try (var in = Files.newInputStream(path)) {
this.digest = MessageDigests.digest(in, MessageDigests.md5());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have usages for md5 in other places as well. I am always confused by whether this is FIPS compliant. But BCFIPS provides md5 even in approved only mode. I'd appreciate if you could elaborate a bit more on this choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MD5 is an edge case.
It's available under FIPS but technically only approved for use in TLS not general crypto.
We could use SHA-1 here instead, but since we're using the hash as a checksum, rather than for crypto, I don't think it's really necessary - we really want a message-digest not a secure-hash

@@ -144,6 +160,39 @@ public void checkAndNotify() throws IOException {

}

private boolean fileHasChanged(long prevLength, long prevLastModified) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This predicate like method also mutates the digest field. Would it be better to mutate digest in the same place as length and lastModified and let this method be pure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was attempting to keep it as cheap as possible and avoid calculating a digest until we knew we needed it, but the consensus seems to be that it's better to be consistent, so I'll calculate the digest as soon as we know we have a file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, not as soon as we have a file - we want to know that the length or last-modified data have changed first.

- Calculate hash on first file read
- Calculate hash everytime file size changes
- Additional tests
@tvernum tvernum merged commit f7454b8 into elastic:master Oct 20, 2021
tvernum added a commit to tvernum/elasticsearch that referenced this pull request Oct 20, 2021
In some cloud environments, there may be frequent synchronization of
configuration files from the orchestration layer to the ES container.

This can trigger frequent, unnecessary reloading of files.

Previously, code that used the ResourceWatcherService / FileWatcher
would need to detect "no-op" file changes itself. With the addition of
this content checking support, it can be handled efficiently by the
Resource Watcher Service.

Backport of: elastic#79423
elasticsearchmachine pushed a commit that referenced this pull request Oct 20, 2021
* Add optional content checking to ResourceWatcher

In some cloud environments, there may be frequent synchronization of
configuration files from the orchestration layer to the ES container.

This can trigger frequent, unnecessary reloading of files.

Previously, code that used the ResourceWatcherService / FileWatcher
would need to detect "no-op" file changes itself. With the addition of
this content checking support, it can be handled efficiently by the
Resource Watcher Service.

Backport of: #79423

* Deal with modification dates on JDK8

JDK only returns lastModified FileType at seconds granularity
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label >enhancement Team:Core/Infra Meta label for core/infra team v7.16.0 v8.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants