-
Notifications
You must be signed in to change notification settings - Fork 24.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add optional content checking to ResourceWatcher #79423
Conversation
In some cloud environments, there may be frequent synchronization of configuration files from the orchestration layer to the ES container. This can trigger frequent, unnecessary reloading of files. Previously, code that used the ResourceWatcherService / FileWatcher would need to detect "no-op" file changes itself. With the addition of this content checking support, it can be handled efficiently by the Resource Watcher Service.
Pinging @elastic/es-core-infra (Team:Core/Infra) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
server/src/main/java/org/elasticsearch/common/hash/MessageDigests.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/watcher/FileWatcher.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/common/hash/MessageDigestsTests.java
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/watcher/FileWatcherTests.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This might be a good tradeoff in the situation when there are many more false postives (watcher gets notified when file does not actually change). But it adds an additional file read for true postives. Also the downstream consumer still needs to check for false postives though much less likely to happen. Overall since false postives are much more likely to happen and true postives are relatively rare, it's an improvement.
server/src/main/java/org/elasticsearch/watcher/FileWatcher.java
Outdated
Show resolved
Hide resolved
} | ||
byte[] prevDigest = this.digest; | ||
try (var in = Files.newInputStream(path)) { | ||
this.digest = MessageDigests.digest(in, MessageDigests.md5()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have usages for md5
in other places as well. I am always confused by whether this is FIPS compliant. But BCFIPS provides md5 even in approved only mode. I'd appreciate if you could elaborate a bit more on this choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MD5 is an edge case.
It's available under FIPS but technically only approved for use in TLS not general crypto.
We could use SHA-1 here instead, but since we're using the hash as a checksum, rather than for crypto, I don't think it's really necessary - we really want a message-digest not a secure-hash
@@ -144,6 +160,39 @@ public void checkAndNotify() throws IOException { | |||
|
|||
} | |||
|
|||
private boolean fileHasChanged(long prevLength, long prevLastModified) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This predicate like method also mutates the digest
field. Would it be better to mutate digest
in the same place as length
and lastModified
and let this method be pure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was attempting to keep it as cheap as possible and avoid calculating a digest until we knew we needed it, but the consensus seems to be that it's better to be consistent, so I'll calculate the digest as soon as we know we have a file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, not as soon as we have a file - we want to know that the length or last-modified data have changed first.
- Calculate hash on first file read - Calculate hash everytime file size changes - Additional tests
In some cloud environments, there may be frequent synchronization of configuration files from the orchestration layer to the ES container. This can trigger frequent, unnecessary reloading of files. Previously, code that used the ResourceWatcherService / FileWatcher would need to detect "no-op" file changes itself. With the addition of this content checking support, it can be handled efficiently by the Resource Watcher Service. Backport of: elastic#79423
* Add optional content checking to ResourceWatcher In some cloud environments, there may be frequent synchronization of configuration files from the orchestration layer to the ES container. This can trigger frequent, unnecessary reloading of files. Previously, code that used the ResourceWatcherService / FileWatcher would need to detect "no-op" file changes itself. With the addition of this content checking support, it can be handled efficiently by the Resource Watcher Service. Backport of: #79423 * Deal with modification dates on JDK8 JDK only returns lastModified FileType at seconds granularity
In some cloud environments, there may be frequent synchronization of
configuration files from the orchestration layer to the ES container.
This can trigger frequent, unnecessary reloading of files.
Previously, code that used the ResourceWatcherService / FileWatcher
would need to detect "no-op" file changes itself. With the addition of
this content checking support, it can be handled efficiently by the
Resource Watcher Service.