-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Expose getLastCommittedSegmentInfos in Engine #97978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose getLastCommittedSegmentInfos in Engine #97978
Conversation
server/src/main/java/org/elasticsearch/index/engine/Engine.java
Outdated
Show resolved
Hide resolved
| * Returns the file sizes for the current commit | ||
| */ | ||
| public List<SegmentFileSize> getLastCommittedSegmentFileSizes() { | ||
| SegmentInfos segmentInfos = getLastCommittedSegmentInfos(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are getLastCommittedSegmentInfos and SegmentInfos objects thread-safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but it is used as it is.
The only place getLastCommittedSegmentInfos() is currently used is in Engine.commitStats() where it is used in the constructor of CommitStats. The underlying reference is volatile in InternalEngine and final in ReadOnlyEngine.
In SegmentInfos, the segment list is modified in applyMergeChanges, add, clear and remove.
SegmentCommitInfo is thread safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect (but haven't confirmed for sure) that all modifications to SegmentInfos#segments happen before it is committed. If so, the value returned from getLastCommittedSegmentInfos() is effectively immutable 👍
|
Pinging @elastic/es-core-infra (Team:Core/Infra) |
| /** | ||
| * Returns the file sizes for the current commit | ||
| */ | ||
| public Map<String, Long> getLastCommittedSegmentFileSizes() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we need to expose all the individual files of the commit? Could we instead have long getLastCommitSizeInBytes() that serves the narrow purpose of our needs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
| total += commitInfo.sizeInBytes(); | ||
| } catch (IOException err) { | ||
| logger.warn( | ||
| "Failed to read file size for shard: [{}], id: [{}], err: [{}]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have an unexpected failure, should we emit a value at all? I'm wondering if a partial value + log message could result in botched billing that gets overlooked, vs a complete stoppage in metrics would get alarmed on quickly (and more likely to be caught in testing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a failure to read a segment file. I'm wondering if we should indicate partial failures in this method and report those upstream?
pgomulka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I suspect that since SegmentInfos are effectively immutable you changed the initial approach and made getLastCommittedSegmentInfos method public?
let's update the PR description to reflect that
As
Engine.getLastCommittedSegmentInfos()is effectively immutable, it is acceptable to expose.