Skip to content

Conversation

@stu-elastic
Copy link
Contributor

@stu-elastic stu-elastic commented Jul 26, 2023

As Engine.getLastCommittedSegmentInfos() is effectively immutable, it is acceptable to expose.

@stu-elastic stu-elastic added WIP :Core/Infra/Core Core issues without another label labels Jul 26, 2023
* Returns the file sizes for the current commit
*/
public List<SegmentFileSize> getLastCommittedSegmentFileSizes() {
SegmentInfos segmentInfos = getLastCommittedSegmentInfos();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are getLastCommittedSegmentInfos and SegmentInfos objects thread-safe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but it is used as it is.

The only place getLastCommittedSegmentInfos() is currently used is in Engine.commitStats() where it is used in the constructor of CommitStats. The underlying reference is volatile in InternalEngine and final in ReadOnlyEngine.

In SegmentInfos, the segment list is modified in applyMergeChanges, add, clear and remove.

SegmentCommitInfo is thread safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect (but haven't confirmed for sure) that all modifications to SegmentInfos#segments happen before it is committed. If so, the value returned from getLastCommittedSegmentInfos() is effectively immutable 👍

@stu-elastic stu-elastic removed the WIP label Jul 26, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticsearchmachine elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label Jul 26, 2023
/**
* Returns the file sizes for the current commit
*/
public Map<String, Long> getLastCommittedSegmentFileSizes() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we need to expose all the individual files of the commit? Could we instead have long getLastCommitSizeInBytes() that serves the narrow purpose of our needs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

total += commitInfo.sizeInBytes();
} catch (IOException err) {
logger.warn(
"Failed to read file size for shard: [{}], id: [{}], err: [{}]",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have an unexpected failure, should we emit a value at all? I'm wondering if a partial value + log message could result in botched billing that gets overlooked, vs a complete stoppage in metrics would get alarmed on quickly (and more likely to be caught in testing).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a failure to read a segment file. I'm wondering if we should indicate partial failures in this method and report those upstream?

@stu-elastic stu-elastic changed the title Expose segment file sizes in Engine Expose getLastCommittedSegmentInfos in Engine Jul 27, 2023
Copy link
Contributor

@pgomulka pgomulka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I suspect that since SegmentInfos are effectively immutable you changed the initial approach and made getLastCommittedSegmentInfos method public?
let's update the PR description to reflect that

@stu-elastic stu-elastic merged commit a2d4799 into elastic:main Jul 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Core/Infra/Core Core issues without another label >non-issue Team:Core/Infra Meta label for core/infra team v8.10.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants