Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(outputs): Clarify buffer limits behavior and fix spec wording #15999

Merged
merged 2 commits into from
Oct 16, 2024

Conversation

Hr0bar
Copy link
Contributor

@Hr0bar Hr0bar commented Oct 9, 2024

Summary

As discussed in #15908 , clarifying the behavior of output buffer behavior when it fills up in general, fixing incorrect specs wording and adding related useful note to docs of outpus.azure_monitor.

Hopefully the wording is clear enough?

Also just noticed in the specs: "Telegraf will not make any attempt to limit the size on disk taken by these files beyond cleaning up WAL files for metrics that have successfully been flushed to their output source."

Perhaps the last word should be destination instead of source ?

Checklist

  • No AI generated code was used in this PR

Related issues

#15908

@telegraf-tiger telegraf-tiger bot added the docs Issues related to Telegraf documentation and configuration descriptions label Oct 9, 2024
@DStrand1 DStrand1 self-assigned this Oct 9, 2024
Copy link
Member

@DStrand1 DStrand1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! This should clarify things much better.

Also just noticed in the specs: "Telegraf will not make any attempt to limit the size on disk taken by these files beyond cleaning up WAL files for metrics that have successfully been flushed to their output source."

Perhaps the last word should be destination instead of source ?

I agree changing to "destination" here would be clearer wording, feel free to make that change as well!

@Hr0bar
Copy link
Contributor Author

Hr0bar commented Oct 10, 2024

There is also "Tracking metrics will be accepted either on a successful write to the output source like currently, or on write to the WAL file."

And Im not so sure anymore, never used tracking metrics, but seems to be write to a buffer WAL file only, perhaps what is meant is that the metric is only handed over to the output plugin (in which case output source would make sense probably), to be written to output destination later (as in external system, file, stdout...).

I would rather leave changing "source" to "destination" in this spec to someone more knowledgeable about Telegraf internals. Perhaps the first occurrence is correct and should be "output source" and only the previously discussed second occurrence needs changing - as "cleaning up WAL files for metrics that have successfully been flushed" seems to indicate write to an external system/file/stdout, rather than only handing the metric over to output plugin.

@DStrand1
Copy link
Member

There is also "Tracking metrics will be accepted either on a successful write to the output source like currently, or on write to the WAL file."

And Im not so sure anymore, never used tracking metrics, but seems to be write to a buffer WAL file only, perhaps what is meant is that the metric is only handed over to the output plugin (in which case output source would make sense probably), to be written to output destination later (as in external system, file, stdout...).

This line is actually more incorrect now, it should say something like "Tracking metrics will be accepted on a successful write to the output destination". Because of this I think using destination over source makes more sense, as there is no middle distinction here between "handed to the output plugin" and "written to the output destination," metrics are accepted and cleaned up from the WAL file only if the output plugin successfully delivers the metric

@Hr0bar
Copy link
Contributor Author

Hr0bar commented Oct 11, 2024

Okay, Ive updated it, makes sense. The previous wording almost got me thinking there is some "in between" state where the metric is "in the output plugin", but not sent to the actual output destination yet. Glad to hear that is not the case.

But we should verify the tracking metrics being "accepted" with the WAL buffer - Im new to the WAL on disk buffer strategy, but I see it is persisted across telegraf instances/restarts, are we sure tracking metrics are "accepted" only when they are flushed from the WAL to the actual output destination ?

For example, I see this comment in the on disk buffer test code:

// Expected to drop the 4th metric, as tracking metrics from
// previous instances  are dropped when the wal file is reopened.

@telegraf-tiger
Copy link
Contributor

@DStrand1
Copy link
Member

... I see it is persisted across telegraf instances/restarts, are we sure tracking metrics are "accepted" only when they are flushed from the WAL to the actual output destination ?

Tracking data is only relevant on the current instance of telegraf, the contract is that if a tracking metric is not accepted, the input plugin that generated it will not assume that metric is "done," and will send it again if asked for more metrics. However on a new instance of telegraf, this tracking data has no way to inform the input, so these metrics have to be removed from the buffer and be reacquired again from the input plugin with new tracking data.

@Hr0bar
Copy link
Contributor Author

Hr0bar commented Oct 16, 2024

Makes sense, the changes should be good then

Copy link
Member

@DStrand1 DStrand1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this!

@DStrand1 DStrand1 assigned srebhan and unassigned DStrand1 Oct 16, 2024
@DStrand1 DStrand1 added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Oct 16, 2024
Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Hr0bar!

@srebhan srebhan added the plugin/output 1. Request for new output plugins 2. Issues/PRs that are related to out plugins label Oct 16, 2024
@srebhan srebhan changed the title docs: Clarify buffer limits behavior, fix specs wording docs(outputs): Clarify buffer limits behavior and fix spec wording Oct 16, 2024
@srebhan srebhan merged commit c0bea1b into influxdata:master Oct 16, 2024
29 checks passed
@github-actions github-actions bot added this to the v1.32.2 milestone Oct 16, 2024
srebhan pushed a commit that referenced this pull request Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Issues related to Telegraf documentation and configuration descriptions plugin/output 1. Request for new output plugins 2. Issues/PRs that are related to out plugins ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants