Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make o365audit input cancellable #21647

Merged
merged 2 commits into from
Oct 9, 2020

Conversation

adriansr
Copy link
Contributor

@adriansr adriansr commented Oct 7, 2020

What does this PR do?

  • Updates the o365input to perform a cancellable wait when an error causes it to restart.
  • Also uses the configured error_retry_interval for the delay between restarts instead of a hardcoded 5m.

Why is it important?

Using time.Sleep prevents Filebeat to terminate until the timeout is elapsed.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • [ ] I have added tests that prove my fix is effective or that my feature works
  • [ ] I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

Relates #21258

PR elastic#21258 introduced a restart mechanism for o365input so that it didn't
stop working once a fatal error was found. This updates the restart delay to
use a cancellation-context-aware method so that the input doesn't block
Filebeat termination.
@adriansr adriansr requested review from urso and a team October 7, 2020 15:58
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Oct 7, 2020
@adriansr adriansr added needs_backport PR is waiting to be backported to other branches. Team:SIEM and removed needs_team Indicates that the issue/PR needs a Team:* label labels Oct 7, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/siem (Team:SIEM)

@adriansr adriansr changed the title Make o365audit input cancelable Make o365audit input cancellable Oct 7, 2020
ctx.Logger.Infof("Restarting in %v", failureRetryInterval)
time.Sleep(failureRetryInterval)
ctx.Logger.Infof("Restarting in %v", inp.config.API.ErrorRetryInterval)
timed.Wait(ctx.Cancelation, inp.config.API.ErrorRetryInterval)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: inputs might have recoverable and non-recoverable errors. Only non-recoverable errors must the input return. The later case is quite unlikely for most inputs I guess. In case we have this pattern more often we might want to unify it by providing some helpers, input manager wrapper, or simply add a 'setting' to the cursor.InputManager to always rerun on error.

@urso
Copy link

urso commented Oct 7, 2020

Change LGTM.

+1 for making the interval configurable. Does the poller error on every internal intermediate error? If not, how about naming it restart_interval?

@adriansr
Copy link
Contributor Author

adriansr commented Oct 9, 2020

No, it only errors for authentication errors. Originally this was a fatal error so that the Beat wouldn't start with bad configuration, now it retries because the auth server can be occasionally down and there's no way to tell a transient error apart from a permanent authentication error.

@adriansr adriansr merged commit 1abe97b into elastic:master Oct 9, 2020
adriansr added a commit to adriansr/beats that referenced this pull request Oct 9, 2020
PR elastic#21258 introduced a restart mechanism for o365input so that it didn't
stop working once a fatal error was found. This updates the restart delay to
use a cancellation-context-aware method so that the input doesn't block
Filebeat termination.

(cherry picked from commit 1abe97b)
adriansr added a commit to adriansr/beats that referenced this pull request Oct 9, 2020
PR elastic#21258 introduced a restart mechanism for o365input so that it didn't
stop working once a fatal error was found. This updates the restart delay to
use a cancellation-context-aware method so that the input doesn't block
Filebeat termination.

(cherry picked from commit 1abe97b)
@adriansr adriansr added v7.10.0 and removed needs_backport PR is waiting to be backported to other branches. labels Oct 9, 2020
adriansr added a commit to adriansr/beats that referenced this pull request Oct 9, 2020
PR elastic#21258 introduced a restart mechanism for o365input so that it didn't
stop working once a fatal error was found. This updates the restart delay to
use a cancellation-context-aware method so that the input doesn't block
Filebeat termination.

(cherry picked from commit 1abe97b)
@adriansr adriansr added the v7.9.3 label Oct 9, 2020
v1v added a commit to v1v/beats that referenced this pull request Oct 13, 2020
* upstream/master: (127 commits)
  Update obs app links (elastic#21682)
  fix: update fleet test suite name (elastic#21738)
  Remove dot from file.extension value in Auditbeat FIM (elastic#21644)
  Fix leaks with metadata processors (elastic#16349)
  Add istiod metricset (elastic#21519)
  [Ingest Manager] Change Sync/Close call order (elastic#21735)
  [Ingest Manager] Syncing unpacked files (elastic#21706)
  Fix concurrent map read and write in socket dataset (elastic#21690)
  Fix conditional coding to remove seccomp info from Winlogbeat (elastic#21652)
  [Elastic Agent] Fix issue where inputs without processors defined would panic (elastic#21628)
  Add configuration of filestream input (elastic#21565)
  libbeat/logp: introduce Logger.WithOptions (elastic#21671)
  Make o365audit input cancellable (elastic#21647)
  fix: remove extra curly brace in script (elastic#21692)
  [Winlogbeat] Remove brittle configuration validation from wineventlog (elastic#21593)
  Fix function that parses from/to/contact headers (elastic#21672)
  [CI] Support Windows-2016 in pipeline 2.0 (elastic#21337)
  Skip publisher flaky tests (elastic#21657)
  backport: add 7.10 branch (elastic#21635)
  [CI: Packaging] fix: push ubi8 images too (elastic#21621)
  ...
adriansr added a commit that referenced this pull request Oct 13, 2020
PR #21258 introduced a restart mechanism for o365input so that it didn't
stop working once a fatal error was found. This updates the restart delay to
use a cancellation-context-aware method so that the input doesn't block
Filebeat termination.

(cherry picked from commit 1abe97b)
adriansr added a commit that referenced this pull request Oct 13, 2020
PR #21258 introduced a restart mechanism for o365input so that it didn't
stop working once a fatal error was found. This updates the restart delay to
use a cancellation-context-aware method so that the input doesn't block
Filebeat termination.

(cherry picked from commit 1abe97b)
adriansr added a commit that referenced this pull request Oct 13, 2020
PR #21258 introduced a restart mechanism for o365input so that it didn't
stop working once a fatal error was found. This updates the restart delay to
use a cancellation-context-aware method so that the input doesn't block
Filebeat termination.

(cherry picked from commit 1abe97b)
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
PR elastic#21258 introduced a restart mechanism for o365input so that it didn't
stop working once a fatal error was found. This updates the restart delay to
use a cancellation-context-aware method so that the input doesn't block
Filebeat termination.

(cherry picked from commit f2ab428)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants