Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VAULT-24013: Audit regression attempting to recover from panic #25605

Merged
merged 12 commits into from
Feb 26, 2024

Conversation

peteski22
Copy link

@peteski22 peteski22 commented Feb 23, 2024

This PR fixes a regression in recovering Vault when audit related code panics. It appeared with the introduction of the go-eventlogger.

The changes involve continuing to handle the metrics updates in the audit broker, but making an entry formatter node responsible for handling panics that occur during formatting of the audit entry.

Addresses: #16462

@peteski22 peteski22 added core/audit hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed labels Feb 23, 2024
Copy link

github-actions bot commented Feb 23, 2024

CI Results:
All Go tests succeeded! ✅

@peteski22 peteski22 marked this pull request as ready for review February 23, 2024 16:40
@peteski22 peteski22 added this to the 1.16.0 milestone Feb 23, 2024
Copy link

github-actions bot commented Feb 23, 2024

Build Results:
All builds succeeded! ✅

Copy link
Contributor

@VioletHynes VioletHynes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! I'm not the biggest expert in this area but from what I know, it looks like it fixes the bug, and I like the addition of the regression test.

cfg, err := NewFormatterConfig()
require.NoError(t, err)
ss := newStaticSalt(t)
formatter, err := NewEntryFormatter("juan", cfg, ss, hclog.NewNullLogger())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why juan? It's just dummy data, but there might be opportunity to use something more descriptive

audit/entry_formatter_test.go Outdated Show resolved Hide resolved
// Handle panics
defer func() {
r := recover()
if r == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nit, but the following seems to be a much more common pattern, so it might make sense for us to align to that one

if r := recover(); r != nil {
   //logging, assigning erroring, etc
}

Copy link
Contributor

@kubawi kubawi Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's because r is used below in the call to the logger (on line 125).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems to be the case in the other ones I've seen, like this one in audit_broker.go, though maybe I'm totally misunderstanding what you mean :P :

		if r := recover(); r != nil {
			a.logger.Error("panic during logging", "request_path", in.Request.Path, "error", r, "stacktrace", string(debug.Stack()))
			retErr = multierror.Append(retErr, fmt.Errorf("panic generating audit log"))
		}

Copy link
Contributor

@kubawi kubawi Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope that's my bad. I totally missed that your original suggestion involved handling logging and errors inside the if statement. Soz for the noise :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also on me for not posting an example in the initial comment, so no worries at all :D

Copy link
Contributor

@kubawi kubawi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@peteski22 peteski22 added the bug Used to indicate a potential bug label Feb 23, 2024
@peteski22 peteski22 merged commit 67c1634 into main Feb 26, 2024
86 checks passed
@peteski22 peteski22 deleted the peteski22/VAULT-24013/audit/panic branch February 26, 2024 10:33
peteski22 pushed a commit that referenced this pull request Feb 26, 2024
peteski22 pushed a commit that referenced this pull request Feb 26, 2024
* PR based on #25605

* changelog

* fix corehelpers
kiannaquach pushed a commit that referenced this pull request Feb 26, 2024
* Add Logger to BackendConfig

* EntryFormatter use logger and recover panics

* Added TODO to consider

* Add 'name' to entry formatter

* Add test for the panic

* Fix NoopAudit with update params

* emit counter metric even when 0

* Fix vault package tests

* changelog

* Remove old comment during test writing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Used to indicate a potential bug core/audit hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants