Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditional log level for api key read #1946

Merged
merged 12 commits into from
Oct 6, 2022
32 changes: 28 additions & 4 deletions internal/pkg/api/handleAck.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ import (
"github.com/rs/zerolog"
"github.com/rs/zerolog/log"

"github.com/elastic/fleet-server/v7/internal/pkg/apikey"
"github.com/elastic/fleet-server/v7/internal/pkg/bulk"
"github.com/elastic/fleet-server/v7/internal/pkg/cache"
"github.com/elastic/fleet-server/v7/internal/pkg/config"
Expand Down Expand Up @@ -364,10 +365,18 @@ func (ack *AckT) updateAPIKey(ctx context.Context,
if apiKeyID != "" {
res, err := ack.bulk.APIKeyRead(ctx, apiKeyID, true)
if err != nil {
zlog.Error().
Err(err).
Str(LogAPIKeyID, apiKeyID).
Msg("Failed to read API Key roles")
if ack.isAPIKeyReadError(ctx, zlog, agentID, err) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a possible concern here is that, now for each agent, for each error, for each output in the policy

for _, output := range agent.Outputs {

this could do an additional request to elasticsearch (waiting on bulk flushes)?
is this a correct understanding?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that's a good point.

zlog.Error().
Err(err).
Str(LogAPIKeyID, apiKeyID).
Msg("Failed to read API Key roles")
} else {
// not an error, race when API key was invalidated before acking
zlog.Info().
Err(err).
Str(LogAPIKeyID, apiKeyID).
Msg("Failed to read invalidated API Key roles")
}
} else {
clean, removedRolesCount, err := cleanRoles(res.RoleDescriptors)
if err != nil {
Expand Down Expand Up @@ -513,6 +522,21 @@ func (ack *AckT) handleUpgrade(ctx context.Context, zlog zerolog.Logger, agent *
return nil
}

func (ack *AckT) isAPIKeyReadError(ctx context.Context, zlog zerolog.Logger, agentID string, err error) bool {
if !errors.Is(err, apikey.ErrAPIKeyNotFound) {
return false
}
agent, err := dl.FindAgent(ctx, ack.bulk, dl.QueryAgentByID, dl.FieldID, agentID)
if err != nil {
zlog.Warn().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why just warn? isn't it a real error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we can take it like an error

Err(err).
Msg("failed to find agent by ID")
return true
}

return agent.Active // it is a valid error in case agent is active (was not invalidated)
}

// Generate an update script that validates that the policy_id
// has not changed underneath us by an upstream process (Kibana or otherwise).
// We have a race condition where a user could have assigned a new policy to
Expand Down