Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add graceful handling of malformed persisted service/check files. #3821

Merged
merged 2 commits into from
Jan 22, 2018

Conversation

kyhavlov
Copy link
Contributor

Previously a change was made to make the file writing atomic,
but that wasn't enough to cover something like an OS crash so we
needed something here to handle the situation more gracefully.

Fixes #1221.

Previously a change was made to make the file writing atomic,
but that wasn't enough to cover something like an OS crash so we
needed something here to handle the situation more gracefully.

Fixes #1221.
@kyhavlov kyhavlov requested a review from slackpad January 19, 2018 22:10
agent/agent.go Outdated
// it for convenience and log about it.
if len(buf) == 0 {
a.logger.Printf("[WARN] Removing leftover empty service file %q", file)
if err := os.Remove(file); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should attempt to delete the file, just warn and skip it. Leaving it around might help operators see what's going on by looking at timestamps, etc.

agent/agent.go Outdated
// Decode the check
var p persistedCheck
if err := json.Unmarshal(buf, &p); err != nil {
return fmt.Errorf("Failed decoding check file %q: %s", file, err)
a.logger.Printf("[WARN] Failed decoding check file %q: %s", file, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was torn on this one, but since these are internal to Consul then we ought to skip these if we skip empty files, since OS crashes could cause this, too. I'd change all the logging in here to ERR level though.

Copy link
Contributor

@slackpad slackpad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kyhavlov kyhavlov merged commit cde1e7c into master Jan 22, 2018
@kyhavlov kyhavlov deleted the persist-file-handling branch January 22, 2018 20:31
@kirankumar1991
Copy link

I have a question regarding this change. I could see from the diff that we're ignoring the check and service files, if they're malformed. In that case when consul starts, how does it know about the ignored checks and services? Is there an expectation that the services & checks should be registered with consul again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants