-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove hardcoded 300s and add error messages when data collection is missing #10142
Conversation
Remove default version qualifier and rename the environment variable to set it from `BEAT_VERSION_QUALIFIER` to `VERSION_QUALIFIER` this will align with other parts of the stack. **Tested with filebeat.** ``` ❯ ./filebeat version [08:39:01] filebeat version 7.0.0 (amd64), libbeat 7.0.0 [0a0c267 built 2018-11-19 13:38:15 +0000 UTC] ``` **Without the patch** ``` ❯ ./filebeat version [08:40:07] filebeat version 7.0.0-alpha1 (amd64), libbeat 7.0.0-alpha1 [b007837 built 2018-11-19 13:39:59 +0000 UTC] ``` Fixes: #8384
jenkins, test this |
@@ -208,12 +222,18 @@ func createCloudWatchEvents(getMetricDataOutput *cloudwatch.GetMetricDataOutput, | |||
mapOfMetricSetFieldResults[metricKey[0]] = fmt.Sprint(output.Values[0]) | |||
} | |||
|
|||
if len(mapOfMetricSetFieldResults) <= 3 { | |||
errMsg := "Missing Cloudwatch data for instance " + instanceID + ". Please recheck the period setting in config." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we log this every time or only once? Also I wonder on what level we should log it as it's bascially an expected error. For a new instances it's expected to happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm logging this every time for each instance right now because this should only happen on new instance or when the period is set to a number that's not multiple of 300s/60s. But yes you are right, log level probably should be warning than error hmm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We try to away Warning log level as it's not clear if things need action or not. I would suggest Info for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good to know!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some minor comments / questions.
machineType, err := instanceOutput.InstanceType.MarshalValue() | ||
if err != nil { | ||
err = errors.Wrap(err, "instance.InstanceType.MarshalValue failed") | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To have this cleaner and more testable we createCloudWatchEvents should probably return and error and logging happens outside if needed. But let's get this PR in and we can still clean it up later.
err = errors.Wrap(err, "Error trying to apply schema in AWS EC2 metricbeat module.") | ||
logp.Error(err) | ||
err = errors.Wrap(err, "Error trying to apply schema schemaRootFields in AWS EC2 metricbeat module.") | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested with 2 EC2 instances. Works as expected.
In this PR, I added checkInstanceStatus function to check if the EC2 instance state is "running" and if instance launch time is earlier than 10 minutes ago. When a EC2 instance first started, it takes some time to fully start running and then it also take some time for Cloudwatch to start collecting EC2 monitoring metrics. This PR also addressed some comments from #9257