-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ml_job metricset to Elasticsearch module #7196
Conversation
jobPath = "/_xpack/ml/anomaly_detectors/_all/_stats" | ||
) | ||
|
||
type MetricSet struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported type MetricSet should have comment or be unexported
@@ -0,0 +1,46 @@ | |||
package ml_job |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't use an underscore in package name
@@ -0,0 +1,50 @@ | |||
package ml_job |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't use an underscore in package name
This PR depends on the release of ES 6.3. |
cb784cc
to
b3bb590
Compare
"time_field":"timestamp", | ||
"time_format": "epoch_ms" | ||
} | ||
}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation? And/or move this to a file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will use the file that I already created for the system tests ...
@@ -48,7 +52,14 @@ var metricSets = []string{ | |||
func TestFetch(t *testing.T) { | |||
compose.EnsureUp(t, "elasticsearch") | |||
|
|||
err := createIndex(getEnvHost() + ":" + getEnvPort()) | |||
host := getEnvHost() + ":" + getEnvPort() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
net.JoinHostPort()
?
r.Error(err) | ||
return []error{err} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return error also if jobsData.Jobs
is nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say in case it's empty no error should be returned.
try: | ||
es.transport.perform_request('POST', "/_xpack/license/start_trial?acknowledge=true") | ||
except: | ||
print "Trial already enabled" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could print the error instead, in case it is caused by other reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if response["count"] > 0: | ||
return | ||
|
||
file = os.path.join(self.beat_path, "module", "elasticsearch", "ml_job", "_meta", "test", "test_job.json") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use this file also for the integration test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
"total_partition_field_count": 0, | ||
"bucket_allocation_failures_count": 0, | ||
"memory_status": "ok", | ||
"log_time": 1525847933184 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we going to collect more metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This PR mainly creates the foundation.
|
||
// Not master, no event sent | ||
if !isMaster { | ||
logp.Debug("elasticsearch", "Trying to fetch index summary stats from a none master node.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling: "none master" should be "non-master"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, "index summary stats" should probably be "machine learning job stats".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both fixed.
|
||
content, err := m.HTTP.FetchContent() | ||
if err != nil { | ||
r.Error(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When X-Pack isn't installed in ES, this if
condition becomes true because a 400 error code is returned by the call to the /_xpack/ml/anomaly_detectors/_all/_stats
API. This then results in error documents being indexed in the metricbeat index that look like this:
{
"@timestamp":"2018-07-05T16:35:40.258Z",
"metricset":{
"rtt":6264,
"namespace":"elasticsearch.ml.job",
"name":"ml_job",
"module":"elasticsearch",
"host":"localhost:9200"
},
"error":{
"message":"HTTP error 400 in ml_job: 400 Bad Request"
},
"beat":{
"version":"7.0.0-alpha1",
"name":"Shaunaks-MBP-2",
"hostname":"Shaunaks-MBP-2"
},
"host":{
"name":"Shaunaks-MBP-2"
}
}
Just want to confirm that this is the behavior we want? Another option could be to log an error (but not index anything) if this API returns 400?
[EDIT] Obviously, whatever approach we take here should be consistent with all X-Pack APIs/metricsets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, could we somehow make use of the xpack
config setting here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone enables the ml_job metricset and X-Pack is not enabled I think an error document should be returned. What we probably should improve is the error message above and have something inside that it is not enabled.
I think our xpack
config is not related to this.
"state": c.Str("state"), | ||
"data_counts": c.Dict("data_counts", s.Schema{ | ||
"processed_record_count": c.Int("processed_record_count"), | ||
}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth including invalid_date_count
as well. From https://www.elastic.co/guide/en/elasticsearch/reference/6.2/ml-jobstats.html#ml-datacounts:
invalid_date_count
(long) The number of records with either a missing date field or a date that could not be parsed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. In general I think we will have to add more fields but I also would like to involve the ML team here in a follow up. I don't think we should send all the stats but only the ones that are most relevant which is a tricky question to answer.
@@ -0,0 +1,3 @@ | |||
=== elasticsearch ml MetricSet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this say ml_job
instead of ml
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
} | ||
|
||
// createIndex creates and elasticsearch index in case it does not exit yet | ||
func enableLicense(host string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Name this enableTrialLicense
just to be more specific about what it actually does?
Jobs []map[string]interface{} `json:"jobs"` | ||
} | ||
|
||
func eventsMapping(r mb.ReporterV2, content []byte) []error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the return value is being ignored anyway so my following comment my be moot:
Looking at other metricsets in the elasticsearch module, it looks like their eventsMapping
functions return error
instead of []error
. Further, to build this error
return object they use var errs multierror.Errors
to collect multiple errors and then return errs.Err()
at the end of the function. For consistency maybe the eventsMapping
function here should do the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, need to change it. This changed recently with the schema change from Jaime. Will adapt it accordingly.
This adds a basic ml_job metricset to the Elasticsearch module. The testing was enhanced to enabled to also test non x-pack Basic features by enabling the trial license as part of the test run.
@@ -191,7 +180,7 @@ func createMLJob(host string) error { | |||
return nil | |||
} | |||
|
|||
req, err := http.NewRequest("PUT", "http://"+host+jobURL, strings.NewReader(mlJob)) | |||
req, err := http.NewRequest("PUT", "http://"+host+jobURL, bytes.NewReader(mlJob)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more of question for my learning, not a review suggestion: why is bytes.NewReader
more appropriate here than strings.NewReader
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main difference is that one takes a []byte
array and the other on a string
as param. ReadFile above returns a []byte
array and previously it was a string (mlJob var).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
(I asked a question but it's incidental to the review).
This adds a basic ml_job metricset to the Elasticsearch module.
The testing was enhanced to enabled to also test non x-pack Basic features by enabling the trial license as part of the test run.