-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add raw mysql fields to the mysql event #3001
Conversation
+1 for I guess it would be nicer to have this in it's own metricset. |
@tsg I did some tests with key names in elasticsearch and also spaces and lots of special characters just work. That means also for MongoDB we can use the original keys. In case one of the services uses dots we then have to decide if these should be replaced by _ to prevent potential conflicts. I would prefer to have the "raw" fields inside the same metricset as it is like an extension of the metricset itself and only one config option is need to enable the additional fields. Overtime I think more and more fields will be moving from the raw to the structured document. Implementation wise I would probably make |
b997208
to
d7237c0
Compare
Adding both the curated and raw fields seems like it creates a lot of duplicated data. As a user I think I would only want to store one copy of the data. Maybe we could remove fields that are in the schema from the raw fields. This means the module can send data for fields that are unknown a priori, and we can still provide an optimized experience for the fields that are known (like normalizing storage units to bytes so they render nicely in Kibana or using scaled_floats in mappings for percentages). +1 for lower-casing raw fields |
Agree that we should not duplicate fields. It would be nice if we could use apply.Schema() to remove the fields which were actually found so we would only iterate over the existing fields afterwards. |
I was trying to think through a likely user scenario:
To me it sounds like we should not care about duplication but rather allow to cherry pick raw metrics. I'm not sure how to do that from the usability point of view, especially how to find out which metrics are available. |
I think part of the joy of Metricbeat is that it's easy so I'm with you on the sadness of that user scenario! I think this usage pattern might not be the most common so maybe a phased approach here would be good. Maybe we do 1 today, 2 later?
|
@tsg I think your use case will describe it pretty well. Our "standard" metricset should cover 90-95% of the uses cases. More experience users will look for additional fields. Based on the unhappiness in point 4 I hope 5 happens: Open Github issue or PR to add the missing metric. The hard part will be where do we say "no" and not add it to the base metricset? If someone only wants to add one metric I would hope we can solve this with filters:
Otherwise a whitelist approach as proposed by @brandonmensing could also be interesting. But both approaches should probably only be added in a second step when we see how / if people are going to use it. BTW: This reminds me of the first metricbeat implementation where there was a base, extended and full metricset instead of the filters. |
2f7537d
to
c6ee4d5
Compare
@@ -57,6 +57,7 @@ type MetricSet interface { | |||
Host() string // Host returns a hostname or other module specific value | |||
// that identifies a specific host or service instance from which to collect | |||
// metrics. | |||
Raw() bool // Raw defines if raw metrics should be sent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to avoid adding new methods to this interface were possible. In this case, when you need this information it is available in the MetricSet by m.Module().Config().Raw
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, changed.
@@ -111,6 +118,7 @@ type ModuleConfig struct { | |||
MetricSets []string `config:"metricsets" validate:"required"` | |||
Enabled bool `config:"enabled"` | |||
Filters processors.PluginConfig `config:"filters"` | |||
Raw bool `config:"raw"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to also add a godoc comment explaining the meaning of this field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
_, exists := rawData["Threads_cached"] | ||
assert.False(t, exists) | ||
|
||
// Check random oder raw field if available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oder
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to Check a raw field if it is available
@@ -111,6 +118,7 @@ type ModuleConfig struct { | |||
MetricSets []string `config:"metricsets" validate:"required"` | |||
Enabled bool `config:"enabled"` | |||
Filters processors.PluginConfig `config:"filters"` | |||
Raw bool `config:"raw"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to also add a godoc comment explaining the meaning of this field.
assert.True(t, schema.HasKey("test_b")) | ||
assert.True(t, schema.HasKey("test_a")) | ||
assert.False(t, schema.HasKey("nooo")) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think HasKey
should return true for both the "source" keys and the "output" keys. Here's the behavior I expected which returns true only for the "source" keys.
func TestHasKey(t *testing.T) {
schema := Schema{
"test": Conv{Key: "Test", Func: nop},
"test_obj": Object{
"test_a": Conv{Key: "TestA", Func: nop},
"test_b": Conv{Key: "TestB", Func: nop},
},
}
assert.True(t, schema.HasKey("Test"))
assert.True(t, schema.HasKey("TestA"))
assert.True(t, schema.HasKey("TestB"))
assert.False(t, schema.HasKey("test"))
assert.False(t, schema.HasKey("test_obj"))
assert.False(t, schema.HasKey("test_a"))
assert.False(t, schema.HasKey("test_b"))
assert.False(t, schema.HasKey("other"))
}
which would mean that hasKey()
would be:
func hasKey(key string, mappers map[string]Mapper) bool {
for _, mapper := range mappers {
if mapper.HasKey(key) {
return true
}
}
return false
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
obviously you are right -> wrong test leads to wrong implementation 👎
c6ee4d5
to
0ef4a3f
Compare
@andrewkroh New version pushed. The current config option is generic. Even though I currently only documented it in the status metricset as this is the only place available. As soon as we add it to other metricsets, we should have a general documentation. |
@@ -32,6 +32,7 @@ type MetricSet struct { | |||
mb.BaseMetricSet | |||
dsn string | |||
db *sql.DB | |||
raw bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
@@ -81,6 +81,7 @@ type BaseMetricSet struct { | |||
name string | |||
module Module | |||
host string | |||
raw bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
Each metricset can add raw fields under the `raw` namespace in addition to the existing fields. The raw fields should contain the raw data coming the service itself. Fields are not changed in any way. As a first example, raw fields were added to mysql status. Remove unused raw variables
71cda3d
to
c02a9a9
Compare
This will include all mysql stats into the event under
mysql.stats.raw
. All keys will change to lower case.Discussions:
raw
,extended
_
by.
?a_b
anda_b_c
has a value (happens in mysql)Potential improvements:
TODO: