-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Suricata] Update fields and paths #8550
[Suricata] Update fields and paths #8550
Conversation
, "ignore_missing": true | ||
} | ||
} | ||
|
||
, { "geoip": | ||
{ "field": "suricata.eve.src_ip" | ||
, "target_field": "ecs.source.geo" | ||
, "target_field": "suricata.eve.src_geo" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ECS doesn't yet have a source.geo
or a destination.geo
but it will. So for master, where we don't have the source
field conflict, we should put the geo data under those fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you already have a plan for dealing with the ECS source
conflict with the Filebeat source
for 6.x?
Perhaps we could just rename all uses of the ECS source
field to source_ecs
when we backport.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea was to have this branch directly mergeable into 6.x to speed up things, so no source
usage yet.
I thought it makes more sense to not create an intermediate field (i.e. source_ecs
) just to have it replaced with source
soon. So I leave this under suricata.eve
for now and keep those fields too for 7.x, adding also source
.
Makes any sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I follow your thinking.
My idea with source_ecs
is that this would be the only difference between the module in 6.x vs 7.x. My hope is that this would make forward upgrades easier for early adopters of the module. We can even warn users in advance that we will definitely be breaking the source_ecs
field in 7.x by renaming it to source
.
We can't guarantee that we won't break other fields, but if they build out any security content (dashboards, alerts) it will be easier to update this content. I guess this will even make it easier for us with the dashboards we create for 6.x.
{ "field": "ecs.user_agent.os_minor" | ||
, "target_field": "ecs.user_agent.os.minor" | ||
{ "field": "user_agent.os_minor" | ||
, "target_field": "user_agent.os.minor" | ||
, "ignore_missing": true | ||
} | ||
} | ||
|
||
, { "geoip": | ||
{ "field": "suricata.eve.src_ip" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we should "promote" this up to the source.ip
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same answer as above, I was refraining from using source
.
91cb8f1
to
8c4c4c0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's looking good overall, but I have a few questions I've added here and there.
|
||
, { "set": | ||
{ "field": "http.request.method" | ||
, "value": "{{suricata.eve.http.http_method}}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The set
operations I had done initially were on fields that were always present. How does it behave for missing fields? Does it create an empty http.request.method
field? If that's the case, we must find a way that doesn't create empty fields. Because the mappings you just added are for http
events, but you still have dns
, tls
, ftp
, smtp
and a whole bunch more to map. We can't end up with all of them defined (and empty) on each event :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these mappings remind me that we should likely add a new processor that can do mass rename/copies with light type coercion (e.g. string to int or float, mostly). I could have sworn I had opened an issue for this. Because this is very painful at the moment.
I wonder whether Beats or Ingest Node would be a better place to have this. Perhaps Ingest Node? @adriansr @andrewkroh @tsg What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I'm having some github issues with my messages disappearing / showing up later, hope you can follow up the discussion)
{ "lang": "painless" | ||
, "source": "if (ctx.suricata.eve.src_ip != null) { if (ctx.source_ecs == null) ctx.source_ecs = new HashMap(); ctx.source_ecs.ip = ctx.suricata.eve.src_ip; }" | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@webmat what do you think of this way to conditionally set a field only when the source field exists? I'm not familiar with ingest pipelines or painless
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it looks like we have the option (not tested) of using the convert
processor. Set type to be the original type and use ignore_missing and target_field. It's hacky but slightly cleaner than the above painless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lie hidden in the name "Painless". I cannot imagine myself doing this for 50+ fields. And that's just Suricata...
I haven't looked at the convert
processor. I'll go have a look.
But I do think we need a more declarative and compact way to do it. Imagine this was an Ingest Node plugin:
{ "normalize":
[ { "source": "suricata.eve.event_type", "target": "event.type", "error_missing": true },
{ "source": "suricata.eve.http.status", "target": "http.response.status_code", "conversion": "integer" },
{ ... }
]
},
By default error_missing = false
, so the long list of fields we may have to convert would not need to specify that. But we could still force an error to be added to the event when really critical fields are missing, as I'm showing in the first event.
This example is loosely based on a plugin I was proposing we add to Logstash elastic/logstash#9768 (for Suricata, when I started it as a LS module). Although I didn't keep pushing it when we moved Suricata to Beats. So it's still in the idea phase at this time for LS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah convert could work, and it's not that hacky. But I feel that having one processor definition per field is pretty heavy. Plus the default is not to ignore missing, which makes the pipeline definition heavier still. And since it's specifically tailored to convert, the type
field is mandatory, when in this case I'd say 80%+ of the fields are straight renames that should remain as strings 😂
But I'd say let's go this way for now, it's the simplest approach, I think. The important part is to not have tons of empty fields :-)
{ "convert":
{ "field": "suricata.eve.event_type",
"target_field": "event.type",
"type": "string"
}
},
{ "convert":
{ "field": "suricata.eve.http.status",
"target_field": "http.response.status_code",
"type": "integer",
"ignore_missing": true
}
},
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad news, conversion doesn't seem to work:
Error loading pipeline: Error loading pipeline for fileset suricata/eve: couldn't load pipeline: couldn't load json. Error: 400 Bad Request: {"error":{"root_cause":[{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"},"suppressed":[{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [text] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}}]}],"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"},"suppressed":[{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [text] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}}]},"status":400}. Response body: {"error":{"root_cause":[{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"},"suppressed":[{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [text] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}}]}],"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"},"suppressed":[{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [text] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}}]},"status":400}```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake, I was using elasticsearch types when it expects JSON types. It works
{ "lang": "painless" | ||
, "source": "void createAndSet(def ctx, List components, def value) { int n = components.getLength(); try { def map = ctx; for (int i = 0; i < n - 1; i++) { map = map.computeIfAbsent(components.get(i), x -> new HashMap()); } map[components.get(n-1)] = value; } catch (Exception e) { } } def getValue(def ctx, List components) { int n = components.getLength(); try { def map = ctx; for (int i = 0; i < n - 1; i++) { map = map.get(components.get(i)); if (map == null) return null; } return map[components.get(n-1)]; }catch(Exception e) { return null; } } List split(String path) { List res = new ArrayList(); int pos = 0; int next = -1; while ( (next=path.indexOf('.',pos))>=0) { if (pos < next - 1) { res.add(path.substring(pos, next)); } pos = next + 1; } if (pos < path.length()) { res.add(path.substring(pos)); } return res; } void copyIfPresent(def ctx, String source, String dest) { def val = getValue(ctx, split(source)); if (val != null) { createAndSet(ctx, split(dest), val); } } int n = params.list.getLength(); for (int i=0; i<n; i+=2) { copyIfPresent(ctx, params.list.get(i), params.list.get(i+1)); }" | ||
, "params": { | ||
"list": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ugly painless does the job of creating an ECS copy of a field only if it exists in the event. Too bad it can't be formatted or documented properly, but at least I've been able to parametrize the field names.
@webmat WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I see that convert
works, I guess convert is clearer than this ugly script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative rename on the Beats side could be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd go for Ingest Node convert.
If we want to keep the original Suricata object around, we can't use Beats rename
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@webmat @andrewkroh I've pushed some fixes, please have a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the fields that are being copied over to the ECS field names are not being used everywhere. Examples of that are: both the IP and port of the source and destination in both of the saved searches. We're still using the suricata.eve.*
fields there.
Some well defined ECS fields are not being copied to their ECS location from the original Suricata object. However I think it's ok for now, as the massive amount of copying required to move most fields will be easier once we have a better way than the current field by field approach with the convert
filter. So we'll need to revisit anyway.
Ingestion and dashboards all look fine to me otherwise.
So all in all I'm good with this as is, since we'll need to revisit it again later.
I'd rather use the ECS IP & port fields in both saved searches, but since we need to revisit anyway, perhaps it doesn't matter. You can fix those now or keep them for later, I'm good with both approaches.
I've left a few comments here and there, but since I'm not sure of the team's usual approach, I'm not marking them as blockers for this PR.
{"field": "suricata.eve.http.http_user_agent" | ||
,"target_field": "user_agent.original" | ||
,"type": "string" | ||
,"ignore_missing": true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous was more compact (hadn't noticed the params initially). But the single line of Painless was inscrutable. I much prefer this, even though it's more verbose. It's really hard to make a programming error in this JSON :-)
Plus when we get a massive rename/coerce processor in IN we can make all of this more compact again.
@@ -6,6 +6,54 @@ | |||
- name: event_type | |||
type: keyword | |||
|
|||
- name: vars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think trying to define the booleans under vars
is a losing game. Isn't this super arbitrary depending on the kinds of messages we're getting? And actually, are they detected as booleans without this, or are they interpreted as strings? Does the JSON contain "true"
or true
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they are true
(not a string).
These fields definitions come from a generator, so it outputs all the fields it finds in the provided eve.json.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'm happy you didn't have write them all by hand, then :-)
This doesn't change the fact that I get the sense that these fields are very message-specific. So if we didn't happen to generate messages containing other fields from our 2 respective computers, then we're missing an unknown amount of fields here. It's in that sense that I'd rather not declare them at all.
But perhaps this is just my OCD. Perhaps nobody will notice and perhaps this is fine.
- name: software_version | ||
type: keyword | ||
|
||
- name: stats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be inclined to skip all of the integers under stats
. They're already numerics in the JSON logs, and will therefore be autodetected as long
.
Do we have a policy of trying to define as many fields as possible in modules (even when it's basically a no-op)? I understand that if the defaults change this could cause problems, but this seems like a lot of overhead for something that doesn't change the outcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that if they are removed from the template, the fields won't be available in Kibana i.e. for Dashboards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's incorrect. This is just declaring these field types explicitly. They will all end up in ES the same whether this is there or not. Kibana will let you use them in dashboards etc regardless of this. If they are in an ElasticSearch index, you can use them in Kibana, there's no other dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we just discussed, we'll keep stats and get rid of vars. Keeping stats around will ensure the Kibana index pattern is pre-populated with the numeric stats, which could eventually be used in visualizations we ship with the module. So it makes sense to pre-populate those with the index template.
Ingest pipeline updated accordingly
These come from a somewhat boring eve.json
Some painless magic
3db862b
to
60c6532
Compare
03e97f6
to
dd0d4e9
Compare
This PR updates the suricata module to: