-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setup: Add default pipeline to templates #14001
Conversation
"set": common.MapStr{ | ||
"field": "event.ingested", | ||
"value": "{{_ingest.timestamp}}", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cool. I wonder if event.ingested
should be in ECS. CC @MikePaquette @webmat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already on it: elastic/ecs#582 :-)
version65, _ := common.NewVersion("6.5.0") | ||
if config.DefaultPipelineEnabled && !ver.LessThan(version65) { | ||
indexSettings.Put("default_pipeline", config.DefaultPipelineName) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pipeline itself requires the Geoip plugin, which is only bundled into ES since 6.7. Might be worth checking for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, makes sense - changing to 6.7.0
. I confirmed it works out of the box with that Elasticsearch version.
Nice work @cwurm! I can't think of anything that might get broken due to this, so I'm supportive to have it enabled by default unless someone else thinks of something that breaks. |
Haven't reviewed in detail, but I love this! BTW we've been tracking ideas of what can be done, with events that are ECS formatted. Check out this issue elastic/ecs#181. You don't have to change anything here, but just in case it sparks more ideas of things general enough to include here eventually :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just commenting on a few operational aspects. I didn't review the code.
Will beat setup --index-management
load this pipeline?
The documentation for installing the index template when using a non-ES output will need to be updated to include how to install the pipeline. We might need an export default-pipeline
sub-command to get a copy of the pipeline JSON.
libbeat/template/config.go
Outdated
@@ -48,5 +53,8 @@ func DefaultConfig() TemplateConfig { | |||
Enabled: true, | |||
Fields: "", | |||
Order: 1, | |||
|
|||
DefaultPipelineEnabled: true, | |||
DefaultPipelineName: "beats_default_pipeline", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the pipeline name should have version number in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I like the idea of using the beat's name rather than "beats".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the pipeline name should have version number in it.
Good catch, definitely. We'll need it when running multiple versions of Beats.
And I like the idea of using the beat's name rather than "beats".
👍 - now the default pipeline for all Beats would be the same, but in the future this might well diverge.
Yes
Good point, let me implement |
I've implemented Would like to do more testing on this, and need to add it in some places in the docs, but I think the code is ready to take a look at. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, this is a lovely change! I have only very minor comments on what's here, but please add some unit tests for the libbeat/template
changes :-)
libbeat/template/pipeline.go
Outdated
|
||
// loadIngestPipeline loads an ingest pipeline into Elasticsearch, | ||
// overwriting an existing pipeline if it exists. | ||
// If you wish to not overwrite an existing pipeline then use ingestPipelinExists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: ingestPipelineExists
libbeat/template/pipeline.go
Outdated
return false, nil | ||
} | ||
|
||
status, body, _ := l.client.Request("GET", esIngestPipelinePath+name, "", nil, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please either include the error result here as above, or add a comment why it is skipped in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I included the error
9055be8
to
7aba709
Compare
@faec Thanks, I've addressed your comments and added Let me know what you think. :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Approved
Now that we have a If we change to What do others think? @urso @tsg @andrewkroh @faec? |
If we make it a Other than that, |
7aba709
to
5160f91
Compare
5160f91
to
986f3ac
Compare
986f3ac
to
9ad7b8a
Compare
@tsg @urso @andrewkroh @faec I've updated this to set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good to me. The final_pipeline
concept is very useful. I didn't scrutinize the code, but the pipeline looks good.
|
||
"github.com/elastic/beats/libbeat/common" | ||
"github.com/elastic/beats/libbeat/common/cfgwarn" | ||
commonP "github.com/elastic/beats/libbeat/common/pipeline" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Package names should be lowercase.
APM Server currently has a mechanism to load a set of default pipelines to ES on startup and applying them by default on event ingestion. This current approach does not use index pipelines, therefore it is definitely interesting to investigate how to best deprecate and move over to this new handling. I can look into that in more detail. The pipelines partially overlap with the default pipelines introduced in this PR, but APM Server additionally createa a user-agent pipeline. Not necessarily a requirement for this PR, but have you thought about how a beat could extend the |
if v[idx], err = fixYAMLMaps(value); err != nil { | ||
return nil, err | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's add a default case erroring out, just in case.
@@ -102,6 +103,39 @@ func (l *ESLoader) Load(config TemplateConfig, info beat.Info, fields []byte, mi | |||
return nil | |||
} | |||
|
|||
// LoadFinalPipeline loads the final pipeline. | |||
func (l *ESLoader) LoadFinalPipeline(config TemplateConfig, info beat.Info) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are missing a 'good' package for loading pipelines. Maybe we can introduce it to the idxmgmt package instead of template? The idxmgmt package configures lifecycle.name
and lifecycle.rollover_alias
when creating the TemplateConfig.
@simitt Would this idea fit apm-server needs? If so, do you think we can proceed here, and create a follow up issue to make the final pipeline more extensible? Depending on the capabilities of the pipeline we might need a version check, plus the pipeline contents. One option to make it somewhat extensible would be to pass the default final pipeline to
All setup should rely on this type being passed where it is required. For pipeline handling in general we might want to introduce a package named |
@urso your suggestion generally sounds reasonable to me. Regarding moving forward with this PR - it is introducing some breaking changes to exported interfaces in the |
Hi! We're labeling this issue as |
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
1 similar comment
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
Hi! We're labeling this issue as |
Hi! |
Adds the
default_pipeline
index setting (ES docs) to the templates loaded onsetup
.Enabled by default, Libbeat first loads the ingest pipeline, then adds it to the template and loads the template. The built-in ingest pipeline contains processors for GeoIP and ASN lookups (and renaming to ECS fields) and adds an ingest timestamp. These are especially relevant for the SIEM app, where several widgets (network map, top IP tables, IP details overview) depend on GeoIP and ASN data being present. The ingest timestamp will help with detecting events that arrive much later than they occur or with a wrong timestamp (more details in elastic/ecs#582).
This feature is governed by four settings:
setup.template.default_pipeline.enabled
- if it is enabled at all (defaulttrue
)setup.template.default_pipeline.overwrite
- if an existing pipeline with the same name should be overwrittensetup.template.default_pipeline.name
- allows specifying a custom pipeline name (the default isbeats_default_pipeline
)setup.template.default_pipeline.file
- instead of using the built-in default pipeline users can also load a custom pipeline from a file. I've re-used the unmarshalling logic from filesets so both JSON and YAML files can be used.A few other notes:
beats_default_pipeline
pipeline for all Beats we could also do Beats-specific pipelines (packetbeat_default_pipeline
,auditbeat_default_pipeline
, etc.). I don't have a strong opinion.