Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch bulk request enhancements #5811

Merged
merged 7 commits into from
Jan 9, 2018

Conversation

urso
Copy link

@urso urso commented Dec 5, 2017

Requires: #5810

  • Log error and drop event if no index name can be computed (ES API
    returns error if index is empty)
  • Set _id if event.Meta["id"] is set
  • Use index action if no id is set and create if id is set.

@urso urso added the in progress Pull request is currently in progress. label Dec 5, 2017
@urso urso force-pushed the enh/elasticsearch-document-id branch 2 times, most recently from eecce76 to 5bcc5c3 Compare December 6, 2017 02:51
@urso urso added review and removed in progress Pull request is currently in progress. labels Dec 7, 2017
Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a really nice addition. I left a few minor comments.

meta := createEventBulkMeta(index, pipeline, event)
meta, err := createEventBulkMeta(index, pipeline, event)
if err != nil {
logp.Err("Failed to encode event meta dat: %s", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit s/dat/data

Pipeline string `json:"pipeline" struct:"pipeline"`
var id interface{}
if m := event.Meta; m != nil {
id = m["id"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment to other PR: Should we really allow here interface or make sure this is converted to a string? Do we know the key "id" exists in m ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can _id be a number instead of a string?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that _id on the ES side is a keyword. So if 12 is put in it ends up also as "12". Even if that is not the case, I think we should enforce string values on beats side.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@ruflin
Copy link
Contributor

ruflin commented Jan 4, 2018

@urso Needs a rebase.

urso added 6 commits January 5, 2018 12:01
- Log error and drop event if no index name can be computed (ES API
  returns error if index is empty)
- Set `_id` if event.Meta["id"] is set
- Use `index` action if no `id` is set and `create` if `id` is set.
@urso urso force-pushed the enh/elasticsearch-document-id branch from 933da88 to 93aaea6 Compare January 5, 2018 11:04
@@ -22,6 +22,13 @@ var (
errNoTimestamp = errors.New("value is no timestamp")
)

func (e *Event) SetID(id string) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported method Event.SetID should have comment or be unexported

@urso
Copy link
Author

urso commented Jan 5, 2018

@ruflin rebased and ensure id is a string.

@ruflin ruflin merged commit bc667ca into elastic:master Jan 9, 2018
@urso urso deleted the enh/elasticsearch-document-id branch February 19, 2019 18:44
urso pushed a commit to urso/beats that referenced this pull request Sep 18, 2019
Add support to configure a key for setting the document ID in
the harvester JSON settings. The ID will be store in the events
Meta["id"] for the output to pick up. With elastic#5811 will the elasticsearch
output is the Meta["id"] field to set the document its ID (uses
op_type="create" to count duplicate inserts of same ID). For other
output type, the document ID will be forwarded via `@metadata.id`.
urso pushed a commit that referenced this pull request Sep 19, 2019
* Add support to set the document ID in the filebeat json reader

Add support to configure a key for setting the document ID in
the harvester JSON settings. The ID will be store in the events
Meta["id"] for the output to pick up. With #5811 will the elasticsearch
output is the Meta["id"] field to set the document its ID (uses
op_type="create" to count duplicate inserts of same ID). For other
output type, the document ID will be forwarded via `@metadata.id`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants