Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add target for decode_json_fields #3169

Merged

Conversation

martinscholz83
Copy link
Contributor

Fixes #3134

This change add the output from decode_json_field to json field instead of message.

@elasticsearch-release
Copy link

Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run.

1 similar comment
@elasticmachine
Copy link
Collaborator

Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run.

@martinscholz83
Copy link
Contributor Author

This implementation added the output into the json field and doesn't remove the original field. Question is if we should remove the original field?

@andrewkroh
Copy link
Member

andrewkroh commented Dec 12, 2016

This PR would break some of the use cases below. I recommend keeping the current behavior but adding an optional target configuration setting to control the destination. With that, I think all the use cases below would be covered.

There is no need to remove the original field in this processor because that could be done using using the drop_fields processor.

Use cases:

  • Decoding JSON in JSON
    • Recipe: Use decode_json_fields with max_depth > 1. The target field is the source field in order to replace the strings with proper JSON objects.
    • Status: Working now.
  • Decoding structure logs from an app running in Docker
    • Recipe: Use Filebeat JSON decoder + decode_json_fields. The JSON decoder works on the wrapper added by Docker then decode_json_fields works on the structured log event from the app.
    • Status: Working now.
    • Note: If we had target this could be done without the Filebeat JSON decoder using the recipe above.
  • Decoding JSON events containing multiline data
    • Recipe: Use Filebeat JSON decoder + Filebeat multiline. Docker puts each log line into a JSON object so if the application writes multline line events (like a stacktrace) then Filebeat must decode multiple JSON events and aggregate their messages.
    • Status: Working now.
  • Decoding multiline JSON (pretty)
  • Decoding arbitrary JSON fields
    • Recipe: Use decode_json_fields.
    • Status: Working but with limitations. You can only replace the string field with an object. You cannot write the decoded data to a different target field. To do that you must be able to set the target field.

@tsg
Copy link
Contributor

tsg commented Dec 12, 2016

I'm +1 on adding a "target" option.

@martinscholz83
Copy link
Contributor Author

Are there names which the target field can not be named to. For example target: "source", target; offset. What i mean is should we compare the target field against a greylist.

@@ -17,18 +17,21 @@ type decodeJSONFields struct {
fields []string
maxDepth int
processArray bool
target string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I want to target the root of the event? How would I configure this?

I think if you make it a *string then you can determine the difference between target being set but empty and target not present at all.

Copy link
Contributor Author

@martinscholz83 martinscholz83 Dec 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that tip 👍 Still have to learn a lot when coming from OO. There is one more question. Like you say if we want to add to root it now looks like

{
  "": {
    "test": {
      "test": "test"
    }
  },
  "@timestamp": "2016-12-14T08:20:16.157Z",
  "beat": {
    "hostname": "4201halwsd00001",
    "name": "4201halwsd00001",
    "version": "6.0.0-alpha1"
  },
  "input_type": "log",
  "message": "{\n    \"test\": \"test\",\n    \"test\": {\n        \"test\": \"test\"\n    }\n}",
  "offset": 73,
  "source": "input.json",
  "type": "log"
}

It looks right? Or should we filter for that empty string and print depth one

{ 
  "test": {
     "test": "test"   
},
  "@timestamp": "2016-12-14T08:20:16.157Z",
  "beat": {
    "hostname": "4201halwsd00001",
    "name": "4201halwsd00001",

@@ -111,6 +111,36 @@ func TestValidJSONDepthTwo(t *testing.T) {

}

func TestTargetOption(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding a test case. 👍

@karmi
Copy link

karmi commented Dec 14, 2016

Hi @maddin2016, we have found your signature in our records, but it seems like you have signed with a different e-mail than the one used in yout Git commit. Can you please add both of these e-mails into your Github profile (they can be hidden), so we can match your e-mails to your Github profile?

@martinscholz83 martinscholz83 force-pushed the add-target-decode-json-field branch from 54cc2ce to 7568437 Compare December 14, 2016 08:24
@@ -75,7 +77,12 @@ func (f decodeJSONFields) Run(event common.MapStr) (common.MapStr, error) {
continue
}

_, err = event.Put(field, output)
if f.target != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be three cases to handle here. When *f.target == "" you can't use event.Put, instead you need to add the fields from output to event. Looking at the json decoding in Filebeat, the default behavior is to not overwrite keys unless overwrite_key is set to true. So I would follow this model to be consistent.

@martinscholz83 martinscholz83 force-pushed the add-target-decode-json-field branch from 01748de to 45ecddb Compare December 15, 2016 07:16
@martinscholz83
Copy link
Contributor Author

@andrewkroh, how should we check if overwrite_keys is true or false. Should we add a separate config attribute for decode_json_fields?

@martinscholz83 martinscholz83 force-pushed the add-target-decode-json-field branch from be86754 to e5b4813 Compare December 22, 2016 11:59
@andrewkroh
Copy link
Member

Yeah, decode_json_fields would need its own boolean overwrite_keys config option.

if len(*f.target) > 0 {
_, err = event.Put(*f.target, output)
} else {
switch t := output.(type) {
Copy link
Member

@andrewkroh andrewkroh Dec 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic looks good. But now the complexity of the Run method is a bit high with all of the nested if/for/switch statements. WDYT about breaking this into smaller functions?

Copy link
Contributor Author

@martinscholz83 martinscholz83 Dec 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of adding a global helper function for that switch part. Because now we duplicate code. For example, helpers.jsonhelper.OverwriteJsonKeys?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense if there is duplication. 👍

@martinscholz83
Copy link
Contributor Author

Hi @andrewkroh, do you have the chance to read my last commit. My idea was to create helper libraries under libbeat. For this case e.g

function OverwriteJsonKeys(event common.Mapstr, keys map[string]interface{})
{
...
}

@andrewkroh andrewkroh merged commit 3180831 into elastic:master Jan 3, 2017
@andrewkroh
Copy link
Member

@maddin2016 Thanks a lot! I'm merging this now so that it can be included in the next release. Please open a new PR for the refactoring work.

@martinscholz83 martinscholz83 mentioned this pull request Jan 4, 2017
@monicasarbu monicasarbu changed the title Add target for decoded_json_field Add target for decode_json_field Jan 9, 2017
@monicasarbu monicasarbu changed the title Add target for decode_json_field Add target for decode_json_fields Jan 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants