-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tag "truncated" to "log.flags" if incoming line is longer than configured limit #7991
Add tag "truncated" to "log.flags" if incoming line is longer than configured limit #7991
Conversation
I like the idea of having this flag. Could we put it under |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to select "request changes" to make it does not get merged before we discussed the naming of the field.
I am fine with |
filebeat/_meta/fields.common.yml
Outdated
@@ -108,6 +108,11 @@ | |||
description: > | |||
Logging level. | |||
|
|||
- name: log.truncated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default type is keyword
. I assume this needs to be integer/long?
I like that it's not just a bool but a value that tells how much was deleted.
@@ -48,6 +49,7 @@ type Reader struct { | |||
separator []byte | |||
last []byte | |||
numLines int | |||
truncated int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't think about multiline being able to capture truncated bytes as well. Good catch 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When truncating, the multiline reader and limit reader must add the number of truncated bytes to the current value (in case a former phase did truncate the contents before).
This makes me wonder if we really want to count the bytes, or rather just want to add a tag to the event.
filebeat/reader/readfile/limit.go
Outdated
"log": common.MapStr{ | ||
"truncated": diff, | ||
}, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the contents is already truncated (multiline is applied before limit reader)? We should add the counters, no?
The implementation of counting the truncated bytes is correct with the current fixed pipeline. However, it is not future-proof. When the reader pipeline refactoring is done, and the pipeline is flexible, more checking is required. I would not add the overhead of getting a value from a Thus, I changed the field value to |
@kvch @ruflin Not sure about the naming: Now that we changed it to a boolean, why do use a field instead of defining and inserting a tag. Please note, this is a more general question: Do we prefer boolean fields or tags? |
Having the lenght of truncated bytes would have been a nice feature but agree that it can make implementation more complex, so having simple bool could be sufficient. For the naming I would also like to push for For the tags: Definitively an interesting idea and I think Logstash uses tags for some things. At the same time I quite like specific fields that "describe" what happened and tags are more for users to add their own "free flow" meta information. |
}, | ||
}) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mlr.truncate > 0
branch do in the worst case 4 allocations on every run.
I would define the common.MapStr in a package variable that I would reuse.
filebeat/reader/readfile/limit.go
Outdated
"truncated": true, | ||
}, | ||
}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above for allocations.
@kvch really minor comments, good catch on multiline! |
I will name this field/tag |
702fa8c
to
b16cff8
Compare
filebeat/reader/message.go
Outdated
@@ -59,3 +59,15 @@ func (msg *Message) AddFields(fields common.MapStr) { | |||
} | |||
msg.Fields.Update(fields) | |||
} | |||
|
|||
func (msg *Message) AddTagsWithKey(key string, tags []string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported method Message.AddTagsWithKey should have comment or be unexported
receiver name msg should be consistent with previous receiver name m for Message
filebeat/reader/message.go
Outdated
|
||
// AddTagsWithKey adds tags to the message with an arbitrary key. | ||
// If the field does not exist, it is created. | ||
func (msg *Message) AddTagsWithKey(key string, tags []string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
receiver name msg should be consistent with previous receiver name m for Message
4706510
to
a01440c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM, I have added a remark about variadic function I think this would be a better API considering the usage we will make.
@ruflin are you OK with log.status
? I think this field could have the dissect_parse_failure
filebeat/reader/message.go
Outdated
} | ||
|
||
// AddTagsWithKey adds tags to the message with an arbitrary key. | ||
// If the field does not exist, it is created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe make AddTagsWithKey
takes a variadic arguments for tags?
Common use case its probably to add a single tag.
AddTagsWithKey(key string, tags ...[]string) error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should name it in the code the same as it's in the event again. If we stay with status
this should be AddStatusWithKey
. Alternative as proposed above we could call it flags so it would be AddFlagsWithKey
. Or if we stick with tags
here it should probably be log.tags
?
Would AddFlagWithKey(key string, flag string) error
be enough here? It seems all the usage we have below is only adding 1 flag.
I like Please add AddTagsWithKeys to the developer docs. Everything MapStr related is very much used/required by all beats (including community beats). |
filebeat/_meta/fields.common.yml
Outdated
@@ -108,6 +108,10 @@ | |||
description: > | |||
Logging level. | |||
|
|||
- name: log.status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about calling it log.flags
? I somehow expect a different content for status like unprocessed
or processed
. flags
is inspired by your field description and I kind of like it because we flagged a certain event.
Alternative would be log.tags
?
filebeat/reader/message.go
Outdated
} | ||
|
||
// AddTagsWithKey adds tags to the message with an arbitrary key. | ||
// If the field does not exist, it is created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should name it in the code the same as it's in the event again. If we stay with status
this should be AddStatusWithKey
. Alternative as proposed above we could call it flags so it would be AddFlagsWithKey
. Or if we stick with tags
here it should probably be log.tags
?
Would AddFlagWithKey(key string, flag string) error
be enough here? It seems all the usage we have below is only adding 1 flag.
if maxBytesReached && maxLinesReached { | ||
if space < 0 || space > len(m.Content) { | ||
truncated = space - len(m.Content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if space < 0
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. I think it was a leftover. This value of truncated was never used, because it was only added if len(m.Content)
was bigger than space
. However, in this branch space
equals len(m.Content)
, so the truncated
value here was unnecessary.
} | ||
|
||
for _, message := range messages { | ||
fmt.Println(message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover?
func TestMultilineAfterTruncated(t *testing.T) { | ||
pattern := match.MustCompile(`^[ ]`) // next line is indented a space | ||
maxLines := 2 | ||
testMultilineTruncated(t, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add a test where the truncated flag should be misssing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
} | ||
|
||
found := false | ||
switch flags := statusFlags.(type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can it be both types? Same question for the other test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it to mimic the testing of the old AddTags
. But as the input is never interface, I am removing it.
maxBytes int | ||
}{ | ||
{"long-long-line", 5}, | ||
{"long-long-line", 3}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a third param here truncated: bool
and then add also a test that is not truncated and checks that the flag does not exist?
// AddTagsWithKey appends a tag to the key field of ms. If the field does not | ||
// exist then it will be created. If the field exists and is not a []string | ||
// then an error will be returned. It does not deduplicate the list. | ||
func AddTagsWithKey(ms MapStr, key string, tags []string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test for the AddTagsWithKey
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// AddTagsWithKey appends a tag to the key field of ms. If the field does not | ||
// exist then it will be created. If the field exists and is not a []string | ||
// then an error will be returned. It does not deduplicate the list. | ||
func AddTagsWithKey(ms MapStr, key string, tags []string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I would keep the naming here as Tags
even though I would rename everything in filebeat code above to Flags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Done.
if !exists { | ||
ms[TagsKey] = tags | ||
|
||
k, subMap, oldTags, present, err := mapFind(key, ms, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wodner if this has a perfomance impact on the old AddTags
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Keys without dots can be processed in the first loop on a fast path in mapFind
. If the key exists as is, it's simply returned at the beginning.
// Fast path, key is present as is.
if v, exists := data[key]; exists {
return key, data, v, true, nil
}
If not, the only additional cost is one IndexRune
which looks for a dot in the key. But currently old tags don't have dots in their keys. Thus, the function returns at the second possible point without doing anything "expensive" e.g. creating additional submaps.
idx := strings.IndexRune(key, '.')
if idx < 0 {
return key, data, nil, false, nil
}
filebeat/reader/message.go
Outdated
m.Fields.Update(fields) | ||
} | ||
|
||
// AddTagsWithKey adds tags to the message with an arbitrary key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment on exported method Message.AddFlagsWithKey should be of the form "AddFlagsWithKey ..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please make sure to update the PR description.
…nfigured limit (elastic#7991) A new field is added to store the flags of an event named "log.flags". If a message is truncated, "truncated" flag is added to the list. Example event with "truncated" flag: { "@timestamp": "2018-08-16T13:00:46.759Z", "@metadata": { "beat": "filebeat", "type": "doc", "version": "7.0.0-alpha1" }, "host": { "name": "sleipnir" }, "source": "/home/n/test.log", "offset": 33, "log": { "flags": [ "truncated" ], }, "message": "test line", "prospector": { "type": "log" }, "input": { "type": "log" }, "beat": { "hostname": "sleipnir", "version": "7.0.0-alpha1", "name": "sleipnir" } } Closes elastic#7022 (cherry picked from commit 0884236)
…ing line is longer than configured limit (#8165) * Add tag "truncated" to "log.flags" if incoming line is longer than configured limit (#7991) A new field is added to store the flags of an event named "log.flags". If a message is truncated, "truncated" flag is added to the list. Example event with "truncated" flag: { "@timestamp": "2018-08-16T13:00:46.759Z", "@metadata": { "beat": "filebeat", "type": "doc", "version": "7.0.0-alpha1" }, "host": { "name": "sleipnir" }, "source": "/home/n/test.log", "offset": 33, "log": { "flags": [ "truncated" ], }, "message": "test line", "prospector": { "type": "log" }, "input": { "type": "log" }, "beat": { "hostname": "sleipnir", "version": "7.0.0-alpha1", "name": "sleipnir" } } Closes #7022 (cherry picked from commit 0884236) * fix changelog && rebase
A new field is added to store the flags of an event named
"log.flags"
.If a message is truncated, "truncated" flag is added to the list.
Example event with
"truncated"
flag:Blocks #7997
Closes #7022