-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collapse multiline logs based on a start line. #2971
Conversation
Summary: This is a very simple approach based on grafana#1380 to provide multiline or block log entries in promtail. A `multiline` stage is added to pipelines. This stages matches a start line. Once a start line is matched all following lines are appended to an entry and "dropped". Once a new start line is matched the former block of multilines is send. This approach has two downside because log entires are not sent until a new start line is matched. 1. Lines can linger for a long time. The multiline stage should flush out lines if now new start line is matched in a certain time frame. However, the current pipeline interface cannot actively push entries. So a time based flushing would require a bigger refactoring. 2. If the observed system crashes the last log lines are not sent. Thus important information might be lost.
"github.com/stretchr/testify/require" | ||
ww "github.com/weaveworks/common/server" | ||
) | ||
func Test_multilineStage_Process(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure I did not cover all the edge cases.
nextStart := "START line 2" | ||
stage.Process(model.LabelSet{}, map[string]interface{}{}, ptrFromTime(time.Now()), &nextStart) | ||
|
||
require.Equal(t, "START line 1\nnot a start line", nextStart) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably the confusing part. A new line is pushed but we get the older multiline blog. Ideally a pipeline stage would be able to push out lines without a call. This could be done with channels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we are processing inputs from channels and push changes out to channels the multiline stage could add a simple timeout to flush a block:
struct Entry type {
labels model.LabelSet,
extracted map[string]interface{},
t *time.Time,
entry *string
}
// Process implements Stage
func (m *multilineStage) Process(entries <- chan Entry) <- chan Entry {
out := make(chan Entry)
currentMultiline := make(bytes.Buffer)
loop { // This should be in a goroutine
select {
case next := <-entries:
if isNewStart(entry) {
// Flush
out <- NewEntry(currentMultiline.String())
currentMultiline.reset()
}
currentMultiline.WriteString(entry.entry)
case <-time.After(timeout * time.Second):
// Flush
out <- currentMultiline.String()
currentMultiline.reset()
}
}
return out
}
Process
is not a fitting name any more since it is called once for all incoming entries. Anyhow, I think it illustrates how the flush time out becomes much simpler.
Codecov Report
@@ Coverage Diff @@
## master #2971 +/- ##
==========================================
- Coverage 61.84% 61.71% -0.14%
==========================================
Files 182 183 +1
Lines 14870 14906 +36
==========================================
+ Hits 9197 9199 +2
- Misses 4824 4857 +33
- Partials 849 850 +1
|
Superseded by cyriltovena#7 |
…transaction is valid (grafana#2971) * make a copy of KVs coming from boltdb which are only valid until the transaction is valid Signed-off-by: Sandeep Sukhani <sandeep.d.sukhani@gmail.com> * update changelog Signed-off-by: Sandeep Sukhani <sandeep.d.sukhani@gmail.com>
Summary:
This is a very simple approach based on #1380 to provide multiline
or block log entries in promtail.
A
multiline
stage is added to pipelines. This stages matches a startline. Once a start line is matched all following lines are appended
to an entry and "dropped". Once a new start line is matched the former
block of multilines is send.
This approach has two downside because log entires are not sent until a
new start line is matched.
Lines can linger for a long time. The multiline stage should flush out
lines if now new start line is matched in a certain time frame.
However, the current pipeline interface cannot actively push entries.
So a time based flushing would require a bigger refactoring.
If the observed system crashes the last log lines are not sent. Thus
important information might be lost.