Skip to content

Commit

Permalink
promtail: initialize extracted map with initial labels
Browse files Browse the repository at this point in the history
  • Loading branch information
pracucci committed Oct 10, 2019
1 parent bf4530a commit bd63a49
Show file tree
Hide file tree
Showing 4 changed files with 44 additions and 7 deletions.
18 changes: 13 additions & 5 deletions docs/clients/promtail/pipelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ stages:
Typical pipelines will start with a parsing stage (such as a
[regex](./stages/regex.md) or [json](./stages/json.md) stage) to extract data
from the log line. Then, a series of action stages will be present to do
something with that extract data. The most common action stage will be a
something with that extracted data. The most common action stage will be a
[labels](./stages/labels.md) stage to turn extracted data into a label.

A common stage will also be the [match](./stages/match.md) stage to selectively
Expand Down Expand Up @@ -153,30 +153,38 @@ scrape_configs:
The following sections further describe the types that are accessible to each
stage (although not all may be used):
##### Label Set
#### Label Set
The current set of labels for the log line. Initialized to be the set of labels
that were scraped along with the log line. The label set is only modified by an
action stage, but filtering stages read from it.
The final label set will be index by Loki and can be used for queries.
##### Extracted Map
#### Extracted Map
A collection of key-value pairs extracted during a parsing stage. Subsequent
stages operate on the extracted map, either transforming them or taking action
with them. At the end of a pipeline, the extracted map is discarded; for a
parsing stage to be useful, it must always be paired with at least one action
stage.
##### Log Timestamp
The extracted map is initialized with the same set of initial labels that were
scraped along with the log line. This initial data allows for taking action on
the values of labels inside pipeline stages that only manipulate the extracted
map. For example, log entries tailed from files have the label `filename` whose
value is the file path that was tailed. When a pipeline executes for that log
entry, the initial extracted map would contain `filename` using the same value
as the label.

#### Log Timestamp

The current timestamp for the log line. Action stages can modify this value.
If left unset, it defaults to the time when the log was scraped.

The final value for the timestamp is sent to Loki.

##### Log Line
#### Log Line

The current log line, represented as text. Initialized to be the text that
Promtail scraped. Action stages can modify this value.
Expand Down
7 changes: 7 additions & 0 deletions pkg/logentry/stages/pipeline.go
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,13 @@ func NewPipeline(logger log.Logger, stgs PipelineStages, jobName *string, regist
// Process implements Stage allowing a pipeline stage to also be an entire pipeline
func (p *Pipeline) Process(labels model.LabelSet, extracted map[string]interface{}, ts *time.Time, entry *string) {
start := time.Now()

// Initialize the extracted map with the initial labels (ie. "filename"),
// so that stages can operate on initial labels too
for labelName, labelValue := range labels {
extracted[string(labelName)] = string(labelValue)
}

for i, stage := range p.stages {
if Debug {
level.Debug(p.logger).Log("msg", "processing pipeline", "stage", i, "name", stage.Name(), "labels", labels, "time", ts, "entry", entry)
Expand Down
22 changes: 22 additions & 0 deletions pkg/logentry/stages/pipeline_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,15 @@ pipeline_stages:
- docker:
- regex:
expression: "^(?P<ip>\\S+) (?P<identd>\\S+) (?P<user>\\S+) \\[(?P<timestamp>[\\w:/]+\\s[+\\-]\\d{4})\\] \"(?P<action>\\S+)\\s?(?P<path>\\S+)?\\s?(?P<protocol>\\S+)?\" (?P<status>\\d{3}|-) (?P<size>\\d+|-)\\s?\"?(?P<referer>[^\"]*)\"?\\s?\"?(?P<useragent>[^\"]*)?\"?$"
- regex:
source: filename
expression: "(?P<service>[^\\/]+)\\.log"
- timestamp:
source: timestamp
format: "02/Jan/2006:15:04:05 -0700"
- labels:
action:
service:
status_code: "status"
`

Expand Down Expand Up @@ -106,6 +110,24 @@ func TestPipeline_MultiStage(t *testing.T) {
"nomatch": "true",
},
},
"should initialize the extracted map with the initial labels": {
rawTestLine,
processedTestLine,
time.Now(),
time.Date(2000, 01, 25, 14, 00, 01, 0, est),
map[model.LabelName]model.LabelValue{
"match": "true",
"filename": "/var/log/nginx/frontend.log",
},
map[model.LabelName]model.LabelValue{
"filename": "/var/log/nginx/frontend.log",
"match": "true",
"stream": "stderr",
"service": "frontend",
"action": "GET",
"status_code": "200",
},
},
}

for tName, tt := range tests {
Expand Down
4 changes: 2 additions & 2 deletions pkg/logentry/stages/regex.go
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ func parseRegexConfig(config interface{}) (*RegexConfig, error) {
// Process implements Stage
func (r *regexStage) Process(labels model.LabelSet, extracted map[string]interface{}, t *time.Time, entry *string) {
// If a source key is provided, the regex stage should process it
// from the exctracted map, otherwise should fallback to the entry
// from the extracted map, otherwise should fallback to the entry
input := entry

if r.cfg.Source != nil {
Expand Down Expand Up @@ -117,7 +117,7 @@ func (r *regexStage) Process(labels model.LabelSet, extracted map[string]interfa
match := r.expression.FindStringSubmatch(*input)
if match == nil {
if Debug {
level.Debug(r.logger).Log("msg", "regex did not match")
level.Debug(r.logger).Log("msg", "regex did not match", "input", *input, "regex", r.expression)
}
return
}
Expand Down

0 comments on commit bd63a49

Please sign in to comment.