Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loki fails to create labels containing the symbol \uf04a #3038

Closed
patrickhuy opened this issue Dec 4, 2020 · 9 comments
Closed

Loki fails to create labels containing the symbol \uf04a #3038

patrickhuy opened this issue Dec 4, 2020 · 9 comments
Assignees

Comments

@patrickhuy
Copy link
Contributor

patrickhuy commented Dec 4, 2020

Describe the bug
When a log message contains the symbol "\uf04a" and loki attempts to create a label containing that symbol the operation fails

To Reproduce
Steps to reproduce the behavior:

  1. Started Loki 2.0
  2. Insert log message { "msg": "lala " } into loki (using Promtail)
  3. Query and log parse the message to json {app="mytestapp"} |= "lala" | json
  4. Observe the error parse error: error unquoting string \\\"\\\\\\\"lala \\\\\\\\uf04a\\\\\\\", namespace=\\\\\\\"\\\": invalid syntax\\n\" in the loki logs.

Expected behavior
The label msg should be created and have the value lala 

Environment:

  • Infrastructure: Kubernetes with loki-stack deployed via helm
  • Deployment tool: helm

Screenshots, Promtail config, or terminal output
Loki logs contain the message

level=warn ts=2020-12-04T14:06:11.4519821Z caller=logging.go:71 traceID=3d9ac584e40b3cb msg="GET /loki/api/v1/query_range?direction=BACKWARD&limit=1000&regexp=&query=%7Bjob%3D%22default%2Ftestjson%22%7D%20%7C%3D%20%22field%22%20%7C%20json&start=1607087171000000000&end=1607090772000000000&step=1 (500) 1.8084ms Response: \"err while creating labelset for {container=\\\"busybox\\\", controller_uid=\\\"e89d7ae2-d7b4-435b-8129-a85d16894e3e\\\", field=\\\"2\\\", filename=\\\"/var/log/pods/default_testjson-txxdv_81d4bd68-39d5-49ba-b7cf-73135da0dd33/busybox/0.log\\\", job=\\\"default/testjson\\\", job_name=\\\"testjson\\\", msg=\\\"lala \\\\uf04a\\\", namespace=\\\"default\\\", pod=\\\"testjson-txxdv\\\", stream=\\\"stdout\\\"}: 1:238: parse error: error unquoting string \\\"\\\\\\\"lala \\\\\\\\uf04a\\\\\\\", namespace=\\\\\\\"\\\": invalid syntax\\n\" ws: false; Accept: application/json, text/plain, */*; Accept-Encoding: gzip, deflate; Accept-Language: en-US,en;q=0.5; Content-Type: application/json; Dnt: 1; User-Agent: Grafana/6.7.0; X-Forwarded-For: 127.0.0.1, 127.0.0.1; X-Grafana-Org-Id: 1; "
@stale
Copy link

stale bot commented Jan 9, 2021

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale A stale issue or PR that will automatically be closed. label Jan 9, 2021
@patrickhuy
Copy link
Contributor Author

Can something be done here? @cyriltovena

@stale stale bot removed the stale A stale issue or PR that will automatically be closed. label Jan 11, 2021
@stale
Copy link

stale bot commented Feb 13, 2021

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale A stale issue or PR that will automatically be closed. label Feb 13, 2021
@marlic7
Copy link

marlic7 commented Feb 14, 2021

The same problem is if message contains "\x00"

Minimal message inserted into Loki with this error: {"message":"some string\u0000"}

@stale stale bot removed the stale A stale issue or PR that will automatically be closed. label Feb 14, 2021
@cyriltovena
Copy link
Contributor

@dannykopping is looking into this.

@dannykopping
Copy link
Contributor

@patrickhuy this looks like a bug in the Prometheus parser which we use to parse our labels; I'll hopefully find a fix for this soon and submit it for review. I've managed to isolate this issue in a failing test against HEAD of Prometheus.

In the meantime, a workaround for you could be to use a new feature merged in recently (#3280) which allows for the selective extraction of labels using the JSON pipeline; you could extract only the labels you care about, provided one of them does not contain this quoted UTF8 value.

The problem, as far as I can tell, relates to this code:

func (ls Labels) String() string {
	var b bytes.Buffer
	b.WriteByte('{')
	for i, l := range ls {
		if i > 0 {
			b.WriteByte(',')
			b.WriteByte(' ')
		}
		b.WriteString(l.Name)
		b.WriteByte('=')
		b.WriteString(strconv.Quote(l.Value))  // <----- this
	}
	b.WriteByte('}')
	return b.String()
}

When your value is run through strconv.Quote it gets transformed into \uf04a, and this triggers a bug in the Prometheus label parser which thinks that the backlash in that value is escaping a string - which it is not.

@dannykopping
Copy link
Contributor

@dannykopping
Copy link
Contributor

@patrickhuy this has been merged in Prometheus, but it may take a few days for the changes to be pulled into Loki due to some conflicts in other dependencies.

@owen-d
Copy link
Member

owen-d commented May 6, 2021

Should be fixed now. We've revendored Prometheus after the upstream change 🎉

@owen-d owen-d closed this as completed May 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants