Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve query sanitization to prevent a password leak in the logs #6421

Merged
merged 1 commit into from
Apr 22, 2016

Conversation

jsternberg
Copy link
Contributor

Sanitizing is now done through pattern matching rather than parsing the
query and replacing the password in the query. This prevents
accidentally redacting the wrong part of a query and revealing what the
password is through association.

Fixes #3883.

@jsternberg jsternberg added this to the 0.13.0 milestone Apr 19, 2016
// This function works on the raw query and attempts to retain the original input
// as much as possible.
func Sanitize(query string) string {
sanitizeSetPassword := regexp.MustCompile(`(?i)password\s+for[^=]*=\s+(["']?[^\s"]+["']?)`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a little strange to compile the regexp on every call to Sanitize. Any reason for not pulling it up to a package-level, unexported var?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be. Let me move it. This was just easier while I was writing the code.

@jsternberg jsternberg force-pushed the js-3883-sanitize-query branch 3 times, most recently from 23e9e02 to a0af340 Compare April 19, 2016 20:11
@gunnaraasen
Copy link
Member

I like the regex approach, but will running two regexes against every query add significant overhead?

@@ -257,23 +257,17 @@ func (h *Handler) serveQuery(w http.ResponseWriter, r *http.Request, user *meta.
p := influxql.NewParser(strings.NewReader(qp))
db := q.Get("db")

// Sanitize the request query params so it doesn't show up in the response logger.
// Do this before anything else so a parsing error doesn't leak passwords.
sanitize(r)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since query.Statements is no longer used, I think sanitize can be called in the response logger.

https://github.com/influxdata/influxdb/blob/master/services/httpd/response_logger.go#L94

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to run the sanitizer for non-query requests. The response logger doesn't seem to be limited to just serveQuery.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The response logger feels like a more natural fit for this code than the query handler. Other requests won't have a q query parameter but it could also be skipped with an if r.URL.EscapedPath() == "/query" check.

@jsternberg
Copy link
Contributor Author

I'm not sure. The golang regex library is based off of re2 rather than perl regular expressions so it's generally much more efficient.

The advantage of this approach is that the query doesn't have to parse correctly. I made mistakes while testing and did this:

CREATE USER "admin" WITH PASSWORD "admin"

This wouldn't get redacted because it resulted in a parse error and the password just got printed to the log.

@jsternberg
Copy link
Contributor Author

Added a benchmark of the Sanitize function.

PASS
BenchmarkSanitize-8       200000             11789 ns/op            1360 B/op         11 allocs/op

@gunnaraasen
Copy link
Member

gunnaraasen commented Apr 20, 2016

I like the regex approach and that it covers mistyped queries. Just wanted to mention the potential performance impact since the regexes are run on every query and the number of queries containing a password is very small.

I ran some quick tests of 10,000 SELECT queries on an empty database using influx_stress and there was no noticeable difference in response times.

@jsternberg
Copy link
Contributor Author

I tried a simple test using influx_stress and got this:

# Before this change
Total Requests: 2000
        Success: 2000
        Fail: 0
Average Response Time: 76.85802ms
Points Per Second: 437187

Total Queries: 250
Average Query Response Time: 23.803755ms

# With this change
Total Requests: 2000
        Success: 2000
        Fail: 0
Average Response Time: 80.204933ms
Points Per Second: 427885

Total Queries: 250
Average Query Response Time: 26.553812ms

It seems like the average query response time gets slowed down by about 3ms. @jwilder thoughts?

@jwilder
Copy link
Contributor

jwilder commented Apr 22, 2016

LGTM 👍

@e-dard
Copy link
Contributor

e-dard commented Apr 22, 2016

Go's regex package is very slow, is increasing the mean query latency by 12% acceptable?

Sanitizing is now done through pattern matching rather than parsing the
query and replacing the password in the query. This prevents
accidentally redacting the wrong part of a query and revealing what the
password is through association.

Fixes #3883.
@jwilder
Copy link
Contributor

jwilder commented Apr 22, 2016

@e-dard I notice that much variation w/ influx_stress stats it reports all the time... even without changes to the code.

@jsternberg
Copy link
Contributor Author

I'll merge this then and we'll update sanitize if it turns out to be bad for performance.

@jsternberg jsternberg merged commit d55fb1b into master Apr 22, 2016
@jsternberg jsternberg deleted the js-3883-sanitize-query branch April 22, 2016 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants