Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(httpext): remove user credentials from metric tags #1132

Merged
merged 3 commits into from
Sep 26, 2019

Conversation

imiric
Copy link
Contributor

@imiric imiric commented Aug 29, 2019

This fixes the credential leak in HTTP metric tags.

Closes #1103

@imiric imiric requested review from mstoykov, na-- and cuonglm August 29, 2019 15:47
lib/netext/httpext/request.go Outdated Show resolved Hide resolved
lib/netext/httpext/request_test.go Outdated Show resolved Hide resolved
lib/netext/httpext/request_test.go Outdated Show resolved Hide resolved
js/modules/k6/http/request_test.go Outdated Show resolved Hide resolved
js/modules/k6/http/request_test.go Outdated Show resolved Hide resolved
cuonglm
cuonglm previously approved these changes Aug 29, 2019
lib/netext/httpext/request.go Outdated Show resolved Hide resolved
@codecov-io
Copy link

codecov-io commented Aug 29, 2019

Codecov Report

Merging #1132 into master will increase coverage by 0.05%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1132      +/-   ##
==========================================
+ Coverage   73.59%   73.65%   +0.05%     
==========================================
  Files         145      144       -1     
  Lines       10575    10544      -31     
==========================================
- Hits         7783     7766      -17     
+ Misses       2333     2321      -12     
+ Partials      459      457       -2
Impacted Files Coverage Δ
lib/netext/httpext/request.go 97.29% <100%> (+0.25%) ⬆️
lib/netext/httpext/transport.go 96.2% <100%> (+0.04%) ⬆️
lib/timeout_error.go 0% <0%> (-75%) ⬇️
cmd/root.go 34% <0%> (-2.37%) ⬇️
stats/cloud/collector.go 70.38% <0%> (-0.63%) ⬇️
lib/netext/httpext/error_codes.go 97.05% <0%> (-0.05%) ⬇️
cmd/configdir_go113.go
lib/testutils/httpmultibin/httpmultibin.go 93.93% <0%> (+1.4%) ⬆️
js/runner.go 84.73% <0%> (+2.63%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 19c6796...ed6e0e7. Read the comment docs.

lib/netext/httpext/request.go Outdated Show resolved Hide resolved
@imiric imiric force-pushed the fix/1103-credentials-leak-metrics branch 4 times, most recently from 5bf47fe to 599112f Compare August 30, 2019 15:59
}
clean := strings.Replace(u.String(), old, new, 1)
// Replace back any literal `${}` placeholders escaped by `u.String()`
return strings.Replace(clean, `$%7B%7D`, `${}`, -1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a bit of a nasty hack 😄 In general, I slightly dislike that we're having to parse the previously parsed URL here and do a bunch of shifty string operations to sanitize it. This shouldn't noticeably affect performance, even when we do it twice for each URL in the script, bit it just seems like a waste, especially given the fact that most URLs probably won't have credentials...

Why are you reluctant to mess with the NewURL() constructor https://github.com/loadimpact/k6/blob/d385166a821a8cd98a4ab2c158b3982d87ee61a5/lib/netext/httpext/request.go#L59-L62
It seems like it would be relatively easy to just add another SanitizedURL property or something like that, which can just be the same string as the URL if there's no authentication, and sanitized from the already parsed URL if there is. In my head that won't be more difficult to trace than the current implementation, so I don't think it's a case of premature optimization, but I might be wrong 😄 @mstoykov, thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am of the opinion that given the amount of strings that the above code will generate I am pretty sure it will have performance impact if we do it for ALL urls. IMHO we should at least check if we need to do anything at all before we even parse it back from string to URL ... given that we already have it as a URL.
Also given that .. I am pretty sure it would be better if we just copy the URL , change the user and password and string it back ...
And ... isn't Name supposed to stay if it's set by the user and if it's not it's the same as the URL so we don't need to do the work twice ?
As a whole I think that maybe taking a step back and working directly with the URLs will be better than what you are currently trying to do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you reluctant to mess with the NewURL() constructor

I guess I was trying to introduce the least amount of changes while I'm getting familiar, but you're both right that it would be more efficient to create a clean URL in the constructor once and avoid parsing it every time the tags are set.

I'll refactor this again, thanks for the feedback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@na-- @mstoykov Take a look at the latest commit 6ab3603. I suppose it's cleaner and more efficient, but I don't like that httpext.URL now wraps 4 representations of potentially the same URL. It's a bit messy to me, but I'm not familiar enough with the codebase to make a concrete suggestion for fixing it.

@na-- na-- added this to the v0.26.0 milestone Sep 4, 2019
@imiric imiric force-pushed the fix/1103-credentials-leak-metrics branch from 599112f to 6ab3603 Compare September 5, 2019 11:49
newURL := URL{u: u, Name: name, URL: urlString}
newURL.CleanURL = newURL.Clean()
if urlString == name {
newURL.Name = newURL.CleanURL
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This handles this use case.

@imiric imiric requested review from na-- and cuonglm September 11, 2019 07:50
cuonglm
cuonglm previously approved these changes Sep 11, 2019
@mstoykov
Copy link
Contributor

mstoykov commented Sep 24, 2019

Again I did some benchmarking:

$ go test -v -bench=. -benchmem
goos: linux
goarch: amd64
pkg: github.com/loadimpact/k6/b
BenchmarkA-4     3000000               546 ns/op             144 B/op          7 allocs/op
BenchmarkB-4     5000000               318 ns/op              56 B/op          3 allocs/op
PASS
ok      github.com/loadimpact/k6/b      4.188s

Here B is my implementation where ... I set URL.User to USER@PASS and than let URL.String() does what it does and revert the change.
Unfortunately * gets escaped but given the performance difference and that this will be a hot path in tests with credentials ... I would argue it's worth it.
I haven't tested if this is threadsafe but it looks like it ;) though I would recommend testing it ;)
edit; update the benchmark to use parallel execution and ran with -race so atleast this parts of the code are fine ;)
More benchmark runs:

BenchmarkA-4     5000000               361 ns/op             144 B/op          7 allocs/op
BenchmarkA-4     5000000               321 ns/op             144 B/op          7 allocs/op
BenchmarkA-4     3000000               363 ns/op             144 B/op          7 allocs/op
BenchmarkA-4     3000000               372 ns/op             144 B/op          7 allocs/op
BenchmarkA-4     3000000               379 ns/op             144 B/op          7 allocs/op
BenchmarkB-4    10000000               200 ns/op              56 B/op          3 allocs/op
BenchmarkB-4     5000000               205 ns/op              56 B/op          3 allocs/op
BenchmarkB-4    10000000               175 ns/op              56 B/op          3 allocs/op
BenchmarkB-4    10000000               216 ns/op              56 B/op          3 allocs/op
BenchmarkB-4     5000000               231 ns/op              56 B/op          3 allocs/op

na--
na-- previously approved these changes Sep 24, 2019
@imiric
Copy link
Contributor Author

imiric commented Sep 24, 2019

Thanks for the benchmark @mstoykov. It does make sense to consider setting User directly, that didn't occur to me. I'll see if we can do something about "*" being replaced, otherwise it might make sense to use a generic "user:pass" instead, because of the performance gains.

@imiric
Copy link
Contributor Author

imiric commented Sep 25, 2019

@mstoykov Unfortunately, the implementation can't be that simple. We can use - instead of * for masking symbols, so that's not a problem, but if we use URL.String() directly our ${} placeholders also get escaped:

        Error:          Not equal:
                        expected: "https://----:----@example.com/${}/${}"
                        actual  : "https://----:----@example.com/$%7B%7D/$%7B%7D"
        Test:           TestURL/Clean/https://user:pass@example.com/${}/${}

And dealing with that issue also gets messy and expensive:

	if u.u != nil && u.u.User != nil {
		tmpU := *u.u
		tmpU.User = maskUserInfo
		tmpU.Path = ""
		q := ""
		if tmpU.RawQuery != "" {
			q = "?" + tmpU.RawQuery
			tmpU.RawQuery = ""
		}
		out = fmt.Sprintf("%s%s%s", tmpU.String(), u.u.Path, q)
	}

There might be a more performant way of doing the above, but on my machine it's only a marginal improvement over the original stringified approach (BenchmarkC):

BenchmarkA-12           10000000               198 ns/op             144 B/op          7 allocs/op
BenchmarkB-12           20000000                97.5 ns/op            56 B/op          3 allocs/op
BenchmarkC-12           10000000               172 ns/op             104 B/op          5 allocs/op

... and more things could go wrong in this version, since it's re-assembling the URL.

I'm not sure. Let me know if you have any suggestions.

@mstoykov
Copy link
Contributor

@imiric I published a new version which is even faster ;) :

BenchmarkA-4     5000000               348 ns/op             144 B/op          7 allocs/op
BenchmarkA-4     3000000               443 ns/op             144 B/op          7 allocs/op
BenchmarkA-4     3000000               464 ns/op             144 B/op          7 allocs/op
BenchmarkA-4     3000000               551 ns/op             144 B/op          7 allocs/op
BenchmarkA-4     5000000               359 ns/op             144 B/op          7 allocs/op
BenchmarkB-4    10000000               248 ns/op              56 B/op          3 allocs/op
BenchmarkB-4     3000000               369 ns/op              56 B/op          3 allocs/op
BenchmarkB-4    10000000               327 ns/op              56 B/op          3 allocs/op
BenchmarkB-4     5000000               229 ns/op              56 B/op          3 allocs/op
BenchmarkB-4    10000000               251 ns/op              56 B/op          3 allocs/op
BenchmarkC-4    20000000                74.4 ns/op            32 B/op          1 allocs/op
BenchmarkC-4    20000000                57.8 ns/op            32 B/op          1 allocs/op
BenchmarkC-4    20000000                56.7 ns/op            32 B/op          1 allocs/op
BenchmarkC-4    30000000                61.8 ns/op            32 B/op          1 allocs/op
BenchmarkC-4    30000000                71.8 ns/op            32 B/op          1 allocs/op

@imiric
Copy link
Contributor Author

imiric commented Sep 25, 2019

@mstoykov That's great, but we can't rely on strings.Index(out, "@"), as "@" can technically be unescaped elsewhere.

I'm all for improving the performance here, but I'm not sure we can escape the penalty if credentials are defined and we want to do this safely.

@na-- @cuonglm Thoughts?

@mstoykov
Copy link
Contributor

@imiric but the first one will be the one we care about ;) (also not sure if you can have an unescaped one but am too lazy to go and read the RFC to check, but it doesn't matter) Also I happen to like my code ...more ;)

@cuonglm
Copy link
Contributor

cuonglm commented Sep 25, 2019

@mstoykov That's great, but we can't rely on strings.Index(out, "@"), as "@" can technically be unescaped elsewhere.

I'm all for improving the performance here, but I'm not sure we can escape the penalty if credentials are defined and we want to do this safely.

@na-- @cuonglm Thoughts?

I would agree with @mstoykov, strings.Index(out, "@") is good enough. We should go with the most simple and fastest. If the url input has @ escaped, then it's probably not a normal user 😄

Ivan Mirić added 2 commits September 25, 2019 17:33
This version performs ~8x faster with ~4x less memory usage than the previous version.
Thanks @mstoykov!

See #1132 (comment)
@imiric imiric dismissed stale reviews from na-- and cuonglm via ed6e0e7 September 25, 2019 15:44
@imiric imiric force-pushed the fix/1103-credentials-leak-metrics branch from 6ab3603 to ed6e0e7 Compare September 25, 2019 15:44
@imiric imiric requested review from cuonglm and na-- September 25, 2019 15:59
@imiric imiric merged commit 731e55b into master Sep 26, 2019
@imiric imiric deleted the fix/1103-credentials-leak-metrics branch September 26, 2019 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Credentials leak in HTTP metric tags
6 participants