-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use concurrency to check mangled passwords #8
base: master
Are you sure you want to change the base?
Conversation
pws = []struct { | ||
pw string | ||
expected error | ||
rating uint | ||
}{ | ||
// include values from pass in tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of including these values here (again)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a reference to a password known to be valid is useful for benchmarks because they never exit earlier and need to go though every check we perform.
We could search the pws
slice for a valid password when we need one, or make a convention that the first password of the slice is valid. I thought being explicit would make things more clear. I.e. pass.valid vs pws[0].pw
The values of the pass
struct are then inserted into the pws
slice to make sure they remain valid. If a future change caused pass.valid
to be invalid, tests would fail, indicating that benchmark numbers are not trustworthy.
if dist := smetrics.WagnerFischer(word, revpw, 1, 1, 1); dist <= v.options.MinDist { | ||
return &DictionaryError{ErrMangledDictionary, word, dist} | ||
select { | ||
case queue <- struct{}{}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I fully understand the purpose of this queue. Couldn't we just keep launching a couple of worker threads in a pool and feed them with all the words from the wordlist? I think it may turn out a bit more readable. Let me know if you already tried that but this queue simply outperforms it 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried some approaches, e.g. using x/sync/errgroup
, and this was the simplest way I found to make it concurrency bounded with the ability to cancel on the first error found.
The queue
is just a semaphore that keeps us from spawning more than njobs
goroutines simultaneously.
The benchmark should run with the same input each time in order to obtain a meaningful average. Let's also benchmark the cases where hash verification is enabled and use a valid password as input since they cover more checks. CheckHIBP is intentionally left out from this benchmark as network factors could fudge the results.
Indexing only happens once per validator. Let's benchmark indexing independently so we can ignore its duration in other benchmarks.
In a future commit we will improve this function by converting into go routines. Introducing the benchmark now is convenient to provide a baseline for comparison.
10c7756
to
f5855f9
Compare
Sorry @rafasc, I have never seen any notification that you responded to my comments 😞 ...and so I completely missed your latest changes. |
This is part of an attempt to improve crunchy's performance.
This PR focuses on improving the mangled password checks by using concurrency to speed up the search.
Hashing will be next