Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use concurrency to check mangled passwords #8

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

rafasc
Copy link
Contributor

@rafasc rafasc commented May 5, 2020

This is part of an attempt to improve crunchy's performance.

This PR focuses on improving the mangled password checks by using concurrency to speed up the search.

$ benchstat before after 
name                   old time/op  new time/op  delta
ValidatePassword-8      41.3s ± 1%   29.7s ± 1%  -28.12%  (p=0.000 n=8+8)
FoundInDictionaries-8   11.8s ± 1%    4.4s ± 1%  -62.24%  (p=0.000 n=8+8)

Hashing will be next

@coveralls
Copy link

Coverage Status

Coverage increased (+0.3%) to 93.902% when pulling 10c7756 on rafasc:ra/goroutine-mangled-check into 98c5f9f on muesli:master.

pws = []struct {
pw string
expected error
rating uint
}{
// include values from pass in tests
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of including these values here (again)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a reference to a password known to be valid is useful for benchmarks because they never exit earlier and need to go though every check we perform.

We could search the pws slice for a valid password when we need one, or make a convention that the first password of the slice is valid. I thought being explicit would make things more clear. I.e. pass.valid vs pws[0].pw

The values of the pass struct are then inserted into the pws slice to make sure they remain valid. If a future change caused pass.valid to be invalid, tests would fail, indicating that benchmark numbers are not trustworthy.

if dist := smetrics.WagnerFischer(word, revpw, 1, 1, 1); dist <= v.options.MinDist {
return &DictionaryError{ErrMangledDictionary, word, dist}
select {
case queue <- struct{}{}:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I fully understand the purpose of this queue. Couldn't we just keep launching a couple of worker threads in a pool and feed them with all the words from the wordlist? I think it may turn out a bit more readable. Let me know if you already tried that but this queue simply outperforms it 😃

Copy link
Contributor Author

@rafasc rafasc May 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried some approaches, e.g. using x/sync/errgroup, and this was the simplest way I found to make it concurrency bounded with the ability to cancel on the first error found.

The queue is just a semaphore that keeps us from spawning more than njobs goroutines simultaneously.

rafasc added 4 commits May 16, 2020 20:45
The benchmark should run with the same input each time in order to
obtain a meaningful average.

Let's also benchmark the cases where hash verification is enabled and
use a valid password as input since they cover more checks.

CheckHIBP is intentionally left out from this benchmark as network
factors could fudge the results.
Indexing only happens once per validator. Let's benchmark indexing
independently so we can ignore its duration in other benchmarks.
In a future commit we will improve this function by converting into go
routines.

Introducing the benchmark now is convenient to provide a baseline for
comparison.
@rafasc rafasc force-pushed the ra/goroutine-mangled-check branch from 10c7756 to f5855f9 Compare May 16, 2020 19:48
@muesli
Copy link
Owner

muesli commented Oct 21, 2020

Sorry @rafasc, I have never seen any notification that you responded to my comments 😞 ...and so I completely missed your latest changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants