Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extremely common word sequences? #63

Open
softwarecreations opened this issue Jul 13, 2021 · 5 comments
Open

Add extremely common word sequences? #63

softwarecreations opened this issue Jul 13, 2021 · 5 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@softwarecreations
Copy link

softwarecreations commented Jul 13, 2021

TLDR:
123456 is pretty much the most common password in the world and also has no entropy due to being an obvious sequence.
zxcvbn-ts falls on it's face with onetwothreefourfivesix, rating it as maximum strength.
Let's fix that?


Just an idea, not sure if this is commonly done with passwords.
But just like 123456789 or 987654321 or abcdefg, etc is seen as completely lacking entropy... what about

Months
januaryfebruarymarch
julyjunemay

Written numbers
onetwothree
nineeightseven

Seasons
springsummerautumn
winterspringsummer

Bible chapters
genesisexoduswhatever etc

Sizes
smallmediumlarge
largemediumsmall

Greek whatever
alphabeta etc

Phonetic alphabet
alphabravocharliedelta
tangosierraromeo

zxcvbn-ts currently thinks all this sort of junk is a strong password (might need to add an extra word in some cases, but normally 3-4 words, and it thinks you're golden), when you've basically got no entropy if you're using any of the above.

Obviously there's an endless amount of common sequences people could put into a password.
Like listing the characters of a popular tv series.

But I figured the categories I wrote above should be standard, because regardless of a person's preferences or personality, they'll deal with (or be familiar with) most, if not all of the above. With the exception of maybe awareness of the bible chapter names.

@softwarecreations softwarecreations changed the title Common word standard sequences Add extremely common word sequences? Jul 13, 2021
@MrWook MrWook added the enhancement New feature or request label Jul 13, 2021
@MrWook
Copy link
Collaborator

MrWook commented Jul 22, 2021

Hey, thanks for the suggestion.
I like the idea, the problem is that this would be a combination from the dictionary matcher and the sequence matcher.
Basically you need a dictionary for every language that has those sequences. For example like this:

{
  "numbers": [
    "one",
    "two",
    "three"
...
  ],
  "seasons": [
    "spring",
    "summer",
    "autumn",
    "winter"
  ]
...
}

This would mean you need to use the dictionary matcher to identify all those different words and then you need to use some kind of sequence matcher to go through all those matches to check if they are in a row.

I like the general idea of this but i don't see the solution right now. If you have an idea feel free to open a PR or create your own package, since 1.0.0-beta-0 custom matchers are possible but i think it would be easier to add it to the repo to reuse the dictionary and add a custom DictionarySequence matcher.

@softwarecreations
Copy link
Author

softwarecreations commented Jul 22, 2021 via email

@modest
Copy link

modest commented Aug 3, 2021

I had a similar, broader idea:

Since Dropbox kicked off this project, there have been some public leaks of unhashed password lists that should be game-changing data sources for a project like this. Instead of assuming that passwords use common words in the same frequency as written text ("you, to, it, that, ..."), we can rank them based on their actual usage in passwords.

Based on actual leaked password lists, we can improve entropy scoring based on (1) the popularity of the password structure (set of patterns; e.g. (word)(number)(symbol) > (symbol)(word)(symbol)) and (2) the rank/weight of each particular pattern within those sets (e.g. onetwothreefour > correcthorsebatterystaple). That first exercise – determining the entropy of the password structure itself – was waived by the original project due to lack of data.

Of course, this exercise is the same as improving the efficiency of a password cracker. But that was essentially the point of zxcvbn to begin with – to help password strength meters "catch up" to password cracking libraries.

(I understand that this fork is focused on cleanup, tech debt, and other higher priority things :) Hopefully it is flattering and not annoying that the suggestions are coming here now.)

@MrWook MrWook added the help wanted Extra attention is needed label Aug 4, 2021
@MrWook
Copy link
Collaborator

MrWook commented Aug 4, 2021

@modest this fork isn't just a clean up. I wanted to revive the project and the idea behind it because i think those password policies are plain up stupid.
I would love to see more matchers and contribution. Which means your idea could be a extendet version of the password dictionary as a separated matcher with a new password list.
Feel free to open an own issue for your idea and if you have the time you can even make a PR :)

@Tostino
Copy link
Contributor

Tostino commented Aug 13, 2021

@modest and @MrWook, I did a little bit of thinking on this today, and I agree keeping those as separate matchers (or at least different match passes) seems like the right way to go.
As said, using leaked password dictionaries and ranking by frequency is one attack vector that should have a set of scores associated with it (what we do today), and word matching by frequency is a totally different attack vector that needs to be scored an entirely different way to work properly.

I'd be interested in implementing this in Nbvcxz as well if there seems to be a consensus in how the algorithm should work, and appropriate scoring values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants