Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove bias when generating secrets #92

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

pepve
Copy link

@pepve pepve commented Sep 7, 2017

Hi, thanks so much for this project! It's really helping me out. I looked over the code and found something that might be improved.

The way secrets were generated not all outcomes were equally likely. Here's wat happened:

  • A number of random bytes were generated.
  • Each byte was mapped to one of 83 "ascii" values (or 61 without symbols).
  • (Somehow the lowercase letter "j" was left out, or it would have been 84 and 62)
  • Mapping was done with floor(byte-value / 255 * (set-length - 1)), this caused heavy bias against the last item in the set. It should be floor(byte-value / 256 * set-length). Even then there's some bias in the mapping, because the bits just don't fit, set indices 0, 21, 42, and 63 are 33.3% more likely to occur.
  • That string was then encoded as hex or base32, but since only 83 (or 61) byte values were possible for each position in the source string, out of every 10 possible hex/base32 strings, only about three would be generated. That's a pretty heavy bias.

So now I do this:

  • Generate a number of random bytes,
  • For the hex and base32 outputs, just encode those bytes as hex and base32.
  • For the ascii output, use the (slightly improved) mapping function.
  • Now hex and base32 are purely unbiased, and ascii still has some bias in it, so I think it should be deprecated, or moved to base64.
  • One caveat, you can no longer save the ascii output on one occassion, translate it to base32 or hex yourself, and reuse it at a later occassion. Because now the base32 and hex outputs just have more random bits stuffed in them, which are discarded for the ascii output.

I hope this is all clear! Thanks in advance for looking at this.

@coveralls
Copy link

coveralls commented Sep 7, 2017

Coverage Status

Coverage remained the same at 100.0% when pulling 9aee69d on pepve:unbiased-secrets into e63b936 on speakeasyjs:master.

@jakelee8
Copy link
Collaborator

@markbao How likely will this change break downstream code?

@pepve
Copy link
Author

pepve commented Sep 11, 2017

I can't speak to closed source uses of Speakeasy, but I could only find one usage with possibly breaking code, in an unforked, unstarred, unwatched repository: https://github.com/pefish/common-express/blob/f3e44569b8eeb6430b7c7856a71ef213050056e4/utils/GoogleAuthenticator.js. I used this search: https://github.com/search?utf8=%E2%9C%93&q=speakeasy+generateSecret+ascii&type=Code.

@Panoplos
Copy link

Any reason this isn't merged, yet?

@awwong1
Copy link

awwong1 commented May 10, 2018

@Panoplos perhaps this code is no longer maintained?
@markbao

index.js Outdated
};

function encodeASCII (bytes, symbols) {
var set = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXTZabcdefghijklmnopqrstuvwxyz';
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

T appears twice. This is almost certainly a mistake (that existed before your PR). Replace second T with Y.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you added the j. :shipit: 💯

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I replaced the second T with a Y!

@chrisdl
Copy link

chrisdl commented Jul 12, 2018

on an unrelated note, I stumbled across this myself earlier today and was super upset. Good job @pepve finding and fixing this!

index.js Outdated
};

function encodeASCII (bytes, symbols) {
var set = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXTZabcdefghijklmnopqrstuvwxyz';

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mlogan
Copy link

mlogan commented May 8, 2020

@pepve FYI I merged this code into https://github.com/mlogan/squeakeasy and published the fork as a new npm module called squeakeasy, since this one appears to be truly abandoned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants