Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bn.powermod is painfully slow #172

Open
ItalyPaleAle opened this issue Apr 30, 2014 · 26 comments
Open

bn.powermod is painfully slow #172

ItalyPaleAle opened this issue Apr 30, 2014 · 26 comments

Comments

@ItalyPaleAle
Copy link

We are implementing a full SRP system for browser-based authentication.

On the server side, we are using "mozilla/node-srp", which uses "justmoon/node-bignum" for dealing with big integers. Speed is excellent, but that is not surprising given that node-bignum uses native code.

On the client side, however, we wanted to use SJCL, as we are already using this library for other tasks in the same project.
Turns out that SJCL has already some methods implemented for SRP, and we were writing the missing ones.

In particular, we had problems with calculating A = g^a. This is the code:

var group = sjcl.keyexchange.srp.knownGroup(2048)
// Get random number a
var secret1 = sjcl.random.randomWords(8, 10) // 32 bytes key
// Compute srpA = g^a
var x = sjcl.bn.fromBits(secret1)
var srpA = group.g.powermod(x, group.N)

The call to powermod above takes ~7s on Chrome 34, on OSX Mavericks on a 2010 MacBook Air, 2GHz Core i7. In certain cases, the web page inside Chrome crashes, forcing me to reload it.
On Safari, same machine, the browser hangs completely.

Changing the group from 2048 to 1024 improves the situation very slightly (~5s), but not enough.

Is there anything that could be done to improve performances of the big integer part of SJCL?

@ItalyPaleAle
Copy link
Author

Apparently, it's the whole Big Number implementation of SJCL that's painfully slow. Here's a JSPerf: http://jsperf.com/jsbnvsleemonvssjclbiginteger/5

@Bren2010
Copy link
Contributor

Can I ask what your use-case is...?

If you're using browser-based crypto, the connection should be over TLS. However, if don't want the user's password to be on or sent to the server at all ever, you could hash their password before it's sent. Hashing is certainly fast.

@ItalyPaleAle
Copy link
Author

Thank you for your interest.
We are developing a solution that implements end-to-end cryptography. In this case, it is unnecessary and potentially dangerous for the server to know the password (which is also the decryption key). For this reason, we were looking at SRP.
Of course, we could always hash the password on the client and then re-hash it in the server before storing in the database (using SHA-256 on the client and salted PBKDF2 on the server; on the client would need to use SHA-2 because it doesn't have the salt).
We feel that SRP could be safer, however, as it's designed to be completely "zero knowledge".
(Aside from this, we are still using TLS everywhere on the web application)

Anyway, whatever the use case is, I believe SJCL needs to improve a lot on the Big Integer side.
As you can see from the JSPerf above, on my system JSBN achieves 122 ops/s with the test code (testing different operations with big integers), Leemon around 32 and SJCL 0.71 (0.6% of the best).
(There are some differences among engines. For example, in the "powmod" method we found that Leemon performs better than JSBN with IE. In all cases, however, SJCL achieves insanely bad performances)

We altered the test code above to use JSBN to do the powermod function (and only for that), and the whole system is now really smooth (feels like instantaneous). This is compared with SJCL that took approximately 5-7s.
Of course, having to ship two different libraries is not ideal for us, but if we want to use SRP we have no other options.

@Bren2010
Copy link
Contributor

What I've done before is, when a client is encrypting data for themselves that they just want the server to store, they provide a hash of their password as authentication and use their raw password as the key for sjcl.encrypt. If you consider a hash a random oracle, then the server learns nothing about the user's password.

Would that work instead?

@ItalyPaleAle
Copy link
Author

Yes, that is more or less what I described above (maybe with too many words :) ), with some changes.

Before sending the password, the client hashes it with SHA-256.
The hash is sent over TLS to the server.
The server hashes again the password with PBKDF2 (with a salt unique for each user) and compares this hash with the one stored in the database.

The notes are:

  1. We need to hash the password twice. If we only hashed the password on the client and the server stored it as is, then the hash becomes somehow a "clear password". Meaning: an attacker who stole the hash could successfully start a session and access users' profiles and some non-encrypted data. The same attacker would not be able to see the encrypted data (as he stole the hash and not the password), but he would still obtain some (less-critical) data about the user.
  2. On the client, we need to use SHA-256 to hash the password. Indeed, if we used salted algorithms such as PBKDF2, the client authenticating would need to make a "pre-request" to obtain his/her own salt. (An alternative could be using PBKDF2 on the client with the same salt for all users, but that would be less effective).

Indeed, I suppose this is what we are going to do, given that SRP is not implementable with SJCL (and bundling another big integer library would be an overkill).

However, aside from what we are going to do, I still feel that SJCL needs to improve its BN component. Competitors are doing much better, and I'm sure there's plenty of room for improvement.

@Nilos
Copy link
Collaborator

Nilos commented Sep 14, 2014

Actually only powermod is really slow.
Everything else is only half the speed of jsbn (and double the speed of Leemon Baird)
See http://jsperf.com/jsbnvsleemonvssjclbiginteger/11

@heavensrevenge
Copy link

Glad you noticed my jsperf test showing only powermod was the problem @Nilos :)

@Nilos
Copy link
Collaborator

Nilos commented Oct 4, 2014

As far as I remember sjcl's biginteger library was optimised for elliptic curve operations and as such is slow for powermod.
Anyhow, if there is an obvious speed improvement for the powermod function, I would be happy to merge (or even code it).
It would be also awesome to merge back your srp so we can provide a full srp system to sjcl-users.

@ItalyPaleAle
Copy link
Author

Nilos, eventually we had to use a different library for SRP. I'm afraid SRP will not be usable with SJCL until the speed of powermod is increased by a big factor. I'm not being overly dramatic: powermod was crashing Chrome.

@danielstockton
Copy link

@EgoAleSum What library did you use in the end?

@ItalyPaleAle
Copy link
Author

JSBN for this part only

@Bren2010
Copy link
Contributor

@EgoAleSum Does pull request #208 give you times on par with what you're looking for?

@heavensrevenge
Copy link

@Bren2010 The new code is ~30x faster but it still doesn't seem to be on par with the others (~6.5x slower)
http://jsperf.com/jsbn-powermod is where I tested your code to compare and http://jsperf.com/jsbnvsleemonvssjclbiginteger/13 is my origional.

@ItalyPaleAle
Copy link
Author

Thank you @heavensrevenge for the JSPerf!
I've added http://jsperf.com/jsbn-powermod/2 with odd generators (see #208), and there powermod() (non-Montgomery) achieves basically the same speed as the montgomery one.

@Bren2010 the patch is terrific and the speed improvement is incredible. It's still far behind but seems a bit more usable now.

@Bren2010
Copy link
Contributor

@EgoAleSum powermod() is seeing that the modulus is odd and calling montpowermod() instead of continuing. That's the reason the benchmarks are the same. The parity of the generator shouldn't matter...

@heavensrevenge @EgoAleSum I did some cleaning and implemented sliding window exponentiation. It's still not as fast as JSBN, but it's better than it was before. (I think it's as fast as I can get it at the moment.)

Thank you both!!

@heavensrevenge
Copy link

not bad @Bren2010, your latest changes double the speed of the original code.
If need be you could always take example or tricks from JSBN but very good work thus far.

@ItalyPaleAle
Copy link
Author

Confirm that. From 0.58 ops/s on my laptop to ~25 ops/second. Still lags behind JSBN (135), but it's already 2 times or so Laemon Baird. It's certainly usable now!

Great work @Bren2010 , and it's improved even more!

@danielstockton
Copy link

👍 nice work

@heavensrevenge
Copy link

@federicobond
Copy link
Contributor

This is an old thread, but I wanted to chime in because I recently had to implement a system similar to the one described by @Bren2010 and @EgoAleSum, where the same password is used for both authentication and client-side encryption via sjcl.encrypt. It is certainly not optimal, but it provides a lot of convenience for end users.

While they are both right that hashing only client-side would effectively make the system equivalent to one that stores passwords in the clear, the solution they propose (hashing the password and sending that to the server) ignores the reason why we use key derivation functions like PBKDF2 instead of simple hashes. On modern hardware, it is not uncommon to be able to perform millions or even billions of hashes per second, so using the SHA-256 of the password for authentication is not much different in terms of security that using the raw password. The service provider could easily break plenty of passwords and access the encrypted payloads. The point of key derivation functions is that they are slow, and they must be because the password space is usually quite reduced, compared to hashing whole documents for example.

The proper solution is to also perform key derivation client-side on the authentication password (obviously with a different salt than the one used in the sjcl.encrypt payload). An alternative to obtaining the user salt by the server or using a fixed one is deterministically generating it using something else we know about the user. The only other piece of data we have available at every authentication endpoint is the username, so we can hash it to obtain the user salt. Salts do not need to be private, but they need to be different for every user (so that attackers cannot crack all passwords at the same time), so there is nothing wrong with generating them deterministically.

@ItalyPaleAle
Copy link
Author

@federicobond this issue was originally opened because we were trying to implement a system based on SRP for authentication, which completely removes the need to send the password to the server: http://srp.stanford.edu/ndss.html

@federicobond
Copy link
Contributor

Yes, I agree that SRP is probably a better solution. I left the comment to warn people of the pitfall I mentioned when implementing systems which share authentication and encryption passwords.

@Nilos
Copy link
Collaborator

Nilos commented May 30, 2016

From my point of view the most desirable system is one that combines SRP and a "key" derivation function because even though an SRP system does not store password equivalent data, one can still perform a brute force attack with reasonable speed on the data, whereas every "key" derivation function has a hardening factor, and one guess on a derivation function takes a long (parameterized) time compared to one guess against an SRP based system.

Combining SRP and derivation should be easy just do (pseudocode!):
password2 = derive(password, salt)
srp(password2)

@Nilos
Copy link
Collaborator

Nilos commented May 30, 2016

Still a fast bn.powermod would be desirable!

@ItalyPaleAle
Copy link
Author

@Nilos not just desirable, but if you consider the results from my test above (crashing a browser) is necessary. Otherwise SRP simply can't be done

@Nilos
Copy link
Collaborator

Nilos commented May 30, 2016

@EgoAleSum I am always happy to merge a PR :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants