-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decrease limit for KDF execution time in test #781
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's interesting, so this test measures that KDF we use inside Themis for encryption/decryption with passphrase — is slow enough. KDF's security has a direct correlation with how many rounds it has ergo with processing performance. This performance is platform-dependent.
If this test fails for you, it means that KDF is calculated "too fast", meaning that we could add rounds and make it slower / more secure.
(Funny side: sometimes this test fails with "timeout error" on GithubActions server because KDF was calculated longer that test was waiting for results.)
I don't think that we'll change KDF's number of rounds anytime soon.
But out of curiosity — can you describe your laptop? Is it a new Apple MacBook M1?
Hm... No other platform seems to do timing tests on KDF usage. But indeed it is platform-dependent. I don't really remember hard motivation for adding this test specifically. I'm not against removing it altogether if it causes issues. Flaking tests is one of the worst kind of tests. My vote is to drop the test altogether, with the next preference to merge this PR if removing the test is unacceptable. One of reasons I can envision for a test like this is to serve as an indicator for when the KDF iteration count should be increased. Now that I've got to git blame, here's what it says:
So... uh... technical progress seems to have happened. Taking benchmarks on CI is arguably a methodologically wrong approach. CI runners have unstable performance so any absolute numbers that you derive by executing stuff there are mostly moot. If anything, this benchmark should probably ensure that KDF is slower than non-KDF by a factor of N, or something. A relative benchmark like this will be more robust. But again. The test does not seem to serve a useful purpose. Themis does not have a policy on bumping KDF iteration count so I don't really know what we're measuring here. For now, the iteration count is managed on arbitrary basis. And if we're updating this parameter basically whenever we feel like it, then I guess we shouldn't be lying to ourselves by trying to make it look like a scientific decision with this test. If we are going to be serious about it, we should have proper benchmarks for Wasm runtime, and base the decision to bump KDF iteration count on that. There are benchmarks for native code. I guess they could be compiled to Wasm with some elbow grease. However, it's certainly out of scope for this PR, along with defining the policy about KDF parameters. P.S.
Sure, it's silvery and rectangular. Also, I call preconceived notions and desktop user bias on you! /s |
Thats why I made the effort to open this issue :) Sharing your view!
Just a not-so-old ThinkPad T490 clocking up to 4.6GHz. So it is faster than a CI should be.
I agree that this is the wrong place for timing tests. Also tests which involve timing generally should not exist. Is there already infrastructure for benchmarking? I think benchmarking should be integrated in the whole develop-test-release cycle if it should have a meaningful impact.
If we do not have a replacement for measuring KDF duration when I would say we keep this test to have a reminder in the future that we need some replacement. If we remove it we will not have the possibility to use git blame on a failing test ;) |
There are some tools for benchmarking Themis Core code in
So it exists but not in a form that can be immediately used for WebAssembly. My current mental model is
Not something that one can immediately act upon :) And then there are various ways that you can run WebAssembly: Node, WASM, WASI... |
We should not remove testing that KDF slow enough because it's expected behavior. But we can parameterize our tests and allow to override values for different test environments. We can use values from env variable/config + default value (for our environment + backward compatibility). |
The problem is, that is is quite impossible to choose a value as this is hardware dependent. Timing in tests is always a bad idea. I just randomly halved the value of 200ms to get the test working. We can use a test runner for benchmarking though. Like if you run What do you think to split the current tests into 2 test suits? |
correct but insecure :) I agree with @Lagovas , I think we should parametrize this test and use default value for tests on CI while preserving ability to override default for testing in other environments. When this test will start failing everywhere, we should think about increasing KDF rounds count across Themis. Right now we use PBKDF2 with 200'000 rounds while NIST recommends 10'000 .. 100'000. For some tests (like fuzzing) we decreased number of rounds to 10. Also, a note: right now we discuss this test under assumption that it takes a little time because of KDF rounds count, while the real reason might be different. |
I'd like to agree on this with @maxammann:
Absolute time it takes to process data depends on hardware and workload. Give me a time limit and I can tweak CPU allocation to the process so that it exceeds the limit. Give me the code and I can game the time limit so that the computation stays within that limit. Talking about benchmarks without taking the environment into consideration is kinda meaningless. The only reason for 200 ms being there: that's the magic number that makes CI pass in whatever environment is there. If we take that as a baseline – i.e., Themis is fine if CI is green – then there is no need to change this number. For one, changing that number – lower time bound for a test, adjusted in this PR – does not make Secure Cell more or less secure. If anything, we should start questioning whether there is a need to change the different number – iteration count – because apparently Secure Cell is not slow enough at least for one person's machine out there. But for that we need to first define how much is “enough”. If the attacker would compute PBKDF2 on an ASIC or GPU then even our current choice of iteration count might turn out to be woefully inadequate (as is probably the choice of PBKDF2 in the first place, if you include ASICs into the threat model). But bumping the default iteration count to 10,000,000 would probably make users more angry than grateful. If we're going to have this discussion, then I think it should be moved somewhere else. GitHub has this new Discussions thingie. Maybe give it a try? |
Now, addressing the feedback...
Which one of those will be running on the CI? I guess, only the “tests” – the one which verify correctness – would be run on CI, and as a part of every PR. While some “benchmarks” might be run by default when a developer run the test suite, or it could be completely opt-in. (As proper benchmarks could take a non-negligible amount of time to collect data.) @vixentael and @Lagovas,
If you ask me, it's already parameterized enough: there's this number in tests that you can adjust if the test fails for you. Exporting it into environment variable does not really make things easier for a developer who is running those tests locally. It's not something that you'd need to tweak between test runs or something.
I'd like to point out that WasmThemis is probably the only environment where such test exists. I don't remember anything like that elsewhere. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the essence of this PR,
- I find that CI runs fine with this change applied.
- I find that CI runs fine without this change applied.
- I find that for CI this change does not include much benefit.
- I find that for CI this change does not include much harm.
- I find that this change does not entail much maintenance burden.
- I find that this change makes life better for at least one person by making the test suite fail less often for them.
So my disposition here is to merge.
As for the other questions raised here, I'd like to discuss them elsewhere, not in some random PR.
You're all welcome to raise more 👍 |
Executing
emmake make test_wasm
failed for me locally with emsdk2.0.13
.This PR should be more a small discussion whether this test really makes sense if it is platform dependant.
Checklist