-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: check iterator length in from_base_*e
methods to prevent consuming large iters
#286
Conversation
from_base_*e
methods to prevent consuming large iters
I'd consider this intentional. Leading zeros are supported by the base convertor as they make mathematical sense. If you feed it an infinite string of zeros than I'd expect it to take infinite time. That's IMO not a bug but a consequence of the inversion of control created by Rust's iterators. If you give it an iterator that panics it will panic. If you give it an iterator that launches rockets, it will launch rockets. We can look into optimizing the leading zero case. Handling them should be O(n) time, O(1) space. |
It makes sense from a mathematical perspective. On the other hand, we can infer that it is never a sane user's intention to trigger this behavior. We also know that after the max digits is reached, any further non-zero digit is guaranteed to cause an overflow. It seems like we can/should use this knowledge to prevent or handle invalid use of the API |
There are sane usecase where this would be useful, as it's pretty common to store numbers with an excessive amount of leading zeros as a way of padding. For example Solidity's ABI likes to store U160 padded up to U256. It would be nice if we can parse the U160 directly from the whole hex string and have it handle overflow correctly. The alternative would be an intermediate U256 and try_into, or messing with substrings and forgetting to check that the lead is indeed zero. I'm proposing an alternate implementation of |
The sane usecases are bounded at (generously) 100 extra digits, and there's no sane usecase for infinitely looping. It's never good behavior for a library to hang indefinitely on valid input Maybe this is a place to use |
Can you give me an example of a well-written rust library that puts similar bounds on user provided iterators? For example fn main() {
println!("{:?}", core::iter::repeat(0_u64).min());
} And here it doesn't even need to hang, because it could short-circuit as soon as it sees Generally I don't consider this infinite looping in the library. This is entirely on the user of the API and expectations should be that iterators will be consumed in full unless noted otherwise. This is not Haskell with lazy execution. I do appreciate making APIs idiot-proof, but would not go as far as limit power users in what they can do. |
closing as there's no pressing need and I don't have a good plan anyway :) |
Draft PR for discussion purposes
max_digits
known to be bugged right now. not a high prio to fix unless we're confident this PR will go inMotivation
closes #279
Currently both
from_base_be
andfrom_base_le
accept more digits then can fit in the targetUint
. This leads to extra iteration/recursion. If all extra digits are 0, there is unbounded extra iteration/recursion, which results in degenerate cases:Solution
Compute
max_digits
and bound thefrom_base_*e
logic to operating on that many digits. Overflow if more digits are supplied.However, this is a breaking change, as
from_base_be
is used infrom_str_radix
andfrom_str
, and currently permits non-minimal string encodings.Open Question:
Can we both
Possible solutions:
size_hint
on the iterators, and error if no upper bound establishedN
extra 0s in addition to the meaningful digits. Error if more thanN
These would still be breaking, but only in much more rare cases. And we would probably be comfortable doing that as a patch
PR Checklist