Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoding BigUInt causes unbound allocation #839

Open
ggwpez opened this issue Jun 19, 2024 · 3 comments
Open

Decoding BigUInt causes unbound allocation #839

ggwpez opened this issue Jun 19, 2024 · 3 comments

Comments

@ggwpez
Copy link
Contributor

ggwpez commented Jun 19, 2024

Decoding a BigUint internally decodes a Vec<u8> here:

Ok(BigUint::from_bytes_le(&Vec::<u8>::deserialize_with_mode(

which then in turn calls Vec::with_capacity with the decoded length:

let mut values = Vec::with_capacity(len);

This can cause a panic if that value is unreasonable large.

A possible test vectors:

#[test]
fn test_biguint_deserialize_bad_data1() {
    let data = [0xe9, 0xc7, 0x55, 0x82, 0x52, 0x038, 0xc1, 0x09];
    // Panics: memory allocation of 702904943871641577 bytes failed
    let _ = BigUint::deserialize_compressed(&data[..]);
}

#[test]
fn test_biguint_deserialize_bad_data2() {
    let data = [0xe9, 0xc7, 0x55, 0x82, 0x52, 0x038, 0xc1, 0x90];
    // Panics: capacity overflow
    let _ = BigUint::deserialize_compressed(&data[..]);
}

Honestly, i am not sure if this is even efficiently fixable. I have the same issue in another generic decoding library, and we are currently trying to have a special DecodeWithMemLimit kind of thing, that limits total allocations of a decode call.
The trivial approach of just checking len does not work, since types can be recursive. Nested vectors will inevitably overflow any artificial restriction.

Not sure what to do on untrusted input for production software, i would probably isolate the decoding into a separate process.

@burdges
Copy link
Contributor

burdges commented Jun 20, 2024

Is BigUInt: CanonicalDeserialize used anywhere? It'd mess up serialization lengths if used anywhere. And BigUInt looks largely internal in https://github.com/search?q=repo%3Aarkworks-rs%2Falgebra%20BigUInt&type=code

You found this fuzzing code we envision deploying? If unused, we could maybe deactivate BigUInt: CanonicalDeserialize using some feature?

Ideally, arkworks could migrate to crypto_bigint since that's more careful code, but that's likely a big task. And crypto_bigint::UInt: CanonicalDeserialize changes the dependency tree slightly.

@burdges
Copy link
Contributor

burdges commented Jun 20, 2024

I think most code uses BigInt via Fp, including afaik all scalars and points. I'd expect the story goes: You'd want const generics to make a BigInt slightly bigger, so instead BigUInt uses a Vec. It's odd naming though, so maybe some other history else happens BigUInt, and maybe BigUInt could be converted to work like BigInt.

Afaik, all those usages should all be ephemeral, meaning BigUInt should not appear in wire formats, and simply disabling BigUInt: CanonicalDeserialize works fine, but cannot test right now.

@ggwpez
Copy link
Contributor Author

ggwpez commented Jun 20, 2024

You found this fuzzing code we envision deploying? If unused, we could maybe deactivate BigUInt: CanonicalDeserialize using some feature?

I did not see it exposed directly in types that i wanted to use, but also did not check everything. Vec and VecDequeu have the same issue, but probably also not exposed i guess?
For now its not a real issue to me, but something to be aware of for developers i guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants