Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-canonical values when deserializing large integers from 32 bits arch on a 64 bits arch #148

Closed
bclement-ocp opened this issue Dec 1, 2023 · 5 comments · Fixed by #149

Comments

@bclement-ocp
Copy link

It is known that deserializing on a 32 bits machine a value in the range [2^31 .. 2^63[ that was serialized on a 64 bits machine will fail, because the value is not representable as an OCaml integer on 32 bits machines.

On the other hand, deserializing on a 64 bits machine a value in that range that was serialized on a 32 bits machine (including js_of_ocaml) will create a non-canonical value with semantics-breaking behavior (for instance, Z.(equal (x * one) x) does not hold).

Not sure if there is anything that can be done about it (I don't think deserialization code can fail), but it might be worth documenting.

@xavierleroy
Copy link
Contributor

Well spotted, thank you for reporting this corner case.

It might be possible to fail at deserialization time in this case, by reporting an intentionally wrong size, but the failure message will be obscure ("input_value: incorrect length of serialized custom block").

Ideally, the custom block would be converted to an unboxed integer during deserialization, but this cannot be implemented with the current deserialization API.

@bclement-ocp
Copy link
Author

It might be possible to fail at deserialization time in this case, by reporting an intentionally wrong size, but the failure message will be obscure ("input_value: incorrect length of serialized custom block").

I didn't think about that — I'd say an obscure error message during deserialization is still better than silently incorrect behavior later in the program due to unexpected semantics.

Ideally, the custom block would be converted to an unboxed integer during deserialization, but this cannot be implemented with the current deserialization API.

This was my initial thought as well, but the deserialization API clearly was not built with arch-dependent representations in mind (you probably know this better than me).

@pascal-cuoq
Copy link
Collaborator

For what it's worth, I think that the case of an integer that was allocated on a 32-bit machine when it was serialized and is then unserialized on a 64-bit machine may already be supported in unmarshall_z.ml, because of this line:

let n = Z.of_bits (Bytes.to_string str) in

unmarshal_z.ml is a file written to add support for Zarith integers to unmarshal.ml, an unofficial deserializer written in OCaml for values serialized with OCaml's serializer.

@xavierleroy
Copy link
Contributor

For the record: it's perfectly possible to fail cleanly from a custom deserializer, using caml_deserialize_error :-)

@bclement-ocp
Copy link
Author

For the record: it's perfectly possible to fail cleanly from a custom deserializer, using caml_deserialize_error :-)

Ah! I failed to properly read the manual, it seems ;) I was misled by this paragraph:

Note: the finalize, compare, hash, serialize and deserialize functions attached to custom block descriptors must never access the OCaml runtime. Within these functions, do not call any of the OCaml allocation functions, and do not perform a callback into OCaml code. Do not use CAMLparam to register the parameters to these functions, and do not use CAMLreturn to return the result. Do not raise exceptions, do not remove global roots, etc.

I opened ocaml/ocaml#12819 to mention caml_deserialize_error there.

xavierleroy added a commit that referenced this issue Jan 3, 2024
This can happen with numbers marshaled on a 32-bit platform (including from JS-of-OCaml) and unmarshaled on a 64-bit platform.

Fixes: #148
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants