-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Promote the header byte jump table to a constant #4
Comments
I'm quite far from being a rust expert so this could be really far off but is there a reason why this wouldn't work in the global scope?
Warning: I didn't even try to compile this -- modified from the examples here: https://www.joshmcguigan.com/blog/array-initialization-rust/ |
In order to use a function call to initialize a static (global) value, that function must be a You can read a nice summary of why initializing statics is complicated here. It's a few years old, but still holds up -- the only major change since then has been the introduction of |
I took a stab at rewriting those functions to be |
Oh? Is there a
|
Looks like some experimental features regarding
|
Using This should unblock not only the header byte jump table, but also some fun optimizations for encoding binary scalars. |
Rust 1.46 landed yesterday. After some experimentation, I've concluded that this is now possible but very painful. Some remaining limitations are easily worked around (Can't use a Several helper functions would need to be rewritten to avoid producing intermediate values that get dropped.
These values are simple We might be able to get around this by inlining the source for these functions (or writing them as macros), but it makes the code a lot less readable and probably isn't worth it at this point. I'll check in again in 6 weeks when Rust 1.47 lands. :) |
The
create_header_byte_jump_table
function inbinary/header.rs
will parse each possible byte value from 0 to 255 as though it were the one-octet type descriptor found at the beginning of all binary Ion values. The output from parsing each byte is stored in a newly allocatedVec
that can be used as a jump table to avoid re-calculating the meaning of same byte values repeatedly.This is sub-optimal:
Vec
to be allocated when a fixed-size array would doVec
despite the fact that its contents will always be identical.Ideally, we would store a fixed-length array as a static constant in the
header
module. Unlike C, however, the Rust compiler tries very hard to prevent programs from mutating global state. This means that initializing a static array is currently somewhat difficult to do without sacrificing either speed or safety.Here are some options I've explored:
safe + slow
The
lazy_static
crate provides a macro that allows you to declare and initialize a global constant. It works by wrapping your constant in a custom smartpointer type--the first time the pointer is dereferenced,lazy_static
will call a user-defined function to calculate the value of the constant. Every dereference that occurs from then on will receive the already-calculated value.This is easy to implement, but profiling with
valgrind
has shown that the indirection introduced by the smartpointer indirection adds an unacceptable amount of overhead in terms of CPU time. Derefencing the jump table each time you begin parsing a value in a stream with millions of values adds up quickly.thread_local
copies of the jump table suffer from even worse indirection overhead, rendering them a non-starter.If/when we add an
IonSystem
concept, we may be able to haveIonReader
s share a reference to a shared jump table living in theIonSystem
, but this may not be worth the complexity.unsafe + fast
Initializing a fixed-size array (i.e.
[T]
, not a Vec) is currently surprisingly challenging for non-trivial types. The best methods available only work for types that implement theCopy
trait, and types that implementDefault
and where the array is <32 entries long.We could choose to skirt this problem with
unsafe
code, but this has its own problems:unsafe
code, so we would need to scrutinize it very carefully.mem::uninitialized
(currently deprecated in nightly) and towardsmem::MaybeUninit
(currently unstable).safe + fast + not possible yet
Rust is incrementally adding support for
const fn
s, functions whose outputs can be calculated entirely at compile time. This is perfect for our use case -- all of the inputs are statically known (u8
values from 0 to 255) and processing them requires no information from the outside world. With this feature, we'll be able to simply write:The compiler will calculate the definition of
HEADER_JUMP_TABLE
at build time and then all interested entities can refer to it freely -- no wasted memory, no additional overhead from indirection.At the time of this writing,
const fn
s have severe restrictions on which language features may be used. Functionality likematch
statements, the?
operator, calling non-const
functions, looping, etc. are planned but not yet supported. These restrictions are so limiting that I have not been able to write an array initializationconst fn
.The text was updated successfully, but these errors were encountered: