Skip to content

Bounding const generic bitmasks #50

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
workingjubilee opened this issue Dec 8, 2020 · 6 comments
Open

Bounding const generic bitmasks #50

workingjubilee opened this issue Dec 8, 2020 · 6 comments

Comments

@workingjubilee
Copy link
Member

You can do this just fine:

#[repr(simd)]
pub struct SimdU8<const LANES: usize>([u8; LANES]);

But to make a bitmask you need to do this:

#[repr(simd)]
pub struct SimdBitMask<const LANES: usize>([u8; (LANES + 7) / 8]) where [(); (LANES + 7) / 8]: Sized;

@calebzulawski in (https://rust-lang.zulipchat.com/#narrow/stream/257879-project-portable-simd/topic/Mask.20API/near/219272371)

As a reminder, because frankly I have lost track of this in a conversation at least once, a "bitmask" is a mask type that uses 1 bit per vector element (like a bitset) so that a single tightly-packed number defines a mask for an entire vector. Bitmasks can be more efficient, but in order to make an efficient bitmask, you need to know the size of the bitmasks you're supporting ahead of time. This is something we would prefer to not say, given that current architectures that are actually in use on commodity hardware (AVX512) already support 64b bitmasks, and future implementations of RISCV-V also will use bitmasks with pretty interestingly widths†, as I currently understand it.

This problem infects our attempt to design an "opaque", arch-agnostic mask type because we have to impose the same bound at that site as well. That makes it quite unpretty and non-ergonomic, even though we have some clear bounds that you would think we could prove to the compiler pretty effectively.

"Full masks" are comparatively simple... they're the size of a vector's elements, and so, are the size of a vector register. Which is a bit convenient, really. Caleb mentioned It might be nice to say that a mask uses an entire vector register and so is Many bits wide regardless, in terms of layout, and hope the optimizer handles it intelligently.

† hypothetically, unless I misunderstand the spec, it requires a mask register to fit inside VLEN, the size of an individual vector register, but that means it could allow a VLEN-sized mask to mask byte-wide structures over a "vector register group" using that 1 vector register... or 128 elements! @programmerjake am I reading this correctly?

@programmerjake
Copy link
Member

† hypothetically, unless I misunderstand the spec, it requires a mask register to fit inside VLEN, the size of an individual vector register, but that means it could allow a VLEN-sized mask to mask byte-wide structures over a "vector register group" using that 1 vector register... or 128 elements! @programmerjake am I reading this correctly?

I didn't check the latest spec, but the earlier version I read allows you to have vector registers with much more than 1000 elements (something like 16k or 64k elements iirc), of course that depends on the hardware supporting something that big, I'd expect most implementations to support less than 64 elements per vector register for the near future.

@programmerjake
Copy link
Member

so, no, you can't cram a bitmask for that in a u128 :)

@workingjubilee
Copy link
Member Author

Well, VLEN doesn't have a specified upper bound here, either, I just said 128 because it was the specced minimum size for an actual vector register, which clearly can plausibly go to at least 512 even on a simpler RISC-y architecture given that such has happened for ARM chips as well.

@workingjubilee
Copy link
Member Author

In the spirit of "something that works, now" currently we're using a u64 since we don't have as much practical need for something more clever in 2021. As this is not necessarily the solution we will want in 2031, this is likely not to be what we do permanently, and we are considering carving out an alternative feature gate for mask types.

@programmerjake
Copy link
Member

@programmerjake
Copy link
Member

Code for idea:

pub const fn mask_bytes(n: usize) -> usize {
    ((n + 7) / 8).next_power_of_two()
}

macro_rules! make_masks {
    ($($n:literal,)*) => {
        $(
            unsafe impl MaskHelper for [(); $n] {
                type Underlying = [u8; mask_bytes($n)];
            }
        )*
    };
}

make_masks![
    0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
    26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49,
    50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73,
    74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,
    98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116,
    117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135,
    136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154,
    155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173,
    174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192,
    193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211,
    212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230,
    231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249,
    250, 251, 252, 253, 254, 255, 256,
];

pub unsafe trait MaskHelper {
    /// for `Self = [(); N]`:
    /// `Underlying` must be `[u8; mask_bytes(N)]`
    type Underlying: Copy + 'static;
}

pub struct Mask<const BITS: usize>(<[(); BITS] as MaskHelper>::Underlying)
where
    [(); BITS]: MaskHelper;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants