-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Depend on packed_simd #565
Conversation
The other solution is to use |
AppVeyor fails because of a network error. |
|
Yeah, that was the type that was return from .le() and friends. I was curious about it, but it worked so I didn't worry too much about it. Glad the names have been fixed though. |
This was per the RFC. If you need shift operations with other types, let me know why, and I'll try to push for them. I considered them a convenience, but I couldn't come up with any reasons beyond this for them. If more people found them convenient, maybe we can get them back.
So, the However, 512-bit wide vector types are not in the RFC, and you should treat them as "super unstable". I wanted to put them behind a feature flag in My recommendation with respect of the Also, I expect to publish Also, if there are any features that you need in |
Regarding the We'll also use things like These are at least the requirements that I'm aware of. |
Just
I have to admit I know next to nothing about simd, so this explains my confusion 😄. Good change.
Would you recommend we remove our 'support' for them for now (it is not much more work than changing a couple of macro calls)?
Thank you! Don't let this issue hurry you, I just happened to have some time today to investigate. |
Interesting. May I ask why is a mask being used here? Or put differently, if the code needs to add
So all the vertical vector comparisons return masks of an appropriate size, and these all can be used with select, have reductions, etc. Independently of which names and sizes we end up giving to the 512-bit wide masks, all these things will still need to work properly. So I'd say that you can count on this not changing modulo we don't know when, if ever, we are going to stabilize the 512-bit wide vector types.
It appears to me that If it isn't, you can always use the non-portable AVX-512 intrinsics in |
Here's the code that calls the loop {
let mask = (scale * max_rand + low).ge_mask(high);
if mask.none() {
break;
}
scale = scale.decrease_masked(mask);
} (Actual code here) where I.e. the set of lanes that we want to decrease is not constant, but rather calculated at runtime. I guess we could do something like
Casting a mask seems to produce a value where all bits are set, or all bits are clear. So I don't think endianness would be a problem. Of course, if casting behavior could vary between platforms, that would indeed be a problem. |
Gotcha, that makes sense. So casting is probably faster than the
This is the case in general (e.g. see the tests in https://github.com/gnzlbg/packed_simd/blob/master/tests/endianness.rs#L133), but for masks this should not matter because all bytes within a lane are either all set or all cleared, so their value won't change with the order. |
@alexcrichton can you re-start the AppVeyor builds please? The service seems to be user-centric rather than project-centric so I don't have authorisation to. |
Ah sorry now it says "OK Pull request #565 is non-mergeable." I've been meaning to switch this though to rust-lang-libs, I can try to do that soon! |
(or you can set it up under your own user if you'd like) |
@pitdicker rebase please |
The CI is broken again... rust-lang/rust#52535 landed, and removes
std::simd
in favor ofpacked_simd
. It doesn't yet have a release on crates.io, but that probably doesn't take long.I ran into three issues:
u32
as argument — easy clean-up on our side.m1x*
types anymore, but I suppose that was a typo in our code.From
implementation to cast integers to floats of the same size. I hope https://github.com/gnzlbg/packed_simd/pull/31 is acceptable, or that there will be some other solution.So this PR doesn't build yet, but I'll make it anyway to show what's going on.