-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rsdict simd feature fails to build with Rust nightly 1.78 #41
Comments
Removing the "simd" feature doesn't hurt performance that much, so disable it for now: without rsdict "simd" feature
with simd feature
without simd feature
|
Thanks for the quick analysis and fix! It looks like Option 1, "Disable the SIMD feature for now, but that would decrease the speed," is the way to go to get things compiling again. That would close this issue. Would adding a separate new feature issue around SIMD to be addressed later be reasonable? Would some future performance improvement be expected at a larger scale of data, use of GPUs, or other SIMD accelerators? Or is the sense that SIMD does not do much algorithmically for HDT? |
@donpellegrino thanks for the input! I disabled the SIMD feature in commit 225ff27 for now but I will keep this issue open because I suspect that rsdict will become updated in the near future as I already got it to compile with the new standard library SIMD feature and I expect that someone more knowledgeable about SIMD than me will be able to fix the errors with reasonable effort. As for the potential impact of a general SIMD feature, that is hard for me to say. The benchmarks above lead me to believe that it does not do much and is best handled by using optimized external libraries like rsdict for rank and select, but I'm open to be positively surprised by a pull request by someone with SIMD expertise :-) |
Update: @Cydhra was kind enough to fix the issues in the refactoring, so once @sujayakar merges sujayakar/rsdict#10, we could reenable the SIMD feature of rsdict. |
I mean, since performance is a non-issue here, this likely won't matter, but there are several alternatives to rsdict which beat its performance even with I think For concrete micro-benchmarks, refer to https://github.com/Cydhra/vers |
We already use sucds so if we could refactor to that this would reduce dependency count by one and if it faster then even better. |
I got rid of the rsdict dependency by switching to sucds following one of the options suggested by @Cydhra. With the iai benchmark (10k triples very small dataset):
My suspicion is that the performance decrease in some of the tests is due to the cumbersome loading process like this:
The loop calling push bits, which further applies some masking because it supports different numbers of bits, is probably much less efficient then just copying the usize blocks as was possible with rsdict. |
Criterion benchmarksThe benchmarks show drastic performance gains, I will recheck it with https://github.com/KonradHoeffner/hdt_benchmark to make sure.
Queries persondata_en.hdt: 86MB, ~10M triples. rsdict
sucds
|
Oh wow, how is that even possible?
Edit: The benchmarks don't come with their resource :( |
Yeah I'm always skeptical with such large changes, so I will try to run it again using the hdt_benchmark suite.
The reason they are not in the repo is that I didn't want to waste disk space for the people behind crates.io, maybe I find a better solution in the future. P.S.: Tried to attach it here but GitHub allows 25 MB max. |
I have similar results with vers like with sucds. A few percent gained in some benchmarks, a few lost in others. I also tried a method that just moves the data into the bitvec instead of copying the bits in a loop, but that didn't seem to make a difference, so I wouldn't worry about it. |
@Cydhra: Thanks for investigating this! Did you try the iai or the criterion benchmark? I'm not sure why they show so much difference, maybe I should change the IAI benchmark to use a larger file as well, as this is what I'm optimizing for. |
@Cydhra I created some comparative benchmark curves at https://github.com/KonradHoeffner/rdf_benchmark/blob/master/hdt_rs_current_vs_previous_versions.ipynb. |
Criterion. |
sorry for not responding earlier -- I've been busy with work and haven't had much time for side projects. nice work benchmarking btw, and |
Thank you! And no problem, I fully understand :-) |
The rsdict dependency "simd" feature depends on packed_simd, which is no longer available in nightly 1.78.
See sujayakar/rsdict#9 and rust-lang/packed_simd#359.
I tried to refactor rsdict but wasn't successful yet, see sujayakar/rsdict#10.
Options
Update: Option 1 was chosen but this issue is kept open in case of a future rsdict update that allows reenabling it.
The text was updated successfully, but these errors were encountered: