-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Porting progress #1
Comments
First, rapid fuzz is a beautiful library with cutting edge algorithms and I admire the work a lot. Second, I am working on an in-dataframe (polars dataframe plugin) data analysis tool. For strings, fuzzy string matching, similarity metrics are crucial. Right now I am relying on strsim and some of my own code for that, which has mediocre performance and would be really happy to see rapid fuzz coming to Rust. I have done some benchmarks using my Rust impl vs. rapid fuzz (via Python UDF), and you can see results here: abstractqqq/polars_ds_extension#17 . Those are some interesting numbers. Is there any way I can support this project? Thank you. |
A couple of comments to the issue you linked:
I uploaded a first in progress version of the library to cargo a couple of hours ago, which includes most basic implementations which is pretty much what you are after. To get the best out of it you should:
Things still missing in the port are:
Overall I am really surprised how fast the port went so far. Especially considering I didn't touch the language before. I would say I have probably around 2/3 of the code volume ported over. As to how the project can be supported: Right now I am especially in need of people who have experience in rust and can help with code review. In particular any suggestions for improvements of the public interface would be very useful, so I can reduce breaking changes in the future. |
@abstractqqq I did now pretty much finalize the API for rust. This changes all function signatures and so will need updates in your project as well. Let me know if you run into any issues with this. This should now be pretty much the final API. I do not expect any more signature changes in the closer future, unless something about it is fundamentally broken. |
Thank you. I will update once I come back |
I updated to the latest version of rapidfuzz and everything is working great! |
Has there been any progress on this recently? |
I didn't really get around to this over the past year, but I should be able to take this up again over the next weeks. |
Hi @maxbachmann, There are still many issues with that PR, but since I won't have time to iron everything out and implement everything "properly" I thought I'd still create a draft PR in the hopes of this work being at least somewhat useful to you. |
This issue tracks missing components in the rust port:
The goal for the first release is to have all of the basic + cached distances implemented. Edit operations and simd are more niche and so not required for a first release.
The text was updated successfully, but these errors were encountered: