Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lexical and fast-float might soon not be needed. #1010

Closed
ghuls opened this issue Jul 20, 2021 · 6 comments
Closed

lexical and fast-float might soon not be needed. #1010

ghuls opened this issue Jul 20, 2021 · 6 comments

Comments

@ghuls
Copy link
Collaborator

ghuls commented Jul 20, 2021

Describe your feature request

lexical and fast-float might soon not be needed anymore as fast-float like algorithm is merged in the standard library.

rust-lang/rust#86761
https://www.reddit.com/r/rust/comments/omelz4/making_rust_float_parsing_fast_libcore_edition/

@ritchie46
Copy link
Member

I saw! 🙂

We still need lexical for the integer parsing, but fast-float is out.

@Alexhuszagh
Copy link

Alexhuszagh commented Jul 21, 2021

I saw! slightly_smiling_face

We still need lexical for the integer parsing, but fast-float is out.

I'll be working on making lexical lighter for integer parsing (as in, using workspaces for each different component). So compile times should decrease. And, ideally, bring those back into Rust core as well.

@ritchie46
Copy link
Member

I'll be working on making lexical lighter for integer parsing (as in, using workspaces for each different component). So compile times should decrease. And, ideally, bring those back into Rust core as well.

Very nice. Great work on the float parsing. 🚀

@Alexhuszagh
Copy link

Just an FYI: the optimized versions of the new integer and float-parsers have been implemented as of v0.8, and the API is identical. However, it does require a fairly recent Rust compiler (1.51.0), due to the requirement of const generics. I'm currently also trying to integrate the further improvements back into Rust core now.

@ritchie46
Copy link
Member

Just an FYI: the optimized versions of the new integer and float-parsers have been implemented as of v0.8, and the API is identical. However, it does require a fairly recent Rust compiler (1.51.0), due to the requirement of const generics. I'm currently also trying to integrate the further improvements back into Rust core now.

Great! I will update! Thank you for your great work on lexical.
I don't know the ins and outs, so I have a few questions:

  • How does lexical float parsing now compares to fast-float.
  • Is the lexical float parsing algorithm the one that gets embedding in rust std?

@Alexhuszagh
Copy link

Alexhuszagh commented Sep 6, 2021

Just an FYI: the optimized versions of the new integer and float-parsers have been implemented as of v0.8, and the API is identical. However, it does require a fairly recent Rust compiler (1.51.0), due to the requirement of const generics. I'm currently also trying to integrate the further improvements back into Rust core now.

Great! I will update! Thank you for your great work on lexical.
I don't know the ins and outs, so I have a few questions:

* How does lexical float parsing now compares to `fast-float`.

They're practically identical, except for rare cases. The extensive benchmarks compared to rust std can be found here, and the results for rust std are practically identical to fast-float-rust.

* Is the lexical float parsing algorithm the one that gets embedding in rust std.

Yes it is, except for the slow algorithm. In fact, I currently have issues (in core and fast-float-rust) and will be writing PRs (and an upstream PR for the reference C++ implementation). This can be quite a difference in performance for very rare cases, but can have impacts on a few real-world datasets (as shown below, in mesh).

So short term, they're similar, but lexical has a few optimizations the others don't have. Long-term, they will ideally be identical, because I want everyone to benefit. If you want a detailed explanation of what lexical does currently that the other two don't do right now, read the issue I've opened in core. Hopefully, this will be integrated shortly.

Detailed Benchmarks

The most relevant result is this, which benchmarks against a few real-world datasets:
Real Benchmark

Canada is practically identical, while for earth and most important mesh, lexical is faster. The datasets can be found here:

The biggest difference is in near-halfway cases, otherwise, the performance is nearly identical. The near-halfway cases with differing digit counts are as follows. Note that for contrived, the performance of halfway and moderate would be identical with fast-float-rust and lexical, just due to less inlining, core is slightly slower. This has no impact on any real-world dataset, however.

Contrived Data
Large Data
Denormal Data

These are obviously contrived cases, but meant to demonstrate worst-case scenarios and performance of specific algorithms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants