Skip to content

Consider readding unsafe slicing in the parser #82

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zbraniecki opened this issue Jan 24, 2019 · 9 comments
Closed

Consider readding unsafe slicing in the parser #82

zbraniecki opened this issue Jan 24, 2019 · 9 comments
Labels
crate:fluent-syntax Issues related to fluent-syntax crate enhancement help wanted We need help making decisions or writing PRs for this.
Milestone

Comments

@zbraniecki
Copy link
Collaborator

Thanks for removing the unsafe block here, at least for now. Would you mind filing a follow-up for discussing re-adding it, please? I'd like to look into it a bit more, given that we're likely protected from invalid UTF8 input by the sheer fact of accepting String into FluentResource::try_new.

Originally posted by @stasm in #76

@zbraniecki
Copy link
Collaborator Author

This change would give us ~20% performance win:

Benchmarking parse/"simple": Collecting 100 samples in estimated 5.0453 s (2                                                                            parse/"simple"          time:   [17.874 us 18.020 us 18.193 us]
                        change: [-20.946% -19.166% -17.259%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe
Benchmarking parse/"menubar": Collecting 100 samples in estimated 5.6131 s (                                                                            parse/"menubar"         time:   [155.88 us 157.36 us 159.22 us]
                        change: [-21.585% -19.878% -18.093%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

Gnuplot not found, disabling plotting

@zbraniecki zbraniecki added enhancement crate:fluent-syntax Issues related to fluent-syntax crate help wanted We need help making decisions or writing PRs for this. labels Apr 17, 2019
@zbraniecki zbraniecki added this to the 0.7 milestone Apr 17, 2019
@zbraniecki
Copy link
Collaborator Author

Using latest Rust nightly and the PR that I just put against master, the perf impact is as following:

parse/"simple": -15%
parse/"preferences": -12.8%
parse/"menubar": -15.3%

@zbraniecki
Copy link
Collaborator Author

@cyplo - on twitter you indicated you might be able to help us with fuzzing [0]. I may file a separate issue for the parser fuzzing prior to 1.0, but this is a particular aspect of it that I'd like to tackle now.

We have a parser that in theory should only cut at known character bounds. In result, we should be able to slice for free using unchecked, and we know that it brings sizable perf wins, but I don't know to verify the risk here.

Is that something you may know how to help us decide on?

[0] https://twitter.com/zbraniecki/status/1096383281929560064

@cyplo
Copy link

cyplo commented Apr 19, 2019

Hi, sure thing @zbraniecki ! Would it be possible to jump on a call to discuss this ? - you can DM me on Twitter

@bjorn3
Copy link

bjorn3 commented Apr 19, 2019

get_slice should be unsafe. The start or end might not be at the start or end of a character, which makes the str invalid. While the parser probably doesn't currently violate that invariant, it could in the future. Making get_slice unsafe makes it clearer to its callers that it has an invariant which must not be violated. This will reduce the likelihood for future changes to violate the invariant.

@zbraniecki
Copy link
Collaborator Author

zbraniecki commented Apr 19, 2019

I honestly struggle to see any possibility of the future development of Fluent Syntax which would lead to us allowing for AST nodes to start/end on anything but start/end of characters.

@zbraniecki zbraniecki modified the milestones: 0.7, 0.8 Jul 26, 2019
@zbraniecki
Copy link
Collaborator Author

Moving to 0.8, since it doesn't affect the API

@zbraniecki zbraniecki modified the milestones: 0.10, 0.11 Dec 17, 2019
@zbraniecki zbraniecki modified the milestones: 0.12, 0.11 Feb 12, 2020
@zbraniecki zbraniecki modified the milestones: 0.12, 0.13 Sep 18, 2020
@zbraniecki
Copy link
Collaborator Author

This has been fixed with Slice in AST.

@cyplo
Copy link

cyplo commented Sep 22, 2020

@zbraniecki let me know if you're still interested in looking into fuzzing this together :) github@cyplo.net is a good starting point or @cyplo on pretty much anywhere

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crate:fluent-syntax Issues related to fluent-syntax crate enhancement help wanted We need help making decisions or writing PRs for this.
Projects
None yet
Development

No branches or pull requests

3 participants