-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(input): Add span support #61
Conversation
There are two main crates that provide location support for nom - `nom-locate`: wide adoption for long time - `pori`: more recent, only really being used by one application While `pori` is less mature, I think it iterates quite well on `nom-locate`; so let's break this down. How to track location - `pori`: tracks how much `input` has advanced - `nom-locate`: same as above but also tracks the line number - leave initial input unchanged and only change begin/end indices, doing unchecked slicing of the input to get the current input - keep initial and current input and use offset - keep start pointer from initial and current input For now, I'm going with (initial, input) because it is relatively easy to implement and the bookkeeping costs are paid when doing location lookups. The downside is it takes up more space, especially if `Located` isn't the inner-most wrapper type. We could reduce this somewhat by using `<I as IntoOutput>::Output` for `initial`. If we make more assumptions on inner types, we could just capture the start pointer and do math with that when needed. The problem with nom-locate tracking line numbers is it pessimizes performance. Knowing the line number is more of a UI concern and should be deferred until there (if the UI will even ask for it). How to capture a span - `pori` combinators that returns `(usize, usize)` - `nom-locate` use the wrapper type - This makes any `recognize`, `take`, etc capture a span which is convenient I was originally attracted to `nom-locate`s approach but the downside is that this makes these types more awkward to work with when **not** capturing a span. I decided to instead use `Parser::span` and `Parser::with_span` combinators, similar to `recognize` and `with_recognized`. `pori`s combinators returned `(usize, usize)`, meaning (start, length) which could be confused with (start, end) with "end" being inclusive or not. To make this clearer, I used `Range<usize>`. Now how it works is embedded in the type system. While the approach in this commit is a new design, it iterates on previous designs with an effort to simplify them down to their most basic parts. I feel fairly confident that this is something we can commit API stability to.
Pull Request Test Coverage Report for Build 3789359607
💛 - Coveralls |
|
I just implemented pori style Located and it seems be 30% slower than what is implemented (#63) I also tested |
#64 shows that nom-locate is even slower. |
I suspect the only way to capture back some of the performance is if we switch from the input being returned to it being a |
Interesting, pom takes the approach of Parser inpuits being a |
There are two main crates that provide location support for nom
nom-locate
: wide adoption for long timepori
: more recent, only really being used by one applicationWhile
pori
is less mature, I think it iterates quite well onnom-locate
; so let's break this down.How to track location
pori
: tracks how muchinput
has advancednom-locate
: same as above but also tracks the line numberpom
whose input is the Input + locationFor now, I'm going with (initial, input) because it is relatively easy to implement and the bookkeeping costs are paid when doing location lookups. The downside is it takes up more space, especially if
Located
isn't the inner-most wrapper type. We could reduce this somewhat by using<I as IntoOutput>::Output
forinitial
. If we make more assumptions on inner types, we could just capture the start pointer and do math with that when needed.The problem with nom-locate tracking line numbers is it pessimizes performance. Knowing the line number is more of a UI concern and should be deferred until there (if the UI will even ask for it).
How to capture a span
pori
combinators that returns(usize, usize)
nom-locate
use the wrapper typerecognize
,take
, etc capture a span which is convenientI was originally attracted to
nom-locate
s approach but the downside is that this makes these types more awkward to work with when not capturing a span. I decided to instead useParser::span
andParser::with_span
combinators, similar torecognize
andwith_recognized
.pori
s combinators returned(usize, usize)
, meaning (start, length) which could be confused with (start, end) with "end" being inclusive or not. To make this clearer, I usedRange<usize>
. Now how it works is embedded in the type system.While the approach in this commit is a new design, it iterates on previous designs with an effort to simplify them down to their most basic parts. I feel fairly confident that this is something we can commit API stability to.