Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate yaml-rust2 #2001

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Evaluate yaml-rust2 #2001

wants to merge 1 commit into from

Conversation

jneem
Copy link
Member

@jneem jneem commented Jul 18, 2024

I had a go at adding positions to yaml, using the yaml-rust2 crate. It sort of works (without any error handling yet), but I also have some concerns:

  • the rust yaml ecosystem is kind of in flux right now. yaml-rust2 seems like the most active yaml library at the moment, but for how long?
  • only the low-level API supports positions, so I had to implement my own MarkedEventReceiver to construct the objects from the parser events. There's a marked_yaml crate, but it's pretty new and not much used yet.
  • the marks aren't always in quite the right position. See here; also, I haven't figured out how to easily and correctly find the position of a string unless it's in the "plain" style.

One alternative, as we already discussed, would be to wrap a mature C library.

@github-actions github-actions bot temporarily deployed to pull request July 18, 2024 08:38 Inactive
@yannham
Copy link
Member

yannham commented Jul 18, 2024

Argh, we should have synced before. In fact I asked about this feature on an issue thread (Ethiraric/yaml-rust2#27), and this was implemented in https://github.com/saphyr-rs/saphyr, which is from the same author but supposed to get more innovations (and potentially breaking changes) while yaml-rust2 would just get maintenance. So I think something very similar to what you implemented can already be done with saphyr, I just haven't had the time to look into it.

At the end of the corresponding PR, the author mentions that indeed the markers sent by their parser are off (see saphyr-rs/saphyr#6 (comment)), and it's a bug of the parser.

We can either use saphyr already, if the imprecision is small, it's already better than what existed before. Or we can even contribute upstream to fix the parser; I suppose the saphyr maintainer now already knows about us and would be eager to merge this change, given they were eager to merge the marker change.

@jneem
Copy link
Member Author

jneem commented Jul 18, 2024

No worries, I would have checked first but I found myself with some unexpected free time today...

I will have a look at saphyr and see if I can figure out the position bug.

@jneem
Copy link
Member Author

jneem commented Jul 29, 2024

I had a quick look at libfyaml today. Their scanner already has support for recording the end of a token (which is basically the thing I'm trying to add to saphyr-parser). But their high-level API doesn't keep track of spans, so in addition to wrapping the C API we'd need to write the thing that consumes events and produces a tree (i.e. basically what's in this PR).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants