-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Spanned toml deserialization, add some validation/tests #230
Conversation
I have been struggling in a mental labyrinth trying to get the Spanned stuff to work with all the weird parsing hacks we already had for like the last week, and ended up obsessively working on it this sunday just to get it out of my head, so I'll take off some other day this week to make up. |
start: 0, | ||
end: 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could use better values to indicate that the span isn't specified, so we can handle that case better? We could e.g. use usize::MAX
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've found miette can be a bit brittle in degenerate source/span situations (panics on the empty string being a NamedSource), so (0,0)
is safer. In practice we only really set spans when we're about to writeback, in which case their values don't matter (but this may change in the future).
src/storage.rs
Outdated
let (line, col) = error.line_col().unwrap_or((0, 0)); | ||
TomlParseError { | ||
source_code: audit_source.clone(), | ||
span: SourceOffset::from_location(&string, line + 1, col + 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC generally column isn't 1-indexed, unlike line. Is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These offsets were determined by empirical checking, everyone's got weird opinions it seems.
src/storage.rs
Outdated
where | ||
T: for<'a> Deserialize<'a>, | ||
{ | ||
let mut reader = BufReader::new(reader); | ||
let mut string = String::new(); | ||
reader.read_to_string(&mut string)?; | ||
let toml = toml_edit::de::from_str(&string).map_err(|error| TomlParseError { error })?; | ||
Ok(toml) | ||
let audit_source = Arc::new(NamedSource::new(file_name, string.clone())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't love that we're cloning the string here, when we really only need one copy, as deserialization doesn't require us to actually own the payload.
Unfortunately it looks like miette throws out the type information as soon as you call NamedSource::new
so it would be quite the hassle to get back out a &str
for the full source file. Could we perhaps rework this like this:
let mut string = String::new();
reader.read_to_string(&mut string)?;
let result = toml::de::from_str(&string);
let source_code = Arc::new(NamedSource::new(file_name, string.clone()));
match result {
Ok(toml) => Ok((source_code, toml)),
Err(error) => {
let (line, col) = error.line_col().unwrap_or((0, 0));
Err(TomlParseError {
source_code,
span: SourceOffset::from_location(&string, line + 1, col),
error,
})
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow, you seem to still be cloning the string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, well I meant to remove the .clone()
, but forgot to. I also totally missed that you need the string to do SourceOffset::from_location
. Something like this should work IIRC (though it has some duplication)
let mut string = String::new();
reader.read_to_string(&mut string)?;
let result = toml::de::from_str(&string);
match result {
Ok(toml) => Ok((Arc::new(NamedSource::new(file_name, string)), toml)),
Err(error) => {
let (line, col) = error.line_col().unwrap_or((0, 0));
Err(TomlParseError {
span: SourceOffset::from_location(&string, line + 1, col + 1),
source_code: Arc::new(NamedSource::new(file_name, string)),
error,
})
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I iterated on this and now we don't clone the string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah nice, same impl :)
Spanning is done with the Spanned<T> hack from toml-rs. A bunch of serialization code had to be rejigged to play nice with that hack. In particular things like flatten and untagged_enum make it freak out, so now we deserialize all the fields and then pick which variant it is. Store now records the source files it imported, and we now check that criteria names are valid everywhere. New tests allow us to quickly check different inputs parse/fail-parse/fail-validate.
Spanning is done with the Spanned hack from toml-rs. A bunch of serialization code had to be rejigged
to play nice with that hack. In particular things like flatten and untagged_enum make it freak out, so
now we deserialize all the fields and then pick which variant it is.
Store now records the source files it imported, and we now check that criteria names are valid everywhere.
New tests allow us to quickly check different inputs parse/fail-parse/fail-validate.