Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add minimal support for internally tagged and untagged enums #451

Merged
merged 4 commits into from
Jul 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Add minimal support for `#[serde(flatten)]` with roundtripping through RON maps ([#455](https://github.com/ron-rs/ron/pull/455))
- Breaking: Bump `bitflags` dependency to 2.0, changes `serde` impls of `Extensions` ([#443](https://github.com/ron-rs/ron/pull/443))
- Bump MSRV to 1.64.0 and bump dependency: `indexmap` to 2.0 ([#459](https://github.com/ron-rs/ron/pull/459))
- Add minimal roundtripping support for `#[serde(tag = "tag")]`, `#[serde(tag = "tag", content = "content")]`, and `#[serde(untagged)]` enums ([#451](https://github.com/ron-rs/ron/pull/451))

## [0.8.0] - 2022-08-17

Expand Down
14 changes: 9 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,13 +104,17 @@ Note the following advantages of RON over JSON:

## Limitations

RON is not designed to be a fully self-describing format (unlike JSON) and is thus not guaranteed to work when [`deserialize_any`](https://docs.rs/serde/latest/serde/trait.Deserializer.html#tymethod.deserialize_any) is used instead of its typed alternatives. In particular, the following Serde attributes are not yet supported:
- `#[serde(tag = "type")]`, i.e. internally tagged enums
- `#[serde(untagged)]`, i.e. untagged enums
RON is not designed to be a fully self-describing format (unlike JSON) and is thus not guaranteed to work when [`deserialize_any`](https://docs.rs/serde/latest/serde/trait.Deserializer.html#tymethod.deserialize_any) is used instead of its typed alternatives. In particular, the following Serde attributes only have limited support:

Furthermore, `#[serde(flatten)]` only has limited support and relies on a small hack [^serde-flatten-hack]. Specifically, flattened structs are only serialised as maps and deserialised from maps. However, this limited implementation supports full roundtripping.
- `#[serde(tag = "tag")]`, i.e. internally tagged enums [^serde-enum-hack]
- `#[serde(tag = "tag", content = "content")]`, i.e. adjacently tagged enums [^serde-enum-hack]
- `#[serde(untagged)]`, i.e. untagged enums [^serde-enum-hack]
- `#[serde(flatten)]`, i.e. flattening of structs into maps [^serde-flatten-hack]

[^serde-flatten-hack]: Deserialising a flattened struct from a map requires that the struct's [`Visitor::expecting`](https://docs.rs/serde/latest/serde/de/trait.Visitor.html#tymethod.expecting) implementation formats a string starting with `"struct "`. This is the case for automatically-derived [`Deserialize`](https://docs.rs/serde/latest/serde/de/trait.Deserialize.html) impls on structs.
While data structures with any of these attributes should roundtrip through RON, their textual representation may not always match your expectation. For instance, flattened structs are only serialised as maps and deserialised from maps.

[^serde-enum-hack]: Deserialising an internally, adjacently, or un-tagged enum requires detecting `serde`'s internal `serde::__private::de::content::Content` content type so that RON can describe the deserialised data structure in serde's internal JSON-like format. This detection only works for the automatically-derived [`Deserialize`](https://docs.rs/serde/latest/serde/de/trait.Deserialize.html) impls on enums. See [#451](https://github.com/ron-rs/ron/pull/451) for more details.
[^serde-flatten-hack]: Deserialising a flattened struct from a map requires that the struct's [`Visitor::expecting`](https://docs.rs/serde/latest/serde/de/trait.Visitor.html#tymethod.expecting) implementation formats a string starting with `"struct "`. This is the case for automatically-derived [`Deserialize`](https://docs.rs/serde/latest/serde/de/trait.Deserialize.html) impls on structs. See [#455](https://github.com/ron-rs/ron/pull/455) for more details.

## RON syntax overview

Expand Down
89 changes: 74 additions & 15 deletions src/de/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ use crate::{
error::{Result, SpannedResult},
extensions::Extensions,
options::Options,
parse::{AnyNum, Bytes, ParsedStr, BASE64_ENGINE},
parse::{AnyNum, Bytes, ParsedStr, StructType, BASE64_ENGINE},
};

mod id;
Expand All @@ -33,6 +33,7 @@ mod value;
pub struct Deserializer<'de> {
bytes: Bytes<'de>,
newtype_variant: bool,
serde_content_newtype: bool,
last_identifier: Option<&'de str>,
recursion_limit: Option<usize>,
}
Expand All @@ -56,6 +57,7 @@ impl<'de> Deserializer<'de> {
let mut deserializer = Deserializer {
bytes: Bytes::new(input)?,
newtype_variant: false,
serde_content_newtype: false,
last_identifier: None,
recursion_limit: options.recursion_limit,
};
Expand Down Expand Up @@ -140,25 +142,49 @@ impl<'de> Deserializer<'de> {
/// struct and deserializes it accordingly.
///
/// This method assumes there is no identifier left.
fn handle_any_struct<V>(&mut self, visitor: V) -> Result<V::Value>
fn handle_any_struct<V>(&mut self, visitor: V, ident: Option<&str>) -> Result<V::Value>
where
V: Visitor<'de>,
{
// Create a working copy
let mut bytes = self.bytes;
// HACK: switch to JSON enum semantics for JSON content
// Robust impl blocked on https://github.com/serde-rs/serde/pull/2420
let is_serde_content =
std::any::type_name::<V::Value>() == "serde::__private::de::content::Content";

if bytes.consume("(") {
bytes.skip_ws()?;
let old_serde_content_newtype = self.serde_content_newtype;
self.serde_content_newtype = false;

if bytes.check_tuple_struct()? {
// first argument is technically incorrect, but ignored anyway
self.deserialize_tuple(0, visitor)
} else {
match (self.bytes.check_struct_type()?, ident) {
(StructType::Unit, Some(ident)) if is_serde_content => {
// serde's Content type needs the ident for unit variants
visitor.visit_str(ident)
}
(StructType::Unit, _) => visitor.visit_unit(),
(_, Some(ident)) if is_serde_content => {
// serde's Content type uses a singleton map encoding for enums
visitor.visit_map(SerdeEnumContent {
de: self,
ident: Some(ident),
})
}
(StructType::Named, _) => {
// giving no name results in worse errors but is necessary here
self.handle_struct_after_name("", visitor)
}
} else {
visitor.visit_unit()
(StructType::NewtypeOrTuple, _) if old_serde_content_newtype => {
// deserialize a newtype struct or variant
self.bytes.consume("(");
self.bytes.skip_ws()?;
let result = self.deserialize_any(visitor);
self.bytes.skip_ws()?;
self.bytes.consume(")");

result
}
(StructType::Tuple | StructType::NewtypeOrTuple, _) => {
// first argument is technically incorrect, but ignored anyway
self.deserialize_tuple(0, visitor)
}
}
}

Expand Down Expand Up @@ -243,14 +269,14 @@ impl<'de, 'a> de::Deserializer<'de> for &'a mut Deserializer<'de> {
// `identifier` does not change state if it fails
let ident = self.bytes.identifier().ok();

if ident.is_some() {
if let Some(ident) = ident {
self.bytes.skip_ws()?;

return self.handle_any_struct(visitor);
return self.handle_any_struct(visitor, Some(std::str::from_utf8(ident)?));
}

match self.bytes.peek_or_eof()? {
b'(' => self.handle_any_struct(visitor),
b'(' => self.handle_any_struct(visitor, None),
b'[' => self.deserialize_seq(visitor),
b'{' => self.deserialize_map(visitor),
b'0'..=b'9' | b'+' | b'-' => {
Expand Down Expand Up @@ -925,3 +951,36 @@ fn struct_error_name(error: Error, name: Option<&str>) -> Error {
e => e,
}
}

struct SerdeEnumContent<'a, 'de: 'a> {
de: &'a mut Deserializer<'de>,
ident: Option<&'a str>,
}

impl<'de, 'a> de::MapAccess<'de> for SerdeEnumContent<'a, 'de> {
type Error = Error;

fn next_key_seed<K>(&mut self, seed: K) -> Result<Option<K::Value>>
where
K: DeserializeSeed<'de>,
{
self.ident
.take()
.map(|ident| seed.deserialize(serde::de::value::StrDeserializer::new(ident)))
.transpose()
}

fn next_value_seed<V>(&mut self, seed: V) -> Result<V::Value>
where
V: DeserializeSeed<'de>,
{
self.de.bytes.skip_ws()?;

let old_serde_content_newtype = self.de.serde_content_newtype;
self.de.serde_content_newtype = true;
let result = seed.deserialize(&mut *self.de);
self.de.serde_content_newtype = old_serde_content_newtype;

result
}
}
73 changes: 65 additions & 8 deletions src/parse.rs
Original file line number Diff line number Diff line change
Expand Up @@ -427,17 +427,67 @@ impl<'a> Bytes<'a> {
.map_or(false, |&b| is_ident_other_char(b))
}

/// Should only be used on a working copy
pub fn check_tuple_struct(mut self) -> Result<bool> {
if self.identifier().is_err() {
// if there's no field ident, this is a tuple struct
return Ok(true);
pub fn check_struct_type(&mut self) -> Result<StructType> {
fn check_struct_type_inner(bytes: &mut Bytes) -> Result<StructType> {
if !bytes.consume("(") {
return Ok(StructType::Unit);
}

bytes.skip_ws()?;

if bytes.identifier().is_ok() {
bytes.skip_ws()?;

match bytes.peek() {
// Definitely a struct with named fields
Some(b':') => return Ok(StructType::Named),
// Definitely a tuple struct with fields
Some(b',') => return Ok(StructType::Tuple),
// Either a newtype or a tuple struct
Some(b')') => return Ok(StructType::NewtypeOrTuple),
// Something else, let's investigate further
_ => (),
};
}

let mut braces = 1;
let mut comma = false;

// Skip ahead to see if the value is followed by a comma
while braces > 0 {
// Skip spurious braces in comments and strings
bytes.skip_ws()?;
let _ = bytes.string();

let c = bytes.eat_byte()?;
if c == b'(' || c == b'[' || c == b'{' {
braces += 1;
} else if c == b')' || c == b']' || c == b'}' {
braces -= 1;
} else if c == b',' && braces == 1 {
comma = true;
break;
}
}

if comma {
Ok(StructType::Tuple)
} else {
Ok(StructType::NewtypeOrTuple)
}
}

self.skip_ws()?;
// Create a temporary working copy
let mut bytes = *self;

// if there is no colon after the ident, this can only be a unit struct
self.eat_byte().map(|c| c != b':')
let result = check_struct_type_inner(&mut bytes);

if result.is_err() {
// Adjust the error span to fit the working copy
*self = bytes;
}

result
}

/// Only returns true if the char after `ident` cannot belong
Expand Down Expand Up @@ -996,6 +1046,13 @@ pub enum ParsedStr<'a> {
Slice(&'a str),
}

pub enum StructType {
NewtypeOrTuple,
Tuple,
Named,
Unit,
}

#[cfg(test)]
mod tests {
use super::*;
Expand Down
2 changes: 1 addition & 1 deletion tests/117_untagged_tuple_variant.rs
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ fn test_ebkalderon_case() {
flags: [
"--enable-thing",
"--enable-other-thing",
If("some-conditional", ["--enable-third-thing"]),
juntyr marked this conversation as resolved.
Show resolved Hide resolved
("some-conditional", ["--enable-third-thing"]),
]
)
"#;
Expand Down
Binary file modified tests/307_stack_overflow.rs
Binary file not shown.
Loading