Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to deserialize enums from SeqAccessDeserializer #2445

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

Mingun
Copy link
Contributor

@Mingun Mingun commented May 7, 2023

Motivation of this PR is following: when I rebase #1922 I noticed, that test, introduced in the first commit of #2303 failed after the change. That failure appeared after changing

// seq: A where A: SeqAccess
let rest = SeqAccessDeserializer::new(seq);
let content = try!(Content::deserialize(rest));
let deserializer = ContentDeserializer::<A::Error>::new(content);
// use `deserializer`

to

// seq: A where A: SeqAccess
let deserializer = SeqAccessDeserializer::new(seq);
// use `deserializer`

i. e. after removing intermediate bufferisation step. Because code for the MapAccessDeserializer is the same and it does not suffer from such problem, I checked the difference and realized, that MapAccessDeserializer has a special handling fo the Deserializer::deserialize_enum hint:

serde/serde/src/de/value.rs

Lines 1474 to 1484 in 25381be

fn deserialize_enum<V>(
self,
_name: &str,
_variants: &'static [&'static str],
visitor: V,
) -> Result<V::Value, Self::Error>
where
V: de::Visitor<'de>,
{
visitor.visit_enum(self)
}

which is missing for the SeqAccessDeserializer.

Because SeqAccessDeserializer is used in such context when deserialize_enum could be requested from it, I think, it should provide the special implementation for that.

This implementation is following: enum is represented as:

  • two elements, where the first element is a tag and the second is a content
  • one element: a special case for unit variants, where the second element for content can be missed

Additionally, this PR:

  • extends testcases to cover all possible enum variants (Unit, Newtype, Tuple and Struct)
  • extends testcases to cover negative scenarios (wrong tag, empty map / seq, missing content for seq)
  • changes the error message for the empty map for MapAccessDeserializer from invalid type: map, expected enum to invalid length: 0, expected enum, because actually map is expected, but it should not be empty
  • document behavior of Deserializer::deserialize_enum for MapAccessDeserializer and SeqAccessDeserializer

@dtolnay dtolnay changed the title Add ability to deserialize enums from SeqAccessDeserialzier Add ability to deserialize enums from SeqAccessDeserializer Jul 26, 2023
Copy link
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

I am not convinced that this is the behavior that would be expected by users. As far as I can tell the motivation explained in the PR description is to make the following:

let deserializer = SeqAccessDeserializer::new(seq);

behave more like the following:

let rest = SeqAccessDeserializer::new(seq);
let content = try!(Content::deserialize(rest));
let deserializer = ContentDeserializer::<A::Error>::new(content);

But I tried that code, and ContentDeserializer also does not deserialize enums from sequences. The current SeqAccessDeserializer is already consistent with this.

use serde::de::value::SeqAccessDeserializer;
use serde::de::{Deserialize, Deserializer, SeqAccess, Visitor};
use serde_derive::Deserialize;
use std::fmt;

#[derive(Deserialize, Debug)]
enum Enum {
    Variant(usize),
}

struct MyVisitor;

impl<'de> Visitor<'de> for MyVisitor {
    type Value = Enum;
    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter.write_str("repro")
    }
    fn visit_seq<A>(self, seq: A) -> Result<Self::Value, A::Error>
    where
        A: SeqAccess<'de>,
    {
        let rest = SeqAccessDeserializer::new(seq);
        let content = serde::__private::de::Content::deserialize(rest).unwrap();
        let deserializer = serde::__private::de::ContentDeserializer::<A::Error>::new(content);
        Enum::deserialize(deserializer)
    }
}

fn main() {
    let mut de = serde_json::Deserializer::from_str(r#" ["Variant", 1] "#);
    let e = de.deserialize_any(MyVisitor).unwrap();
    println!("{:?}", e);
}

I'm left unclear about what the motivation for this PR is.

@Mingun
Copy link
Contributor Author

Mingun commented Aug 4, 2024

I cannot remember exactly, why I decided to make SeqAccessDeserializer behaves like MapAccessDeserializer when working with enums. The whole situation with test from #2303 looked strange for me but I remember that I got deal with it. I need to restore the whole chain of reasoning, this can take some time.

@Mingun
Copy link
Contributor Author

Mingun commented Aug 30, 2024

I remembered the chain of reasoning. To make sure #1922 doesn't break existing behavior, I decided to add the missing tests. One of the tests is the deserialization of ExternallyTagged enums that are inside of InternallyTagged enums:

#[derive(Deserialize)]
enum ExternallyTagged {
  Unit,
}

#[derive(Deserialize)]
#[serde(tag = "tag")]
enum InternallyTagged {
  StructWithEnum { f: ExternallyTagged },
}

Internally tagged enums can be represented either as maps or as sequences:

[
  { "tag": "StructWithEnum", "f": "Unit" },
  ["StructWithEnum", "Unit"],
]

You can make sure that on playground.

Currently internally tagged enums always use bufferisation and ExternallyTagged enum will be deserialized from ContentDeserializer:

serde/serde_derive/src/de.rs

Lines 1399 to 1406 in 3aca38d

let (__tag, __content) = _serde::Deserializer::deserialize_any(
__deserializer,
_serde::__private::de::TaggedContentVisitor::<__Field>::new(#tag, #expecting))?;
let __deserializer = _serde::__private::de::ContentDeserializer::<__D::Error>::new(__content);
match __tag {
#(#variant_arms)*
}

After optimisation in #1922 in the optimized case ExternallyTagged will be deserialized either from MapAccessDeserializer or SeqAccessDeserializer, because intermediate bufferisation step will be skipped here:

fn visit_seq<S>(self, mut seq: S) -> Result<Self::Value, S::Error>
where
S: SeqAccess<'de>,
{
let tag = match tri!(seq.next_element()) {
Some(tag) => tag,
None => {
return Err(de::Error::missing_field(self.tag_name));
}
};
let rest = de::value::SeqAccessDeserializer::new(seq);
Ok((tag, tri!(Content::deserialize(rest))))
}
fn visit_map<M>(self, mut map: M) -> Result<Self::Value, M::Error>
where
M: MapAccess<'de>,
{
let mut tag = None;
let mut vec = Vec::<(Content, Content)>::with_capacity(size_hint::cautious::<(
Content,
Content,
)>(map.size_hint()));
while let Some(k) = tri!(map.next_key_seed(TagOrContentVisitor::new(self.tag_name))) {
match k {
TagOrContent::Tag => {
if tag.is_some() {
return Err(de::Error::duplicate_field(self.tag_name));
}
tag = Some(tri!(map.next_value()));
}
TagOrContent::Content(k) => {
let v = tri!(map.next_value());
vec.push((k, v));
}
}
}
match tag {
None => Err(de::Error::missing_field(self.tag_name)),
Some(tag) => Ok((tag, Content::Map(vec))),
}
}

Without changes in this PR the test from playground failed for #1922 (I've put it to serde's test_suite/regression/issue2445.rs for check):

---- regression::issue2445::seq stdout ----
thread 'regression::issue2445::seq' panicked at test_suite\tests\regression\issue2445.rs:28:71:
called `Result::unwrap()` on an `Err` value: Error("invalid type: sequence, expected internally tagged enum InternallyTagged", line: 1, column: 0)

Then I realized that deserialization tests of externally tagged from MapAccessDeserializer are incomplete and completely missing for SeqAccessDeserializer. So I've added them. Because both deserializers used in the same circumstances it looks reasonable that they should behave the same.

(review this commit with "ignore whitespace changes" option on)
The new message is more logical, because actually map *is* expected,
but it should contain at least one entry
failures (6):
    access_to_enum::seq::empty_seq
    access_to_enum::seq::newtype
    access_to_enum::seq::struct_
    access_to_enum::seq::tuple
    access_to_enum::seq::unit
    access_to_enum::seq::wrong_tag
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants