Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed bugs with different enum formats, closes #3 #4

Merged
merged 16 commits into from
May 14, 2021

Conversation

awestlake87
Copy link
Contributor

It looks like the problem mainly came from the deserialize_any function. Apparently, serde does some intermediate buffering of the fields in internally-tagged and untagged enums before completely deserializing them. Because of this, the deserialize_any field needs to be able to differentiate between integers, floats, arrays, and maps.

I changed this function to call the visit_i64 function if the JsValue is a safe integer. The serde ContentDeserializer would not cast a f64 into an i64 during deserialization, so I prioritized the i64 if it meets the criteria of a safe integer. Serde will later cast this back into a float if necessary.

Additionally, in order to get the content of those enum variants working, I had to add support for maps and sequences in deserialize_any.

In the tests, I removed the variants that were incompatible with internally-tagged enums and added special cases in assert_json for untagged enum variants. These variants are not necessarily objects, so they did not play well with JSON::stringify (at least for me).

@awestlake87 awestlake87 changed the title Fixed bugs with different enum formats (issue #3) Fixed bugs with different enum formats, closes #3 Oct 21, 2019
Copy link

@swfsql swfsql left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support this pr. I haven't tested it, but looks good. And I reafirm that serde redirects into deserialise_any for tagged enums (also for flattened structures if I'm not mistaken)

tests/serde.rs Show resolved Hide resolved
tests/serde.rs Outdated Show resolved Hide resolved
@@ -236,9 +236,17 @@ impl<'de> de::Deserializer<'de> for Deserializer {
} else if let Some(v) = self.value.as_bool() {
visitor.visit_bool(v)
} else if let Some(v) = self.value.as_f64() {
visitor.visit_f64(v)
if js_sys::Number::is_safe_integer(&self.value) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these conversions covered by new tests somehow?

Generally I'd be wary of extending deserialize_any with casts, and prefer that users specify explicit destination types, unless there is a compelling reason to use dynamic heuristics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this gets into the weeds of how serde handles internally tagged enums. Basically what happens is serde buffers the variant into an intermediate Content structure. When it encounters a JS number, it stores it as an f64, which ends up breaking deserialization for integer types when it converts this intermediate structure to the one we actually want to deserialize.

The same is true for sequences and objects, which is why I added checks for the Array and Object types.

In the test cases, I test one with type B as a float, and one with type B as an int. I believe this also covers the Object type, since the struct has to be deserialized as a map first, but I'm not certain. We will probably want to add a test for a sequence type (I guess I only verified that this worked in my project).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ughh this is unfortunate, but makes sense...

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coming back to this - can you please make sure to cover these branches by tests (outside of internally tagged enums)? I want to make sure that assumptions about representation of new types are clearly defined for future regressions. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm running into a few problems with this. I added two variants to the test_enum! macro, making it look like this:

macro_rules! test_enum {
    ($(# $attr:tt)* $name:ident) => {{
        #[derive(Debug, PartialEq, Serialize, Deserialize)]
        $(# $attr)*
        enum $name<A, B> where A: Debug + Hash + Eq {
            Unit,
            Newtype(A),
            Tuple(A, B),
            Struct { a: A, b: B },
            Sequence(Vec<A>),
            Map(HashMap<A, B>),
        }

        test_via_json($name::Unit::<String, i32>);
        test_via_json($name::Newtype::<_, i32>("newtype content".to_string()));
        test_via_json($name::Tuple("tuple content".to_string(), 42));
        test_via_json($name::Struct {
            a: "struct content".to_string(),
            b: 42,
        });
        test_via_json($name::<i32, i32>::Sequence(vec![52, 1, -124, 23, -65]));
        test_via_json($name::<String, i32>::Map(
            vec![("a".to_string(), 12), ("abc".to_string(), -1161), ("b".to_string(), 64)].into_iter().collect()
        ));
    }};
}

This should at least touch on some sequences and maps in non-internally tagged enums. However, I ran into the following error with this setup:

failures:

---- serde::enums output ----
    error output:
        panicked at 'assertion failed: `(left == right)`
          left: `"{\"Map\":{}}"`,
         right: `"{\"Map\":{\"abc\":-1161,\"b\":64,\"a\":12}}"`', tests/serde.rs:37:9
        
        Stack:
        
        Error
            at module.exports.__wbg_new_59cb74e423758ede (serde-wasm-bindgen/target/wasm32-unknown-unknown/wbg-tmp/wasm-bindgen-test.js:524:21)
            at console_error_panic_hook::Error::new::h274f015c600dfb3e (wasm-function[6169]:0x2018c8)
            at console_error_panic_hook::hook_impl::h3ca87fb44cb9eea6 (wasm-function[1239]:0x151671)
            at console_error_panic_hook::hook::h2b7999185aaf7486 (wasm-function[6797]:0x20b038)
            at core::ops::function::Fn::call::hc246242b419002f9 (wasm-function[6193]:0x201f6b)
            at std::panicking::rust_panic_with_hook::h8540df5fb5cc4da7 (wasm-function[1847]:0x181b27)
            at std::panicking::continue_panic_fmt::h3082507e3bf1767e (wasm-function[3830]:0x1cf7a9)
            at std::panicking::begin_panic_fmt::hd9925b25492d4f84 (wasm-function[6026]:0x1ff2bf)
            at serde::assert_json::h2af830e0b2abbb33 (wasm-function[186]:0x6ba78)
            at serde::test_via_json::hd30427c9dd2fd6e6 (wasm-function[1698]:0x1777a1)
        
        
    
    JS exception that was thrown:
        RuntimeError: unreachable
            at __rust_start_panic (wasm-function[7985]:0x2160dc)
            at rust_panic (wasm-function[7342]:0x211ef4)
            at std::panicking::rust_panic_with_hook::h8540df5fb5cc4da7 (wasm-function[1847]:0x181b4e)
            at std::panicking::continue_panic_fmt::h3082507e3bf1767e (wasm-function[3830]:0x1cf7a9)
            at std::panicking::begin_panic_fmt::hd9925b25492d4f84 (wasm-function[6026]:0x1ff2bf)
            at serde::assert_json::h2af830e0b2abbb33 (wasm-function[186]:0x6ba78)
            at serde::test_via_json::hd30427c9dd2fd6e6 (wasm-function[1698]:0x1777a1)
            at serde::enums::h8c39b7b5568167b3 (wasm-function[89]:0x1eb44)
            at core::ops::function::FnOnce::call_once::hf429f4f99c997298 (wasm-function[7207]:0x21078e)
            at wasm_bindgen_test::__rt::Context::execute_sync::{{closure}}::hc76b1a4e656b2ee3 (wasm-function[5241]:0x1f0cde)

failures:

    serde::enums

test result: FAILED. 10 passed; 1 failed; 0 ignored

Not sure why the HashMap is being skipped over here, but it appears to be another bug (unless I'm missing something). Note that serializing structs works perfectly fine, but not HashMap, which I believe works fine for serde_json.

Another thing I saw was due to the way assert_json checks the serialization results. When I add another test case to test_enum! for sequences of floats:

test_via_json(InternallyTagged::<f32, ()>::Sequence(vec![12.4, 11.0, -124.5, 232.2, -65.56]));

I get this error due to floating points not being serialized exactly the same by serde_json and js_sys::JSON:

---- serde::enums output ----
    error output:
        panicked at 'assertion failed: `(left == right)`
          left: `"{\"Sequence\":[12.399999618530273,11,-124.5,232.1999969482422,-65.55999755859375]}"`,
         right: `"{\"Sequence\":[12.4,11.0,-124.5,232.2,-65.56]}"`', tests/serde.rs:37:9
...

This indicates that we need a more robust method of asserting that values are serialized and deserialized properly in order to provide more comprehensive coverage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So after looking into df09c8b, I'm a little unclear as to why iterables get excluded there. I agree that we should have a serialization switch for Map vs Object, but if we don't at the moment, then isn't it ok just to allow deserialization from iterables?

It might not be the best policy, but it wouldn't be unreasonable until a serde(with = ...) switch is implemented or serde-rs/serde#1183 gets resolved in my opinion. At the very least, it might save someone a gotcha if they just naively replace their serde_json code with serde_wasm_bindgen.

I suspended that changeset and ran this test and it seemed to do fine:

assert_eq!(
    InternallyTagged::Map(
        vec![
            ("a".to_string(), 12), 
            ("abc".to_string(), -1161), 
            ("b".to_string(), 64)
        ].into_iter().collect()
    ), 
    from_value::<InternallyTagged<String, i32>>(
        js_sys::eval(r"
            var m = new Map(); 
            m.set('tag', 'Map'); 
            m.set('a', 12);
            m.set('abc', -1161);
            m.set('b', 64);
            m
        ").unwrap()
    ).unwrap()
);

I'll upload the deserialization test and the README change so you can look at it, though.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hesitation with allowing such conversions that is that someone will most likely start relying on that behaviour, and removing it afterwards in a favour of "the right way" will be a breaking change.

I'd rather be as restrictive as possible and throw an explicit error, until someone comes up with a use-case where this is important, rather than silently allow some conversions through and deal with the challenge of deprecating them later.

Copy link
Contributor Author

@awestlake87 awestlake87 May 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing to consider is that excluding Iterables also affects untagged enums since they go through deserialize_any as well. So this test fails with df09c8b as well:

    #[derive(Debug, PartialEq, Serialize, Deserialize)]
    #[serde(untagged)]
    enum Untagged<A, B>
    where
        A: Ord,
    {
        Unit,
        Sequence { seq: Vec<A> },
        Map(BTreeMap<A, B>),
    }
    assert_eq!(
        Untagged::Map(
            vec![
                ("a".to_string(), 12), 
                ("abc".to_string(), -1161), 
                ("b".to_string(), 64)
            ].into_iter().collect()
        ), 
        from_value::<Untagged<String, i32>>(
            js_sys::eval(r"
                var m = new Map(); 
                m.set('a', 12);
                m.set('abc', -1161);
                m.set('b', 64);
                m
            ").unwrap()
        ).unwrap()
    );

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof and basically maps at any level in the variant. This also fails:

    #[derive(Debug, PartialEq, Serialize, Deserialize)]
    #[serde(tag = "tag")]
    enum InternallyTagged<A, B>
    where
        A: Ord,
    {
        Unit,
        Struct { a: A, b: B },
        Sequence { seq: Vec<A> },
        Map(BTreeMap<A, B>),
        DeeperMap {
            map: BTreeMap<A, B>,
        }
    }

    assert_eq!(
        InternallyTagged::DeeperMap {
            map: vec![
                ("a".to_string(), 12), 
                ("abc".to_string(), -1161), 
                ("b".to_string(), 64)
            ].into_iter().collect()
        }, 
        from_value::<InternallyTagged<String, i32>>(
            js_sys::eval(r"
                var m = new Map(); 
                m.set('a', 12);
                m.set('abc', -1161);
                m.set('b', 64);
            
                ({ 'tag': 'DeeperMap', 'map': m })
            ").unwrap()
        ).unwrap()
    );

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I uploaded 793f42b with the changes to tests and an update to the README. I think this should cover what we talked about before I brought up df09c8b, but let me know if you think we should add some more info or tests.

@attente
Copy link

attente commented Nov 24, 2020

Hi, is there any chance of this getting merged and released on crates.io? It would be great to have this fixed!

@egasimus
Copy link

Hi folks, what's the status on this?

Copy link
Owner

@RReverser RReverser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've realised that as much as I wanted to try and do some final clean-ups here, this gets too tricky due to Serde's handling of certain types, and I won't find time to figure it out further.

The tests are looking good, and I trust your implementation, so I'm just doing to merge this and release as a new version on crates.io, and if there are more edge cases we could handle, we'll hopefully get further issues for them and address separately.

Sorry for blocking everyone for so long.

@RReverser RReverser merged commit 476359a into RReverser:master May 14, 2021
@RReverser
Copy link
Owner

RReverser commented May 14, 2021

Published as 0.2.0. Could probably be 0.1.x, but the complexity of some changes made me uncomfortable about potentially breaking existing code. Thanks again @awestlake87!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants