Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libserialize: encoding non string key map to json silently produces invalid json instead of failing #19490

Merged
merged 4 commits into from
Jan 19, 2015

Conversation

oli-obk
Copy link
Contributor

@oli-obk oli-obk commented Dec 3, 2014

importing object type string key maps is still supported
writing them should be explicit, and can be done as follows

let some_tree_map : TreeMap<String, Json> = ...;
Json::Object(some_tree_map).to_writer(&mut writer);

related to #8335, #9028, #9142

@lifthrasiir
Copy link
Contributor

This seems too "ugly" for normal uses. Can we do better? For example, we may optionally implement (or insert something like if TypeId::of::<K>() == TypeId::of::<String>(), which rustc will optimize out) an object version of maps, then fall back to this code.

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 3, 2014

i'd prefer keeping it explicit.
if you want to manually create json structures, do it with the Json objects.
if you want serialization, have as little surprises as possible.
if you want deserialization, allow reading everything that is legal and matches what you expect to read

@lifthrasiir
Copy link
Contributor

The autoserialization is commonly used for types that are not aware of serialization (but could have been made aware of it). Adding Json::Object to such types would be annoying.

(I have made it clear that this specific issue is subjective, as "ugly" is quoted, but I don't think that making the output format overly unconventional is good in general. If that were a concern we should have used some other format specific to us.)

@alexcrichton
Copy link
Member

I agree with @lifthrasiir that this seems like the wrong default. This means that any users encoding a HashMap are encoding into an array of arrays, right?

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 6, 2014

yes, which is the only sensible way for non-string (or not trivially convertable to string) keys.
another way would be to serialize the key to json, escape it, and use that as the key.
a third way would be to differentiate between string/numeric keys and all other keys. which would probably cause a lot of confusion (wrap a string in a newtype and you get a different serialization)

@alexcrichton
Copy link
Member

I think that not serializing maps as maps is such a gotcha that serialize::json would not be very useful in the world at large. I personally think it's OK for serialization to just fail for non-string keys as that's simply just a limitation of the json format. All serializers shouldn't necessarily strive to be able to encode all Rust types in existence I believe.

@sfackler
Copy link
Member

sfackler commented Dec 8, 2014

I agree with @alexcrichton.

I've felt a bit weird with the JSON module's use of Serializable and Deserializable for this reason. It seems more "correct" for those traits to only be used with things that can truly serialize and deserialize anything with an implementation, and we'd have separate traits for JSON. That might end up being way to inconvenient, but it's a bit of a bummer that you don't find out until runtime that some type can't be JSON serialized.

@sfackler
Copy link
Member

sfackler commented Dec 8, 2014

We also serialize enums with fields in a way that feels a bit shoehorned in.

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 8, 2014

another alternative would be to output invalid json (map key is a json value, not a string), and to require the json serializer to be created with a flag explicitly allowing that.

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 8, 2014

go-lang: http://golang.org/pkg/encoding/json/#Marshal (has flag to convert key to json and escape as string)

python: https://docs.python.org/2/library/json.html (convert to string in python-specific format)

Keys in key/value pairs of JSON are always of the type str. When a dictionary is converted into JSON, all the keys of the dictionary are coerced to strings. As a result of this, if a dictionary is converted into JSON and then back into a dictionary, the dictionary may not equal the original one.

.net: http://james.newtonking.com/json/help/index.html?topic=html/SerializationGuide.htm
ToString or custom TypeConverter for converting to string

other languages can't get their head around one or another way either. I agree now with @alexcrichton, an error in serialization is the right way to go. i'll update this PR.

@oli-obk oli-obk changed the title libserialize: allow non string key map import/export libserialize: non string key map export to json fails instead of silently producing invalid json Dec 8, 2014
@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 16, 2014

I have tried several schemes:

  1. check if the format is string (starts and ends with a quote) or if it is integer (no quotes, but can be converted to integer), else panic. This is rather hacky and requires unsafe and transmute (as it currently already does)
  2. make emit_map_elt_key generic over a type parameter that requires map keys to implement a trait EncodableAsMapKey<S : Encoder<E>, E> and Encodable. This way there's no function parameter, and Encoders can specify which Types can be map keys by implementing that trait for them. Does not help about the unsafe and transmute, as quotes still need to be added sometimes. Checks at compile-time.
  3. check type through TypeId (+ pass type instead of function to emit_map_elt_key). Forces a limited set of hardcoded types allowed as keys (types, not traits, implementing your own string type won't work).

A few more possible ways to do these checks

  1. create emit_map_elt_key_* for all the types that exist as emit_*
  2. wait for generic specialization and overwrite the Encodable impls for collections to check stuff for json encoder
  3. create emit_map_elt_key_num, emit_map_elt_key_string and emit_map_elt_key_object
  4. add trait parameters to Encoder for choosing a trait that all map keys must implement (checks at compile time)
  5. add trait parameters to Encoder for choosing a trait that all Encodables must implement. (checks at runtime)

@alexcrichton
Copy link
Member

One possible solution would be to set a flag when emitting a string and resetting it in all other emissions, and then after you emit a key if the flag is set you know you emitted a string and otherwise you know that it wasn't a key.

The other direction is also fine where you set a flag when emitting a key and then when you emit a type elsewhere it'll check the flag to make sure it's valid

@oli-obk oli-obk force-pushed the json_non_string_key_maps branch from afd14a6 to a8929fa Compare December 16, 2014 16:40
@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 16, 2014

That sounds reasonable.
I actually implemented both. when emitting a map key the flag is set, when a string, char or number is written, the flag is checked and number formatting is adjusted and the flag is changed. then the mapkey-function checks the flag again and either panics or clears the flag.

panicking -> error reporting will be done in another PR (after Encoder+PrettyEncoder merge)

@alexcrichton
Copy link
Member

Awesome, looks great! Could you actually go a head an add a variant to the error enum returned to reflect this new condition? (as opposed to waiting to a further PR)

@oli-obk oli-obk force-pushed the json_non_string_key_maps branch from a8929fa to b1f8a41 Compare December 18, 2014 10:07
@oli-obk oli-obk changed the title libserialize: non string key map export to json fails instead of silently producing invalid json libserialize: encoding non string key map to json silently produces invalid json instead of failing Dec 18, 2014
@oli-obk oli-obk force-pushed the json_non_string_key_maps branch from c85de97 to fa718dd Compare December 18, 2014 14:32
@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 18, 2014

r? @alexcrichton
not the prettiest way to do this, but converting errors only works nicely through the try! macro

@oli-obk oli-obk force-pushed the json_non_string_key_maps branch 2 times, most recently from aef867a to bc0236f Compare December 18, 2014 15:46
@oli-obk oli-obk force-pushed the json_non_string_key_maps branch from bc0236f to 66426be Compare December 19, 2014 12:04
@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 19, 2014

rebased and fixed

@oli-obk oli-obk force-pushed the json_non_string_key_maps branch from 66426be to d3e903b Compare December 23, 2014 11:20
@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 23, 2014

fixed unit test, rebased and fixed rebase fallout

@oli-obk oli-obk force-pushed the json_non_string_key_maps branch 3 times, most recently from 135da92 to 81173a3 Compare January 9, 2015 10:16
@oli-obk
Copy link
Contributor Author

oli-obk commented Jan 9, 2015

phew, good thing it took me so long to get this ready... it doesn't work!
if a single sub-element of the map key is a valid map key, the flag is set and the map key is accepted.
what i could do is blacklist all invalid elements, but any changes in the future to the Encoder or Decoder trait might mess that up again.

@oli-obk oli-obk force-pushed the json_non_string_key_maps branch 2 times, most recently from b3fb183 to c0490ae Compare January 9, 2015 15:16
@oli-obk
Copy link
Contributor Author

oli-obk commented Jan 9, 2015

@alexcrichton it is fixed and tested now, i actually messed up the unit test before, that's why the mistake wasn't caught by it.
make check-stage1 ran through, make check is running right now
edit: make check finished

@oli-obk
Copy link
Contributor Author

oli-obk commented Jan 9, 2015

should i squash the fix?

@alexcrichton
Copy link
Member

Sure!

@oli-obk oli-obk force-pushed the json_non_string_key_maps branch from c0490ae to b262ab5 Compare January 10, 2015 09:15
@oli-obk
Copy link
Contributor Author

oli-obk commented Jan 10, 2015

done

@oli-obk
Copy link
Contributor Author

oli-obk commented Jan 11, 2015

@alexcrichton those failures don't seem to be mine, should i rebase or sth?

@oli-obk oli-obk force-pushed the json_non_string_key_maps branch from b262ab5 to a320149 Compare January 19, 2015 13:22
@oli-obk
Copy link
Contributor Author

oli-obk commented Jan 19, 2015

rebased

@oli-obk
Copy link
Contributor Author

oli-obk commented Jan 19, 2015

the only fix required was changing a 0..n to a ..n

@alexcrichton
Copy link
Member

@bors: r+ a320149

@bors
Copy link
Contributor

bors commented Jan 19, 2015

⌛ Testing commit a320149 with merge 7f8c687...

bors added a commit that referenced this pull request Jan 19, 2015
importing object type string key maps is still supported
writing them should be explicit, and can be done as follows

```rust
let some_tree_map : TreeMap<String, Json> = ...;
Json::Object(some_tree_map).to_writer(&mut writer);
```

related to #8335, #9028, #9142
@bors bors merged commit a320149 into rust-lang:master Jan 19, 2015
@oli-obk oli-obk deleted the json_non_string_key_maps branch January 23, 2015 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants