Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement JSON::Any and YAML::Any without recursive aliases #5183

Merged
merged 1 commit into from
Jan 15, 2018
Merged

Implement JSON::Any and YAML::Any without recursive aliases #5183

merged 1 commit into from
Jan 15, 2018

Conversation

asterite
Copy link
Member

Related to #5155

(this PR isn't finished, but it's working well)

So, I managed to implement JSON.parse and YAML.parse without recursive aliases.

JSON.parse still returns JSON::Any, but this is a struct whose raw value contains all possible types such as Nil, String or Array(JSON::Any), so an array of these same structs - that's how we achieve recursion without recursive aliases (instead of it being Array(JSON::Type), which doesn't exist anymore).

Some things I noticed when doing this:

  1. In YAML a Hash (mapping) can have keys of any YAML type, not just strings. With recursive aliases, the key is simply a YAML::Type, and because it's a union it can be an Int64 or a String, for example. Without recursive aliases the key is a YAML::Any, which isn't a union. So, when you index a Hash(YAML::Any, YAML::Any) with a String, that won't work out of the box, unless we make comparisons between String and YAML::Any work both ways, and their hash values are the same. This is just a matter of reopening these types and overloading ==, but it still a bit surprising. In any case, it was a bit surprising before because YAML::Any==(String) would work well but String==(YAML::Any) wouldn't, so that was necessary even with recursive aliases.

  2. The overall API ends up being nicer and more consistent in my opinion.

For example, #3158 won't exist. For example this works just fine now:

require "json"

json = JSON.parse("[{\"key\": \"value\"}]")
array = json.as_a
first = array[0]
hash = first.as_h
value = hash["key"].as_s
p value # => "value"

Trying to do the same with the existing code we get:

require "json"

json = JSON.parse("[{\"key\": \"value\"}]")
array = json.as_a
first = array[0]
hash = first.as_h # undefined method 'as_h' for Array(JSON::Type)

Super inconsistent. At that point we have to do JSON::Any.new(first).as_h or first.as(Hash). Why in one place we can use as_a and in another place we can use .as(Hash).

So maybe we just get rid of JSON::Any and use the recursive definition? We'd have to cast with as every time we want to use a value, but at least it would be consistent.

Unfortunately, that will work, but if we want, for example, to include a property in JSON.mapping that can be of any JSON type (so basically, just parse it to the recursive type) we can't do it with a recursive alias because recursive aliases can't have methods. We need to provide a new(JSON::PullParser) method and that's impossible to do. Or introduce a type that has this method (basically JSON::Any) but it's just simpler to have one type and not two.

I still think that doing it this way is the best way:

  • It's consistent
  • Less typing with as_a, as_h, etc. instead of .as(Array), .as(Hash)
  • Less redundant features in the language (less work for the core team, less things to learn, less possible bugs, etc.)

I also think that removing recursive aliases will make it much easier to implement generic aliases because we don't have to deal with recursive generic aliases.

In any case, I don't think this will get merged before 0.24.0, and even after I'll have to discuss it with the team at Manas.

@ysbaddaden
Copy link
Contributor

I'm a little surprised to see the following changes in specs, isn't Any supposed to avoid explicit casts for simple/quick parsing?

data[0]      => data.as_a[0]
data["key"]  => data.as_h["key"]
data.each {} => data.as_a.each {}

I do not dispute the whole change. Removing recursive aliases is probably a good thing.

I'm not of fan Any anyway; I prefer raw with ugly/explicit casts or using PullParser directly. For example I'd probably have generators for parsing JSON based on a json-schema in Rails-like controllers (with proper validation). Yet, for quick prototypes or simple JSON I understand that Any with transparent accessors is nice sugar.

@asterite
Copy link
Member Author

Oh, yes, I forgot to mention that. I decided to remove those, and also remove the include Enumerable from them, because why just provide [] methods to easily traverse the data structure? Why just each? What if we have an array and we want to reverse it? Or do permutations? Or determine if a hash has a key? Providing just a few methods might be more confusing than providing none. It's not that hard to do to_a and then end up with an Array for which you already know all the methods, one which is Enumerable and works well, etc.

Another thing, JSON::Any was Enumerable and it worked for hashes, but it would yield two values instead of a single value with a tuple. So if you'd do each_with_index |(k, v), i| that wouldn't work. So instead of introducing inconsistencies I prefer to remove them and stick to plain old good APIs.

We could add those convenience methods, though, I'm just not sure it's worth it.

@RX14
Copy link
Contributor

RX14 commented Oct 26, 2017

@asterite because traversing arrays/hashes to get to the part of the data you want, then casting and doing advanced stuff with the result is the 95% usecase. And that's what Any is for: a quick hack to get to some data. If that quick hask is hard to use, we might as well go all the way and force everyone to use JSON.mapping/Array.from_json/Hash.from_json properly. I wouldn't mind that. I personally don't use Any, but I can see that it might be useful for playing around without going through the hassle of mapping the data correctly.

@asterite
Copy link
Member Author

@RX14 If that's the case (it's probably is), we can add those methods back (though I wouldn't make Any an Enumerable) but still implement it without recursive aliases because it seems to be more consistent.

@RX14
Copy link
Contributor

RX14 commented Oct 26, 2017

Loosing enumerable is fine. I still don't see the point to removing recursive aliases. There seems to be no point other than simplifying the compiler, which I don't think is ever a strong enough reason to remove a useful feature itself.

@asterite
Copy link
Member Author

What about #3158 ?

I suggest removing recursive aliases because they are almost exactly the same as implementing them with a struct with an instance variable that accomplishes the recursion. It's not just simplifying the compiler for my benefit, it's simplifying the user experience because they don't have to learn a new complex concept.

@RX14
Copy link
Contributor

RX14 commented Oct 26, 2017

I don't see how #3158 relates to recursive aliases being bad. Nobody in that thread has problems with recursive aliases, just the api of the Any types not offering the option of mapping collections contents to Any.

@asterite
Copy link
Member Author

Exactly. But to offer that kind of API we need to do map, which is just slow and ugly. Using the plain old struct approach gives us a nice and efficient API for free. Recursive aliases just make things worse in this case.

@RX14
Copy link
Contributor

RX14 commented Oct 26, 2017

That's an argument for implementing JSON::Any by parsing into JSON::Any directly instead of JSON::Type, not for removing recursive aliases for other usecases, or for removing the old recursive JSON::Type.

@asterite
Copy link
Member Author

@RX14 You probably read this already, but I'd like to share this comment about Nim: https://news.ycombinator.com/item?id=15577922

To be honest, I read the Nim manual some time ago but was baffled with it, as there's so many things you can do, and so many ways to do it, that whenever I want to start doing something it's hard. That probably also means there are more efficient ways to do some things than in Crystal, but it comes at the cost of confusing the user. For experts, it's fine, though.

That's why Go, despite not being liked because it lacks generics, map, select, etc., is used a lot and still liked and defended by many (I like it too, though I never used it for anything): it's simple, with very few ways to do a same thing, but you can still get stuff done. The cost is probably duplicating a bit of code here and there, not sure how often that happens. Maybe it's a reasonable tradeoff.

Once again, I'm glad we removed aliases in methods: I would sometimes like to have these aliases back like in Ruby for greater flexibility (Array#first and Array#take, for instance), but it's probably a bad idea because it adds more ways to do the same things, something which is usually a complaint, and goes against simplicity.

Regarding recursive aliases, or any feature, I always like to imagine what if we never had that feature in the first place. Imagine when me and @waj tried to implement JSON, instead of thinking "we need a recursive type, let's add recursive aliases to the language" we had said: "we need a recursive type, we can implement this with a struct and an instance variable". Recursive aliases would have never existed then.

The question to ask is: would the API or user experience had been dramatically different? I don't think so. In fact, the user experience is probably slightly better regarding casting and consistency. It might be a bit inconvenient to do case v = json.raw instead of just case json, but that's it.

So, without recursive aliases there are less ways to do things with more or less the same outcome. When a user is faced with this problem, it's easy: you use a struct/class with an instance variable that does the recursion (or just use record, as shown above, it's actually almost the same syntax as a recursive alias!). There's no need to think "Hm, should I use a recursive alias and maybe then provide a nice wrapper on top of it? Or should I use another approach?".

Anyway, many ways to say the same thing 😊

@RX14
Copy link
Contributor

RX14 commented Oct 29, 2017

@asterite thank you for your comment, I think you've managed to convince me. That also begs the question, what other features should be simplified or removed? When adding new features, how will we maintain this ethos? Is this kind of simplicity already a core tenet of crystal or does removing recursive aliases represent a shift in the goals of the language (and is that a good thing)?

I don't know the answer to these questions and I think anyone does, but I think that the general design principles of crystal need to be distilled and codified.

So that's why I think we should make the Any types not use recursive aliases, as that's something we can do right now and is a sensible change to make whether recursive aliases stay or not. But I don't think we should remove recursive aliases right now, or even remove JSON::Type in its current form, until we have a bit more discussion.

@lbguilherme
Copy link
Contributor

This change about JSON and YAML is good and should be merged, regardless of the fate of recursive aliases.

JSON::Any#[] for int and string should exist because it is surely most of the usecase of JSON/YAML: navigating until you extract a single value. If the person want to do anything more, they can use .as_a/.as_h as needed. (Removing Enumerable is good)

@hanyuone
Copy link

Bump: any updates?

@asterite
Copy link
Member Author

We can go forward with this. But if we do, we should remove recursive aliases from the language. That's the whole point of this PR, to prove that they aren't a necessary feature. Should we?

@luislavena
Copy link
Contributor

But if we do, we should remove recursive aliases from the language. That's the whole point of this PR, to prove that they aren't a necessary feature. Should we?

@asterite as other have voiced, I think this change should be merged and the fate of recursive aliases be dealt separately. I think this is the necessary and first step in prove that: if Crystal's own codebase don't use recursive aliases, then it makes sense to remove it.

There are some UX points in #5155 that I think will be resolved by more usage of this feature.

Cheers.

@asterite
Copy link
Member Author

@luislavena Thanks for the feedback! Sounds good. Someone (one or two) should approve this and then we can merge this.

@asterite asterite added this to the Next milestone Jan 15, 2018
@asterite asterite merged commit 597ccac into crystal-lang:master Jan 15, 2018
@oprypin
Copy link
Member

oprypin commented Feb 14, 2018

With the example "How to recursively change Hash values and save each change to new Hash"
I am sold on this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants