Skip to content
This repository was archived by the owner on Oct 7, 2020. It is now read-only.

Consider switching back to aeson-generated json serialization? #109

Closed
mgsloan opened this issue Nov 28, 2015 · 14 comments
Closed

Consider switching back to aeson-generated json serialization? #109

mgsloan opened this issue Nov 28, 2015 · 14 comments

Comments

@mgsloan
Copy link
Collaborator

mgsloan commented Nov 28, 2015

#107 caused me to consider code generation for other editors to be an advantage of generated JSON instances. Once more editor code is written, it'd be good not to have sweeping changes to the protocol, so we ought to be sure the current approach is best. Despite the consensus in #32, it seems worthwhile to revisit the decision.

Pros

  • Allows for generating serialization code for editors (somewhat related to [RFC] Ide generation and Guidance Proposition #107). Code generators for aeson's default serialization would generally be a very useful thing to have.
  • Reduces boilerplate
  • If consistent type / field naming is used in Haskell, we also get consistent JSON naming.

Cons

  • Potentially uglier JSON

Solvable Cons

  • Easier to accidentally change the protocol
    • Solution: Generate a file every compilation, possibly by pprinting all of the types, sorted by name. Stuff that isn't relative to serialization, such as which instances are derived, would get set to some default ([]).

      The file would be part of the repo and would store a protocol version number. When this file differs from the HEAD version of the file, it'll force a version bump. Some differences may be benign. In this case, you can force the version number back to what it was, and commit the new version of the file.

      The file could also store an aeson version number. While this usually shouldn't make a difference to the protocol, it would be a good idea to make sure that aeson didn't change any serialization details when upgrading it.

  • Can't have FromJSON instance take old formats.
    • Solution: Keep the generated ToJSON instance, and add an explicit FromJSON instance. This way, code generation will still work.
@alanz
Copy link
Collaborator

alanz commented Nov 28, 2015

For the code generation, there are two cases

  • We use haskell, in which case we can use whatever JSON instances there are, regardless of their origin
  • We don't use haskell, in which case some specific process needs to happen to adapt to the hie JSON. Admittedly, if there is a known scheme for generating the JSON, this can be a lot easier.

For the second case, it may be useful to get hie to be able to spit out a http://swagger.io/ description, or use http://json-schema.org/ (as previously suggested).

@gracjan
Copy link
Contributor

gracjan commented Nov 28, 2015

The shape of the protocol should be the source for the code, not other way around.

Rationale: interface part is and will be the most stable because changing it requires simultaneous change to multiple editors. This is hard and therefore will not happen often.

If you want to generate something from something else then generate Haskell from JSON schema, not the other way around.

@alanz
Copy link
Collaborator

alanz commented Nov 28, 2015

@gracjan, I think we are in agreement on the protocol format being the driver for the code, but the question is how to manage its stability. @mgsloan has proposed a way of detecting changes so that there can be a version bump, indicating changes.

I think the unstated question is whether we want "beautiful" JSON or whether the straight aeson instances are ok.

And this to some extent comes down to the experience that the IDE plugin writers will have, and I do not have much insight into that. Any comments?

@mgsloan
Copy link
Collaborator Author

mgsloan commented Nov 28, 2015

Generating Haskell from JSON schema is a reasonable solution! It'd be interesting to see how well that works out.

One of the reasons I like the idea of generating schema from haskell datatypes is that this way you can use more specific types (and so use newtypes that have additional checks, etc). I imagine that it's tricky to have a "schema --> Haskell" generator which supports the full gamet of possible datatypes, but maybe that isn't necessary.

@cocreature
Copy link
Collaborator

As someone who has working on IDE bindings, I don’t particularly care how they look. We need documentation and examples either way (I hope I get to the generating examples automatically part soon) so I don’t having json that looks a bit unconventional is actually a problem.

@alanz
Copy link
Collaborator

alanz commented Nov 28, 2015

Ok, so it seems that the important things are

  • Given that the wire protocol is machine to machine it does not have to be
    "pretty"
  • It is better to allow more precise typing on the haskell side
  • explicit versioning of the wire protocol is vital

On Sat, Nov 28, 2015 at 11:03 AM, Moritz Kiefer notifications@github.com
wrote:

As someone who has working on IDE bindings, I don’t particularly care how
they look. We need documentation and examples either way (I hope I get to
the generating examples automatically part soon) so I don’t having json
that looks a bit unconventional is actually a problem.


Reply to this email directly or view it on GitHub
#109 (comment)
.

@JPMoresmau
Copy link
Contributor

I don't think it's a good idea. Having automatic serialization will always leak some of our Haskell into the protocol. I saw it perfectly with the parameters serialization, where we ended up with objects called rp and op, which meant nothing for somebody that doesn't know the Haskell code. I don't buy the argument that the protocol can be ugly because it's machine to machine. An IDE writer will consider integrating HIE and will look at the protocol. If it makes no sense or is too ugly, he's going to go "WTF is that mess" and move on to better things. Of course we need to have explicit versioning and solid tests, but we should give us the possibility to change the way the Haskell is written without changing the JSON protocol, so we'll end up with a mix of automatic and generated JSON code anyway.

@mgsloan
Copy link
Collaborator Author

mgsloan commented Nov 30, 2015

Ok, based on this and IRC discussion it seems reasonable to leave it the way it is.

It would be ideal to specify the protocol datatypes as few times as possible, and generate as much code as possible. However, this ideal requires an implementation of such code generation, which does not yet exist. So, it seems pragmatic to continue with the manually written JSON instances.

Sorry for the distraction, but perhaps this will be an interesting alternative to revisit in the future.

@mgsloan mgsloan closed this as completed Nov 30, 2015
@rvion
Copy link
Collaborator

rvion commented Nov 30, 2015

just for the record (not to reopen the issue)
@mgsloan I was sharing your preference.
Also, I found several more good arguments in favor of code generation since my last comments. I'm keeping them for now. I'll try to find some time to write them down in a well-written post soon.

@alanz
Copy link
Collaborator

alanz commented Nov 30, 2015

I think there are two separate issue

  • how the json serialisation is generated
  • whether an IDE integration has code generated for it

The solution to the one should not affect the other, and it is up to the specific IDE integrator to decide what is the best way to tackle the task.

@rvion
Copy link
Collaborator

rvion commented Nov 30, 2015

Regarding code generation:

Again, I really understand that we can't force everyone to buy the code-generated all the way paradigm. I don't want to pollute the discussion, so I'll write a blog post discussing json serialisation / code generation soon.

Regarding your sentence:

The solution to the one should not affect the other

I do not completely agree. I think it just depends on the architecture you want.
The two could have to affect each other for several practical reasons.


Some related thought I had yesterday while looking at the code

ping @mgsloan

⚠️ (I'm not making any strong proposal here, I'm just trying to enlarge the discussion, give other views of the problem)

Imagine if we were:

  • Hiding all references to other tools (ghc-mod, etc)
  • Having types for each high-level command (AutoComplete, Reformat, RenameModule, etc.)
  • Having types for each kind of possible params (FileOnlyContext, ProjetAndFileContext, etc.),
  • Having types for each possible responses (PossibleCompletion string description url location, Type ...),
  • having some closed type familly mapping commands to data they expect as input
  • having some closed type familly mapping commands to data they answer as output
  • Having default serialisation for all params and responses
  • ...

then

  • With data being defined in separate files, names could be nice without any prefix, and that would be great.
  • About versionning, we could even imagine having version as type Suffix: AutocompleteV1, ComlpetionResponseV2, etc. => not so ugly, and full retro-compatibility, for free. with closed type families, we'll have full type safety about all request being properly handled.
  • about "not meaningfull" names: in such a scenario, names would all be meaningfull: no 'ghcmod or 'applyrefact'
  • all actions would be easilly integrable: autocompletion would autocomplete language pragma of function names depending on arguments. When autocompleting a pragma, we could have a nice description and a link to ghc manual. When autocompleting a function, we could have a link to some hoogle server spawned by hie.

In such a scenario, code generation would make more sense in many ways. Also, ide-integrators wouldn't have to pick between several available linting engines, or type info engines, but would benefit from us having made all the selection and unification work.

(really, I'm just trying to be helpfull and give new ideas before the project matures too much) ⛵

@alanz
Copy link
Collaborator

alanz commented Nov 30, 2015

It all sounds great, but means that there is a global namespace of commands, which means the task of a plugin-writer gets harder, and it is harder for people to experiment with private plugins which may then grow up to be full-featured ones in future.

I do agree with

  • having types for params,which is currently captured in the context definitions, except has the ability to add extra params.
  • having types for responses. What you are describing is what I have been calling semantic types. They are things that make sense in an IDE environment.

I am not sure about the other mapping suggestions though, unless they can be done in an extensible way, allowing local plugins.

@rvion
Copy link
Collaborator

rvion commented Nov 30, 2015

I share your interrogation, I'm still thinking about all alternatives.

here is some "answer elements" to your points

To mitigate some of the cons, we could imagine:

  • keeping namespaces for experimentations and tool specific commands (as it is now). The default workspace (without tool name) would be the only for the main commands agregated and curated by the hie team. Basic ide writer would just have to deal with the defautl nameless workspace to have a decent ide, and could still include specific commands if they want (stuff like: AddLicenseHeaderOnAllFiles).)
  • having a GenericJsonRequest, and a GenericJsonResponse data (non semantic) allowing people to easilly experiment with new commands / new return types.
  • with symbols and closed type families, I guess we could even ensure all commands have different names at compile time almost for free.

to insist on the cons

Extensibility may really be a little bit more difficult, and we will need more concensus about canonical types and commands

@alanz
Copy link
Collaborator

alanz commented Nov 30, 2015

Perhaps we should do both, over time.

I think the current approach will allow us to identify the semantic types
of interest, as well as the command set we need.

Once we have a handle on this and the way it works, we can see about
putting a type safe version into place as outlined by @rvion.

They can possible run concurrently, or have a translation layer at various
points.

On Mon, Nov 30, 2015 at 4:04 PM, Rémi Vion notifications@github.com wrote:

I share your interrogation, I'm still thinking about all alternatives.

here is some "answer elements" to your points
To mitigate some of the cons, we could imagine:

keeping namespaces for experimentations and tool specific commands (as
it is now). The default workspace (without tool name) would be the only for
the main commands agregated and curated by the hie team. Basic ide writer
would just have to deal with the defautl nameless workspace to have a
decent ide, and could still include specific commands if they want (stuff
like: AddLicenseHeaderOnAllFiles).)

having a GenericJsonRequest, and a GenericJsonResponse data (non
semantic) allowing people to easilly experiment with new commands / new
return types.

with symbols and closed type families, I guess we could even ensure
all commands have different names at compile time almost for free.

to insist on the cons

Extensibility may really be a little bit more difficult, and we will need
more concensus about canonical types and commands


Reply to this email directly or view it on GitHub
#109 (comment)
.

@alanz alanz added this to the prehistory milestone Feb 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants