Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single constructor data types and JSON serialization #75

Closed
boisgera opened this issue Aug 19, 2020 · 4 comments · Fixed by #76
Closed

Single constructor data types and JSON serialization #75

boisgera opened this issue Aug 19, 2020 · 4 comments · Fixed by #76

Comments

@boisgera
Copy link

Hi everyone !

Until recently I worked under the assumption that a pandoc-types datatype with a single constructor (say Format) had its type erased from the JSON representation : instead of {"t": type, "c": content}, the representation was simply content.

I think that (maybe with the exception of Meta ?) this assumption was valid until the recent changes to the document model. Now, AFAICT some types with a single constructor have their types erased and some don't. I thought for a moment that the difference was that some where declared with newtype keyword (type erasure) and some with data keyword (no type erasure) which would make sense (if I understand correctly the difference between the two keywords in Haskell) but this second hypothesis doesn't hold either.

Could anyone explain me if there is a simple rule based on the definition of pandoc types that says if the type of the data will be erased in JSON representation ?

AFAICT, Format data (newtype) has its type erased in JSON, but RowSpan data (newtype) has its type serialized. Cell data (data) also have their types serialized. Unfortunately, I don't know enough of Haskell to pinpoint what parts of the code explain the difference between these cases ...

The context: I have developped a Python library (https://github.com/boisgera/pandoc) that reads the pandoc-types data models (for as many versions of pandoc as possible) to reproduce automatically the equivalent hierarchy of classes in Python, so that json data can be exchanged with the available pandoc executable to work with a pandoc document representation in Python. The target being the people (first and foremost : me 😉 ) that need to analyze and transform a document with a nice AST and are fluent in Python but not so much in Haskell (or in Lua). To continue to do that, I need to be able to infer automatically from the output of :browse Text.Pandoc.Definition in ghci the JSON serialization rule for each data type. This is why a simple and mechanical rule would help !

Cheers,

SB

@jgm
Copy link
Owner

jgm commented Aug 21, 2020

I think we should be as consistent as possible here. Maybe @despresc can comment on whether there was a reason for the different behavior in the case of RowSpan and Cell. If not, we should change this.

@despresc
Copy link
Contributor

No, there was no particular reason that I can remember. I think I just missed that newtype-defined types were serialized without their type information when I was looking at the other instance definitions.

@despresc
Copy link
Contributor

despresc commented Aug 29, 2020

Or other single-constructor data types, for that matter. I think I just copied what Inline and Block did for serialization. That means that TableHead, Caption, and similar types should all be changed too.

@jgm
Copy link
Owner

jgm commented Aug 30, 2020

This will mean another version bump in pandoc-types, but I think it's probably worth making these changes so that we have a consistent JSON serialization scheme.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants