-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standard MIME content-type #19
Comments
I'd rather prefer In addition to the Media Type, a registered structured suffix may be interesting. In my eyes even more useful, to create media types like See also: @wardi have you considered filing a registration for a json-lines Media Type and structured suffix at IANA? |
There is an IETF RFC 7464 for JSON Text Sequences that uses mime type: It allows prefixing each JSON record with <RS> control character and requires ending each JSON record with <LF>. |
This seems like a duplicate of #9. The whole purpose of the |
The lack of a definitive IANA Media Type for JSON Lines causes some difficulty for those of us using the format. In the interest of pushing the issue, I took the liberty of starting a conversation: Perhaps someone here would like to join that thread? Disclaimer: I am in no way affiliated with the IANA/IETF. I am merely interested in using the format, correctly. |
@whlavina the response from Tim Bray was the most helpful and it looks nothing had happened since then. I'll copy the interesting bit here for reference
|
I am linking the relevant RFC to suggest new MIME type for standardisation: https://www.rfc-editor.org/rfc/rfc6838.html I propose working on adding the mime type Among the two ways they list to get it added to the standard tree:
I think the second one is the most relevant, which leads to https://www.rfc-editor.org/rfc/rfc5226 |
Hi @sp4ce, good to see that someone is leading the way to an actual RFC! I've noticied that AWS is (apparently) using JSON Lines for one of their products. I haven't seen a description of the actual output to know whether or not it is compatible with JSON Lines. In any case they are using the mime type |
AWS Claim it's compatible with JSON Lines - it links to the JSON Lines homepage https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/S3DataExport.Output.html |
There's an [ongoing discussion](wardi/jsonlines#19) about what the MIME type for [JSONL](https://jsonlines.org/) files should be. Making it `application/jsonl` leads to the file being downloaded according to my testing, which prevents browsers from opening them in a new window and parsing them as JSON, which fixes btcpayserver#5488.
There's an [ongoing discussion](wardi/jsonlines#19) about what the MIME type for [JSONL](https://jsonlines.org/) files should be. Making it `application/jsonl` leads to the file being downloaded according to my testing, which prevents browsers from opening them in a new window and parsing them as JSON, which fixes #5488.
If there's still interest in doing this, I would recommend an informational track internet-draft (I-D) to describe the jsonlines specification, with an IANA considerations section registering the media type. The idea is that drafts work towards RFCs work towards standards on a long evolutionary track of internet draft to RFC, and potentially to being an internet standard. IETF wants to deal with immutable and permanently available documents, so you will likely need represent the encoding and parsing requirements authoritatively within the I-D itself, using IETF nomenclature. There's a lot of references to this available, and the JSON Text Sequences RFC is likely an excellent example. I suspect there will be feedback that some areas are not needed. For example, your UTF-8 encoding rule does not have much left to it once you reference the JSON RFC. That RFC already mandates UTF-8 for everything other than closed ecosystems.At that point, you have to decide whether the application "advice" that they might want to escape the string to work on ASCII transports becomes something you might want to represent as an application note on the jsonlines site, and a discussion you have with the IETF more broadly - after all, it would also affect JSON and json sequence data over such transports. Conversely, you may want to be quite a bit more specific for the sake of interoperability, such as whether applications MUST be able to consume |
What's wrong with what ndjson is trying to implement? Their current standard is |
The The lack of an immutable standard (like a RFC with a number) means that ndjson three years from now may make changes along lines like these for robustness, but implementations do not have a clear way to explain what they are compatible with. There are plenty of commercial products which use vendor and x-prefixed media types, and which do not attempt to define fixed/robust/interoperable behavior. It is a matter of what this project is going for, which is why my first words were "If there's still interest in doing this". In terms of ramifications, most SDOs (standard defining organizations) won't touch dependencies which do not have these and other formalisms, and may use things like publication in another SDO (like IETF) as a sign of that. That means ndjson/jsonlines may be used in public facing API, but a large category of interoperable standards work either wouldn't touch it, or will standardize their own similar effort. |
Well that's the problem, it might happen, at some point in the future. Given the usage of JSON lines in various commercial products, we're suggesting we do that formalisation now - or at least start the process very soon! |
I'd love to see this. So do we copy-paste JSON-SEQ https://datatracker.ietf.org/doc/html/rfc7464 without the "ASCII Record Separator (0x1E)"? JSON-SEQ discusses detecting truncated records and continuing a fair bit, all of that could be removed in a new RFC.
Rule 3 in https://jsonlines.org/ mentions that a compliant parser will be able to consume Lines of only whitespace are already invalid by rule 2 in https://jsonlines.org/ , but again it doesn't hurt to make this clear. To be specific let's say that any line that doesn't parse as valid JSON should be treated as an invalid record but still counts as a record for the purpose of numbering the lines. |
Should it count as a record? The whole point of something called JSON Lines is that it stores lines of a well defined format called JSON, not arbitrary character sequences. Depending on the nature on malformed data in a line it might as well make all other lines after it invalid and blow up logs with parsing errors noise when the offender is a single line (a whole file). |
I think RFCs are copyrighted so to copy paste you would need permission of the original author |
I'm glad to see continued discussion and forward movement. It's interesting to see that YAML just recently (this month) gained IANA media type registration... 22 years after the format was first created. If YAML can do it, JSON Lines can, too! If there's any need for help with the process, maybe we could ask the folks who pushed the YAML RFC? |
Here's the guidelines on how to write an Internet Draft |
@whlavina You folks are welcome to come join the HTTPAPI mailing list https://datatracker.ietf.org/wg/httpapi/about/ and we can chat about a path to registering this media type. This is where the YAML media type registration RFC was created and we are working towards the OpenAPI one also. There is ongoing discussion about allowing mediatype registrations to happen in the standards tree without necessarily going through the process of writing an RFC for the format. https://www.ietf.org/archive/id/draft-ietf-mediaman-standards-tree-00.html Although, this format might be simple enough that an RFC would straightforward. |
As of last month, that (expired) draft is replaced by https://www.ietf.org/archive/id/draft-ietf-mediaman-standards-tree-01.html |
What do you think about adding new HTTP content-type for jsonlines data.
What about
application/jsonl
?The text was updated successfully, but these errors were encountered: