-
-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
" escaped double quotes in JSON/CSV #1769
Comments
The main reason seems to be that any input is HTML encoded before being written to the MongoDB and the I think, it is pretty obvious that a simple string (unless marked explicitly as HTML input/output, but we don't have those in OFF) should not be HTML encoded in internal storage in the first place. However, changing this might not be trivial, because we do provide a full MongoDB dump for anyone to download, and changing the encoding behaviour of HTML entities could be seen as a breaking change. However, we also shouldn't require anyone that just wants to use the JSON/CSV output to decode HTML entities before displaying a string value. Which approach do you prefer, @stephanegigandet, @CharlesNepote, @teolemon, @openfoodfacts/openfoodfacts-server? From a standards and portability perspective, I'd prefer not HTML encoding anything before writing it to the database(s), and letting the output format decide the correct method of escaping values - but that approach does have compatibility issues with current consumers of the data. Did I miss anything? |
Regardless of the question how to we want to store the data, an option to keep up compatibility with existing API consumers could be to introduce a parameter or HTTP header to the API. For example, we might have One more thing to keep in mind is that if we did decide to not encode the values before storing them, we should review the ProductOpener source to ensure that strings are HTML encoded before they get displayed. |
Related to that, the ingredient parser does not understand the quote entities, and so ingredients lists like "20% "Los" will create an ingredient tag 20-quot... |
About the idea of changing the API with
On https://world.openfoodfacts.org/data page there is a section dealing with "Mailing list for data, API and exports": why not use it? Also this section should be upper in the document. And it would also be interesting to publish more information such as a changelog for API and database big changes. |
Summary:
If a field (ie.
product_name
orbrands
) contains a value with"
, then the JSON output and CSV output for that product contains"
instead of the correct escaped form of"
for each format.Steps to reproduce:
"
in the name.Expected behavior:
\"
.""
.Observed behavior:
In the CSV file and in the JSON output, the
"
is escaped as"
.Someone who uses the JSON API or parses the CSV output should not need to parse fields/properties as HTML.
The text was updated successfully, but these errors were encountered: