You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
And it is not just the character encoding that might be different. I've seen many variations of CSV around (e.g. how they treat commas, new lines and escaping characters in values). There is no real standard... and even if there was, not everyone might implement it properly.
It feels like we need a general framework where different snippets can be assigned to different data sets, based on an expandable set of rules and metadata.
Currently, we (want to) have two snippets: download CSV and download anything. But the further we go, we'll have to deal with more variants (e.g. download UTF-8 CSV, download CSV that puts values with commas in double quotes, download CSV that uses backslashes to escape commas).
At one extreme, the rules need only find one snippet for a type of file. At the other extreme, there might need to be a custom snippet that is only used for one particular dataset. In between, a single snippet is used with all CSV from a particular publisher, but a different snippet used for other publishers. That is, the metadata for the rules might already be available, or at the worst case there needs to be a "use this particular snippet" metadata property.
Maintaining this will be a lot of work, so maybe we should let users contribute. Or at least let them tell us when a snippet no longer works for a particular dataset and/or to vote it down. Maybe they can be given a pop-up menu of possible snippets they can use, with a default already chosen, but with other options that might work -- with the "download anything" snippet as the option of last resort. Sounds like a code sharing project/feature in its own right!
Knowing the encoding will enable us to write more dynamic snippets where we can visualise the data better
The text was updated successfully, but these errors were encountered: