-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CVS delimiter detection is wrong on importing csv language files #2853
Comments
How is it not working correctly? What issue are you trying to solve? I don't see how switching from This line is trying to find the delimiter at the end of the first column. It says "match any character one or more times, a slash, followed by either 0 or 1 speechmarks, and then any other next character". It's the any other next character that would be the discovered delimiter. The only place I could see this not work would be if there were escaped speechmarks in the first column - which should never happen given that the first column is always a JSON property path and a _id. Is it that the match doesn't work when the first column isn't wrapped with speechmarks? Update: It was that there were slashes in the second column and the regex as defined is greedy |
The regex is matching any character.
So it can match on any place after the first char In my case it detected "p" as delimiter. The regex is still not save to use but better. Ideally it will be removed or a library is used therefore. |
Can you post the first line of your csv? I want to find something more fitting. |
I can't post they complete line because it's a private project. But it's like that:
|
Yea that makes sense. |
Yea, your suggestion of limiting the delimiters seems like the only sensible option. This article suggests pipe and space should be included and we can probably drop the rest of the regex just matching the delimited list directly. These main use-cases would all work fine (no marks, marks, marks+html):
|
The best is to use this https://www.npmjs.com/package/detect-csv or this https://www.npmjs.com/package/csv-string#detectinput--string--string With the regex their may still be a problem. Detection should also only be done if not defined via command line. |
I completely disagree about using a library to do this, both of these libraries detect delimiters in ways that are almost identical to a regex detect-csv, csv-string and they are both designed for generic data. In contrast, we know what the data looks like on the first line of all of these files and we don't need the overhead of another library. To have detection only be done if not defined at the command line, change this line to default |
Subject of the issue/enhancement/features
The csv delimiter detection introduced in #2737 is not working correctly.
The introduced regex can't work correctly:
It probably should look like:
Reference:
adapt_framework/grunt/helpers/Translate.js
Line 297 in 3f25a4d
The text was updated successfully, but these errors were encountered: