Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 BOM Magic Bytes are read as part of the column header #196

Closed
bashdx opened this issue Mar 28, 2021 · 0 comments
Closed

UTF-8 BOM Magic Bytes are read as part of the column header #196

bashdx opened this issue Mar 28, 2021 · 0 comments

Comments

@bashdx
Copy link

bashdx commented Mar 28, 2021

  • Operating System: Win 10, 19401
  • Node Version: 12.19.0
  • NPM Version: 6.14.8
  • csv-parser Version: 3.0.0

Expected Behavior

When reading a UTF-8 BOM encoded text file, the magic bytes "EF BB BF" should not be part of the column name.

Actual Behavior

The magic bytes "EF BB BF" are part of the header and therefore the property name representing the column cannot be fetched from the object, i.e. if the column name is "Date(UTC)"

obj["Date(UTC)"] returns undefined.

Fetching the key via Object.getOwnProperties and putting the string in a node Buffer looks like this:

<Buffer ef bb bf 44 61 74 65 28 55 54 43 29>

vs. expected

<Buffer 44 61 74 65 28 55 54 43 29>

How Do We Reproduce?

Create a csv file and save as UTF-8 with BOM. Where the first column header is right at the beginning of the visible file contents, i.e.

Column1;Column2;Column3
Value1-1;Value1-2;Value1-3
Value2-1;Value2-2;Value2-3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant