Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cell values with underscore surrounded by numbers not able to be read (ex: 123_4567) #1219

Closed
HunterWard opened this issue May 1, 2022 · 3 comments
Labels
confirmed This issue can be reproduced

Comments

@HunterWard
Copy link

Description
I'm trying to read in data from an .xlsx and it seems like excelize has a problem reading in cell value data when the data in the cell is in the format of "[NUMBERS]_[NUMBERS]". I've not been able to find a fix.

Steps to reproduce the issue:
The data I'm working with is sensitive so I made a dummy .xlsx which I also attached:
sample2.xlsx (A column has no special formatting)
image
I made the A column show a few different examples of underscores in the cells.

Here is my code:
image

Describe the results you received:
image
You can see that it is detecting and reading in the A column just fine except for where there is an underscore surrounded by numbers in the second row (A2). The results are the same when using GetCellValue().

In the debugger it is just an empty string "". Here are second and third rows from debugger:
image

Describe the results you expected:
I expect a cell with "[NUMBERS_NUMBERS]" (ex 1888_257) to be read into the []string that GetRows()/GetCellValue() returns.

Output of go version:

go version go1.18.1 windows/amd64

Excelize version or commit ID:

v2.6.0

Environment details (OS, Microsoft Excel™ version, physical, etc.):
I'm on windows working specifically with .xlsx format files.

@HunterWard
Copy link
Author

Small update:

As a work around I was trying to use the RawCellValue: true option but for some reason that wasn't working either. In the debugger it looks like the options I set are not being passed to the parsing functions. Might be another bug?

This is how I'm passing options:
f, err := excelize.OpenFile("sample2.xlsx", excelize.Options{RawCellValue: true})

I replaced line 66 in cell.go from:
val, err := c.getValueFrom(f, f.sharedStringsReader(), parseOptions(opts...).RawCellValue)
to
val, err := c.getValueFrom(f, f.sharedStringsReader(), f.options.RawCellValue)

and line 132 in rows.go from:
rows.rawCellValue, rows.sst = parseOptions(opts...).RawCellValue, rows.f.sharedStringsReader()
to
rows.rawCellValue, rows.sst = rows.f.options.RawCellValue, rows.f.sharedStringsReader()

I don't know of any unintentional side effects of this change but it works for getting those underscore values to be read:
[ 2016.1.1 6/16/16 Test Subject This is a sample description some notes John Doe Gift of So and So 2016.1 orig]
to
[1888_257 2016.1.1 6/16/16 Test Subject This is a sample description some notes John Doe Gift of So and So 2016.1 orig]

I'm not familiar enough with the codebase yet to understand what may be going wrong with trying to parse the original values but I'll keep looking.

@HunterWard
Copy link
Author

HunterWard commented May 2, 2022

So it looks like textHandler() in numfmt.go is the last function where my cells value is present, then the function returns an empty string which ends up being the final value. I put in a check at the end:
image
And now returns the correct value and my final output is correct. Thoughts on this change?

Edit:
I added a test case to numfmt_test.go and everything passes

@xuri xuri added confirmed This issue can be reproduced in progress Working in progress labels May 2, 2022
@xuri xuri closed this as completed in eed431e May 2, 2022
@xuri
Copy link
Member

xuri commented May 2, 2022

Thanks for your issue, I have fixed it. Please upgrade to the master branch code, and this patch will be released in the next version.

@xuri xuri removed the in progress Working in progress label May 2, 2022
xuri added a commit to carbin-gun/excelize that referenced this issue Oct 9, 2022
…formance, and 1904 date system support

- Fix incorrect cell data types casting results when number formatting
- Support set cell value on 1904 date system enabled, ref qax-os#1212
- Improve performance for set sheet row and the merging cells, fix performance impact when resolving qax-os#1129
xuri added a commit to JDavidVR/excelize that referenced this issue Jul 11, 2023
…formance, and 1904 date system support

- Fix incorrect cell data types casting results when number formatting
- Support set cell value on 1904 date system enabled, ref qax-os#1212
- Improve performance for set sheet row and the merging cells, fix performance impact when resolving qax-os#1129
jenbonzhang pushed a commit to jenbonzhang/excelize that referenced this issue Oct 22, 2023
…formance, and 1904 date system support

- Fix incorrect cell data types casting results when number formatting
- Support set cell value on 1904 date system enabled, ref qax-os#1212
- Improve performance for set sheet row and the merging cells, fix performance impact when resolving qax-os#1129
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed This issue can be reproduced
Projects
None yet
Development

No branches or pull requests

2 participants