Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot handle special symbols #220

Closed
durgeksh opened this issue Apr 2, 2024 · 9 comments · Fixed by #223
Closed

Cannot handle special symbols #220

durgeksh opened this issue Apr 2, 2024 · 9 comments · Fixed by #223
Labels
question Further information is requested

Comments

@durgeksh
Copy link

durgeksh commented Apr 2, 2024

Hi team,
I am facing one issue on reading excel sheet through Polars. It says calamine cell error: #VALUE!
The sheet in interest does not get read through standard api.

How to handle such special symbols through fastexcel?

Thank you.

I am using fastexcel==0.10.2.

@lukapeschke lukapeschke added the question Further information is requested label Apr 3, 2024
@lukapeschke
Copy link
Collaborator

Hi @durgeksh could you please provide the entire stack trace ? and maybe a file allowing to reproduce the issue ? thanks!

@durgeksh
Copy link
Author

durgeksh commented Apr 5, 2024

Traceback (most recent call last):
  File "/Users/neo/Desktop/workspace/pocs/polarsdemo.py", line 84, in <module>
    df = pl.read_excel("sample_data.xlsx", engine='calamine')
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/neo/Desktop/workspace/pocs/.venv/lib/python3.11/site-packages/polars/_utils/deprecation.py", line 134, in wrapper
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/neo/Desktop/workspace/pocs/.venv/lib/python3.11/site-packages/polars/_utils/deprecation.py", line 134, in wrapper
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/neo/Desktop/workspace/pocs/.venv/lib/python3.11/site-packages/polars/io/spreadsheet/functions.py", line 253, in read_excel
    return _read_spreadsheet(
           ^^^^^^^^^^^^^^^^^^
  File "/Users/neo/Desktop/workspace/pocs/.venv/lib/python3.11/site-packages/polars/io/spreadsheet/functions.py", line 475, in _read_spreadsheet
    parsed_sheets = {
                    ^
  File "/Users/neo/Desktop/workspace/pocs/.venv/lib/python3.11/site-packages/polars/io/spreadsheet/functions.py", line 476, in <dictcomp>
    name: reader_fn(
          ^^^^^^^^^^
  File "/Users/neo/Desktop/workspace/pocs/.venv/lib/python3.11/site-packages/polars/io/spreadsheet/functions.py", line 821, in _read_spreadsheet_calamine
    ws = parser.load_sheet_by_name(sheet_name, **read_options)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/neo/Desktop/workspace/pocs/.venv/lib/python3.11/site-packages/fastexcel/__init__.py", line 184, in load_sheet_by_name
    self._reader.load_sheet(
_fastexcel.CalamineCellError: calamine cell error: #VALUE!
Context:
    0: could not determine dtype for column Amount


Process finished with exit code 1

Sample file:
sample_data.xlsx

lukapeschke added a commit that referenced this issue Apr 5, 2024
closes #220

Signed-off-by: Luka Peschke <luka.peschke@toucantoco.com>
PrettyWood pushed a commit that referenced this issue Apr 5, 2024
closes #220

Signed-off-by: Luka Peschke <luka.peschke@toucantoco.com>
@durgeksh
Copy link
Author

durgeksh commented Apr 5, 2024

Thank you @lukapeschke for fixing this so fast.

@lukapeschke
Copy link
Collaborator

@durgeksh you're welcome, thank you for the sample file!

@durgeksh
Copy link
Author

durgeksh commented Apr 8, 2024

@lukapeschke Can we provide an option to parse these special symbols as a string and retain in the sheet please? Now, it removes the symbol and puts null there.

Thank you.

@lukapeschke
Copy link
Collaborator

@durgeksh could you please create a separate issue for that ? I'll mark it as a feature request

@durgeksh
Copy link
Author

durgeksh commented Apr 8, 2024

Yes, sure. Thank you.

@PrettyWood
Copy link
Member

@durgeksh It's not that simple because if it's considered as a string then it can mess up with the rest of the column that would have another type
So it requires either everything to be casted as a string or some kind of union type on the column.
Anyway it's definitely a feature request!

@durgeksh
Copy link
Author

@PrettyWood Yes, in that case safe typecast would be string for the column with special symbols.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants