-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: how to convert .xls to .xlsx because Pandas failed to open .xls files #58470
Comments
Thanks for the report. You checked the box |
this result : INSTALLED VERSIONScommit : 7c48ff4 pandas : 1.2.5 |
You are still using pandas 1.2.5 which I don't think is supported anymore, could you try upgrading to the latest version of pandas and see if that fixes your problem? |
how can i upgrade ? Requirement already satisfied: pandas in /opt/conda/miniconda3/lib/python3.8/site-packages (2.0.3) |
It seems like you are using conda, I am not familiar with that but I think you can run |
@Chirawat3987 - can you try doing
pandas will try to automatically detect the Excel file format, but this is failing for your XLS file (it does not match the XLS signatures that the file should start with). But you can specify which engine to use, and pandas will succeed if the engine can open it. |
this error message, pls help |
xlrd is not able to read the XLS file - there is nothing that pandas can do. You can also try the |
Ah -but python-calamine was only added recently. You will need to upgrade pandas to try it out, as @Aloqeely has advised. |
and then I run command : df = pd.read_excel('D:/test/ZARR2001.xls',engine='xlrd') |
I upgrade pandas lastest and run command error message : |
Are you sure it's actually an xls file and not just a text file with a .xls extension? Try opening it in a text editor. |
Thanks for the responses @Chirawat3987. pandas uses other packages to open Excel files (e.g. xlrd and calamine). Neither can open your file - so this is not a pandas issue. Closing. |
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
The error that I get from pandas is:
ValueError Traceback (most recent call last)
Cell In[40], line 3
1 import pandas as pd
----> 3 df = pd.read_excel('ZARR2001.xls')
4 df.to_excel('ZARR2001.xlsx')
File /opt/conda/miniconda3/lib/python3.8/site-packages/pandas/util/_decorators.py:299, in deprecate_nonkeyword_arguments..decorate..wrapper(*args, **kwargs)
294 msg = (
295 f"Starting with Pandas version {version} all arguments of "
296 f"{func.name}{arguments} will be keyword-only"
297 )
298 warnings.warn(msg, FutureWarning, stacklevel=stacklevel)
--> 299 return func(*args, **kwargs)
File /opt/conda/miniconda3/lib/python3.8/site-packages/pandas/io/excel/_base.py:336, in read_excel(io, sheet_name, header, names, index_col, usecols, squeeze, dtype, engine, converters, true_values, false_values, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, mangle_dupe_cols, storage_options)
334 if not isinstance(io, ExcelFile):
335 should_close = True
--> 336 io = ExcelFile(io, storage_options=storage_options, engine=engine)
337 elif engine and engine != io.engine:
338 raise ValueError(
339 "Engine should not be specified when passing "
340 "an ExcelFile - ExcelFile already has the engine set"
341 )
File /opt/conda/miniconda3/lib/python3.8/site-packages/pandas/io/excel/_base.py:1080, in ExcelFile.init(self, path_or_buffer, engine, storage_options)
1078 ext = "xls"
1079 else:
-> 1080 ext = inspect_excel_format(
1081 content=path_or_buffer, storage_options=storage_options
1082 )
1084 if ext == "ods":
1085 engine = "odf"
File /opt/conda/miniconda3/lib/python3.8/site-packages/pandas/io/excel/_base.py:974, in inspect_excel_format(path, content, storage_options)
972 return "xls"
973 elif not peek.startswith(ZIP_SIGNATURE):
--> 974 raise ValueError("File is not a recognized excel file")
976 # ZipFile typing is overly-strict
977 # python/typeshed#4212
978 zf = zipfile.ZipFile(stream) # type: ignore[arg-type]
ValueError: File is not a recognized excel file
Expected Behavior
pls help
Installed Versions
Replace this line with the output of pd.show_versions()
The text was updated successfully, but these errors were encountered: