-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The "file" source is not able to open with pandas CDS files previously downloaded #180
Comments
Dear @malmans2, thank you for reporting this issue. The method to save the results from CDS to a given file is collection_id = "insitu-observations-gruan-reference-network"
request = {
"format": "csv-lev.zip",
"year": "2006",
"month": "05",
"variable": ["air_temperature", "altitude"],
"day": ["21", "22"],
}
data_cds = earthkit.data.from_source("cds", collection_id, **request)
data_cds.to_pandas() # OK
data_cds.save("my_cds_data.zip") The |
Got it, thanks. What about the last method in my comment? import cdsapi
import earthkit.data
client = cdsapi.Client()
data_cdsapi = earthkit.data.from_source(
"file", client.retrieve(collection_id, request).download()
)
data_cdsapi.to_pandas() # ParserError I.e., |
You should be able to read the previously downloaded CSD data as a "file" source. The problem is that pandas' import cdsapi
import earthkit.data
client = cdsapi.Client()
data_cdsapi = earthkit.data.from_source(
"file", client.retrieve(collection_id, request).download()
)
df = data_cdsapi.to_pandas(pandas_read_csv_kwargs={"comment": "#"}) Now, it is a good question if |
Understood. Thanks for the clarification. |
I am reopening as I think there are some issues that we can address here to attempt some consistency accross sources. There may be some differences, but I think we can do better than the current implementation. Further details, the current default file source:
cds_source:
ecmwf_api source:
|
What happened?
I'm able to open CDS files downloaded using
from_source("cds", ...)
, but I'm not able to open them if they've been previously downloaded (i.e., usingfrom_source("file", ...)
).In this specific case, when using
from_source("cds", ...)
looks like additional arguments are passed to the libraries used under the hood to read data (pandas,comment="#"
).Is there a way to open a local file previously downloaded from the CDS exactly as
from_source("cds", ...)
would do?What are the steps to reproduce the bug?
Version
0.3.1
Platform (OS and architecture)
Darwin MacBook-Pro-3.local 22.6.0 Darwin Kernel Version 22.6.0: Wed Jul 5 22:21:56 PDT 2023; root:xnu-8796.141.3~6/RELEASE_X86_64 x86_64
Relevant log output
Accompanying data
No response
Organisation
B-Open / CADS-EQC
The text was updated successfully, but these errors were encountered: