Skip to content

Conversation

@diogomatoschaves
Copy link
Contributor

@diogomatoschaves diogomatoschaves commented Feb 27, 2025

Comment on lines 45 to 60
def load_coordinates(file_path):
"""Loads the coordinates from a file."""
try:
return pd.read_excel(file_path) # Try reading as Excel
except ValueError:
pass

try:
return pd.read_csv(file_path) # Try reading as CSV with commas
except ValueError:
pass

df = pd.read_csv(file_path, delimiter=";") # Try reading CSV with colons
if len(df.columns) == 1:
return None
return df
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the file_path will contain the extension, isn't is better to just infer the file type by the extension instead of try and error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this will now be a problem when the read_excel fails for another reason than the file not being an xls-file (e.g. because openpyxl is not installed). So better to choose between read_excel and read_csv based on extension indeed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was that sometimes a file can have a missing extension while still being valid, but I see your points 🙌

I have changed the implementation to check based on the file extension, and also improved the error handling.

@iuryt
Copy link
Collaborator

iuryt commented Feb 27, 2025

Thanks for contributing!

Copy link
Member

@erikvansebille erikvansebille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR, very good work on the unit tests in particular! But I think the logic to decide between open_excel and open_csv should be made more robust

Comment on lines 45 to 60
def load_coordinates(file_path):
"""Loads the coordinates from a file."""
try:
return pd.read_excel(file_path) # Try reading as Excel
except ValueError:
pass

try:
return pd.read_csv(file_path) # Try reading as CSV with commas
except ValueError:
pass

df = pd.read_csv(file_path, delimiter=";") # Try reading CSV with colons
if len(df.columns) == 1:
return None
return df
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this will now be a problem when the read_excel fails for another reason than the file not being an xls-file (e.g. because openpyxl is not installed). So better to choose between read_excel and read_csv based on extension indeed

@diogomatoschaves diogomatoschaves force-pushed the support-csv branch 2 times, most recently from 9f88573 to 720036e Compare February 27, 2025 11:27
@diogomatoschaves
Copy link
Contributor Author

Hi @erikvansebille, I made some changes as requested—thanks again for your feedback! Let me know if you have any further comments on this updated version.

@VeckoTheGecko
Copy link
Collaborator

Looks good. Thanks for the contribution @diogomatoschaves !

@VeckoTheGecko VeckoTheGecko merged commit 25879bd into Parcels-code:main Mar 28, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support CSV as an input file format

4 participants