-
Notifications
You must be signed in to change notification settings - Fork 10
Add support for csv files as input #152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for csv files as input #152
Conversation
9e17162 to
76c6a81
Compare
src/virtualship/utils.py
Outdated
| def load_coordinates(file_path): | ||
| """Loads the coordinates from a file.""" | ||
| try: | ||
| return pd.read_excel(file_path) # Try reading as Excel | ||
| except ValueError: | ||
| pass | ||
|
|
||
| try: | ||
| return pd.read_csv(file_path) # Try reading as CSV with commas | ||
| except ValueError: | ||
| pass | ||
|
|
||
| df = pd.read_csv(file_path, delimiter=";") # Try reading CSV with colons | ||
| if len(df.columns) == 1: | ||
| return None | ||
| return df |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the file_path will contain the extension, isn't is better to just infer the file type by the extension instead of try and error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, this will now be a problem when the read_excel fails for another reason than the file not being an xls-file (e.g. because openpyxl is not installed). So better to choose between read_excel and read_csv based on extension indeed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was that sometimes a file can have a missing extension while still being valid, but I see your points 🙌
I have changed the implementation to check based on the file extension, and also improved the error handling.
|
Thanks for contributing! |
erikvansebille
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR, very good work on the unit tests in particular! But I think the logic to decide between open_excel and open_csv should be made more robust
src/virtualship/utils.py
Outdated
| def load_coordinates(file_path): | ||
| """Loads the coordinates from a file.""" | ||
| try: | ||
| return pd.read_excel(file_path) # Try reading as Excel | ||
| except ValueError: | ||
| pass | ||
|
|
||
| try: | ||
| return pd.read_csv(file_path) # Try reading as CSV with commas | ||
| except ValueError: | ||
| pass | ||
|
|
||
| df = pd.read_csv(file_path, delimiter=";") # Try reading CSV with colons | ||
| if len(df.columns) == 1: | ||
| return None | ||
| return df |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, this will now be a problem when the read_excel fails for another reason than the file not being an xls-file (e.g. because openpyxl is not installed). So better to choose between read_excel and read_csv based on extension indeed
9f88573 to
720036e
Compare
15044d9 to
5e65979
Compare
for more information, see https://pre-commit.ci
|
Hi @erikvansebille, I made some changes as requested—thanks again for your feedback! Let me know if you have any further comments on this updated version. |
|
Looks good. Thanks for the contribution @diogomatoschaves ! |
Uh oh!
There was an error while loading. Please reload this page.