-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] make single problem loaders for equal length problems return numpy arrays #109
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two suggestions:
- Change default in load_UCR_UEA_dataset to numpy3d
- Remove support for numpy2d
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
its in the cards, I think I will change the name to load_tsc_problem, but it needs to also load unequal problems so was going to delay until we remove DataFrames all together
see my other comments |
Issues with 2d numpy and transformers should be moved to different PR
I had no idea that the single problem loaders were used so much in testing (other than correctness testing), this is surely not a good idea, will raise an issue |
well, that tweak to kill time on a saturday morning became a monster! |
I would give you a badge of honour! :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some default types in _data_io.py
do not seem to match. Might be worth checking? Found some at least.
thanks for the review @patrickzib , think I have addressed all issues now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well done. thank you.
change the return_type from None to numpy3d and remove the very long comments describing possible data structures in unnecessary detail. Part of #42
edit: sorry this touches so many files, I had no idea that the single problem loaders were used so much in testing (other than correctness testing). I think unless testing actual output, tests should use randomly generated data from tests/_testing package. I've had a look through a fair few tests on this epic trawl, and there seems to be a lot of testing for testing sake in obscure code such as benchmarking. There is testing same thing on very similar datasets, using larger datasets unnecessarily etc. I have raised an issue #118