-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DataPipe] Ensure all DataPipes Meet Testing Requirements #106
Comments
This is awesome. One nit note: serializable should be same as picklable IMO. |
Related to this issue: * #106 [ghstack-poisoned]
Related to this issue: * #106 [ghstack-poisoned]
Related to this issue: * #106 [ghstack-poisoned]
…Pipes" Related to this issue: * #106 Differential Revision: [D34625707](https://our.internmc.facebook.com/intern/diff/D34625707) [ghstack-poisoned]
Related to this issue: * #106 Differential Revision: [D34625707](https://our.internmc.facebook.com/intern/diff/D34625707) [ghstack-poisoned]
@NivekT |
We can require each DataPipe to introduce a simple usage example graph for this purpose. |
When we have time, we might need to go over our DataPipes again to identify any missing test since there are a few DataPipe implemented recently. Besides, for future reference, we might need to improve our testing framework to something similar to |
Agreed that the |
🚀 Feature
We have many tests for existing DataPipes (both in PyTorch Core and TorchData). However, over time, they have become less organized. Moreover, as the testing requirements expand, older DataPipes may not have tests to cover the newly added requirements.
This issue aims to track the status of tests for all DataPipes.
Motivation
We want to ensure test coverage for all DataPipe is complete to reduce bugs and unexpected behavior.
Alternative
We also should create some testing templates for
IterDataPipe
andMapDataPipe
that can be widely applied.IterDataPipe
TrackerX - Done
NA - Not Applicable
Blank - Not Done/Unclear
Test definitions:
Functional - unit test to ensure that the DataPipe works properly with various input arguments
Reset - DataPipe can be reset/restart after being read
__len__
- the__len__
method is implemented whenever possible (or explicitly not implemented)Serializable - DataPipe is serializable
Graph (future) - can be traversed as part of a DataPipe graph
Snapshot (future) - can be saved/loaded as a checkpoint/snapshot
__len__
MapDataPipe
TrackerX - Done
NA - Not Applicable
Blank - Not Done/Unclear
__len__
cc: @ejguan @VitalyFedyunin @NivekT
The text was updated successfully, but these errors were encountered: