-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let Data
and HeteroData
implement FeatureStore
#4807
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks pretty great. Thank you! Let me know if you need further help for the DynamicInheritance
issue in Batch
.
On a different note: It looks like Data
and FeatureStore
both implement, e.g., __getitem__
. Currently, this means that we do not really take advantage of the View
stuff in FeatureStore
as a result. Might be good to add this as a TODO somewhere. WDYT?
Codecov Report
@@ Coverage Diff @@
## master #4807 +/- ##
==========================================
+ Coverage 82.62% 82.64% +0.01%
==========================================
Files 325 325
Lines 17373 17427 +54
==========================================
+ Hits 14355 14403 +48
- Misses 3018 3024 +6
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment, LGTM.
r"""Obtains a feature tensor from node storage.""" | ||
# Retrieve tensor and index accordingly: | ||
tensor = getattr(self[attr.group_name], attr.attr_name) | ||
if tensor is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it requires we set index
why do we even need to check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe another way to solve this if its confusing is to have a different value for UNSET
which indicates its going to index 'all'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment! Not sure what you mean by needing to check index
; this is because TensorAttr
just requires that its attributes are set (they can be set to None). The current way to index all would be None
, which I think is acceptable; although I'm happy to define a custom value for the UNSET
enum to indicate all
indexing in a follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks great. Thank you! One thing that would be good to confirm is whether pickling of data
objects between PyG 2.0.4 and PyG master still works, e.g., running an example in PyG 2.0.4 and re-run with pre-processed dataset in master. Otherwise, we have to list this as a breaking change.
@rusty1s I tried testing this explicitly by constructing a |
This PR lets
Data
andHeteroData
implement the feature store abstraction. In particular, it definesput_tensor
,get_tensor
, andremove_tensor
methods on both classes, and adds basic tests for these functionalities.