-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DynUnet get_feature_maps() issue when using distributed training #1564
Comments
I recommend to do below change to fix this issue:
It's slightly different from the original code which always returns a list of @wyli @rijobro @ericspod , what do you guys think? Thanks. |
This will revert the behaviour of #1393. I suppose this is fine as long as we make the documentation clear so that future users aren't confused. |
Hi @rijobro , I think most of the previous problems are due to the Thanks. |
Hi @Nic-Ma @rijobro , actually at the beginning, in val or infer modes, it does only return the output tensor. However, return different types are not permitted for torchscript, thus start from this version, list based results will be returned. |
So it seems that we need to be consistent with what we return, regardless of train mode, etc., is that correct? If so, what if we return: if self.deep_supervision:
return dict{"data": self.output_block(out), "feature_map": self.heads[1 : self.deep_supr_num + 1]}
else:
return self.output_block(out) In this fashion the output is always consistent (since By returning a dictionary, hopefully it will be clearer to users what we are returning (since users were confused by the list previously returned). |
thanks this looks good to me as well, we could have the default value
|
Please have a try with dict return first, I am afraid it can't fix the TorchScript issue... |
If the default inferrer requires a tensor, then presumably you could wrap that function: def default_dict_inferrer(input, key, *args, **kwargs):
return default_inferrer(input[key], *args, **kwargs) |
Hi @rijobro , thanks for the advice, but return this kind of dict doesn't fix the torchscript issue as @Nic-Ma mentioned. I suggest here we still return a list, and add the corresponding docstrings. In order to help users use this network, I will also add/update tutorials. Let me submit a PR first for you to review. |
Describe the bug
If we use
DistributedDataParallel
to wrap the network, calling the 'get_feature_maps' function will raise the following error:It is common to return multiple variables in a network's forward function, not only for this case, but also for other situations such as for calculating triplet loss in metric learning based tasks. Therefore, here I think we should re-consider about what to return in the forward function of DynUnet.
Let me think about it and submit a PR then we can discuss it later.
@Nic-Ma @wyli @rijobro
The text was updated successfully, but these errors were encountered: