-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CocoDetection dataset incompatible with Faster R-CNN model #2720
Comments
Hi, Thanks for opening this issue. Your assessment is correct: the current implementation for COCODataset is not compatible with the one that the training code expects. vision/references/detection/coco_utils.py Lines 209 to 220 in 898802f
where we add an extra image_id to the targets.
Apart from this vision/references/detection/coco_utils.py Line 231 in 898802f
vision/references/detection/coco_utils.py Lines 50 to 103 in 898802f
There are a few reasons why this is the case:
Overall, my thinking is that the return type of a dataset is tightly coupled with the training / evaluation script for a model in particular. Unfortunately there is this mismatch between both now, but I'm not sure how to fix this without a pretty drastic BC-breaking change (except from introducing another dataset class for the new COCO style dataset). I'm glad to hear your thoughts on this. |
Thanks for your quick response! Your explanation makes sense, and the additional code that you referenced is very helpful. I feel like you'd agree that what feels intuitive/natural is for the output of CocoDetection to match the current expectation of the object detection model training. I agree that making the change would break backward compatibility. Might the current behavior potentially be deprecated and later changed to make the future behavior line up? Transforming the target is a valid workaround, so it's not a big problem, but it would be unexpected for any new users. |
We could potentially change the output format (and have a deprecation cycle for a few releases to let users fix their code), but I'm always wary of breaking changes as there are many tutorials / code out there that uses the current behavior. We have been very careful in the past with breaking changes like this one, so this is something that deserves further discussion. cc @dongreenberg let's discuss about this on our next meeting |
Thanks for your consideration. At the very least, if someone is confused like I was, they can find this issue! |
Hi, so I have this problem too, can you manage to transforming the target? how you do that? |
Hi, @rikkudo, this was a while ago, so I can't guarantee this code, but I ended up doing something like this.
|
thank you for the help, really apreciate! |
🐛 Bug
Before I report my issue, I'd like to say that the TorchVision Object Detection Finetuning tutorial is excellent! I've found the code to be easy to work with, but the tutorial made it even more accessible -- it got me training on my custom dataset in just a few hours (including making my own Docker image with GPU support).
The CocoDetection dataset appears to be incompatible with the Faster R-CNN model. The TorchVision Object Detection Finetuning tutorial specifies the format of datasets to be compatible with the Mask R-CNN model: datasets' getitem method should output an image and a target with fields boxes, labels, area, etc. The CocoDetection dataset returns the COCO annotations as the target, which does not match the dataset specification.
The evaluation code included with the tutorial (
engine.evaluate
) also appears to be incompatible with the built-in CocoDetection dataset format.To Reproduce
Steps to reproduce the behavior:
Follow the steps in the TorchVision Object Detection Finetuning tutorial substituting a dataset with COCO annotations and using torchvision.datasets.CocoDetection as the dataset class instead of the custom dataset class defined in the tutorial. I hit an error within the
train_one_epoch
function in engine.py.The error message is below.
File "references/detection/engine.py", line 28, in
targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
AttributeError: 'list' object has no attribute 'items'
Expected behavior
I expected training and evaluation to run successfully when using
torchvision.datasets.CocoDetection
. I was able to run training and evaluation by making my own custom COCO dataset class and manipulating the target output to match the specified format.Environment
Collecting environment information...
PyTorch version: 1.6.0+cu101
Is debug build: False
CUDA used to build PyTorch: 10.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: Could not collect
Python version: 3.7 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: Quadro T2000
Nvidia driver version: 430.64
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.19.2
[pip3] torch==1.6.0+cu101
[pip3] torchvision==0.7.0+cu101
[conda] Could not collect
Additional context
None
cc @pmeier
The text was updated successfully, but these errors were encountered: