-
-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when I try to resume training with instance segmentation in google colab #403
Comments
👋 Hello @alanacc92, thank you for raising an issue about Ultralytics HUB 🚀! Please visit our HUB Docs to learn more:
If this is a 🐛 Bug Report, please provide screenshots and steps to reproduce your problem to help us get started working on a fix. If this is a ❓ Question, please provide as much information as possible, including dataset, model, environment details etc. so that we might provide the most helpful response. We try to respond to all issues as promptly as possible. Thank you for your patience! |
👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help. For additional resources and information, please see the links below:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed! Thank you for your contributions to YOLO 🚀 and Vision AI ⭐ |
@alanacc92 hello! Thank you for providing detailed information about the error you're encountering while trying to resume training with instance segmentation using Google Colab. From the log you've posted, it seems like the input shape for one of the loss computations is invalid. The Here are some steps that might resolve your issue:
If the issue persists, please provide the full stack trace or any additional details on the GitHub issue tracker. This will help in diagnosing the root cause more effectively. For further help with troubleshooting, you might want to refer to the documentation available at https://docs.ultralytics.com/hub, which covers common issues and best practices for using the Ultralytics HUB. Also, remember that you can always reach out to the broader community and the Ultralytics team on the GitHub repository for assistance. They can provide valuable insights and suggestions based on their experience. |
I also encountered this question. I finally found I use a false trainner. The task I was doing is detection but I used a segmentation trainer. Then, I used the right one and this bug is fixed. It is a phenomenon of the lack of invalid data checking. Maybe your problem was not caused by this, however, check your config and yaml file is a clever choice. |
Hello @wenruoxu, Thank you for sharing your experience and insights! It's great to hear that you were able to resolve the issue by ensuring you used the correct trainer for your task. Indeed, using the appropriate configuration and YAML files is crucial for successful training. For anyone encountering similar issues, here are a few additional steps to consider:
Here is a small code snippet to help you ensure that you are using the correct task and trainer: from ultralytics import YOLO
# Load the model
model = YOLO('path/to/your/model.pt')
# Ensure the correct task is set
model.task = 'detect' # or 'segment', 'classify', etc.
# Train the model
model.train(data='path/to/your/data.yaml', epochs=100) If you continue to experience issues, please provide more details on the GitHub issue tracker, and the community or Ultralytics team will be happy to assist you further. Thank you for your contribution to the discussion, and happy training! 🚀 |
Search before asking
HUB Component
Training
Bug
Ultralytics HUB: New authentication successful ✅⚠️ Unable to automatically guess model task, assuming 'task=detect'. Explicitly define task for your model, i.e. 'task=detect', 'segment', 'classify', or 'pose'.
Ultralytics HUB: View model at https://hub.ultralytics.com/models/2TaEEEn6ncyUFLxCdWJ6 🚀
Downloading https://storage.googleapis.com/ultralytics-hub.appspot.com/users/eFUHfcU7kgPqOknxAqN6z2bDjBw2/models/2TaEEEn6ncyUFLxCdWJ6/epoch-31.pt to 'epoch-31.pt'...
100%|██████████| 90.5M/90.5M [00:04<00:00, 20.6MB/s]
WARNING
Ultralytics YOLOv8.0.183 🚀 Python-3.10.12 torch-2.0.1+cu118 CUDA:0 (Tesla T4, 15102MiB)
engine/trainer: task=segment, mode=train, model=epoch-31.pt, data=https://storage.googleapis.com/ultralytics-hub.appspot.com/users/eFUHfcU7kgPqOknxAqN6z2bDjBw2/datasets/MV6QbQSZVe2iAAtRpDFD/Cars Test2.v3i.yolov8+severe ML.zip, epochs=100, patience=15, batch=17, imgsz=640, save=True, save_period=-1, cache=ram, device=, workers=8, project=None, name=None, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, show=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, vid_stride=1, stream_buffer=False, line_width=None, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, boxes=True, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=None, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.0, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0, cfg=None, tracker=botsort.yaml, save_dir=runs/segment/train
Downloading https://storage.googleapis.com/ultralytics-hub.appspot.com/users/eFUHfcU7kgPqOknxAqN6z2bDjBw2/datasets/MV6QbQSZVe2iAAtRpDFD/Cars Test2.v3i.yolov8+severe ML.zip to 'Cars Test2.v3i.yolov8+severe ML.zip'...
100%|██████████| 773M/773M [00:36<00:00, 22.1MB/s]
Unzipping Cars Test2.v3i.yolov8+severe ML.zip to /content/datasets/Cars Test2.v3i.yolov8+severe ML...: 100%|██████████| 47345/47345 [00:12<00:00, 3751.65file/s]
Downloading https://ultralytics.com/assets/Arial.ttf to '/root/.config/Ultralytics/Arial.ttf'...
100%|██████████| 755k/755k [00:00<00:00, 18.1MB/s]
TensorBoard: Start with 'tensorboard --logdir runs/segment/train', view at http://localhost:6006/
0 -1 1 928 ultralytics.nn.modules.conv.Conv [3, 32, 3, 2]
1 -1 1 18560 ultralytics.nn.modules.conv.Conv [32, 64, 3, 2]
2 -1 1 29056 ultralytics.nn.modules.block.C2f [64, 64, 1, True]
3 -1 1 73984 ultralytics.nn.modules.conv.Conv [64, 128, 3, 2]
4 -1 2 197632 ultralytics.nn.modules.block.C2f [128, 128, 2, True]
5 -1 1 295424 ultralytics.nn.modules.conv.Conv [128, 256, 3, 2]
6 -1 2 788480 ultralytics.nn.modules.block.C2f [256, 256, 2, True]
7 -1 1 1180672 ultralytics.nn.modules.conv.Conv [256, 512, 3, 2]
8 -1 1 1838080 ultralytics.nn.modules.block.C2f [512, 512, 1, True]
9 -1 1 656896 ultralytics.nn.modules.block.SPPF [512, 512, 5]
10 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
11 [-1, 6] 1 0 ultralytics.nn.modules.conv.Concat [1]
12 -1 1 591360 ultralytics.nn.modules.block.C2f [768, 256, 1]
13 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
14 [-1, 4] 1 0 ultralytics.nn.modules.conv.Concat [1]
15 -1 1 148224 ultralytics.nn.modules.block.C2f [384, 128, 1]
16 -1 1 147712 ultralytics.nn.modules.conv.Conv [128, 128, 3, 2]
17 [-1, 12] 1 0 ultralytics.nn.modules.conv.Concat [1]
18 -1 1 493056 ultralytics.nn.modules.block.C2f [384, 256, 1]
19 -1 1 590336 ultralytics.nn.modules.conv.Conv [256, 256, 3, 2]
20 [-1, 9] 1 0 ultralytics.nn.modules.conv.Concat [1]
21 -1 1 1969152 ultralytics.nn.modules.block.C2f [768, 512, 1]
22 [15, 18, 21] 1 2771705 ultralytics.nn.modules.head.Segment [3, 32, 128, [128, 256, 512]]
YOLOv8s-seg summary: 261 layers, 11791257 parameters, 11791241 gradients
Transferred 417/417 items from pretrained weights⚠️
Freezing layer 'model.22.dfl.conv.weight'
AMP: running Automatic Mixed Precision (AMP) checks with YOLOv8n...
Downloading https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt to 'yolov8n.pt'...
100%|██████████| 6.23M/6.23M [00:00<00:00, 76.3MB/s]
AMP: checks passed ✅
train: Scanning /content/datasets/Cars Test2.v3i.yolov8+severe ML/train/labels... 21118 images, 214 backgrounds, 0 corrupt: 100%|██████████| 21118/21118 [00:14<00:00, 1443.36it/s]
train: New cache created: /content/datasets/Cars Test2.v3i.yolov8+severe ML/train/labels.cache
train: 27.5GB RAM required to cache images with 50% safety margin but only 8.2/12.7GB available, not caching images
albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8))
val: Scanning /content/datasets/Cars Test2.v3i.yolov8+severe ML/valid/labels... 2502 images, 21 backgrounds, 0 corrupt: 100%|██████████| 2502/2502 [00:03<00:00, 819.01it/s]
val: New cache created: /content/datasets/Cars Test2.v3i.yolov8+severe ML/valid/labels.cache
val: Caching images (2.2GB ram): 100%|██████████| 2502/2502 [00:11<00:00, 227.16it/s]
Plotting labels to runs/segment/train/labels.jpg...
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically...
optimizer: SGD(lr=0.01, momentum=0.9) with parameter groups 66 weight(decay=0.0), 77 weight(decay=0.00053125), 76 bias(decay=0.0)
Resuming training from epoch-31.pt from epoch 33 to 100 total epochs
Ultralytics HUB: View model at https://hub.ultralytics.com/models/2TaEEEn6ncyUFLxCdWJ6 🚀
Image sizes 640 train, 640 val
Using 2 dataloader workers
Logging results to runs/segment/train
Starting training for 100 epochs...
0%| | 0/1243 [00:00<?, ?it/s]
RuntimeError Traceback (most recent call last)
in <cell line: 5>()
3 model = YOLO('https://hub.ultralytics.com/models/2TaEEEn6ncyUFLxCdWJ6')
4
----> 5 model.train()
7 frames
/usr/local/lib/python3.10/dist-packages/ultralytics/utils/loss.py in (.0)
159 loss = torch.zeros(3, device=self.device) # box, cls, dfl
160 feats = preds[1] if isinstance(preds, tuple) else preds
--> 161 pred_distri, pred_scores = torch.cat([xi.view(feats[0].shape[0], self.no, -1) for xi in feats], 2).split(
162 (self.reg_max * 4, self.nc), 1)
163
RuntimeError: shape '[32, 67, -1]' is invalid for input of size 268800
Environment
Ultralytics HUB Version
v0.1.24
Client User Agent
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/117.0
Operating System
Win32
Browser Window Size
1920 x 927
Server Timestamp
1695274659
Minimal Reproducible Example
Setup in google colab
%pip install ultralytics # install
from ultralytics import YOLO, checks, hub
checks() # checks
Login:
hub.login('xxxxxx')
model = YOLO('https://hub.ultralytics.com/models/2TaEEEn6ncyUFLxCdWJ6')
model.train()
Additional
No response
The text was updated successfully, but these errors were encountered: