-
-
Notifications
You must be signed in to change notification settings - Fork 16.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guidance regarding transfer learning with COCO to add additional classes #980
Comments
This is very easy. You simply append your new dataset, labelled with class 80 onward, to coco.yaml Lines 12 to 13 in 5a9c5c1
|
Thank you for the answer. it will need to train the whole dataset again with 80 coco class already trained, and 1 new class. |
Exactly, retraining dataset is not desired as it can take weeks. Here is what I have tried. I created dataset based on new class that also contained COCO objects. Then I appended new class to COCO classes and made sure all labels are fine. Then I froze the layers except for Detect head to fine tune it to new class using the following: https://gist.github.com/shayanalibhatti/7955bc7a09cbfcd67843f7159d1d8924 But I am not seeing any good results .... on new class or COCO classes. |
The time savings are already built in by starting from pretrained weights. If you freeze layers you'll naturally see some additional speedup, though at the cost of reduced mAP. |
can you please share any link or source that can help us, understand it better |
@daynial132 for best results follow guidance provided above. For pretrained speedup comparison, simply see the training results shown in the colab notebook. For freezing layers (not recommended), see ultralytics/yolov3#106 |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Hey, Did you find a solution to this? To not retrain the entire model, but rather just add a few labels to the existing model? |
can you share with me how you do it? I'm ML newbie, I really haven't found a way to fine tuned model so that it recognized COCO classes and newly added class @shayanalibhatti |
i want to fine tuned model so that it recognized COCO classes and 2 newly added classes. I edited the config file to classes=82, but only train with 2 newly added classes, can I do this? @glenn-jocher |
|
@BahadirgK the response you copied already explains how to do this |
Is it possible to train YOLOv5 on COCO with its 80 classes, and then export the weights to be used on a dataset with fewer classes? |
@mkieffer1107 yes, this is the default workflow. |
Thanks for the reply! I’m new to ml in general. Do I just export the weights and then choose them using flags when running train.py on a new dataset? |
@mkieffer1107 see Train Custom Data tutorial for guidance: YOLOv5 Tutorials
Good luck 🍀 and let us know if you have any other questions! |
Hey did you find how to perform transfer learning by for combining coco weights and custom weights? |
Hey, were you able to find any solution for this? I wanted to add 5 classes for currency notes along with the 80 coco classes. |
Hey, thanks for the quick reply, however if I train on Colab, won't it take too long to be trained again on coco dataset as well as my own custom dataset? |
I still am trying to figure this out, do we have to train on a new dataset along with training with coco dataset? Or we can use the yolov5.pt for coco ? |
@kashishnaqvi101 I guess it's a bit late. I am currently working through the same problem and figured it out going through several discussion threads. Hope this helps other researchers as well. Yes, you will need to train the model with COCO dataset and your custom dataset both if your custom data has new classes not in the COCO dataset. @glenn-jocher explained how to do that above. #980 (comment). Remember your new class should be labeled starting from 80 now as COCO already has label mapped from 0-79. Freezing layers potentially can be done, however, it reduces mAP although it takes less training time. Pre-trained weights from yolov5s.pt will only come in handy when your custom dataset contains labels from COCO and no new class labels. To do so, you just need to create a customdata.yaml file with link to your custom dataset and remember to list out all 80 classes from COCO. To get the best results for custom data with new classes, train with COCO + custom dataset with updated label mapping. |
@Utsabab excellent summary, thanks for sharing your findings! Just to add a bit, when combining COCO with a custom dataset that includes new classes, indeed your custom class labels should start from 80 onwards. Using pre-trained weights like For the training, make sure you update your dataset # example dataset.yaml
train: /path/to/your/train/images
val: /path/to/your/val/images
# number of classes (80 COCO + num of your custom classes)
nc: 85
# class names (COCO classes followed by your custom classes)
names: [0 .. 79, 'custom1', 'custom2', 'custom3', 'custom4', 'custom5'] Be sure to adjust the path to your train/val images and the number of custom classes ( |
@glenn-jocher Thanks for the explanation. Just one more clarifying question. I was assuming I will have to add coco dataset path to train and val in dataset.yaml.
Will pre-trained weights from YOLOv5s.pt still be applied if the path to the COCO dataset is not included in the .yaml file? Thank you! |
@Utsabab yes, you've got it right! 😊 Including the COCO dataset paths along with your custom dataset paths in the To clarify: even if you don't include the COCO dataset path in your Remember, the use of pre-trained weights aims to leverage learned features to improve training efficiency and effectiveness on your task. Happy to help if you have more questions! |
@glenn-jocher I have 14 classes in my custom dataset of which 9 are new and 5 are in the COCO labels. Is there a way I can use pre-trained weights for the 5 COCO classes only without including other labels from COCO in custom yaml file? How would the example dataset.yaml would look like? I am getting low recall score, therefore, hope to increase it by only predicting objects from my custom dataset. I'd appreciate your help. |
@Utsabab, great question! To leverage pre-trained weights for the 5 COCO classes while introducing your 9 new ones, simply structure your dataset.yaml as follows, keeping all 14 of your custom classes in the list, including those overlapping with COCO. Recall may indeed improve due to a more focused training on relevant classes. Here's how your train: /path/to/your/train/images
val: /path/to/your/val/images
# 14 classes: 5 from COCO + 9 new ones
nc: 14
# Class names: ensure COCO overlapping classes are correctly named
names: ['COCO_class1', 'COCO_class2', 'COCO_class3', 'COCO_class4', 'COCO_class5', 'new_class1', 'new_class2', 'new_class3', 'new_class4', 'new_class5', 'new_class6', 'new_class7', 'new_class8', 'new_class9'] This setup instructs the model to learn from the pre-trained weights for any common COCO features while focusing exclusively on detecting your defined 14 classes. Just ensure that the COCO classes in your dataset share the exact names as they do in COCO for maximal benefit from the transfer learning. Hope this helps, and happy training! 🚀 |
@glenn-jocher I retrained my model eliminating under-represented object labels from the data and the mAP and recall value improved. I am currently working on accessing the feature maps of each images during training. I found recourses to identify and visualize feature maps during inference, however, not during training. In addition, I am trying to figure out a way to use those feature maps as an additional input and dissociate relation between feature maps during training. Any insights into it would assist me. Thank you! |
@Utsabab hey there! 🌟 Glad to hear about the improvement in your mAP and recall values after refining your dataset. For accessing and visualizing feature maps during training in YOLOv5, you can modify the model's forward function to return feature maps of interest, alongside the usual predictions. Here's a simplistic example on how you might approach it: class Model(nn.Module):
def __init__(self, ...):
super(Model, self).__init__()
# your model architecture here
def forward(self, x, feature_visualize=False):
# pass x through the layers
# let's say you want to visualize features from layer 'some_layer'
if feature_visualize:
features = self.some_layer(x)
return predictions, features
return predictions And during training, you could grab those feature maps with something like this: predictions, features = model(images, feature_visualize=True) Remember, this is quite simplified and integrating it into YOLOv5 or any other complex models might require a more nuanced approach, especially with handling multiple feature layers and integrating this additional input meaningfully for your specific use case. Using those feature maps as an additional input and dissociating relations between them during training could get pretty specific depending on what you're aiming to achieve. You might experiment with concatenating these features at different stages of your model or using them as part of attention mechanisms, ensuring you adjust the training logic to leverage this effectively. Good luck with your exploration, and I hope this gives you a solid starting point! |
@glenn-jocher Thank you for the directions. I have been able to access feature maps from SPPF layers inside the _forward_once function of BaseModel class in yolo.py. However, I want the tensors for only the last epoch. I can access epochs and img paths in train.py, however, I can't figure out connection between yolo.py and train.py, therefore, can't seem to figure how to access only tensors from SPPF layer from last epoch. Any guidance would be helpful. |
@Utsabab I'm thrilled to hear you've made progress! 😊 To access SPPF layer's feature maps only from the last epoch, a straightforward approach is to modify Here's a quick sketch on how you could implement this:
# Inside the training loop of train.py
for epoch in range(start_epoch, epochs):
model.train()
for i, (imgs, targets, paths, _) in enumerate(dataloader):
...
predictions, _ = model(imgs.to(device), epoch=epoch, max_epoch=epochs-1)
# Inside your model definition
def forward(self, x, epoch=None, max_epoch=None, ...):
# Your forward logic
if epoch is not None and epoch == max_epoch:
# Logic to process or store feature maps from the SPPF layer
features = self.sppf_layer(x)
# Do something with features
... Please ensure you adjust the logic to fit your exact needs, especially how you plan to process or store the feature maps. This example is quite simplified to illustrate the concept. Best of luck with your further experiments! 🚀 |
@glenn-jocher, just wanted to drop a quick thanks for your help! Sorting out that epoch parameter hiccup really pushed me forward. I've been making some solid strides since then. One issue I am dealing with now is that back when I was training on batch size 1, I could easily snag each image path and its SPPF feature maps. With batch size bumped up to 128, I'm hitting a snag. I looped through paths to access image files in each batch, but struggling to do the same for the SPPF feature maps. Any pointers would be awesome! |
@Utsabab, thrilled to hear you're making progress! 🎉 For handling SPPF feature maps with larger batch sizes, remember that the feature maps will be batched just like your input images. Essentially, your feature maps tensor will have a shape like To iterate over each image's SPPF feature maps in the batch, you can simply use a loop. Here's a quick example: # Assuming 'features' is your batched SPPF feature maps tensor
for i in range(features.size(0)): # Looping through the batch dimension
feature_map = features[i] # This is the SPPF feature map for the ith image in the batch
# You can now process each feature_map as needed This way, you can access and process the SPPF feature maps for each image individually, even in larger batches. Keep in mind that processing these one by one in Python might be slower, and leveraging batch operations where possible is generally preferable for efficiency. Keep up the great work! 👍 |
@glenn-jocher It worked like a charm. Thanks a bunch! |
@Utsabab, you're welcome! 😊 Happy to hear it worked well for you. If you have any more questions or need further assistance as you continue with your project, feel free to reach out. Happy coding! 🚀 |
I would like to use the first 6 classes from coco and train additional classes (or not) from custom dataset. How can I achieve this? |
Hi @nbandaru1h, Great question! To use the first 6 classes from COCO and potentially add additional classes from your custom dataset, you can follow these steps:
This setup ensures that the model leverages the pre-trained weights for the COCO classes and learns any additional classes from your custom dataset. If you encounter any issues or have further questions, feel free to ask! Happy training! 🚀 |
How would you potentially do this if you were to freeze layers? How to execute that with darknet? |
To freeze layers in YOLOv5, you can use the python train.py --img 640 --epochs 50 --data /path/to/your/dataset.yaml --weights yolov5s.pt --freeze 10 For darknet, you would need to modify the |
So in the case of yolov3/v4, how many layers should ideally be frozen? Would it be the last few yolo layers or more? |
For YOLOv3/v4, it's common to freeze the backbone layers and fine-tune the head layers. Typically, you might freeze the first few layers that capture general features. The exact number can vary based on your dataset and goals. Experiment to find what works best for your specific case. |
Hi, I have a question regarding transfer learning. I want to use COCO pretrained weights and add another class of my own. For this, I added the class, made sure .yaml file contains that class and labeling is also done accordingly so that the newly added class has index 80 as 0-79 are for COCO. In total there are 81 classes.
Now the issue is that how to do transfer learning such that previous learned COCO representations remain retained ? In the following link Mr. Jocher strongly advised not to freeze layers with YOLOv3 and hence --transfer argument was removed.
ultralytics/yolov3#106 (comment)
But if layers are not frozen, then wouldnt new class learning override weights of previous COCO learned weights to learn new class ? I have used the pretrained yolov5x.pt weights, with training data containing only new class. but results are not good as it does not recognize COCO's person class. It seems that the model in this case just learns new class and forgets previous learning.
Please guide if anyone has done transfer learning successfully with YOLOv5, as in they fine tuned model so that it recognized COCO classes and their newly added class. If you can tell how you did it, that would be great help.
The text was updated successfully, but these errors were encountered: