Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidance regarding transfer learning with COCO to add additional classes #980

Closed
shayanalibhatti opened this issue Sep 16, 2020 · 42 comments
Labels
Stale Stale and schedule for closing soon

Comments

@shayanalibhatti
Copy link

Hi, I have a question regarding transfer learning. I want to use COCO pretrained weights and add another class of my own. For this, I added the class, made sure .yaml file contains that class and labeling is also done accordingly so that the newly added class has index 80 as 0-79 are for COCO. In total there are 81 classes.

Now the issue is that how to do transfer learning such that previous learned COCO representations remain retained ? In the following link Mr. Jocher strongly advised not to freeze layers with YOLOv3 and hence --transfer argument was removed.
ultralytics/yolov3#106 (comment)

But if layers are not frozen, then wouldnt new class learning override weights of previous COCO learned weights to learn new class ? I have used the pretrained yolov5x.pt weights, with training data containing only new class. but results are not good as it does not recognize COCO's person class. It seems that the model in this case just learns new class and forgets previous learning.

Please guide if anyone has done transfer learning successfully with YOLOv5, as in they fine tuned model so that it recognized COCO classes and their newly added class. If you can tell how you did it, that would be great help.

@glenn-jocher
Copy link
Member

This is very easy. You simply append your new dataset, labelled with class 80 onward, to coco.yaml train:, and train normally:
python train.py --data coco.yaml

yolov5/data/coco.yaml

Lines 12 to 13 in 5a9c5c1

# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: ../coco/train2017.txt # 118287 images

@daynial132
Copy link

daynial132 commented Sep 22, 2020

Thank you for the answer.

it will need to train the whole dataset again with 80 coco class already trained, and 1 new class.
it will take a lot of time as well as lot of computation.
Is there a way by just adding 1 class? not tainting all 80 + 1 class again

@shayanalibhatti
Copy link
Author

shayanalibhatti commented Sep 22, 2020

Exactly, retraining dataset is not desired as it can take weeks. Here is what I have tried. I created dataset based on new class that also contained COCO objects. Then I appended new class to COCO classes and made sure all labels are fine.

Then I froze the layers except for Detect head to fine tune it to new class using the following:

https://gist.github.com/shayanalibhatti/7955bc7a09cbfcd67843f7159d1d8924

But I am not seeing any good results .... on new class or COCO classes.

@glenn-jocher
Copy link
Member

The time savings are already built in by starting from pretrained weights. If you freeze layers you'll naturally see some additional speedup, though at the cost of reduced mAP.

@daynial132
Copy link

can you please share any link or source that can help us, understand it better

@glenn-jocher
Copy link
Member

@daynial132 for best results follow guidance provided above. For pretrained speedup comparison, simply see the training results shown in the colab notebook.

For freezing layers (not recommended), see ultralytics/yolov3#106

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Stale Stale and schedule for closing soon label Oct 24, 2020
@Dhirajdgandhi
Copy link

Hi, I have a question regarding transfer learning. I want to use COCO pretrained weights and add another class of my own. For this, I added the class, made sure .yaml file contains that class and labeling is also done accordingly so that the newly added class has index 80 as 0-79 are for COCO. In total there are 81 classes.

Now the issue is that how to do transfer learning such that previous learned COCO representations remain retained ? In the following link Mr. Jocher strongly advised not to freeze layers with YOLOv3 and hence --transfer argument was removed.
ultralytics/yolov3#106 (comment)

But if layers are not frozen, then wouldnt new class learning override weights of previous COCO learned weights to learn new class ? I have used the pretrained yolov5x.pt weights, with training data containing only new class. but results are not good as it does not recognize COCO's person class. It seems that the model in this case just learns new class and forgets previous learning.

Please guide if anyone has done transfer learning successfully with YOLOv5, as in they fine tuned model so that it recognized COCO classes and their newly added class. If you can tell how you did it, that would be great help.

Hey,

Did you find a solution to this? To not retrain the entire model, but rather just add a few labels to the existing model?

@glenn-jocher
Copy link
Member

glenn-jocher commented Nov 16, 2020

@Vanh1112
Copy link

Vanh1112 commented Nov 2, 2021

Exactly, retraining dataset is not desired as it can take weeks. Here is what I have tried. I created dataset based on new class that also contained COCO objects. Then I appended new class to COCO classes and made sure all labels are fine.

Then I froze the layers except for Detect head to fine tune it to new class using the following:

https://gist.github.com/shayanalibhatti/7955bc7a09cbfcd67843f7159d1d8924

But I am not seeing any good results .... on new class or COCO classes.

can you share with me how you do it? I'm ML newbie, I really haven't found a way to fine tuned model so that it recognized COCO classes and newly added class @shayanalibhatti

@Vanh1112
Copy link

Vanh1112 commented Nov 2, 2021

i want to fine tuned model so that it recognized COCO classes and 2 newly added classes. I edited the config file to classes=82, but only train with 2 newly added classes, can I do this? @glenn-jocher

@BahadirgK
Copy link

BahadirgK commented Dec 23, 2021

This is very easy. You simply append your new dataset, labelled with class 80 onward, to coco.yaml train:, and train normally: python train.py --data coco.yaml

yolov5/data/coco.yaml

Lines 12 to 13 in 5a9c5c1

# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: ../coco/train2017.txt # 118287 images

Hi. I wanna retrain my model and add classes onto the 80 class of coco. Is it enough that I download the coco images then copy and paste my custom images and labels to that folder. Is there any easy way to retrain and add classes? Thanks for your help.
@glenn-jocher

@glenn-jocher
Copy link
Member

@BahadirgK the response you copied already explains how to do this

@mkieffer1107
Copy link

Is it possible to train YOLOv5 on COCO with its 80 classes, and then export the weights to be used on a dataset with fewer classes?

@glenn-jocher
Copy link
Member

@mkieffer1107 yes, this is the default workflow.

@mkieffer1107
Copy link

Thanks for the reply! I’m new to ml in general. Do I just export the weights and then choose them using flags when running train.py on a new dataset?

@glenn-jocher
Copy link
Member

glenn-jocher commented Dec 31, 2021

@mkieffer1107 see Train Custom Data tutorial for guidance:

YOLOv5 Tutorials

Good luck 🍀 and let us know if you have any other questions!

@kashishnaqvi101
Copy link

can you please share any link or source that can help us, understand it better

Hey did you find how to perform transfer learning by for combining coco weights and custom weights?

@kashishnaqvi101
Copy link

Exactly, retraining dataset is not desired as it can take weeks. Here is what I have tried. I created dataset based on new class that also contained COCO objects. Then I appended new class to COCO classes and made sure all labels are fine.

Then I froze the layers except for Detect head to fine tune it to new class using the following:

https://gist.github.com/shayanalibhatti/7955bc7a09cbfcd67843f7159d1d8924

But I am not seeing any good results .... on new class or COCO classes.

Hey, were you able to find any solution for this? I wanted to add 5 classes for currency notes along with the 80 coco classes.

@kashishnaqvi101
Copy link

@kashishnaqvi101 see https://community.ultralytics.com/t/how-to-combine-weights-to-detect-from-multiple-datasets/38/13

Hey, thanks for the quick reply, however if I train on Colab, won't it take too long to be trained again on coco dataset as well as my own custom dataset?

@kashishnaqvi101
Copy link

@kashishnaqvi101 see https://community.ultralytics.com/t/how-to-combine-weights-to-detect-from-multiple-datasets/38/13

I still am trying to figure this out, do we have to train on a new dataset along with training with coco dataset? Or we can use the yolov5.pt for coco ?

@Utsabab
Copy link

Utsabab commented Mar 21, 2024

@kashishnaqvi101 I guess it's a bit late. I am currently working through the same problem and figured it out going through several discussion threads. Hope this helps other researchers as well.

Yes, you will need to train the model with COCO dataset and your custom dataset both if your custom data has new classes not in the COCO dataset. @glenn-jocher explained how to do that above. #980 (comment). Remember your new class should be labeled starting from 80 now as COCO already has label mapped from 0-79. Freezing layers potentially can be done, however, it reduces mAP although it takes less training time.

Pre-trained weights from yolov5s.pt will only come in handy when your custom dataset contains labels from COCO and no new class labels. To do so, you just need to create a customdata.yaml file with link to your custom dataset and remember to list out all 80 classes from COCO.

To get the best results for custom data with new classes, train with COCO + custom dataset with updated label mapping.

@glenn-jocher
Copy link
Member

@Utsabab excellent summary, thanks for sharing your findings! Just to add a bit, when combining COCO with a custom dataset that includes new classes, indeed your custom class labels should start from 80 onwards. Using pre-trained weights like yolov5s.pt helps in leveraging learned features to expedite the training process for your combined dataset.

For the training, make sure you update your dataset .yaml file to include both the COCO and new classes, adjusting the nc (number of classes) field accordingly. Here's a quick snippet on how your .yaml might look:

# example dataset.yaml
train: /path/to/your/train/images
val: /path/to/your/val/images

# number of classes (80 COCO + num of your custom classes)
nc: 85  

# class names (COCO classes followed by your custom classes)
names: [0 .. 79, 'custom1', 'custom2', 'custom3', 'custom4', 'custom5']

Be sure to adjust the path to your train/val images and the number of custom classes (nc) as well as their names (names) to fit your specific case. Happy training! 🚀

@Utsabab
Copy link

Utsabab commented Mar 22, 2024

@Utsabab excellent summary, thanks for sharing your findings! Just to add a bit, when combining COCO with a custom dataset that includes new classes, indeed your custom class labels should start from 80 onwards. Using pre-trained weights like yolov5s.pt helps in leveraging learned features to expedite the training process for your combined dataset.

For the training, make sure you update your dataset .yaml file to include both the COCO and new classes, adjusting the nc (number of classes) field accordingly. Here's a quick snippet on how your .yaml might look:

# example dataset.yaml
train: /path/to/your/train/images
val: /path/to/your/val/images

# number of classes (80 COCO + num of your custom classes)
nc: 85  

# class names (COCO classes followed by your custom classes)
names: [0 .. 79, 'custom1', 'custom2', 'custom3', 'custom4', 'custom5']

Be sure to adjust the path to your train/val images and the number of custom classes (nc) as well as their names (names) to fit your specific case. Happy training! 🚀

@glenn-jocher Thanks for the explanation. Just one more clarifying question. I was assuming I will have to add coco dataset path to train and val in dataset.yaml.

# example dataset.yaml
train: 
- /path/to/your/train/images
- /path/to/coco/train/images
val: 
- /path/to/your/val/images
- /path/to/coco/train/images

# number of classes (80 COCO + num of your custom classes)
nc: 85  

# class names (COCO classes followed by your custom classes)
names: [0 .. 79, 'custom1', 'custom2', 'custom3', 'custom4', 'custom5']

Will pre-trained weights from YOLOv5s.pt still be applied if the path to the COCO dataset is not included in the .yaml file?

Thank you!

@glenn-jocher
Copy link
Member

@Utsabab yes, you've got it right! 😊 Including the COCO dataset paths along with your custom dataset paths in the .yaml file, as you've shown, allows you to train on both datasets simultaneously, which is especially useful when adding new classes to the COCO dataset.

To clarify: even if you don't include the COCO dataset path in your .yaml, the pre-trained weights from yolov5s.pt (or any other variant you choose) will still provide a beneficial starting point due to their training on the COCO dataset. These weights help in faster convergence and potentially better performance, especially for features common between COCO and your custom classes. Including COCO paths in training simply provides additional training data, which can be helpful for generalization, especially for the new classes you're adding.

Remember, the use of pre-trained weights aims to leverage learned features to improve training efficiency and effectiveness on your task.

Happy to help if you have more questions!

@Utsabab
Copy link

Utsabab commented Mar 26, 2024

@glenn-jocher I have 14 classes in my custom dataset of which 9 are new and 5 are in the COCO labels. Is there a way I can use pre-trained weights for the 5 COCO classes only without including other labels from COCO in custom yaml file? How would the example dataset.yaml would look like? I am getting low recall score, therefore, hope to increase it by only predicting objects from my custom dataset. I'd appreciate your help.

@glenn-jocher
Copy link
Member

@Utsabab, great question! To leverage pre-trained weights for the 5 COCO classes while introducing your 9 new ones, simply structure your dataset.yaml as follows, keeping all 14 of your custom classes in the list, including those overlapping with COCO. Recall may indeed improve due to a more focused training on relevant classes.

Here's how your dataset.yaml might look:

train: /path/to/your/train/images
val: /path/to/your/val/images

# 14 classes: 5 from COCO + 9 new ones
nc: 14  

# Class names: ensure COCO overlapping classes are correctly named
names: ['COCO_class1', 'COCO_class2', 'COCO_class3', 'COCO_class4', 'COCO_class5', 'new_class1', 'new_class2', 'new_class3', 'new_class4', 'new_class5', 'new_class6', 'new_class7', 'new_class8', 'new_class9']

This setup instructs the model to learn from the pre-trained weights for any common COCO features while focusing exclusively on detecting your defined 14 classes. Just ensure that the COCO classes in your dataset share the exact names as they do in COCO for maximal benefit from the transfer learning.

Hope this helps, and happy training! 🚀

@Utsabab
Copy link

Utsabab commented Apr 8, 2024

@glenn-jocher I retrained my model eliminating under-represented object labels from the data and the mAP and recall value improved. I am currently working on accessing the feature maps of each images during training. I found recourses to identify and visualize feature maps during inference, however, not during training. In addition, I am trying to figure out a way to use those feature maps as an additional input and dissociate relation between feature maps during training. Any insights into it would assist me. Thank you!

@glenn-jocher
Copy link
Member

@Utsabab hey there! 🌟 Glad to hear about the improvement in your mAP and recall values after refining your dataset. For accessing and visualizing feature maps during training in YOLOv5, you can modify the model's forward function to return feature maps of interest, alongside the usual predictions.

Here's a simplistic example on how you might approach it:

class Model(nn.Module):
    def __init__(self, ...):
        super(Model, self).__init__()
        # your model architecture here

    def forward(self, x, feature_visualize=False):
        # pass x through the layers
        # let's say you want to visualize features from layer 'some_layer'
        if feature_visualize:
            features = self.some_layer(x)
            return predictions, features
        return predictions

And during training, you could grab those feature maps with something like this:

predictions, features = model(images, feature_visualize=True)

Remember, this is quite simplified and integrating it into YOLOv5 or any other complex models might require a more nuanced approach, especially with handling multiple feature layers and integrating this additional input meaningfully for your specific use case.

Using those feature maps as an additional input and dissociating relations between them during training could get pretty specific depending on what you're aiming to achieve. You might experiment with concatenating these features at different stages of your model or using them as part of attention mechanisms, ensuring you adjust the training logic to leverage this effectively.

Good luck with your exploration, and I hope this gives you a solid starting point!

@Utsabab
Copy link

Utsabab commented Apr 21, 2024

@glenn-jocher Thank you for the directions. I have been able to access feature maps from SPPF layers inside the _forward_once function of BaseModel class in yolo.py. However, I want the tensors for only the last epoch. I can access epochs and img paths in train.py, however, I can't figure out connection between yolo.py and train.py, therefore, can't seem to figure how to access only tensors from SPPF layer from last epoch. Any guidance would be helpful.

@glenn-jocher
Copy link
Member

@Utsabab I'm thrilled to hear you've made progress! 😊 To access SPPF layer's feature maps only from the last epoch, a straightforward approach is to modify train.py to pass a flag or epoch information to your model indicating the current epoch. Then, within your model, you can conditionally store or process these feature maps based on this information.

Here's a quick sketch on how you could implement this:

  1. Modify train.py: Pass the current epoch number to your model's forward method.
# Inside the training loop of train.py
for epoch in range(start_epoch, epochs):
    model.train()
    for i, (imgs, targets, paths, _) in enumerate(dataloader):
        ...
        predictions, _ = model(imgs.to(device), epoch=epoch, max_epoch=epochs-1)
  1. Adjust your model's forward method: Accept the epoch parameters and implement logic to handle the last epoch differently.
# Inside your model definition
def forward(self, x, epoch=None, max_epoch=None, ...):
    # Your forward logic
    if epoch is not None and epoch == max_epoch:
        # Logic to process or store feature maps from the SPPF layer
        features = self.sppf_layer(x)
        # Do something with features
    ...

Please ensure you adjust the logic to fit your exact needs, especially how you plan to process or store the feature maps. This example is quite simplified to illustrate the concept. Best of luck with your further experiments! 🚀

@Utsabab
Copy link

Utsabab commented Apr 24, 2024

@glenn-jocher, just wanted to drop a quick thanks for your help! Sorting out that epoch parameter hiccup really pushed me forward. I've been making some solid strides since then. One issue I am dealing with now is that back when I was training on batch size 1, I could easily snag each image path and its SPPF feature maps. With batch size bumped up to 128, I'm hitting a snag. I looped through paths to access image files in each batch, but struggling to do the same for the SPPF feature maps. Any pointers would be awesome!

@glenn-jocher
Copy link
Member

@Utsabab, thrilled to hear you're making progress! 🎉 For handling SPPF feature maps with larger batch sizes, remember that the feature maps will be batched just like your input images. Essentially, your feature maps tensor will have a shape like [batch size, channels, height, width].

To iterate over each image's SPPF feature maps in the batch, you can simply use a loop. Here's a quick example:

# Assuming 'features' is your batched SPPF feature maps tensor
for i in range(features.size(0)):  # Looping through the batch dimension
    feature_map = features[i]  # This is the SPPF feature map for the ith image in the batch
    # You can now process each feature_map as needed

This way, you can access and process the SPPF feature maps for each image individually, even in larger batches. Keep in mind that processing these one by one in Python might be slower, and leveraging batch operations where possible is generally preferable for efficiency.

Keep up the great work! 👍

@Utsabab
Copy link

Utsabab commented Apr 25, 2024

@glenn-jocher It worked like a charm. Thanks a bunch!

@glenn-jocher
Copy link
Member

@Utsabab, you're welcome! 😊 Happy to hear it worked well for you. If you have any more questions or need further assistance as you continue with your project, feel free to reach out. Happy coding! 🚀

@nbandaru1h
Copy link

I would like to use the first 6 classes from coco and train additional classes (or not) from custom dataset. How can I achieve this?

@glenn-jocher
Copy link
Member

Hi @nbandaru1h,

Great question! To use the first 6 classes from COCO and potentially add additional classes from your custom dataset, you can follow these steps:

  1. Prepare Your Dataset:

    • Ensure your custom dataset is labeled in the YOLO format.
    • Create a dataset.yaml file that includes paths to your training and validation images, and lists all the classes you want to train on (both COCO and custom).
  2. Modify the dataset.yaml:

    • Include the first 6 COCO classes and any additional classes from your custom dataset. Here's an example:
    train: /path/to/your/train/images
    val: /path/to/your/val/images
    
    # Number of classes: 6 COCO classes + your custom classes
    nc: 6 + <number_of_custom_classes>
    
    # Class names: first 6 COCO classes followed by your custom classes
    names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'custom_class1', 'custom_class2', ...]
  3. Training:

    • Use the pre-trained COCO weights and specify your custom dataset.yaml file during training. This will allow the model to retain the knowledge of the first 6 COCO classes while learning the new classes from your custom dataset.
    python train.py --img 640 --epochs 50 --data /path/to/your/dataset.yaml --weights yolov5s.pt

This setup ensures that the model leverages the pre-trained weights for the COCO classes and learns any additional classes from your custom dataset. If you encounter any issues or have further questions, feel free to ask!

Happy training! 🚀

@nbandaru1h
Copy link

Hi @nbandaru1h,

Great question! To use the first 6 classes from COCO and potentially add additional classes from your custom dataset, you can follow these steps:

  1. Prepare Your Dataset:

    • Ensure your custom dataset is labeled in the YOLO format.
    • Create a dataset.yaml file that includes paths to your training and validation images, and lists all the classes you want to train on (both COCO and custom).
  2. Modify the dataset.yaml:

    • Include the first 6 COCO classes and any additional classes from your custom dataset. Here's an example:
    train: /path/to/your/train/images
    val: /path/to/your/val/images
    
    # Number of classes: 6 COCO classes + your custom classes
    nc: 6 + <number_of_custom_classes>
    
    # Class names: first 6 COCO classes followed by your custom classes
    names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'custom_class1', 'custom_class2', ...]
  3. Training:

    • Use the pre-trained COCO weights and specify your custom dataset.yaml file during training. This will allow the model to retain the knowledge of the first 6 COCO classes while learning the new classes from your custom dataset.
    python train.py --img 640 --epochs 50 --data /path/to/your/dataset.yaml --weights yolov5s.pt

This setup ensures that the model leverages the pre-trained weights for the COCO classes and learns any additional classes from your custom dataset. If you encounter any issues or have further questions, feel free to ask!

Happy training! 🚀

How would you potentially do this if you were to freeze layers? How to execute that with darknet?

@glenn-jocher
Copy link
Member

To freeze layers in YOLOv5, you can use the --freeze argument during training. For example, to freeze the first 10 layers, you can run:

python train.py --img 640 --epochs 50 --data /path/to/your/dataset.yaml --weights yolov5s.pt --freeze 10

For darknet, you would need to modify the .cfg file to set stopbackward=1 for the layers you want to freeze. If you need further assistance, please refer to the YOLOv5 documentation.

@nbandaru1h
Copy link

To freeze layers in YOLOv5, you can use the --freeze argument during training. For example, to freeze the first 10 layers, you can run:

python train.py --img 640 --epochs 50 --data /path/to/your/dataset.yaml --weights yolov5s.pt --freeze 10

For darknet, you would need to modify the .cfg file to set stopbackward=1 for the layers you want to freeze. If you need further assistance, please refer to the YOLOv5 documentation.

So in the case of yolov3/v4, how many layers should ideally be frozen? Would it be the last few yolo layers or more?

@glenn-jocher
Copy link
Member

For YOLOv3/v4, it's common to freeze the backbone layers and fine-tune the head layers. Typically, you might freeze the first few layers that capture general features. The exact number can vary based on your dataset and goals. Experiment to find what works best for your specific case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Stale Stale and schedule for closing soon
Projects
None yet
Development

No branches or pull requests

10 participants