-
-
Notifications
You must be signed in to change notification settings - Fork 16.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any plan for Knowledge Distillation? #1762
Comments
Hello @hzhuangdy, thank you for your interest in 🚀 YOLOv5! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution. If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you. If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available. For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com. RequirementsPython 3.8 or later with all requirements.txt dependencies installed, including $ pip install -r requirements.txt EnvironmentsYOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
StatusIf this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit. |
@hzhuangdy yes this is a very interesting concept. I've used this myself for autolabelling data with a trained YOLOv5x model in order to teach a smaller YOLOv5s model, and it works very well. These steps are manually possible at the moment, but hopefully in the future we can head towards a more automated pipeline for allowing this sort of behavior. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
@glenn-jocher Did you train the smaller model on soft or hard labels? |
@RobinBram not sure I understand. All models are trained identically, with commands to reproduce trainings displayed in README here: |
@glenn-jocher Since the topic was knowledge destillation, and from all the research I have read, the best way is to use the soft labels of the teacher and not the hard labels to have the student learn to reason and generalize in the same wa as the teacher. I'm not sure what the soft labels is in the object detection case. I suppose including the confidence for the different boxes of the teacher in the loss function might work. Knowledge distillation is also possible without any additional data, just using a weighted loss function with the ground truth and the soft labels of the teacher to train the student. Think I will try both in my master thesis that I'm currently working on. |
@RobinBram ah I see, soft labels are provided by a teacher model rather than a human labeler. Yes, we have some of the tools for this, but not the entire chain. You can autolabel any dataset by running it through test.py (or detect.py) with --save-txt, which will generate YOLO-format labels for all the detections, and you can also include the confidences in the label as a 6th column if you also pass --save-conf to test.py. I've used this for example to label all of COCO test set (40k images) with YOLOv5x, and then add them to COCO train set (120k images), to train new models on the merged dataset (160k images). Training on two datasets at the same time is very easy with YOLOv5, you just pass them both in your data.yaml as a list: The result of the 160k experiment was that smaller models like YOLOv5s achieved better results, but YOLOv5x itself did not improve, since it's the same size as the teacher model. We also don't have code in place to exploit label confidences yet though, so in my experiment above I only labelled high confidence objects, i.e. --conf 0.9. If you'd like to help contribute that would be great!
|
@RobinBram these are the commands you should look at. --save-hybrid is a very advanced feature that appends model predictions to existing (probably human) labels in your dataset (if any). NMS is run on the combined set for each image, with the apriori/human labels assigned confidences of 1.0 before NMS. Lines 295 to 297 in cd8ed35
|
If anyone is looking at this thread, everything is migrated into |
🚀 Feature
Use a teacher model to train a student model, which is lighter than the teacher model
It is a brilliant method for model simplification without a decrease in accuracy
Motivation
To make a small model more efficient
Pitch
Alternatives
Additional context
The text was updated successfully, but these errors were encountered: