Some questions and findings of training on my dataset. #274

wangat · 2024-11-19T08:25:45Z

Thank you for your article and project. Before training, I qualitatively tested a batch of data using pre-trained YOLO series models (5-11) and models such as Dino, Co-Detr, Florence2（Fine tune and pre-train weights on my data）, YOLOWorld, etc. The statistics showed that the results of Co-Detr and Dino were the best, which could approximate the version I trained on YOLOV7（Having parameter 37622682） based on isolated data.

So I tried to replicate the situation in the literature on Coco and found that it tended towards your results. Afterwards, I converted my training set to coco format (from millions of level data collected by urban surveillance cameras in real environments, including four categories: faces, human bodies, vehicles, and non motorized vehicles; after cleaning and analyzing the data, I selected 110w images as the training set), and then I trained the Dino model in ResNet50 and Swin_L3-384_22k versions. During the training process and yolo series comparison found some situations:

1.I found that Dino's results were not as good as the YOLOV10x（Having parameter 31662584） version on my own dataset (which was the same for both the validation and test sets).
2.I found that Dino（Having parameter 217163332） with a larger number of parameters has higher accuracy than Dino-ResNet50 （Having parameter 46604048）in human and non motorized vehicles, and lower accuracy in other indicators. Considering the experiments with the same data in the YOLO series and some anchor free models, larger parameters (Especially in terms of several parameter differences) should improve the indicators. I am not sure about the reason for this situation. Due to equipment limitations, the two training sessions were conducted on different hardware, which naturally resulted in different initial random numbers. Additionally, non deterministic operators were used in Dino, and the results could not be fixed (a higher result was found in the replication experiment in dn-detr-resnet50-12 epochs, but unfortunately only images were saved and the storage address was forgotten to be changed during subsequent testing). At the same time, my training set has a special definition of the human body for cycling drivers (involving both the human body and non motorized vehicles), and I suspect that this may have caused the decline in the indicators.
3.Because the training cannot be fully replicated, is your result the average of multiple training sessions? Previously, in some other projects, it was found that randomness had a significant impact on the results on simple datasets (with fewer categories but a lot of scenarios) and large datasets. At that time, the statistical data could reach plus or minus 2% of the baseline.
4.During training, it was found that the model is unstable (batch size=2 for each card), and training interruptions often occur. The error message is as follows: 'torch. distributed. last. multiprocessing. app: [ERROR] failed (exitcode: -9) local_rank: 3 (pid: 1064263) of binary:/opt/conda/bin/python'. The follow-up investigation may be related to CPU. Have you encountered similar problems before?

Looking forward to your reply. （I'm very sorry for asking you many questions. thank you.）

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions and findings of training on my dataset. #274

Some questions and findings of training on my dataset. #274

wangat commented Nov 19, 2024

Some questions and findings of training on my dataset. #274

Some questions and findings of training on my dataset. #274

Comments

wangat commented Nov 19, 2024