The goal of this competition is to accurately identify starfish in real-time by building an object detection model trained on underwater videos of coral reefs.
Me and my teammate started early and contributed with few Notebook and discussion threads, but as the deadline for the overlapping competition Sartorius -CIS
was approaching
we had to shift our focus toward that. We got started with the competiion after the 31st Dec. It took some time to digest all the things that were happening, [im still reading the solutions, and some code π
.]. This was a very successful competition [also sartorius], total of 2,026 teams and 61,174 submissions. People shared there ideas and code massively, because of that it was a great start for a beginner like me, because I got to learn a lot of things. With the help of the community our team was able to get in top 3% of the LB and had a silver medal [we were expecting Bronze]. Here I will be discussing about my experiments. I created a lot of NBs in this competition, and from that I tried to compile most of the NBs that are important. Here you will find some forked NBs(with modification) and some independent NBs. If I missed something please create an issue and ask there.
img_pred_seq45518.mp4
See more here
- Most of the ideas were proposed on the Discussion forums
- When I got to yolov5 it was clear that yolov5 is FTW. But there were conflits on which version to use yolov5s6 or yolov5m6. Because some were getting better results on one of them.
-
I first started with 2 stage detector FasterRCNN, I tried different backbones and hyper-params with different augmentation techniques[geometric and colog and combined]. I tried ResNet101,ResNet50,MobileNet,EfficientB3,SwinTransformer. Check out this amazing repository by @mrinath, it helped for efficient with timm timmFasterRcnn.
-
I started with 3fold yolov5s6 It was a video based splitting. I was using this repository minor changes over the ultralytics yolov5 to track the f2 score. As per my analysis it was most likely that video_id2 would give more better f2, because it has more data, and there were varience in the data. I tried different hyper parameters in that, and different training image resolutions. I tried doing ensemble after training each fold. I did the same with yolov5m6. I found out Adam was working better SGD. I also did some experiments with custom augmentation using albumentations.
-
After seeing some discussion on yolov5 model freezing, I thought of trying that, and for this best splitting was sequence based groupfold. for more check out ultralytics docs. I trained both yolov5s6 and yolov5m6. image size was +/-3000.
-
Along with yolov5, tracking was doing a better job increasing the CV/LB. I also tried that. I used norfair tracking. I saw some discussions on different tracking to use, like deep sort and so, but ended up using the norfair one as it was giving decent results and I did not had much time.
-
As a postprocessing technique I also tried to use classification on the bounding boxes. It also helped. I tried different models Normal CNN, Densenet121, Resnet [50,101], Efficientnet[B3] and ensemble. I used this code for doing the ensemble, it is also added that in this repo. Our demo pipeline looks like this,
-
graph TD;
A(Competition Data)--> B(video split vid_0);
A(Competition Data)--> C(video split vid_1);
A(Competition Data)--> D(video split vid_2);
B(video split vid_0)-->E(Train yolov5s6 img-3584);
C(video split vid_1)-->M(Train yolov5s6 img-3584);
D(video split vid_2)-->N(Train yolov5s6 img-3584);
E(Train yolov5s6 img-3584)--> K(TTA w/ inf imgsize-6200);
M(Train yolov5s6 img-3584)--> K(TTA w/ inf imgsize-6200);
N(Train yolov5s6 img-3584)--> K(TTA w/ inf imgsize-6200);
K(TTA w/ inf imgsize-6200)--eg conf:0.30, thr:0.50--> G(WBF);
G(WBF)--> F(Classification);
F(Classification)--> H(norfair Tracking);
H(norfair Tracking)-->final
-
Learning to Sea: Underwater img Enhancement + EDA
public-kaggle
[200+ Upvotes] -
competition metric implementation clone, w/ some fix: reef-model-cv-check
- StarFish-V3 [yolov5 ensemble + tracking]
- StarFish-V3 [yolov5 + tracking]
- yolov5 inference | Leon-V5-infer 2.0
Inference NB: https://www.kaggle.com/soumya9977/learning-to-torch-fasterrcnn-infer
Experiment log FasterRCNN:
Version | model | file used | link | CV/LB |
---|---|---|---|---|
v9 | fasterRCNN | fasterrcnn_resnet50_fpn-e10.pt | NB | 0.461/0.285 |
v12 | fasterRCNN | fasterrcnn_resnet50_fpn-e9.pt | NB | 0.461/~0.285 |
v10 | fasterRCNN | fasterrcnn_resnet50_fpn-e11.pt | NB | 0.459/0.285 |
v13 | fasterRCNN | fasterrcnn_resnet50_fpn-e8.pt | NB | 0.460/0.288 |
v11 | fasterRCNN | fasterrcnn_resnet50_fpn-e6.pt | NB | 0.457/0.291 |
Experiment log FasterRCNN:
Version | model | file used | link | CV/LB |
---|---|---|---|---|
v9 | fasterRCNN resnet50,90/10,e12,bs8,SGD,cnf0.15,i480 | fasterrcnn_resnet50_fpn-e10.pt | NB | 0.461/0.285 |
v12 | fasterRCNN ................... same as above | fasterrcnn_resnet50_fpn-e9.pt | NB | 0.461/~0.285 |
v10 | fasterRCNN ................... same as above | fasterrcnn_resnet50_fpn-e11.pt | NB | 0.459/0.285 |
v13 | fasterRCNN ................... same as above | fasterrcnn_resnet50_fpn-e8.pt | NB | 0.460/0.288 |
v11 | fasterRCNN ................... same as above | fasterrcnn_resnet50_fpn-e6.pt | NB | 0.457/0.291 |
v16 | fasterRCNN resnet50,90/10,e12,bs8,SGD,cnf0.15,i480,geo aug | fasterrcnn_resnet50_fpn-e6.pt | NB | 0.467/0.274 |
v17 | fasterRCNN resnet50,90/10,e20,bs8,SGD,cnf0.15,i480,color aug | fasterrcnn_resnet50_fpn-e6.pt | NB | 0.407/0.382 |
v18 | fasterRCNN resnet50,90/10,e20,bs8,SGD,cnf0.15,i480,color aug | fasterrcnn_resnet50_fpn-e20.pt | NB | 0.407/0.291 |
v20 | FasterRCNN[train]-color+geo aug-480p-SGD-90:10-e20,multi conf,new train loop,bs8 | fasterrcnn_resnet50_fpn-e11.pt | NB | 0.338/? |
v21 | FasterRCNN[train]-color+geo aug-480p-SGD-90:10-e20,multi conf,new train loop,bs8 | fasterrcnn_resnet50_fpn-e20.pt | NB | 0.338/0.184 |
v22 | fasterRCNN resnet50,90/10,e20,bs8,SGD,cnf0.15,i480,color aug [inf imgSize2400] | fasterrcnn_resnet50_fpn-e6.pt | NB | 0.407/0.00 [problem in the code] |
v23 | fasterRCNN resnet50,90/10,e16,bs8,AdamW,cnf0.15,i480,color aug [save_multy: future_resume] | fasterrcnn_resnet50_fpn-e7.pt | NB | 0.382/? |
v24 | fasterRCNN resnet50,90/10,e16,bs8,AdamW,cnf0.1,i480,color aug [save_multy: future_resume] | fasterrcnn_resnet50_fpn-e7.pt | NB | 0.382/? |
Experiment log YOLOV5:
version | config | iou & conf | img_size[train/test] | epoch used | CV/LB |
---|---|---|---|---|---|
starfish-v17 | [tracking,tta] 1/5 fold,yolos6 ,3000,e11,bs2 | 0.4,0.28 | 1920 x 3 | best.pt | 0.81/0.571 |
starfish-v16 | [tracking,tta] Good Moon model yolos6 | "" | 6400 | f2_sub.pt | ?/0.647 |
starfish-v15 | [tracking,tta] Good Moon model yolos6 | "" | 1920 x 3 | f2_sub.pt | ?/0.641 |
Leon-V5 - v3 | [no tracking, tta] Good Moon model yolos6 | "" | 10000 | f2_sub.pt | ?/0.432 |
Leon-V5 - v4 | [no tracking, tta] Good Moon model yolos6 | 0.4,0.20 | 10000 | f2_sub.pt | ?/0.424 |
Leon-V5 - v1 | [no tracking, tta] Good Moon model yolos6 | 0.50,0.30 | 6400 | f2_sub.pt | ?/0.665 |
Leon-V5 - v2 | [no tracking, tta] Good Moon model yolos6 | 0.4,0.28 | 6400 | f2_sub.pt | ?/0.665 |
starfish-v13 | [tracking,tta] 1/5 fold,yolov5s5 ,3000,e11,bs2 | 0.4,0.28 | 1920 x 3 | best.pt | 0.76/0.588 |
starfish-v12 | [tracking,tta] 1/5 fold,yolov5s5 ,3000,e11,bs2 | 0.4,0.15 | 1920 x 3 | best.pt | 0.76/0.580 |
starfish-v07 | [tracking,tta] 1/5 fold,yolov5s5 ,3000,e11,bs2 | 0.4,0.28 | 1920 x 3 | best.pt | 0.76/0.588 |
Experiment log YOLOV5:
version | config | epoch for sub | cv/lb |
---|---|---|---|
v4 | CONF= 0.28, IOU= 0.40, sheep's model | fold2 best | ?/0.616 |
v5 | yolov5s5:albu[frcnn],imgsize=3600,bs=2,e11,CONF= 0.28, IOU= 0.40 | best.pt | ?/0.552 |
v7 | yolov5s5:albu[frcnn],imgsize=3600,bs=2,e11,CONF= 0.28, IOU= 0.40 [SAME as β] | epoch6.pt | 0.73871/0.552 |
v8 | yolov5s5:ammarnassanalhajali yolov5 | best.pt | ?/? |
v10 | yolov5s5:[BASE MODEL] imgsize=3600,bs=2,e11,CONF= 0.28, IOU= 0.40 | best.pt | 0.76../0.588 |
v12 | yolov5s5:[BASE MODEL] imgsize=3600,bs=2,e11,CONF= 0.15, IOU= 0.40 | best.pt | 0.76../0.580 |
v12 | yolov5s5:[BASE MODEL] imgsize=3600,bs=2,e11,CONF= 0.15, IOU= 0.40 | epoch7.pt | 0.76../? |
v15 | yolov5s6:[Good Moon Model] CONF= 0.28, IOU= 0.40, img size = 1980x2 | f2_sub2.pt | 0.76../? |
v16 | yolov5s6:[Good Moon Model] CONF= 0.28, IOU= 0.40, img size = 6400 | f2_sub2.pt | 0.76../? |
v16 | yolov5s6:[yolov5s6] imgsize=3600,bs=2,e11,CONF= 0.28, IOU= 0.40 | best.pt | 0.81../? |
v18 | yolov5s6:[Good Moon Model] CONF= 0.30, IOU= 0.50, img size = 6400 | f2_sub2.pt | 0.76../? |
v21 | yolov5s6: Vid based split, vid:2, CONF= 0.30, IOU= 0.50, inf img size = 6400, train 3584, 6th epoch | best f2 epoch | 0.89/0.620 |
v23 | yolov5s6: Vid based split, vid:1, CONF= 0.30, IOU= 0.50, inf img size = 6400, train 3584, 7th epoch | best f2 epoch | 0.72/0.610 |
v32 | yolov5s6:[Good Moon Model] CONF=0.30,IOU= 0.50, img size = 6400 | '../input/yolov5s6/yolov5s6_sub9.pt' | ?/0.680 |
v33 | yolov5m6: 3584img, adam,e6,bs2,vid:2 | last.pt | 0.87/0.558[p] |
v34 | yolov5m6: 3k img, e6,bs2, vid:2 | epoch5.pt | 0.869/0.558[p] |
v35 | yolov5s6 3584img, video_fold vid2, copypaste:0.5, e10 | epoch7.pt | 0.88/0.625[p] |
v36 | yolov5m6 resume training, 3584img, video_fold vid2, e11 | epoch9.pt | 0.88/0.623[p] |
Experiment log YOLOV5:
- yolov5m6try1-epoch5 = 0.600
- yolov5m6try1-epoch3 = 0.562
- yolov5s6-vid_id:0 = 0.555 [yolov5s6]
- Write a optuna scritp for adjusting the inference hyper-params of yolov5
- Add some other NBs and writeups that are left.
- making the code into python scripts.
- Add model prediction video
- dont participate on multiple comps w/ overlapping timings [one at a time]
- Try to explore more fields [NLP,audio,tabular]
- use better ways to track [use WandB/google sheet]
- In the middle of the competition, when a LB score boosting trick[increasing image size to 3x,4x ..10x] got shared, that changed the momentum of the competition. The competition suddenly turned into a GPU war. PPl who had more resources and compute were getting high LB scores. This thing really effected me bcoz I did not had any good compute except kaggle and colab. I was feeling like giving up. But though that I learned two life lession, which are,
- keep patience
- either you go all the way or you dont go anywhere.
After ward I understood that, anyone can achieve anything if they has the patience of keep working irrespective of the outcome. And for the 2nd point, I guess I developed it in myself that if you stop a process in the middle, it will give you nothing but regret, because you started with a motivation, right? I used to see the posters that I wrote over the competition time, used to see the timelines that I made involving differnet experimentation ideas, the progress I had made, these thing motivated me to keep on going all the way. So my advice will be to create some posters/TODOs/learning blogs while in the process of doing it, this will keep you motivated through out the journey and when you feel like giving up see those poster/blogs, and ask yourself, if I had to give up after coming that far, then why I even started doing it, those blogs/posters will give you the reason behind of your starting.