Investigate recent advances for next model backbone (2023/2024) #7

wingman-jr-addon · 2023-04-11T03:50:40Z

facebookresearch/ConvNeXt-V2#3
https://github.com/edwardyehuang/iSeg/tree/master/backbones

wingman-jr-addon · 2024-01-17T04:28:16Z

Found an excellent resource for model implementations at https://github.com/leondgarse/keras_cv_attention_models#recognition-models, which should accelerate trying out new models.

I've been doing some catch up on recent advancements of the past two or so years. While ConvNextV2 is intriguing due to model size, I think a better approach for this use case will be to not only focus on the model size but also on the pretraining. In particular, I think the DINO/DINOv2 and CLIP-related pretraining approaches are particularly helpful due to the robustness in distribution shift. Not only is the training material much closer to our target distribution, but generally the models will be much stronger.

To test this theory out, I tried a DINOv2 finetune out (dense layer only + gradual weight change on last 15 layers) and got excellent results, better than I had seen for some of my more half-hearted approaches to e.g. Inceptions and/or ResNets. The only challenge there is that the smallest model available uses a whopping 47.23G FLOPS (vs. 0.72G for say EfficientNetV1 B0). A bit surprisingly, I was able to successfully convert it to TensorflowJS but it was slow to the level of several seconds per image prediction. Still, a useful experiment to demonstrate effectiveness of a stronger model. The dataset has also increased somewhat, so it's not quite apples to apples but notice the improvement in the DET and ROC curves.
SQRXR 112 (EfficientNet Lite L0-based):

SQRXR 119 (DINOv2):

I'm going to check out an EVA02 based model next as it is CLIP-based, but it still weighs in at 4.72G FLOPS so it's in theory going to be a few times slower than the current.

wingman-jr-addon · 2024-01-19T01:06:01Z

So I tried out the smallest EVA02, EVA02TinyPatch14. I tried training with various finetunes changing the number of layers of the graph I retuned. Results were OK, but final DET graphs showed poor and/or uneven performance. My hypothesis is to try the next size up EVA02 and see if I gain more of the smoothness and performance I observed in the megasized DINOv2.

For comparison, here's a DET output from EVA02Tiny (SQRXR 120):

Now compare that to the current deployed EfficientNetLite L0 based approach (SQRXR 119):

wingman-jr-addon · 2024-01-20T20:32:11Z

Results from EVA02 larger model were decent but not enough to justify performance penalty.
SQRXR 121 EVA02Small:

wingman-jr-addon · 2024-01-20T20:34:19Z

EfficientFormerV2S2 seems like a potential for incremental improvement. (SQRXR 122)

Still a little squiggly on the bottom of the DET curve, not such a fan for FPR on the "trusted" zone. Still good though and inference only increased from about 68ms for SQRXR 112 to 82ms for this one SQRXR 122.

wingman-jr-addon · 2024-01-22T04:02:07Z

I had a few more experiments.

I tried playing around with Uniformer. However, I was getting reasonable but much lower than expected accuracy so I suspect I'm not integrating something correctly.
I worked with EdgeNext_Base. DET wasn't as promising as it could be:
I returned to an EfficientFormer variant, EfficientFormerV2L. Performance was not better than current, and was actually no better than EfficientFormerV2S2, which surprised me:
I returned to the possible incremental results from EfficientFormerV2S2 and looked at ways to smooth out the DET curve by playing with training, making some updates like a switch to AdamW in a couple places. This was successful. SQRXR 127 (compare to SQRXR 122 in last comment):

It's a subtle improvement over SQRXR 112 because the DET curve bows in slightly. For example, 5% FNR crosses 20% FPR on SQRXR 112 but SQRXR 127 is clearly below. The non-linear scaling is something to watch carefully here as subtle changes can actually mean bigger changes in the final model performance.

wingman-jr-addon · 2024-01-24T10:09:53Z

I tried working with EfficientViT B2 as SQRXR 128. Training went well and overall results were promising:

Unfortunately the resulting model was a bit difficult to both reload as well as to convert into TF.js. The use of 'hard_swish' did not play well, but I was able to coax the data into a custom layer instead of a function and got it to reload; however, the use of the PartitionedCall op ultimately meant TF.js couldn't handle it. Might be something to return to as there may be a way to coax the model to not have a PartitionedCall at some point but it's not obvious.

Next up trying RepViT M11 as SQRXR 129. Training was OK but did not seem to provide much advantage over say SQRXR 127.

wingman-jr-addon · 2024-02-01T01:31:11Z

Next - Levit 256 as SQRXR 130:

Marginal advantage on ROC over baseline 112, less good DET.

wingman-jr-addon · 2024-02-01T01:32:23Z

CMT XS Torch as SQRXR 131:

Marginal advantage on ROC AUC, disadvantage on DET.

wingman-jr-addon · 2024-02-01T01:33:18Z

TinyViT 11 as SQRXR 132:

Marginal advantage on ROC AUC, disadvantage on DET.

wingman-jr-addon · 2024-02-01T01:34:39Z

EfficientNetV2 B0 as SQRXR 133:

No advantage.

wingman-jr-addon · 2024-02-01T01:37:19Z

EfficientFormerV2S0 as a smaller variant of an earlier experiment. SQRXR 134:

No improvement, and unsurprisingly not as good as V2S2.

wingman-jr-addon · 2024-02-01T01:39:33Z

Tried a bit different training regime with current EfficientNetLite L0 using some of the other advances like swapping out for AdamW. SQRXR 135:

About the same but the DET curve's a bit more gnarly at the beginning. No clear advantage. Still, I think there may be something to the training technique approach here.

wingman-jr-addon · 2024-02-01T06:49:03Z

GCViT XTiny - a bit bigger model. Shows in the performance. SQRXR 136:

Definite improvement. Scanning speed is slow but sort-of-tolerable. Might be useful as a bigger model. The adventurous can try it out on the test branch while it sticks around: https://github.com/wingman-jr-addon/wingman_jr/tree/sqrxr-136

wingman-jr-addon · 2024-04-22T01:33:03Z

I've been trying this out ... and I'm not sure it's fast enough or good enough to become the next top model yet. I might need to keep searching.

wingman-jr-addon changed the title ~~Investigate ConvNext V2 as possible backbone~~ Investigate recent advances for next model backbone (2023) Jan 17, 2024

wingman-jr-addon mentioned this issue Jan 20, 2024

False characters shown wingman-jr-addon/wingman_jr#199

Closed

wingman-jr-addon changed the title ~~Investigate recent advances for next model backbone (2023)~~ Investigate recent advances for next model backbone (2023/2024) Jan 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate recent advances for next model backbone (2023/2024) #7

Investigate recent advances for next model backbone (2023/2024) #7

wingman-jr-addon commented Apr 11, 2023

wingman-jr-addon commented Jan 17, 2024

wingman-jr-addon commented Jan 19, 2024

wingman-jr-addon commented Jan 20, 2024

wingman-jr-addon commented Jan 20, 2024 •

edited

Loading

wingman-jr-addon commented Jan 22, 2024 •

edited

Loading

wingman-jr-addon commented Jan 24, 2024

wingman-jr-addon commented Feb 1, 2024

wingman-jr-addon commented Feb 1, 2024

wingman-jr-addon commented Feb 1, 2024

wingman-jr-addon commented Feb 1, 2024

wingman-jr-addon commented Feb 1, 2024

wingman-jr-addon commented Feb 1, 2024

wingman-jr-addon commented Feb 1, 2024

wingman-jr-addon commented Apr 22, 2024

Investigate recent advances for next model backbone (2023/2024) #7

Investigate recent advances for next model backbone (2023/2024) #7

Comments

wingman-jr-addon commented Apr 11, 2023

wingman-jr-addon commented Jan 17, 2024

wingman-jr-addon commented Jan 19, 2024

wingman-jr-addon commented Jan 20, 2024

wingman-jr-addon commented Jan 20, 2024 • edited Loading

wingman-jr-addon commented Jan 22, 2024 • edited Loading

wingman-jr-addon commented Jan 24, 2024

wingman-jr-addon commented Feb 1, 2024

wingman-jr-addon commented Feb 1, 2024

wingman-jr-addon commented Feb 1, 2024

wingman-jr-addon commented Feb 1, 2024

wingman-jr-addon commented Feb 1, 2024

wingman-jr-addon commented Feb 1, 2024

wingman-jr-addon commented Feb 1, 2024

wingman-jr-addon commented Apr 22, 2024

wingman-jr-addon commented Jan 20, 2024 •

edited

Loading

wingman-jr-addon commented Jan 22, 2024 •

edited

Loading