Image classification is a fundamental task in computer vision, with many practical applications. However, achieving high accuracy on complex datasets like Cifar-10 and Cifar-100 can be challenging, particularly when working with limited computational resources. In this study, we evaluate the performance of Multi-Task Learning (MTL) on two common image classification tasks: class and super-class classification. Specifically, we investigate the efficiency of MTL using the Vision Transformers (ViT) architecture on these datasets, shedding light on its comparative performance against the Single-Task Learning (STL) approach and traditional convolution-based models. Our results demonstrate that MTL using the ViT architecture outperforms the STL approach on both classification tasks, achieving higher accuracy with fewer parameters. Additionally, we find that ViT and ResNet-152 perform similarly on these tasks, highlighting the potential of ViT for MTL scenarios. These findings have important implications for the development of efficient and effective image classification models, particularly in scenarios where multiple classification tasks need to be performed simultaneously.
-
Notifications
You must be signed in to change notification settings - Fork 0
Multitask-Learning (hard-parameter sharing) with Vision Transformers on Cifar10 & Cifar100
License
mnguyen0226/multitask_learning_vit
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Multitask-Learning (hard-parameter sharing) with Vision Transformers on Cifar10 & Cifar100
Topics
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published