Welcome to Vision-KAN! We are exploring the exciting possibility of KAN replacing MLP in Vision Transformer. Due to GPU resource constraints, this project may experience delays, but we'll keep you updated with any new developments here! 📅✨
To install this package, simply run:
pip install VisionKAN
Here's a quick example to get you started:
from VisionKAN import create_model, train_one_epoch, evaluate
KAN_model = create_model(
model_name='deit_tiny_patch16_224_KAN',
pretrained=False,
hdim_kan=192,
num_classes=100,
drop_rate=0.0,
drop_path_rate=0.05,
img_size=224,
batch_size=144
)
Dataset | MLP Hidden Dim | Model | Date | Epoch | Top-1 | Top-5 | Checkpoint |
---|---|---|---|---|---|---|---|
ImageNet 1k | 768 | DeiT-tiny (baseline) | - | 300 | 72.2 | 91.1 | - |
CIFAR-100 | 192 | DeiT-tiny (baseline) | 2024.5.25 | 300(stop) | 84.94 | 96.53 | Checkpoint |
CIFAR-100 | 384 | DeiT-small (baseline) | 2024.5.25 | 300(stop) | 86.49 | 96.17 | Checkpoint |
CIFAR-100 | 768 | DeiT-base (baseline) | 2024.5.25 | 300(stop) | 86.54 | 96.16 | Checkpoint |
Dataset | KAN Hidden Dim | Model | Date | Epoch | Top-1 | Top-5 | Checkpoint |
---|---|---|---|---|---|---|---|
ImageNet 1k | 20 | Vision-KAN | 2024.5.16 | 37(stop) | 36.34 | 61.48 | - |
ImageNet 1k | 192 | Vision-KAN | 2024.5.25 | 346(stop) | 64.87 | 86.14 | Checkpoint |
ImageNet 1k | 768 | Vision-KAN | 2024.6.2 | 154(training) | 62.90 | 85.03 | - |
CIFAR-100 | 192 | Vision-KAN | 2024.5.25 | 300(stop) | 73.17 | 93.307 | Checkpoint |
CIFAR-100 | 384 | Vision-KAN | 2024.5.25 | 300(stop) | 78.69 | 94.73 | Checkpoint |
CIFAR-100 | 768 | Vision-KAN | 2024.5.29 | 300(stop) | 79.82 | 95.42 | Checkpoint |
- 5.7.2024: Released the current Vision KAN code! 🚀 We used efficient KAN to replace the MLP layer in the Transformer block and are pre-training the Tiny model on ImageNet 1k. Updates will be reflected in the table.
- 5.14.2024: The model is starting to converge! We’re using [192, 20, 192] for input, hidden, and output dimensions.
- 5.15.2024: Switched from efficient kan to faster kan to double the training speed! 🚀
- 5.16.2024: Convergence appears to be bottlenecked; considering adjusting the KAN hidden layer size from 20 to 192.
- 5.22.2024: Fixed Timm version dependency issues and cleaned up the code! 🧹
- 5.24.2024: Loss decline is slowing, nearing final results! 🔍
- 5.25.2024: The model with 192 hidden layers is approaching convergence! 🎉 Released the best checkpoint of VisionKAN.
We utilized DeiT as the baseline for Vision KAN development. Huge thanks to Meta and MIT for their incredible work! 🙌
If you are using our work, please cite:
@misc{VisionKAN2024,
author = {Ziwen Chen and Gundavarapu and WU DI},
title = {Vision-KAN: Exploring the Possibility of KAN Replacing MLP in Vision Transformer},
year = {2024},
howpublished = {\url{https://github.com/chenziwenhaoshuai/Vision-KAN.git}},
}