Training

We provide ImageNet-1K training commands here. Please check INSTALL.md for installation instructions first.

ImageNet-1K Training

Taking MogaNet-T as an example, you can use the following command to run this experiment on a single machine (8GPUs):

python -m torch.distributed.launch --nproc_per_node=8 train.py \
--model moganet_tiny --input_size 224 --drop_path 0.1 \
--epochs 300 --batch_size 128 --lr 1e-3 --weight_decay 0.04 \
--aa rand-m7-mstd0.5-inc1 --crop_pct 0.9 --mixup 0.1 \
--amp --native_amp \
--data_dir /path/to/imagenet-1k \
--experiment /path/to/save_results

Here, the effective batch size = --nproc_per_node * --batch_size. In the example above, the effective batch size is 8*128 = 1024. Running on one machine, we can reduce --batch_size and use --amp to avoid OOM issues while keeping the total batch size unchanged.

To train other MogaNet variants, --model and --drop_path need to be changed. Examples with single-machine commands are given below:

MogaNet-XT

Single-machine (8GPUs) with the input size of 224:

python -m torch.distributed.launch --nproc_per_node=8 train.py \
--model moganet_xtiny --input_size 224 --drop_path 0.05 \
--epochs 300 --batch_size 128 --lr 1e-3 --weight_decay 0.03 \
--aa rand-m7-mstd0.5-inc1 --crop_pct 0.9 --mixup 0.1 \
--amp --native_amp \
--data_dir /path/to/imagenet-1k \
--experiment /path/to/save_results

MogaNet-Tiny

Single-machine (8GPUs) with the input size of 224:

python -m torch.distributed.launch --nproc_per_node=8 train.py \
--model moganet_tiny --input_size 224 --drop_path 0.1 \
--epochs 300 --batch_size 128 --lr 1e-3 --weight_decay 0.04 \
--aa rand-m7-mstd0.5-inc1 --crop_pct 0.9 --mixup 0.1 \
--amp --native_amp \
--data_dir /path/to/imagenet-1k \
--experiment /path/to/save_results

Single-machine (8GPUs) with the input size of 256:

python -m torch.distributed.launch --nproc_per_node=8 train.py \
--model moganet_tiny --input_size 256 --drop_path 0.1 \
--epochs 300 --batch_size 128 --lr 1e-3 --weight_decay 0.04 \
--aa rand-m7-mstd0.5-inc1 --crop_pct 0.9 --mixup 0.1 \
--amp --native_amp \
--data_dir /path/to/imagenet-1k \
--experiment /path/to/save_results

MogaNet-Small

Single-machine (8GPUs) with the input size of 224 with EMA (you can evaluate it without EMA):

python -m torch.distributed.launch --nproc_per_node=8 train.py \
--model moganet_small --input_size 224 --drop_path 0.1 \
--epochs 300 --batch_size 128 --lr 1e-3 --weight_decay 0.05 \
--crop_pct 0.9 \
--model_ema --model_ema_decay 0.9999 \
--data_dir /path/to/imagenet-1k \
--experiment /path/to/save_results

MogaNet-Base

Single-machine (8GPUs) with the input size of 224 with EMA:

python -m torch.distributed.launch --nproc_per_node=8 train.py \
--model moganet_base --input_size 224 --drop_path 0.2 \
--epochs 300 --batch_size 128 --lr 1e-3 --weight_decay 0.05 \
--crop_pct 0.9 \
--model_ema --model_ema_decay 0.9999 \
--data_dir /path/to/imagenet-1k \
--experiment /path/to/save_results

MogaNet-Large

Single-machine (8GPUs) with the input size of 224 with EMA:

python -m torch.distributed.launch --nproc_per_node=8 train.py \
--model moganet_large --input_size 224 --drop_path 0.3 \
--epochs 300 --batch_size 128 --lr 1e-3 --weight_decay 0.05 \
--crop_pct 0.9 \
--model_ema --model_ema_decay 0.9999 \
--data_dir /path/to/imagenet-1k \
--experiment /path/to/save_results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TRAINING.md

TRAINING.md

Training

ImageNet-1K Training

Files

TRAINING.md

Latest commit

History

TRAINING.md

File metadata and controls

Training

ImageNet-1K Training