Skip to content

Latest commit

 

History

History
195 lines (152 loc) · 6.37 KB

README.md

File metadata and controls

195 lines (152 loc) · 6.37 KB

PyTorch Image Classification

Simple image classification for a custom dataset based on PyTorch Lightning & timm. You can train a classification model by simply preparing directories of images.

*This single-file (train.py) repository was created for a friend with ease of use as a priority, it may not be suitable for exhaustive experimentation. However, it does provide the basic functionalities and should be easy to use/modify.

Docker Environment

docker-compose build
docker-compose run --rm dev

ref. https://docs.docker.com/compose/gpu-support/

Or install directly with pip:

(*The libraries are installed directly into your environment.)

pip install -r docker/requirements.txt

Please see docker-compose.yaml, Dockerfile, and requirements.txt.

Data Preparation

Custom Dataset

Dataset preparation is simple. Prepare directories with the name of the class to train then store corresponding images in their directories as follows. (ImageFolder class is used inside the loader.)

{dataset name}/
├── train/
│   ├── {class1}/
│   ├── {class2}/
│   ├── ...
└── val/
    ├── {class1}/
    ├── {class2}/
    ├── ...

Sample Dataset

For reference, I have prepared a script to download torchvision datasets.

torchvision originally provides us with datasets as Dataset class, but since the purpose of this repository is to run training for our own dataset, I save them once as jpeg images for easier understanding.

python scripts/download_and_generate_jpeg_dataset.py -d cifar10
usage: download_and_generate_jpeg_dataset.py [-h] --dataset_name DATASET_NAME
                                             [--outdir OUTDIR]

Script for generating dataset.

optional arguments:
  -h, --help            show this help message and exit
  --dataset_name DATASET_NAME, -d DATASET_NAME
                        Dataset name to generate. (mnist or cifar10)
  --outdir OUTDIR, -o OUTDIR
                        Output directory. (default: dataset name)

The script produces the following directory structure (when outdir is not specified):

cifar10/
├── raw/
│   ├── cifar-10-batches-py
│   └── cifar-10-python.tar.gz
├── train/
│   ├── airplane/
│   ├── automobile/
│   ├── bird/
│   ├── cat/
│   ├── deer/
│   ├── dog/
│   ├── frog/
│   ├── horse/
│   ├── ship/
│   └── truck/
└── val/
    ├── airplane/
    ├── automobile/
    ├── bird/
    ├── cat/
    ├── deer/
    ├── dog/
    ├── frog/
    ├── horse/
    ├── ship/
    └── truck/
  • raw/: raw files downloaded by torchvision (Its content depends on dataset)

Run

Training

Simple implementation with everything in a single file (train.py)

Specify the dataset root directory containing the train and val directories.

python train.py -d cifar10

Detailed settings by command line (code link):

You can use most of the models in the timm by specifying --model-name directly.

usage: train.py [-h] --dataset DATASET [--outdir OUTDIR]
                [--model-name MODEL_NAME] [--img-size IMG_SIZE]
                [--epochs EPOCHS] [--save-interval SAVE_INTERVAL]
                [--batch-size BATCH_SIZE] [--num-workers NUM_WORKERS]
                [--gpu-ids GPU_IDS [GPU_IDS ...] | --n-gpu N_GPU]
                [--seed SEED]

Train classifier.

optional arguments:
  -h, --help            show this help message and exit
  --dataset DATASET, -d DATASET
                        Root directory of dataset
  --outdir OUTDIR, -o OUTDIR
                        Output directory
  --model-name MODEL_NAME, -m MODEL_NAME
                        Model name (timm)
  --img-size IMG_SIZE, -i IMG_SIZE
                        Input size of image
  --epochs EPOCHS, -e EPOCHS
                        Number of training epochs
  --save-interval SAVE_INTERVAL, -s SAVE_INTERVAL
                        Save interval (epoch)
  --batch-size BATCH_SIZE, -b BATCH_SIZE
                        Batch size
  --num-workers NUM_WORKERS, -w NUM_WORKERS
                        Number of workers
  --gpu-ids GPU_IDS [GPU_IDS ...]
                        GPU IDs to use
  --n-gpu N_GPU         Number of GPUs
  --seed SEED           Seed

solver settings (code link):

OPT = 'adam'  # adam, sgd
WEIGHT_DECAY = 0.0001
MOMENTUM = 0.9  # only when OPT is sgd
BASE_LR = 0.001
LR_SCHEDULER = 'step'  # step, multistep, reduce_on_plateau
LR_DECAY_RATE = 0.1
LR_STEP_SIZE = 5  # only when LR_SCHEDULER is step
LR_STEP_MILESTONES = [10, 15]  # only when LR_SCHEDULER is multistep

transforms settings (code link):

We use the torchvision transforms because it is easy to use with the ImageFolder dataset.

        if is_train:
            self.transform = transforms.Compose([
                transforms.RandomHorizontalFlip(p=0.5),
                transforms.Resize(img_size),
                transforms.ToTensor(),
                transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                     std=[0.229, 0.224, 0.225])
            ])
        else:
            self.transform = transforms.Compose([
                transforms.Resize(img_size),
                transforms.ToTensor(),
                transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                     std=[0.229, 0.224, 0.225])
            ])

tensorboard logging

We logged training with tensorboard by default.

tensorboard --logdir ./results

image