Getting Started

Please refer to Installation to setup environment at first, and prepare flower102 dataset by following the instruction mentioned in the Quick Start.

1. Training and Evaluation on CPU or Single GPU

If training and evaluation are performed on CPU or single GPU, it is recommended to use the tools/train.py and tools/eval.py. For training and evaluation in multi-GPU environment on Linux, please refer to 2. Training and evaluation on Linux+GPU.

1.1 Model training

After preparing the configuration file, The training process can be started in the following way.

python tools/train.py \
    -c configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
    -o pretrained_model="" \
    -o use_gpu=False

Among them, -c is used to specify the path of the configuration file, -o is used to specify the parameters needed to be modified or added, -o pretrained_model="" means to not using pre-trained models. -o use_gpu=True means to use GPU for training. If you want to use the CPU for training, you need to set use_gpu to False.

Of course, you can also directly modify the configuration file to update the configuration. For specific configuration parameters, please refer to Configuration Document.

The output log examples are as follows:

If mixup or cutmix is used in training, top-1 and top-k (default by 5) will not be printed in the log:

...
epoch:0  , train step:20   , loss: 4.53660, lr: 0.003750, batch_cost: 1.23101 s, reader_cost: 0.74311 s, ips: 25.99489 images/sec, eta: 0:12:43
...
END epoch:1   valid top1: 0.01569, top5: 0.06863, loss: 4.61747,  batch_cost: 0.26155 s, reader_cost: 0.16952 s, batch_cost_sum: 10.72348 s, ips: 76.46772 images/sec.
...

If mixup or cutmix is not used during training, in addition to the above information, top-1 and top-k (The default is 5) will also be printed in the log:

...
epoch:0  , train step:30  , top1: 0.06250, top5: 0.09375, loss: 4.62766, lr: 0.003728, batch_cost: 0.64089 s, reader_cost: 0.18857 s, ips: 49.93080 images/sec, eta: 0:06:18
...
END epoch:0   train top1: 0.01310, top5: 0.04738, loss: 4.65124,  batch_cost: 0.64089 s, reader_cost: 0.18857 s, batch_cost_sum: 13.45863 s, ips: 49.93080 images/sec.
...

During training, you can view loss changes in real time through VisualDL, see VisualDL for details.

1.2 Model finetuning

After configuring the configuration file, you can finetune it by loading the pretrained weights, The command is as shown below.

python tools/train.py \
    -c configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
    -o pretrained_model="./pretrained/MobileNetV3_large_x1_0_pretrained" \
    -o use_gpu=True

Among them, -o pretrained_model is used to set the address to load the pretrained weights. When using it, you need to replace it with your own pretrained weights' path, or you can modify the path directly in the configuration file.

We also provide a lot of pre-trained models trained on the ImageNet-1k dataset. For the model list and download address, please refer to the model library overview.

1.3 Resume Training

If the training process is terminated for some reasons, you can also load the checkpoints to continue training.

python tools/train.py \
    -c configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
    -o checkpoints="./output/MobileNetV3_large_x1_0/5/ppcls" \
    -o last_epoch=5 \
    -o use_gpu=True

The configuration file does not need to be modified. You only need to add the checkpoints parameter during training, which represents the path of the checkpoints. The parameter weights, learning rate, optimizer and other information will be loaded using this parameter.

Note:

The parameter -o last_epoch=5 means to record the number of the last training epoch as 5, that is, the number of this training epoch starts from 6, , and the parameter defaults to -1, which means the number of this training epoch starts from 0.
The -o checkpoints parameter does not need to include the suffix of the checkpoints. The above training command will generate the checkpoints as shown below during the training process. If you want to continue training from the epoch 5, Just set the checkpoints to ./output/MobileNetV3_large_x1_0_gpupaddle/5/ppcls, PaddleClas will automatically fill in the pdopt and pdparams suffixes.
```
output/
└── MobileNetV3_large_x1_0
    ├── 0
    │   ├── ppcls.pdopt
    │   └── ppcls.pdparams
    ├── 1
    │   ├── ppcls.pdopt
    │   └── ppcls.pdparams
    .
    .
    .
```

1.4 Model evaluation

The model evaluation process can be started as follows.

python tools/eval.py \
    -c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
    -o pretrained_model="./output/MobileNetV3_large_x1_0/best_model/ppcls"\
    -o load_static_weights=False

The above command will use ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml as the configuration file to evaluate the model ./output/MobileNetV3_large_x1_0/best_model/ppcls. You can also set the evaluation by changing the parameters in the configuration file, or you can update the configuration with the -o parameter, as shown above.

Some of the configurable evaluation parameters are described as follows:

ARCHITECTURE.name: Model name
pretrained_model: The path of the model file to be evaluated
load_static_weights: Whether the model to be evaluated is a static graph model

Note: If the model is a dygraph type, you only need to specify the prefix of the model file when loading the model, instead of specifying the suffix, such as 1.3 Resume Training.

2. Training and evaluation on Linux+GPU

If you want to run PaddleClas on Linux with GPU, it is highly recommended to use paddle.distributed.launch to start the model training script(tools/train.py) and evaluation script(tools/eval.py), which can start on multi-GPU environment more conveniently.

2.1 Model training

After preparing the configuration file, The training process can be started in the following way. paddle.distributed.launch specifies the GPU running card number by setting selected_gpus:

export CUDA_VISIBLE_DEVICES=0,1,2,3

python -m paddle.distributed.launch \
    --selected_gpus="0,1,2,3" \
    tools/train.py \
        -c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml

The configuration can be updated by adding the -o parameter.

python -m paddle.distributed.launch \
    --selected_gpus="0,1,2,3" \
    tools/train.py \
        -c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
        -o pretrained_model="" \
        -o use_gpu=True

The format of output log information is the same as above, see 1.1 Model training for details.

2.2 Model finetuning

After configuring the configuration file, you can finetune it by loading the pretrained weights, The command is as shown below.

export CUDA_VISIBLE_DEVICES=0,1,2,3

python -m paddle.distributed.launch \
    --selected_gpus="0,1,2,3" \
    tools/train.py \
        -c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
        -o pretrained_model="./pretrained/MobileNetV3_large_x1_0_pretrained"

Among them, pretrained_model is used to set the address to load the pretrained weights. When using it, you need to replace it with your own pretrained weights' path, or you can modify the path directly in the configuration file.

There contains a lot of examples of model finetuning in Quick Start. You can refer to this tutorial to finetune the model on a specific dataset.

2.3 Resume Training

If the training process is terminated for some reasons, you can also load the checkpoints to continue training.

export CUDA_VISIBLE_DEVICES=0,1,2,3

python -m paddle.distributed.launch \
    --selected_gpus="0,1,2,3" \
    tools/train.py \
        -c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
        -o checkpoints="./output/MobileNetV3_large_x1_0/5/ppcls" \
        -o last_epoch=5 \
        -o use_gpu=True

The configuration file does not need to be modified. You only need to add the checkpoints parameter during training, which represents the path of the checkpoints. The parameter weights, learning rate, optimizer and other information will be loaded using this parameter. About last_epoch parameter, please refer 1.3 Resume training for details.

2.4 Model evaluation

The model evaluation process can be started as follows.

python tools/eval.py \
    -c ./configs/quick_start/MobileNetV3_large_x1_0_finetune.yaml \
    -o pretrained_model="./output/MobileNetV3_large_x1_0/best_model/ppcls"\
    -o load_static_weights=False

About parameter description, see 1.4 Model evaluation for details.

3. Use the pre-trained model to predict

After the training is completed, you can predict by using the pre-trained model obtained by the training, as follows:

python tools/infer/infer.py \
    -i image path \
    --model MobileNetV3_large_x1_0 \
    --pretrained_model "./output/MobileNetV3_large_x1_0/best_model/ppcls" \
    --use_gpu True \
    --load_static_weights False

Among them:

image_file(i): The path of the image file to be predicted, such as ./test.jpeg;
model: Model name, such as MobileNetV3_large_x1_0;
pretrained_model: Weight file path, such as ./pretrained/MobileNetV3_large_x1_0_pretrained/;
use_gpu: Whether to use the GPU, default by True;
load_static_weights: Whether to load the pre-trained model obtained from static image training, default by False;
resize_short: The length of the shortest side of the image that be scaled proportionally, default by 256;
resize: The side length of the image that be center cropped from resize_shorted image, default by 224;
pre_label_image: Whether to pre-label the image data, default value: False;
pre_label_out_idr: The output path of pre-labeled image data. When pre_label_image=True, a lot of subfolders will be generated under the path, each subfolder represent a category, which stores all the images predicted by the model to belong to the category.

Note: If you want to use Transformer series models, such as DeiT_***_384, ViT_***_384, etc., please pay attention to the input size of model, and need to set resize_short=384, resize=384.

About more detailed infomation, you can refer to infer.py.

4. Use the inference model to predict

PaddlePaddle supports inference using prediction engines, which will be introduced next.

Firstly, you should export inference model using tools/export_model.py.

python tools/export_model.py \
    --model MobileNetV3_large_x1_0 \
    --pretrained_model ./output/MobileNetV3_large_x1_0/best_model/ppcls \
    --output_path ./inference \
    --class_dim 1000

Among them, the --model parameter is used to specify the model name, --pretrained_model parameter is used to specify the model file path, the path does not need to include the model file suffix name, and --output_path is used to specify the storage path of the converted model, class_dim means number of class for the model, default as 1000.

Note:

If --output_path=./inference, then three files will be generated in the folder inference, they are inference.pdiparams, inference.pdmodel and inference.pdiparams.info.
You can specify the shape of the model input image by setting the parameter --img_size, the default is 224, which means the shape of input image is 224*224. If you want to use Transformer series models, such as DeiT_***_384, ViT_***_384, you need to set --img_size=384.

The above command will generate the model structure file (inference.pdmodel) and the model weight file (inference.pdiparams), and then the inference engine can be used for inference:

python tools/infer/predict.py \
    --image_file image path \
    --model_file "./inference/inference.pdmodel" \
    --params_file "./inference/inference.pdiparams" \
    --use_gpu=True \
    --use_tensorrt=False

Among them:

image_file: The path of the image file to be predicted, such as ./test.jpeg;
model_file: Model file path, such as ./MobileNetV3_large_x1_0/inference.pdmodel;
params_file: Weight file path, such as ./MobileNetV3_large_x1_0/inference.pdiparams;
use_tensorrt: Whether to use the TesorRT, default by True;
use_gpu: Whether to use the GPU, default by True
enable_mkldnn: Wheter to use MKL-DNN, default by False. When both use_gpu and enable_mkldnn are set to True, GPU is used to run and enable_mkldnn will be ignored.
resize_short: The length of the shortest side of the image that be scaled proportionally, default by 256;
resize: The side length of the image that be center cropped from resize_shorted image, default by 224;
enable_calc_topk: Whether to calculate top-k accuracy of the predction, default by False. Top-k accuracy will be printed out when set as True.
gt_label_path: Image name and label file, used when enable_calc_topk is True to get image list and labels.

Note: If you want to use Transformer series models, such as DeiT_***_384, ViT_***_384, etc., please pay attention to the input size of model, and need to set resize_short=384, resize=384.

If you want to evaluate the speed of the model, it is recommended to use predict.py, and enable TensorRT to accelerate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

getting_started_en.md

getting_started_en.md

Getting Started

1. Training and Evaluation on CPU or Single GPU

1.1 Model training

1.2 Model finetuning

1.3 Resume Training

1.4 Model evaluation

2. Training and evaluation on Linux+GPU

2.1 Model training

2.2 Model finetuning

2.3 Resume Training

2.4 Model evaluation

3. Use the pre-trained model to predict

4. Use the inference model to predict

Files

getting_started_en.md

Latest commit

History

getting_started_en.md

File metadata and controls

Getting Started

1. Training and Evaluation on CPU or Single GPU

1.1 Model training

1.2 Model finetuning

1.3 Resume Training

1.4 Model evaluation

2. Training and evaluation on Linux+GPU

2.1 Model training

2.2 Model finetuning

2.3 Resume Training

2.4 Model evaluation

3. Use the pre-trained model to predict

4. Use the inference model to predict