docs: add some details for tutorials of fine-tuning

XixinYang · XixinYang · commit d31e7dfbc6f7 · 2024-01-11T16:31:02.000+08:00
diff --git a/docs/en/how_to_guides/finetune_with_a_custom_dataset.md b/docs/en/how_to_guides/finetune_with_a_custom_dataset.md
@@ -1,6 +1,6 @@
 # Fine-tune with A Custom Dataset
 
-This document introduces the process for fine-tuning a custom dataset in MindCV and the implementation of fine-tuning techniques such as reading the dataset online, setting the learning rate for specific layers, freezing part of the parameters, etc. The main code is in./example/finetune.py, you can make changes to it based on this tutorial as needed.
+This document introduces the process for fine-tuning a pre-trained model from MindCV on a custom dataset and the implementation of fine-tuning techniques such as reading the dataset online, setting the learning rate for specific layers, freezing part of the parameters, etc. The main code is in./example/finetune.py, you can make changes to it based on this tutorial as needed.
 
 Next, we will use the FGVC-Aircraft dataset as an example to show how to fine-tune the pre-trained model mobilenet v3-small. [Fine-Grained Visual Classification of Aircraft](https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/) is a commonly used fine-grained image Classification benchmark dataset, which contains 10,000 aircraft images from 100 different types of aircraft (a.k.a variants), that is, 100 images for each aircraft type.
 
@@ -194,10 +194,10 @@ For small-size custom datasets, it is suggested that data augmentation can be us
 
 Referring to [Stanford University CS231n](https://cs231n.github.io/transfer-learning/#tf), **fine-tuning all the parameters**, **freezing feature network**, and **setting learning rates for specific layers** are commonly used fine-tuning skills. The first one uses pre-trained weights to initialize the parameters of the target model, and then updates all parameters based on the new dataset, so it's usually time-consuming but will get a high precision. Freezing feature networks are divided into freezing all feature networks(linear probe) and freezing partial feature networks. The former uses the pre-trained model as a feature extractor and only updates the parameters of the full connection layer, which takes a short time but has low accuracy; The latter generally freezes the parameters of shallow layers, which only learn the basic features of images, and only updates the parameters of the deep network and the full connection layer. Setting learning rate for specific layers is similar but more elaborate, it specifies the learning rates used by certain layers during training.
 
-For hyper-parameters used in fine-tuning training, you can refer to the configuration file used when pre-training on the ImageNet-1k dataset in ./configs. Note that for fine-tuning, <font color=DarkRed>the hyper-parameter `pretrained` should be set to be `True` </font>to load the pre-training weight, <font color=DarkRed> `num_classes` should be set to be the number of labels </font>of the custom dataset (e.g. 100 for the Aircraft dataset here), moreover, don't forget to <font color=DarkRed>reduce batch_size and epoch_size </font>based on the size of the custom dataset. In addition, since the pre-trained weight already contains a lot of information for identifying images, in order not to destroy this information too much, it is also necessary to<font color=DarkRed> reduce the learning rate `lr` </font>, and it is also recommended to start training and adjust from at most one-tenth of the pre-trained learning rate or 0.0001. These parameters can be modified in the configuration file or added in the shell command as shown below. The training results can be viewed in the file ./ckpt/results.txt.
+For hyper-parameters used in fine-tuning training, you can refer to the configuration file used when pre-trained on the ImageNet-1k dataset in ./configs. Note that for fine-tuning, <font color=DarkRed>the hyper-parameter `pretrained` should be set to be `True` </font>to automatically downloaded and load the pre-trained weight, the parameters of the classification layer will be automatically removed during the process because the `num_classes` is not the default 1000 (but if you want to load a checkpoint file from local directory, do remember to set `pretrained` to be `False`, set the `ckpt_path` and manually delete the parameters of classifier before everything), <font color=DarkRed> `num_classes` should be set to be the number of labels </font>of the custom dataset (e.g. 100 for the Aircraft dataset here), moreover, don't forget to <font color=DarkRed>reduce batch_size and epoch_size </font>based on the size of the custom dataset. In addition, since the pre-trained weight already contains a lot of information for identifying images, in order not to destroy this information too much, it is also necessary to<font color=DarkRed> reduce the learning rate `lr` </font>, and it is also recommended to start training and adjust from at most one-tenth of the pre-trained learning rate or 0.0001. These parameters can be modified in the configuration file or added in the shell command as shown below. The training results can be viewed in the file ./ckpt/results.txt.
 
 ```bash
-python .examples/finetune/finetune.py --config=./configs/mobilenetv3/mobilnet_v3_small_ascend.yaml --data_dir=./aircraft/data --pretrained=True
+python .examples/finetune/finetune.py --config=./configs/mobilenetv3/mobilnet_v3_small_ascend.yaml --data_dir=./aircraft/data --num_classes=100 --pretrained=True ...
 ```
 
 When fine-tuning mobilenet v3-small based on Aircraft dataset, this tutorial mainly made the following changes to the hyper-parameters:
diff --git a/docs/zh/how_to_guides/finetune_with_a_custom_dataset.md b/docs/zh/how_to_guides/finetune_with_a_custom_dataset.md
@@ -1,6 +1,6 @@
 # 自定义数据集的模型微调指南
 
-本文档提供了使用MindCV在自定义数据集上微调的参考流程以及在线读取数据集、分层设置学习率、冻结部分特征网络等微调技巧的实现方法，主要代码实现集成在./example/finetune.py中，您可以基于此教程根据需要自行改动。
+本文档提供了在自定义数据集上微调MindCV预训练模型的参考流程以及在线读取数据集、分层设置学习率、冻结部分特征网络等微调技巧的实现方法，主要代码实现集成在./example/finetune.py中，您可以基于此教程根据需要自行改动。
 
 接下来将以FGVC-Aircraft数据集为例展示如何对预训练模型mobilenet v3-small进行微调。[Fine-Grained Visual Classification of Aircraft](https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/)是常用的细粒度图像分类基准数据集，包含 10000 张飞机图片，100 种不同的飞机型号(variant)，其中每种飞机型号均有 100 张图片。
 
@@ -201,10 +201,10 @@ MindCV使用`create_loader`函数对上一章节读取的数据集进行图像
 
 参考[Stanford University CS231n](https://cs231n.github.io/transfer-learning/#tf)，**整体微调**、**冻结特征网络微调**、与**分层设置学习率微调**是常用的微调模式。模型的整体微调使用预训练权重初始化目标模型的参数并在此基础上针对新数据集继续训练、更新所有参数，因此计算量较大，耗时较长但一般精度较高；冻结特征网络则分为冻结所有特征网络与冻结部分特征网络两种，前者将预训练模型作为特征提取器，仅更新全连接层参数，耗时短但精度低，后者一般固定学习基础特征的浅层参数，只更新学习精细特征的深层网络参数与全连接层参数；分层设置学习率与之相似，但是更加精细地指定了网络内部某些特定层在训练中更新参数所使用的学习率。
 
-对于实际微调训练中所使用的的超参数配置，可以参考./configs中基于ImageNet-1k数据集预训练的配置文件。注意对模型微调而言，应事先<font color=DarkRed>将超参数`pretrained`设置为`True`</font>以加载预训练权重，<font color=DarkRed>将`num_classes`设置为自定义数据集的标签个数</font>（比如Aircfrat数据集是100），还可以基于自定义数据集规模，<font color=DarkRed>适当调小`batch_size`与`epoch_size`</font>。此外，由于预训练权重中已经包含了许多识别图像的初始信息，为了不过分破坏这些信息，还需<font color=DarkRed>将学习率`lr`调小</font>，建议至多从预训练学习率的十分之一或0.0001开始训练、调参。这些参数都可以在配置文件中修改，也可以如下所示在shell命令中添加，训练结果可在./ckpt/results.txt文件中查看。
+对于实际微调训练中所使用的的超参数配置，可以参考./configs中基于ImageNet-1k数据集预训练的配置文件。注意应事先<font color=DarkRed>将`num_classes`设置为自定义数据集的标签个数</font>（比如Aircfrat数据集是100）；<font color=DarkRed>将超参数`pretrained`设置为`True`以自动下载与加载预训练权重</font>，过程中由于`num_classes`并非默认的1000，分类层的参数将会被自动去除（此时无需设置`ckpt_path`,但如果您需要加载来自本地的checkpoint文件，则需要保持`pretrained`为`False`并自行指定`ckpt_path`参数，注意务必提前自行将checkpoint文件中分类层的参数剔除）。此外，还可以基于自定义数据集规模，<font color=DarkRed>适当调小`batch_size`与`epoch_size`</font>，由于预训练权重中已经包含了许多识别图像的初始信息，为了不过分破坏这些信息，还需<font color=DarkRed>将学习率`lr`调小</font>，建议至多从预训练学习率的十分之一或0.0001开始训练、调参。这些参数都可以在配置文件中修改，也可以如下所示在shell命令中添加，训练结果可在./ckpt/results.txt文件中查看。
 
 ```bash
-python .examples/finetune/finetune.py --config=./configs/mobilenetv3/mobilnet_v3_small_ascend.yaml --data_dir=./aircraft/data --pretrained=True
+python .examples/finetune/finetune.py --config=./configs/mobilenetv3/mobilnet_v3_small_ascend.yaml --data_dir=./aircraft/data --num_classes=100 --pretrained=True ...
 ```
 
 本文在基于Aircraft数据集对mobilenet v3-small微调时主要对超参数做了如下改动：
diff --git a/examples/finetune/README.md b/examples/finetune/README.md
@@ -1,4 +1,4 @@
-This folder contains scripts for fine-tuning on your own custom dataset, do refer to [tutorials](https://mindspore-lab.github.io/mindcv/how_to_guides/finetune_with_a_custom_dataset/) for details.
+This folder contains scripts for fine-tuning on your own custom dataset with a pretrained model offered by MindCV, do refer to [tutorials](https://mindspore-lab.github.io/mindcv/how_to_guides/finetune_with_a_custom_dataset/) for details.
 
 ### split_files.py
 ```shell
@@ -14,4 +14,4 @@ This is an example demonstrating how to read the raw images as well as the label
 ```shell
 python examples/finetune/finetune.py --config=./configs/mobilenetv3/mobilnet_v3_small_ascend.yaml
 ```
-A script for fine-tuning with some example code of fine-tuning methods in it (refer to the tutorial mentioned above).
+A script for fine-tuning with some example code of fine-tuning methods in it (all settings during fine-tuning are inside the config file, for more details, please refer to the tutorial mentioned above).
diff --git a/examples/finetune/finetune.py b/examples/finetune/finetune.py
@@ -313,12 +313,6 @@ def finetune_train(args):
     # callback
     # save checkpoint, summary training loss
     # record val acc and do model selection if val dataset is available
-    begin_step = 0
-    begin_epoch = 0
-    if args.ckpt_path != "":
-        begin_step = optimizer.global_step.asnumpy()[0]
-        begin_epoch = args.ckpt_path.split("/")[-1].split("-")[1].split("_")[0]
-        begin_epoch = int(begin_epoch)
 
     summary_dir = f"./{args.ckpt_save_dir}/summary"
     assert (
@@ -328,7 +322,6 @@ def finetune_train(args):
         trainer,
         model_name=args.model,
         model_ema=args.ema,
-        last_epoch=begin_epoch,
         dataset_sink_mode=args.dataset_sink_mode,
         dataset_val=loader_eval,
         metric_name=list(metrics.keys()),
@@ -373,10 +366,7 @@ def finetune_train(args):
     logger.info(essential_cfg_msg)
     save_args(args, os.path.join(args.ckpt_save_dir, f"{args.model}.yaml"), rank_id)
 
-    if args.ckpt_path != "":
-        logger.info(f"Resume training from {args.ckpt_path}, last step: {begin_step}, last epoch: {begin_epoch}")
-    else:
-        logger.info("Start training")
+    logger.info(f"Load checkpoint from {args.ckpt_path}. \nStart training")
 
     trainer.train(args.epoch_size, loader_train, callbacks=callbacks, dataset_sink_mode=args.dataset_sink_mode)