[Feature] Support CPU training #7016

AronLin · 2022-01-17T10:03:35Z

We support CPU training.

The model is default put on cuda device, we modify this setting so that if there are no cuda devices, the model will be put on cpu.
This PR is based on open-mmlab/mmcv#1621, we should merge this PR after 1621.

Now we can use the CPU to train/debug our model and test our model with batch size >=2. Before running the program we need to export CUDA_VISIBLE_DEVICES=-1 to disable GPU visibility.

codecov · 2022-01-17T10:54:15Z

Codecov Report

Merging #7016 (a7d7a1d) into dev (ff9bc39) will increase coverage by 0.03%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##              dev    #7016      +/-   ##
==========================================
+ Coverage   62.35%   62.39%   +0.03%     
==========================================
  Files         327      329       +2     
  Lines       26129    26176      +47     
  Branches     4424     4432       +8     
==========================================
+ Hits        16293    16332      +39     
- Misses       8969     8975       +6     
- Partials      867      869       +2

Flag	Coverage Δ
unittests	`62.37% <0.00%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
mmdet/apis/train.py	`15.66% <0.00%> (ø)`
mmdet/datasets/pipelines/formating.py	`0.00% <0.00%> (-67.77%)`	⬇️
mmdet/core/bbox/samplers/random_sampler.py	`75.00% <0.00%> (-5.56%)`	⬇️
mmdet/core/visualization/image.py	`79.20% <0.00%> (-0.80%)`	⬇️
mmdet/apis/test.py	`13.39% <0.00%> (-0.13%)`	⬇️
mmdet/apis/inference.py	`38.53% <0.00%> (ø)`
mmdet/datasets/deepfashion.py	`100.00% <0.00%> (ø)`
mmdet/core/visualization/__init__.py	`100.00% <0.00%> (ø)`
mmdet/datasets/pipelines/formatting.py	`67.76% <0.00%> (ø)`
mmdet/core/visualization/palette.py	`93.75% <0.00%> (ø)`
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ff9bc39...a7d7a1d. Read the comment docs.

tools/train.py

tools/test.py

AronLin · 2022-01-17T13:40:20Z

The bug about non-distributed multi-GPU training is removed to #7019

docs/en/1_exist_data_model.md

docs/zh_cn/1_exist_data_model.md

gaotongxiao · 2022-01-26T01:45:31Z

docs/en/1_exist_data_model.md

+**Note**:
+
+We do not recommend users to use CPU for training because it is too slow. We support this feature to allow users to debug on machines without GPU for convenience.


Should have used the "note" wrapper

```{note} ```

EasonQYS · 2022-01-29T06:42:14Z

[question] If I want to use mmdet/apis/train.py , how to understand "device_ids=cfg.gpu_ids"?
What should be equel to cfg.gpu_ids?
As I know, cfg.gpu_ids should be a non-negtive sequence like [0,] or [0,1,]. but not accept [-1,] or -1.

* Modify docs * Support CPU training * Modify docs * Modify Chinese docs

AronLin requested review from ZwwWayne and BIGWangYuDong January 17, 2022 10:03

BIGWangYuDong reviewed Jan 17, 2022

View reviewed changes

tools/train.py Outdated Show resolved Hide resolved

tools/test.py Outdated Show resolved Hide resolved

BIGWangYuDong requested a review from hhaAndroid January 17, 2022 11:06

Modify docs

c5ecbb2

AronLin force-pushed the cpu branch from 1395c07 to c5ecbb2 Compare January 17, 2022 13:18

Support CPU training

27a8077

ZwwWayne reviewed Jan 19, 2022

View reviewed changes

docs/en/1_exist_data_model.md Outdated Show resolved Hide resolved

Modify docs

7a65e56

AronLin requested review from ZwwWayne and BIGWangYuDong January 19, 2022 12:34

ZwwWayne reviewed Jan 19, 2022

View reviewed changes

docs/zh_cn/1_exist_data_model.md Outdated Show resolved Hide resolved

Modify Chinese docs

a7d7a1d

AronLin requested a review from ZwwWayne January 19, 2022 12:41

ZwwWayne merged commit 794a87c into open-mmlab:dev Jan 19, 2022

fangyixiao18 mentioned this pull request Jan 25, 2022

[Feature] support cpu training open-mmlab/mmselfsup#188

Merged

6 tasks

HIT-cwh mentioned this pull request Jan 25, 2022

[Feature] Support CPU training open-mmlab/mmrazor#62

Merged

gaotongxiao reviewed Jan 26, 2022

View reviewed changes

gaotongxiao mentioned this pull request Jan 26, 2022

[Feature] Support CPU training/testing open-mmlab/mmocr#752

Merged

GT9505 mentioned this pull request Jan 26, 2022

Support CPU training open-mmlab/mmtracking#404

Merged

ly015 mentioned this pull request Jan 26, 2022

[Feature] Support CPU train/test open-mmlab/mmpose#1157

Merged

6 tasks

mzr1996 mentioned this pull request Jan 26, 2022

[Enhance] New-style CPU training and inference. open-mmlab/mmpretrain#674

Merged

6 tasks

plyfager mentioned this pull request Jan 26, 2022

[Feature] Support CPU training open-mmlab/mmgeneration#238

Merged

MengzhangLI mentioned this pull request Jan 27, 2022

[Enhance] New-style CPU training and inference. open-mmlab/mmsegmentation#1251

Merged

ckkelvinchan mentioned this pull request Jan 27, 2022

[Feature] Support CPU training open-mmlab/mmagic#720

Merged

linyq17 mentioned this pull request Jan 27, 2022

[Enhance] Formatting non distributed training and inference and Supporting CPU training. open-mmlab/mmfewshot#42

Merged

MeowZheng mentioned this pull request Jan 28, 2022

[Enhancement]Support CPU Train/Inference open-mmlab/mmflow#86

Merged

shinya7y mentioned this pull request Jan 29, 2022

[feature]增加cpu可以支持的运行 add a **cpu version** of training-api #7097

Closed

AronLin deleted the cpu branch February 14, 2022 21:29

chhluo pushed a commit to chhluo/mmdetection that referenced this pull request Feb 21, 2022

[Feature] Support CPU training (open-mmlab#7016)

974a6d5

* Modify docs * Support CPU training * Modify docs * Modify Chinese docs

hhaAndroid mentioned this pull request Mar 9, 2022

I want to debug my code for train in cpu.What can I should do? #7346

Closed

RangiLyu mentioned this pull request Apr 22, 2022

feat(mlu): Support PyTorch backend on MLU. #7578

Merged

ZwwWayne pushed a commit that referenced this pull request Jul 18, 2022

[Feature] Support CPU training (#7016)

1cc147f

* Modify docs * Support CPU training * Modify docs * Modify Chinese docs

ZwwWayne pushed a commit to ZwwWayne/mmdetection that referenced this pull request Jul 19, 2022

[Feature] Support CPU training (open-mmlab#7016)

f88e484

* Modify docs * Support CPU training * Modify docs * Modify Chinese docs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support CPU training #7016

[Feature] Support CPU training #7016

AronLin commented Jan 17, 2022 •

edited

Loading

codecov bot commented Jan 17, 2022 •

edited

Loading

AronLin commented Jan 17, 2022

gaotongxiao Jan 26, 2022

EasonQYS commented Jan 29, 2022

		Note:

		We do not recommend users to use CPU for training because it is too slow. We support this feature to allow users to debug on machines without GPU for convenience.

[Feature] Support CPU training #7016

[Feature] Support CPU training #7016

Conversation

AronLin commented Jan 17, 2022 • edited Loading

codecov bot commented Jan 17, 2022 • edited Loading

Codecov Report

AronLin commented Jan 17, 2022

gaotongxiao Jan 26, 2022

Choose a reason for hiding this comment

EasonQYS commented Jan 29, 2022

AronLin commented Jan 17, 2022 •

edited

Loading

codecov bot commented Jan 17, 2022 •

edited

Loading