auto set cpu env when mkldnn or mklml enabled #5671

tensor-tang · 2017-11-15T07:49:54Z

related #5280

When MKL-DNN or MKLML enabled,
auto set OMP_DYNAMIC and KMP_AFFINITY according to HT status.
auto set OMP_NUM_THREADS and MKL_NUM_THREADS according to total available processors and trainer_count

V1 API
V2 API

…han processors

luotao1

我将这个PR拉下来，在本地跑run_mkldnn.sh脚本。
然后 echo ${OMP_NUM_THREADS}，打印值为空。想问下是什么原因呢？

luotao1 · 2017-11-16T02:24:06Z

python/paddle/v2/__init__.py

+    processors = int(processors.read())
+    trainers = kwargs.get('trainer_count', 1)
+    threads = processors / trainers
+    threads = '1' if threads < 1 else str(threads)


会存在trainer_count不存在的情况么？

threads < 1说明processors是0？这种情况存在么

这里已经考虑了trainer_count不存在的情况，kwargs.get('trainer_count', 1)，默认为1.

如果存在某些用户写的trainer_count为8，但是他并不知道系统只有4个processors ，那么这里算出来的threads为0，就错了，所以给一个最小值，防止用户的错误输入。

luotao1 · 2017-11-16T02:25:49Z

paddle/scripts/submit_local.sh.in

+    fi
+    if [ -z "$OMP_DYNAMIC" ]; then
+      export OMP_DYNAMIC="FALSE"
+    fi


54-58行写成下面这种可以么，不需要加上判断。看v2 api那块就是直接设置的：

export KMP_AFFINITY="granularity=fine,compact,0,0" export OMP_DYNAMIC="FALSE"

61-65行同

这里加上判断是有原因的，如果写死了，外面的环境变量的值就传不进来了。

luotao1 · 2017-11-16T02:44:40Z

benchmark/paddle/image/run_mkldnn.sh

@@ -14,8 +12,6 @@ function train() {
  elif [ $4 == "False" ]; then


用mkldnn的时候，下面两个变量也是这样设置的么？从v1和v2的设置中看到是被统一了。

export OMP_NUM_THREADS=1 export MKL_NUM_THREADS=1

用mkldnn的时候，这个设置的操作统一放到内部去实现了，但值不是1。

请问这个设置的操作在哪里实现了？

就在https://github.com/PaddlePaddle/Paddle/pull/5671/files#diff-b7cc7cb6384ea4125da708e18377e447R87
这里，mklml和mkldnn都是一样的逻辑。

luotao1 · 2017-11-16T02:45:42Z

benchmark/paddle/image/run_mkldnn.sh

-  unset OMP_NUM_THREADS MKL_NUM_THREADS
-  export OMP_DYNAMIC="FALSE"
-  export KMP_AFFINITY="granularity=fine,compact,0,0"
+  unset OMP_NUM_THREADS MKL_NUM_THREADS OMP_DYNAMIC KMP_AFFINITY
  topology=$1
  layer_num=$2
  bs=$3


从run_mkldnn.sh脚本看，使用mkldnn的时候，trainer_count设为1就行？这点是不是也要告诉用户呢

可以告诉用户，使用mkldnn就不需要设置trainer_count了，默认为1即可。

那这步是准备写入文档？还是写入代码呢？

目前还没有加到code里面，不过可以加一个提醒用户。

tensor-tang

是为空的，这个变量的生命周期只在运行paddle里面，结束后是打不出来值的，除非在运行过程中打出来才会有值。

如果想看的值，可以在paddle那个脚本里面加echo 即可。但是一般用户可能也不需要看到这个，所以我没加echo。

tensor-tang · 2017-11-16T02:57:00Z

benchmark/paddle/image/run_mkldnn.sh

-  unset OMP_NUM_THREADS MKL_NUM_THREADS
-  export OMP_DYNAMIC="FALSE"
-  export KMP_AFFINITY="granularity=fine,compact,0,0"
+  unset OMP_NUM_THREADS MKL_NUM_THREADS OMP_DYNAMIC KMP_AFFINITY
  topology=$1
  layer_num=$2
  bs=$3


可以告诉用户，使用mkldnn就不需要设置trainer_count了，默认为1即可。

tensor-tang · 2017-11-16T03:00:09Z

benchmark/paddle/image/run_mkldnn.sh

@@ -14,8 +12,6 @@ function train() {
  elif [ $4 == "False" ]; then


用mkldnn的时候，这个设置的操作统一放到内部去实现了，但值不是1。

tensor-tang · 2017-11-16T03:01:37Z

paddle/scripts/submit_local.sh.in

+    fi
+    if [ -z "$OMP_DYNAMIC" ]; then
+      export OMP_DYNAMIC="FALSE"
+    fi


这里加上判断是有原因的，如果写死了，外面的环境变量的值就传不进来了。

tensor-tang · 2017-11-16T03:04:18Z

python/paddle/v2/__init__.py

+    processors = int(processors.read())
+    trainers = kwargs.get('trainer_count', 1)
+    threads = processors / trainers
+    threads = '1' if threads < 1 else str(threads)


这里已经考虑了trainer_count不存在的情况，kwargs.get('trainer_count', 1)，默认为1.

如果存在某些用户写的trainer_count为8，但是他并不知道系统只有4个processors ，那么这里算出来的threads为0，就错了，所以给一个最小值，防止用户的错误输入。

luotao1 · 2017-11-16T03:11:54Z

benchmark/paddle/image/run_mkldnn.sh

-  unset OMP_NUM_THREADS MKL_NUM_THREADS
-  export OMP_DYNAMIC="FALSE"
-  export KMP_AFFINITY="granularity=fine,compact,0,0"
+  unset OMP_NUM_THREADS MKL_NUM_THREADS OMP_DYNAMIC KMP_AFFINITY
  topology=$1
  layer_num=$2
  bs=$3


那这步是准备写入文档？还是写入代码呢？

luotao1 · 2017-11-16T03:12:20Z

benchmark/paddle/image/run_mkldnn.sh

@@ -14,8 +12,6 @@ function train() {
  elif [ $4 == "False" ]; then


请问这个设置的操作在哪里实现了？

luotao1 · 2017-11-16T03:13:30Z

python/paddle/v2/__init__.py

+        set_env("KMP_AFFINITY", "granularity=fine,compact,0,0")
+    else:
+        set_env("OMP_DYNAMIC", "true")
+        set_env("KMP_AFFINITY", "granularity=fine,compact,1,0")


v1 api的时候没写死，v2 api的时候写死了？

v2的时候也没有写死，定义了一个set_env

luotao1 · 2017-11-16T03:29:11Z

如果想看的值，可以在paddle那个脚本里面加echo 即可。但是一般用户可能也不需要看到这个，所以我没加echo。

可以加一下打印的函数么？注释掉即可。

tensor-tang · 2017-11-16T03:31:34Z

可以，我加一个输出完整的配置结果，就先放在v1的API那里好了。

tensor-tang · 2017-11-16T03:38:15Z

Done.

在编译之前把注释去掉就可以了。

luotao1

LGTM。在下一个PR中，使用MKLDNN的时候，提醒用户trainer_count默认为1.

auto set cpu env when mkldnn or mklml enabled for V1 API

4fbba65

tensor-tang added 3 commits November 15, 2017 17:14

at least set threads number as 1, in case trainers number is larger t…

a6f5f6e

…han processors

auto set cpu environment in V2 API

d66d6c6

remove the hard code setting in benchmark scripts

a3b2b7b

tensor-tang changed the title ~~auto set cpu env when mkldnn or mklml enabled for V1 API~~ auto set cpu env when mkldnn or mklml enabled Nov 15, 2017

tensor-tang requested a review from luotao1 November 15, 2017 12:54

luotao1 reviewed Nov 16, 2017

View reviewed changes

tensor-tang commented Nov 16, 2017

View reviewed changes

luotao1 reviewed Nov 16, 2017

View reviewed changes

add echo the result as comment

6337007

luotao1 approved these changes Nov 16, 2017

View reviewed changes

luotao1 merged commit 6cf7f1e into PaddlePaddle:develop Nov 16, 2017

tensor-tang deleted the autocpu branch November 16, 2017 05:28

This was referenced Nov 16, 2017

mkldnn only need one trainer #5715

Merged

v2 init have some code that can not run on mac #5801

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

auto set cpu env when mkldnn or mklml enabled #5671

auto set cpu env when mkldnn or mklml enabled #5671

tensor-tang commented Nov 15, 2017 •

edited

Loading

luotao1 left a comment

luotao1 Nov 16, 2017

tensor-tang Nov 16, 2017

luotao1 Nov 16, 2017

tensor-tang Nov 16, 2017

luotao1 Nov 16, 2017

tensor-tang Nov 16, 2017 •

edited

Loading

luotao1 Nov 16, 2017

tensor-tang Nov 16, 2017

luotao1 Nov 16, 2017

tensor-tang Nov 16, 2017

luotao1 Nov 16, 2017

tensor-tang Nov 16, 2017

tensor-tang left a comment •

edited

Loading

tensor-tang Nov 16, 2017

tensor-tang Nov 16, 2017 •

edited

Loading

tensor-tang Nov 16, 2017

tensor-tang Nov 16, 2017

luotao1 Nov 16, 2017

luotao1 Nov 16, 2017

luotao1 Nov 16, 2017

tensor-tang Nov 16, 2017

luotao1 commented Nov 16, 2017

tensor-tang commented Nov 16, 2017

tensor-tang commented Nov 16, 2017

luotao1 left a comment •

edited

Loading

		@@ -14,8 +12,6 @@ function train() {
		elif [ $4 == "False" ]; then

auto set cpu env when mkldnn or mklml enabled #5671

auto set cpu env when mkldnn or mklml enabled #5671

Conversation

tensor-tang commented Nov 15, 2017 • edited Loading

luotao1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tensor-tang Nov 16, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tensor-tang left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tensor-tang Nov 16, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luotao1 commented Nov 16, 2017

tensor-tang commented Nov 16, 2017

tensor-tang commented Nov 16, 2017

luotao1 left a comment • edited Loading

Choose a reason for hiding this comment

tensor-tang commented Nov 15, 2017 •

edited

Loading

tensor-tang Nov 16, 2017 •

edited

Loading

tensor-tang left a comment •

edited

Loading

tensor-tang Nov 16, 2017 •

edited

Loading

luotao1 left a comment •

edited

Loading