-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
auto set cpu env when mkldnn or mklml enabled #5671
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我将这个PR拉下来,在本地跑run_mkldnn.sh脚本。
然后 echo ${OMP_NUM_THREADS}
,打印值为空。想问下是什么原因呢?
processors = int(processors.read()) | ||
trainers = kwargs.get('trainer_count', 1) | ||
threads = processors / trainers | ||
threads = '1' if threads < 1 else str(threads) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 会存在trainer_count不存在的情况么?
- threads < 1说明processors是0?这种情况存在么
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
这里已经考虑了trainer_count不存在的情况,
kwargs.get('trainer_count', 1)
, 默认为1. -
如果存在某些用户写的trainer_count为8, 但是他并不知道系统只有4个processors ,那么这里算出来的threads为0,就错了,所以给一个最小值,防止用户的错误输入。
fi | ||
if [ -z "$OMP_DYNAMIC" ]; then | ||
export OMP_DYNAMIC="FALSE" | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
54-58行写成下面这种可以么,不需要加上判断。看v2 api那块就是直接设置的:
export KMP_AFFINITY="granularity=fine,compact,0,0"
export OMP_DYNAMIC="FALSE"
61-65行同
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里加上判断是有原因的,如果写死了,外面的环境变量的值就传不进来了。
@@ -14,8 +12,6 @@ function train() { | |||
elif [ $4 == "False" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
用mkldnn的时候,下面两个变量也是这样设置的么?从v1和v2的设置中看到是被统一了。
export OMP_NUM_THREADS=1
export MKL_NUM_THREADS=1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
用mkldnn的时候,这个设置的操作统一放到内部去实现了,但值不是1。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请问这个设置的操作在哪里实现了?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unset OMP_NUM_THREADS MKL_NUM_THREADS | ||
export OMP_DYNAMIC="FALSE" | ||
export KMP_AFFINITY="granularity=fine,compact,0,0" | ||
unset OMP_NUM_THREADS MKL_NUM_THREADS OMP_DYNAMIC KMP_AFFINITY | ||
topology=$1 | ||
layer_num=$2 | ||
bs=$3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
从run_mkldnn.sh
脚本看,使用mkldnn的时候,trainer_count设为1就行?这点是不是也要告诉用户呢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以告诉用户,使用mkldnn就不需要设置trainer_count
了,默认为1即可。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
那这步是准备写入文档?还是写入代码呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
目前还没有加到code里面,不过可以加一个提醒用户。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是为空的,这个变量的生命周期只在运行paddle里面,结束后是打不出来值的,除非在运行过程中打出来才会有值。
如果想看的值,可以在paddle那个脚本里面加echo 即可。但是一般用户可能也不需要看到这个,所以我没加echo。
unset OMP_NUM_THREADS MKL_NUM_THREADS | ||
export OMP_DYNAMIC="FALSE" | ||
export KMP_AFFINITY="granularity=fine,compact,0,0" | ||
unset OMP_NUM_THREADS MKL_NUM_THREADS OMP_DYNAMIC KMP_AFFINITY | ||
topology=$1 | ||
layer_num=$2 | ||
bs=$3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以告诉用户,使用mkldnn就不需要设置trainer_count
了,默认为1即可。
@@ -14,8 +12,6 @@ function train() { | |||
elif [ $4 == "False" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
用mkldnn的时候,这个设置的操作统一放到内部去实现了,但值不是1。
fi | ||
if [ -z "$OMP_DYNAMIC" ]; then | ||
export OMP_DYNAMIC="FALSE" | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里加上判断是有原因的,如果写死了,外面的环境变量的值就传不进来了。
processors = int(processors.read()) | ||
trainers = kwargs.get('trainer_count', 1) | ||
threads = processors / trainers | ||
threads = '1' if threads < 1 else str(threads) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
这里已经考虑了trainer_count不存在的情况,
kwargs.get('trainer_count', 1)
, 默认为1. -
如果存在某些用户写的trainer_count为8, 但是他并不知道系统只有4个processors ,那么这里算出来的threads为0,就错了,所以给一个最小值,防止用户的错误输入。
unset OMP_NUM_THREADS MKL_NUM_THREADS | ||
export OMP_DYNAMIC="FALSE" | ||
export KMP_AFFINITY="granularity=fine,compact,0,0" | ||
unset OMP_NUM_THREADS MKL_NUM_THREADS OMP_DYNAMIC KMP_AFFINITY | ||
topology=$1 | ||
layer_num=$2 | ||
bs=$3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
那这步是准备写入文档?还是写入代码呢?
@@ -14,8 +12,6 @@ function train() { | |||
elif [ $4 == "False" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请问这个设置的操作在哪里实现了?
set_env("KMP_AFFINITY", "granularity=fine,compact,0,0") | ||
else: | ||
set_env("OMP_DYNAMIC", "true") | ||
set_env("KMP_AFFINITY", "granularity=fine,compact,1,0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v1 api的时候没写死,v2 api的时候写死了?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v2的时候也没有写死,定义了一个set_env
可以加一下打印的函数么?注释掉即可。 |
可以, 我加一个输出完整的配置结果,就先放在v1的API那里好了。 |
Done. 在编译之前把注释去掉就可以了。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM。在下一个PR中,使用MKLDNN的时候,提醒用户trainer_count默认为1.
related #5280
When MKL-DNN or MKLML enabled,
auto set
OMP_DYNAMIC
andKMP_AFFINITY
according to HT status.auto set
OMP_NUM_THREADS
andMKL_NUM_THREADS
according to total available processors andtrainer_count