Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plan separation compile #9920
Plan separation compile #9920
Changes from 70 commits
560a511
51034f0
3736494
5807882
c07ea5e
1fd10ef
74c96df
e89143b
1b10509
44bf12b
bd50bc7
7853956
b725318
45bc629
3c4ea9d
9880ba4
20175fc
fbff274
ede3cd2
3ba45e5
2e9ab1a
93a7947
818d14d
8ca22bf
2adbb13
ccf9bea
2c577df
fa49459
a4e67b0
7d69c25
d4782a7
eb76987
bb9e65e
6b575fc
13ba2ac
92face0
1b2edca
3910af6
a37f9f8
132a8a7
f1352e6
6b13581
4f11a23
75e024b
c11c934
a8cbe2b
402b4e2
5b82586
e62ea6a
d617f0e
81a1f57
8b9c084
48ed476
16a1268
799c943
d93b7a0
2c142d0
b56afc0
1cb6a31
d0611eb
5e23757
0b1cbc9
f78359e
f3661e2
717f72a
34baacd
0786e36
3fc7fe8
c360988
8faa985
83aca5b
8e0a201
0a14449
711d192
3f929cc
7d71e88
57a0259
52df355
9cdc0fd
c1b8dfe
2871fa6
b8f8b08
7ca3f5c
5597fe8
b92ce13
74335c7
3030a0f
1c09796
bda427d
be6c271
b2ec5fd
37dd7ce
564e262
10603f1
c69f405
5990e3b
fb4bfd7
d177b63
0be303e
50ec139
a442869
118784f
9dba71e
1d7b196
e74fd7e
a18bee5
6259dfd
6d56de6
b8d3033
6fc41b8
e749fd4
e24c9d8
f1c2078
9f6d21e
51efb80
2c73e74
e4d30df
677ece4
ec7d071
95dd077
b4b1c37
8dd52e9
973fa46
fe31a50
81311e1
891d5b3
e9d9c5b
e3833cd
5cc147b
8731999
cf8cf13
62dee17
db84632
49c8d18
041a4b3
3e08944
e0cf92b
108891e
5f312a6
08aa237
db5fdc6
ddffa4a
5a7f554
008239e
311cb7a
87dae49
be2987d
9acbc79
9e7f0ec
0c4ff96
b0332fb
111673c
9d75b96
34b3133
acff92c
bfcebd3
aabc9c7
17203d0
4377368
88f4297
0e7b8ed
993efb8
8f5e6f7
dbae209
266c388
9f952bd
a9ad100
3921231
7514cbe
5805ea5
e8b3053
d7b7594
95f59a3
2c4c3e2
09154d0
fd2c2a3
1936fdc
f211abb
b0c7ad0
298aea4
5fd0870
7df10e7
410c71f
e9a20de
57499b7
d169bdd
c956597
c2039df
09293c4
4586502
45c1ca4
bded88f
af8303e
adb1b08
429aa14
a916407
c00c942
76f0157
cc91242
70c2192
105244d
184289f
7efa97a
9a8499f
bfcd7d7
a5b0251
b98e25f
6d4e214
3d677f3
6ae7714
a1ed190
40e9160
47a64f3
48c73cf
d9d4771
35aae5f
e6c539b
03d8f0a
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
支持在一个独立线程里释放大对象
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要实现一个新的类吗?直接调用全局 thread pool 是不是也是一样的?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同问,这里的 AsyncDeallocateContext 感觉意义不是很大, 直接使用全局的 thread pool 会有什么问题呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是为了临时启用一个线程,做异步的大对象释放(开销明显),主线程可以省掉大对象析构的时间。
这个线程在做对象释放时,主线程可能在驱动 Global thread pool 去做通信。所以不适合复用。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
什么情况下 Global thread pool 会去做通信呢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Separation plan compilation is done by:
// a. Master broadcast job(or logical graph) to all workers, make all rank use the same job.
// b. Mater compile BoxingTaskGraph and broadcast it to all workers. BoxingTaskGraph needs to be
// done on master rank.
// c. Each rank compile it's related task node with RankCompiler. RankCompiler compile with the
// BoxingTaskGraph and the job.
// d. Master CollectiveBoxingPlan and then broadcast to all the workers.
比如 a d 里面的同步数据的操作,调用了 MultiThreadLoop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b 里面不会触发数据同步 和 MultiThreadLoop 吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
也会,通信都用的 MultiThreadLoop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
单独的多线程释放,可以单独一个 PR(后置优化 PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里应该把合法的 "mode" str 罗列出来
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
既然: rank_per_iter 和 rank_per_thread 是用于中间调试用的 mode,那么最终应该改成一个 bool 环境变量:
ENABLE_ONEFLOW_LAZY_COMPILE_PER_RANK 之类的环境变量,开启或者关闭