Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Opt] Make irpass::replace_all_usages_with bottom-up #1789

Merged
merged 3 commits into from
Aug 29, 2020

Conversation

xumingkuan
Copy link
Collaborator

@xumingkuan xumingkuan commented Aug 27, 2020

Related issue = close #1785

Stmt::parent is sometimes unreliable, especially in the offload pass. lower_ast also causes the IR printer to crash sometimes, which makes it harder to debug...

In the async engine, there are some usages of replace_all_usages_with which require traversing the root instead of old_stmt->parent. I set root to be nullptr in Stmt::replace_with and make replace_all_usages_with bottom-up only when root == nullptr.

[Click here for the format server]


@xumingkuan
Copy link
Collaborator Author

test fuse_dense -v -t 1: crashing.
Remove other tests except for test_fuse_dense_x2y2z in test_fuse_dense.py: crashing.
Add ti.init(print_ir=True) in test_fuse_dense_x2y2z: passing.
Add ti.init() in test_fuse_dense_x2y2z: passing.
Log:

......
[W 08/27/20 23:11:13.428] [verify.cpp:taichi::lang::irpass::analysis::verify@1
12] IR root is not a Block. Skipping verification.
Windows fatal exception: access violation

Thread 0x00002134 (most recent call first):
  File "C:\Users\xmk\Desktop\taichi\python\taichi\lang\impl.py", line 244 in s
ync
  File "C:\Users\xmk\Desktop\taichi\python\taichi\lang\__init__.py", line 788
in sync
  File "C:\Users\xmk\Desktop\taichi\tests\python\fuse_test_template.py", line
51 in template_fuse_dense_x2y2z
  File "C:\Users\xmk\Desktop\taichi\tests\python\test_fuse_dense.py", line 8 i
n test_fuse_dense_x2y2z
  File "C:\Users\xmk\Desktop\taichi\python\taichi\lang\__init__.py", line 733
in wrapped
  File "D:\Anaconda3\lib\site-packages\_pytest\python.py", line 180 in pytest_
pyfunc_call
  File "D:\Anaconda3\lib\site-packages\pluggy\callers.py", line 187 in _multic
all
  File "D:\Anaconda3\lib\site-packages\pluggy\manager.py", line 81 in <lambda>
  File "D:\Anaconda3\lib\site-packages\pluggy\manager.py", line 87 in _hookexe
c
  File "D:\Anaconda3\lib\site-packages\pluggy\hooks.py", line 289 in __call__
  File "D:\Anaconda3\lib\site-packages\_pytest\python.py", line 1567 in runtes
t
  File "D:\Anaconda3\lib\site-packages\_pytest\runner.py", line 153 in pytest_
runtest_call
  File "D:\Anaconda3\lib\site-packages\pluggy\callers.py", line 187 in _multic
all
  File "D:\Anaconda3\lib\site-packages\pluggy\manager.py", line 81 in <lambda>
  File "D:\Anaconda3\lib\site-packages\pluggy\manager.py", line 87 in _hookexe
c
  File "D:\Anaconda3\lib\site-packages\pluggy\hooks.py", line 289 in __call__
  File "D:\Anaconda3\lib\site-packages\_pytest\runner.py", line 247 in <lambda
>
  File "D:\Anaconda3\lib\site-packages\_pytest\runner.py", line 294 in from_ca
ll
  File "D:\Anaconda3\lib\site-packages\_pytest\runner.py", line 247 in call_ru
ntest_hook
  File "D:\Anaconda3\lib\site-packages\_pytest\runner.py", line 207 in call_an
d_report
  File "D:\Anaconda3\lib\site-packages\_pytest\runner.py", line 117 in runtest
protocol
  File "D:\Anaconda3\lib\site-packages\_pytest\runner.py", line 100 in pytest_
runtest_protocol
  File "D:\Anaconda3\lib\site-packages\pluggy\callers.py", line 187 in _multic
all
  File "D:\Anaconda3\lib\site-packages\pluggy\manager.py", line 81 in <lambda>
  File "D:\Anaconda3\lib\site-packages\pluggy\manager.py", line 87 in _hookexe
c
  File "D:\Anaconda3\lib\site-packages\pluggy\hooks.py", line 289 in __call__
  File "D:\Anaconda3\lib\site-packages\_pytest\main.py", line 321 in pytest_ru
ntestloop
  File "D:\Anaconda3\lib\site-packages\pluggy\callers.py", line 187 in _multic
all
  File "D:\Anaconda3\lib\site-packages\pluggy\manager.py", line 81 in <lambda>
  File "D:\Anaconda3\lib\site-packages\pluggy\manager.py", line 87 in _hookexe
c
  File "D:\Anaconda3\lib\site-packages\pluggy\hooks.py", line 289 in __call__
  File "D:\Anaconda3\lib\site-packages\_pytest\main.py", line 296 in _main
  File "D:\Anaconda3\lib\site-packages\_pytest\main.py", line 240 in wrap_sess
ion
  File "D:\Anaconda3\lib\site-packages\_pytest\main.py", line 289 in pytest_cm
dline_main
  File "D:\Anaconda3\lib\site-packages\pluggy\callers.py", line 187 in _multic
all
  File "D:\Anaconda3\lib\site-packages\pluggy\manager.py", line 81 in <lambda>
  File "D:\Anaconda3\lib\site-packages\pluggy\manager.py", line 87 in _hookexe
c
  File "D:\Anaconda3\lib\site-packages\pluggy\hooks.py", line 289 in __call__
  File "D:\Anaconda3\lib\site-packages\_pytest\config\__init__.py", line 158 i
n main
  File "C:\Users\xmk\Desktop\taichi\python\taichi\main.py", line 745 in _test_
python
  File "C:\Users\xmk\Desktop\taichi\python\taichi\main.py", line 923 in test
  File "C:\Users\xmk\Desktop\taichi\python\taichi\main.py", line 88 in __call_
_
  File "C:\Users\xmk\Desktop\taichi\python\taichi\main.py", line 27 in wrapper
  File "C:\Users\xmk\Desktop\taichi\python\taichi\main.py", line 1062 in main
  File "C:\Users\xmk\Desktop\taichi\python\taichi\__main__.py", line 2 in <mod
ule>
  File "D:\Anaconda3\lib\runpy.py", line 85 in _run_code
  File "D:\Anaconda3\lib\runpy.py", line 193 in _run_module_as_main
[E 08/27/20 23:11:13.519] Received signal 11 (SIGSEGV)


***********************************
* Taichi Compiler Stack Traceback *
***********************************
0x7ffa901d667a: taichi::print_traceback(line 293 in C:\Users\xmk\Desktop\taich
i\taichi\system\traceback.cpp) in taichi_core.pyd
0x7ffa90278e27: taichi::Logger::error(line 115 in C:\Users\xmk\Desktop\taichi\
taichi\util\logging.cpp) in taichi_core.pyd
0x7ffa9027e8f2: taichi::signal_handler(line 163 in C:\Users\xmk\Desktop\taichi
\taichi\util\logging.cpp) in taichi_core.pyd
0x7ffae4bfc053: seh_filter_exe in ucrtbase.dll
0x7ffae4bd823a: _intrinsic_setjmpex in ucrtbase.dll
0x7ffae4bcd040: _C_specific_handler in ucrtbase.dll
0x7ffae7c211cf: _chkstk in ntdll.dll
0x7ffae7bea209: RtlRaiseException in ntdll.dll
0x7ffae7c1fe3e: KiUserExceptionDispatcher in ntdll.dll
0x7ffa9006e674: llvm::GetElementPtrInst::Create(line 924 in D:\LLVM10\include\
llvm\IR\Instructions.h) in taichi_core.pyd
0x7ffa9006fd14: llvm::IRBuilder<llvm::ConstantFolder,llvm::IRBuilderDefaultIns
erter>::CreateGEP(line 1810 in D:\LLVM10\include\llvm\IR\IRBuilder.h) in taich
i_core.pyd
0x7ffa90085a20: taichi::lang::CodeGenLLVM::visit(line 1459 in C:\Users\xmk\Des
ktop\taichi\taichi\codegen\codegen_llvm.cpp) in taichi_core.pyd
0x7ffa9008279b: taichi::lang::CodeGenLLVM::visit(line 125 in C:\Users\xmk\Desk
top\taichi\taichi\codegen\codegen_llvm.cpp) in taichi_core.pyd
0x7ffa90084cc0: taichi::lang::CodeGenLLVM::visit(line 616 in C:\Users\xmk\Desk
top\taichi\taichi\codegen\codegen_llvm.cpp) in taichi_core.pyd
0x7ffa9008279b: taichi::lang::CodeGenLLVM::visit(line 125 in C:\Users\xmk\Desk
top\taichi\taichi\codegen\codegen_llvm.cpp) in taichi_core.pyd
0x7ffa902eecfd: taichi::lang::CodeGenLLVMCPU::create_offload_range_for(line 45
 in C:\Users\xmk\Desktop\taichi\taichi\backends\cpu\codegen_cpu.cpp) in taichi
_core.pyd
0x7ffa902ef77a: taichi::lang::CodeGenLLVMCPU::visit(line 72 in C:\Users\xmk\De
sktop\taichi\taichi\backends\cpu\codegen_cpu.cpp) in taichi_core.pyd
0x7ffa90079f5f: taichi::lang::CodeGenLLVM::emit_to_module(line 1686 in C:\User
s\xmk\Desktop\taichi\taichi\codegen\codegen_llvm.cpp) in taichi_core.pyd
0x7ffa9007a6f9: taichi::lang::CodeGenLLVM::gen(line 1691 in C:\Users\xmk\Deskt
op\taichi\taichi\codegen\codegen_llvm.cpp) in taichi_core.pyd
0x7ffa902ee9a3: taichi::lang::CodeGenCPU::codegen(line 125 in C:\Users\xmk\Des
ktop\taichi\taichi\backends\cpu\codegen_cpu.cpp) in taichi_core.pyd
0x7ffa90120127: <lambda_5c4cefeffead66ceac8071fde85d1dec>::operator()(line 186
 in C:\Users\xmk\Desktop\taichi\taichi\program\async_engine.cpp) in taichi_cor
e.pyd
0x7ffa9012447d: taichi::lang::ParallelExecutor::worker_loop(line 122 in C:\Use
rs\xmk\Desktop\taichi\taichi\program\async_engine.cpp) in taichi_core.pyd
0x7ffa9011d341: std::thread::_Invoke<std::tuple<<lambda_34b52dd46d7c4d7e730740
40a48de1ec> >,0>(line 44 in D:\Program Files\Microsoft Visual Studio\2019\VC\T
ools\MSVC\14.27.29110\include\thread) in taichi_core.pyd
0x7ffae4bb0e82: beginthreadex in ucrtbase.dll
0x7ffae74e7bd4: BaseThreadInitThunk in KERNEL32.DLL
0x7ffae7bece51: RtlUserThreadStart in ntdll.dll

@xumingkuan
Copy link
Collaborator Author

Without print_ir=True, the IR after Simplified IV is:

kernel {
$0 = offloaded range_for(0, 16777216) grid_dim=0 block_dim=1024
body {
  <i32 x1> $250 = const [0]
  <i32 x1> $24 = const [4]
  <i32 x1> $17 = const [1]
  <i32 x1> $2 = loop $0 index 0
  <i32 x1> $8 = const [10485760]
  <i32 x1> $9 = cmp_lt $2 $8
  $13 : if $9 {
    <gen*x1> $211 = get root
    <gen*x1> $213 = [S0root][root]::lookup($211, $250) activate = false
    <gen*x1> $214 = get child [S0root->S1dense] $213
    <gen*x1> $217 = [S1dense][dense]::lookup($214, $2) activate = false
    <i32*x1> $218 = get child [S1dense->S2place_i32] $217
    <i32 x1> $16 = global load $218
    <i32 x1> $18 = add $16 $17
    <gen*x1> $224 = get child [S0root->S3dense] $213
    <gen*x1> $227 = [S3dense][dense]::lookup($224, $2) activate = false
    <i32*x1> $228 = get child [S3dense->S4place_i32] $227
    <i32*x1> $20 : global store [$228 <- $18]
    <i32 x1> $21 = loop $188 index 0  <---------------------- !!!
    <gen*x1> $237 = [S3dense][dense]::lookup($224, $21) activate = false
    <i32*x1> $238 = get child [S3dense->S4place_i32] $237
    <i32 x1> $23 = global load $238
    <i32 x1> $25 = add $23 $24
    <gen*x1> $244 = get child [S0root->S5dense] $213
    <gen*x1> $247 = [S5dense][dense]::lookup($244, $21) activate = false
    <i32*x1> $248 = get child [S5dense->S6place_i32] $247
    <i32*x1> $27 : global store [$248 <- $25]
  }
}

With print_ir=True, the IR is:

kernel {
  $0 = offloaded range_for(0, 16777216) grid_dim=0 block_dim=1024
  body {
    <i32 x1> $1 = const [0]
    <i32 x1> $2 = const [4]
    <i32 x1> $3 = loop $0 index 0
    <i32 x1> $4 = const [10485760]
    <i32 x1> $5 = cmp_lt $3 $4
    $6 : if $5 {
      <gen*x1> $7 = get root
      <gen*x1> $8 = [S0root][root]::lookup($7, $1) activate = false
      <gen*x1> $9 = get child [S0root->S3dense] $8
      <gen*x1> $10 = [S3dense][dense]::lookup($9, $3) activate = false
      <i32*x1> $11 = get child [S3dense->S4place_i32] $10
      <i32 x1> $12 = global load $11
      <i32 x1> $13 = add $12 $2
      <gen*x1> $14 = get child [S0root->S5dense] $8
      <gen*x1> $15 = [S5dense][dense]::lookup($14, $3) activate = false
      <i32*x1> $16 = get child [S5dense->S6place_i32] $15
      <i32 x1> $17 : global store [$16 <- $13]
    }
  }
}

@archibate
Copy link
Collaborator

archibate commented Aug 28, 2020

Add ti.init() in test_fuse_dense_x2y2z: passing.

What about ti.init(async_mode=True)?
ti.init() will reset async_mode to False and of course it will pass the test.

@xumingkuan
Copy link
Collaborator Author

Add ti.init() in test_fuse_dense_x2y2z: passing.

What about ti.init(async_mode=True)?
ti.init() will reset async_mode to False and of course it will pass the test.

Nice catch! With ti.init(async_mode=True): crashing.

@codecov
Copy link

codecov bot commented Aug 29, 2020

Codecov Report

Merging #1789 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1789   +/-   ##
=======================================
  Coverage   43.51%   43.51%           
=======================================
  Files          44       44           
  Lines        6023     6023           
  Branches     1078     1078           
=======================================
  Hits         2621     2621           
  Misses       3248     3248           
  Partials      154      154           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d8a2b96...05fa0fa. Read the comment docs.

Copy link
Member

@yuanming-hu yuanming-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM.

@yuanming-hu yuanming-hu merged commit fa4b2da into taichi-dev:master Aug 29, 2020
@yuanming-hu yuanming-hu mentioned this pull request Sep 1, 2020
@xumingkuan xumingkuan deleted the replace-with branch September 8, 2020 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Opt] CFG optimization scales superlinearly
3 participants