Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fusion executor #162

Merged
merged 36 commits into from
Jul 19, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
dcbd8e4
Working towards a new fusion execution engine.
csarofeen Jul 9, 2020
27c57cd
fixing func_name string
jjsjann123 Jul 9, 2020
70c3862
Worked through bugs, simple gemm now working.
csarofeen Jul 10, 2020
0c7f207
Merge branch '20_7_6_devel' of https://www.github.com/csarofeen/pytor…
csarofeen Jul 10, 2020
112ff1a
Fix bad merge conflict resolution.
csarofeen Jul 10, 2020
90f09c8
Merge branch '20_7_6_devel' into FusionExecutor
csarofeen Jul 10, 2020
3c46d89
Change arguments to use unique_ptr.
csarofeen Jul 10, 2020
0246ee3
Clang tidy.
csarofeen Jul 10, 2020
9711b82
Prefix executor related files.
csarofeen Jul 10, 2020
a2d03c2
Fix build after moving files.
csarofeen Jul 10, 2020
bb32428
Try to support thread infernece for broadcast.
csarofeen Jul 10, 2020
a4043c2
Split out launch parameters.
csarofeen Jul 10, 2020
e68584d
Move kernel argument holder to live with kernel arguments.
csarofeen Jul 10, 2020
d2277f9
Move argument validation and nvrtc compile to a utility file.
csarofeen Jul 10, 2020
07552e7
Some more minor tweaks/cleanup.
csarofeen Jul 11, 2020
48ef19b
Add launch constraints to manually set thread dims when they can't be…
csarofeen Jul 11, 2020
094136b
Add NamedScalar support to expression evaluator. Improve a few error …
csarofeen Jul 12, 2020
616475e
Basic support to split on a symbolic value we can make an input, or a…
csarofeen Jul 12, 2020
fbc5579
Fix tests.
csarofeen Jul 13, 2020
54422a4
Rework allocation nodes a bit to prepare for using them with grid red…
csarofeen Jul 13, 2020
bd6ae1f
Add a grid reduction IR node.
csarofeen Jul 14, 2020
83ff4bb
Restructure lower_index so we can return multiple exprs from each exp…
csarofeen Jul 15, 2020
6374594
Enable reduction buffers, enable reductions in fusion executor. Conve…
csarofeen Jul 16, 2020
80d6de8
Maybe broadcast support.
csarofeen Jul 16, 2020
a803c08
Try to get broadcast in the right spot this time.
csarofeen Jul 16, 2020
c8a332d
quick fix to disable broadcast hack for integration
jjsjann123 Jul 17, 2020
c34e966
Add automatic output allocation, change every other test to use it.
csarofeen Jul 17, 2020
99da19d
tlemo comments
csarofeen Jul 17, 2020
bd0a2c1
update broadcast tests
jjsjann123 Jul 17, 2020
1b68ff9
hacky switch to FusionExecutor
jjsjann123 Jul 17, 2020
2592f1b
Fix implicit broadcast indexing.
csarofeen Jul 17, 2020
6b34df7
Clang.
csarofeen Jul 17, 2020
484451f
fixing legacy fuser with missing static shape info
jjsjann123 Jul 17, 2020
bec3aff
Address PR comments.
csarofeen Jul 19, 2020
c440423
Merge remote-tracking branch 'origin/20_7_6_devel' into FusionExecutor
csarofeen Jul 19, 2020
12bd4e4
Clang tidy.
csarofeen Jul 19, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 9 additions & 5 deletions caffe2/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -451,8 +451,11 @@ if(NOT INTERN_BUILD_MOBILE OR NOT BUILD_CAFFE2_MOBILE)
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/compute_at.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/dispatch.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/expr_evaluator.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/executor.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/executor_kernel_arg.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/executor_launch_params.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/executor_utils.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/fusion.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/scheduler.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/graph_fuser.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/index_compute.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/ir_base_nodes.cpp
Expand All @@ -463,26 +466,27 @@ if(NOT INTERN_BUILD_MOBILE OR NOT BUILD_CAFFE2_MOBILE)
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/iter_visitor.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/kernel.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/kernel_cache.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/manager.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/shape_inference.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/mutator.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/lower_index.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/lower_loops.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/lower_thread_predicate.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/lower_unroll.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/lower_utils.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/lower_validation.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/lower2device.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/manager.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/mutator.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/parser.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/partition.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/predicate_compute.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/register_interface.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/scheduler.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/shape_inference.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/tensor_view.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/transform_iter.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/transform_replay.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/transform_rfactor.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/type.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/utils.cpp
${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/register_interface.cpp
${TORCH_SRC_DIR}/csrc/jit/tensorexpr/cuda_codegen.cpp
)
add_library(caffe2_nvrtc SHARED ${ATen_NVRTC_STUB_SRCS})
Expand Down
Loading