add sycl KI for InferenceBuilder by using syclomatic#21
add sycl KI for InferenceBuilder by using syclomatic#21delock merged 12 commits intodelock:gma/xpu_upstreamfrom
Conversation
baodii
commented
Oct 18, 2023
- as hipify, we only support JIT mode
- we only support inference kernels now
- delete other OpBuilders in op_builder/xpu/init.py and accelerator/xpu_accelerator.py
- add rule.YAML, pre_process.sh and post_process.sh to help us justify source code and generated code
- we don't change original cuda code. we will copy original cuda code to build folder
- sycl code will be generated in third-party folder
- sycl code will be generated but not compiled when install deepspeed as hipify
enable jit_load for sycl kernels
|
@delock @CaoZhongZ @rogerxfeng8 please review |
| return False | ||
|
|
||
| cuda_okay = True | ||
| if not self.is_rocm_pytorch() and not self.is_sycl_enabled() and torch.cuda.is_available(): |
There was a problem hiding this comment.
why there is changes for cuda and rocm?
| from deepspeed.ops.op_builder.builder import OpBuilder, TORCH_MAJOR, TORCH_MINOR | ||
|
|
||
|
|
||
| class SYCLOpBuilder(OpBuilder): |
There was a problem hiding this comment.
shouldn't there be two kinds of builder, SYCLOpBuilder and SYCLAutoOpBuilder, and SYCLAutoOpBuilder works for ops with syclomatic?
There was a problem hiding this comment.
shouldn't there be two kinds of builder, SYCLOpBuilder and SYCLAutoOpBuilder, and SYCLAutoOpBuilder works for ops with syclomatic?
add SYCLAutoOpBuilder below.
accelerator/xpu_accelerator.py
Outdated
| return FusedAdamBuilder | ||
| from deepspeed.ops.op_builder.xpu import CPUAdagradBuilder, CPUAdamBuilder, FusedAdamBuilder, AsyncIOBuilder, InferenceBuilder | ||
|
|
||
| if class_name == "InferenceBuilder": |
There was a problem hiding this comment.
This change seems to turn off CPUAdagradBuilder, CPUAdamBuilder, FusedAdamBuilder, AsyncIOBuilder. Is this intended?
.gitignore
Outdated
|
|
||
| # Build + installation data | ||
| build/ | ||
| third-party/ |
There was a problem hiding this comment.
Is third-party/ used to store generated sycl kernels? What will the directory structure be under third-party?
op_builder/xpu/post_process.sh
Outdated
| find ./deepspeed/third-party/ -type f -exec sed -i "s/at::kCUDA/at::kXPU/g" {} + | ||
|
|
||
| # fix pt_binding.cpp torch::from_blob 4 inputs pattern | ||
| patch ./deepspeed/third-party/csrc/transformer/inference/csrc/pt_binding.cpp << 'DIFF___' |
There was a problem hiding this comment.
What is the error intend to fix? Why can't it be fixed in the source code directory? Is it temporary or will persist? What do we need to do when pt_binding.cpp get changed?
| return dpcpp_ext | ||
|
|
||
| def sycl_extension(self): | ||
| if self.is_sycl_enabled(): |
There was a problem hiding this comment.
This function is very long. Can we extract two smaller functions? one for include, one for source.
op_builder/xpu/builder.py
Outdated
| trans_cmd = c2s_cmd + cuda_inc_flag + extra_args + in_root + out_root + cuda_source | ||
| print("**** processing ", f'{trans_cmd}') | ||
| p = subprocess.Popen(f'{trans_cmd}', stdout=subprocess.PIPE, shell=True) | ||
| # processes_running.append(p) |
There was a problem hiding this comment.
Please remove code that is not needed.
| find ./build/csrc -type f -exec sed -i "s/torch::from_blob/at::from_blob/g" {} + | ||
|
|
||
| # fix inference_context.h to make it could be migrate | ||
| patch ./build/csrc/transformer/inference/includes/inference_context.h << 'DIFF___' |
There was a problem hiding this comment.
Why can't change source code directly?
There was a problem hiding this comment.
Why can't change source code directly?
add changes in source codes.
|
@baodii comments added. Please also fix format error by turning on pre-commit in your environment. |
some license check in csrc/xpu folder not pass. |
|
@delock I have fixed these issues. Please review. |