Skip to content

add sycl KI for InferenceBuilder by using syclomatic#21

Merged
delock merged 12 commits intodelock:gma/xpu_upstreamfrom
baodii:baodi/syclomatic
Oct 30, 2023
Merged

add sycl KI for InferenceBuilder by using syclomatic#21
delock merged 12 commits intodelock:gma/xpu_upstreamfrom
baodii:baodi/syclomatic

Conversation

@baodii
Copy link

@baodii baodii commented Oct 18, 2023

  • as hipify, we only support JIT mode
  • we only support inference kernels now
  • delete other OpBuilders in op_builder/xpu/init.py and accelerator/xpu_accelerator.py
  • add rule.YAML, pre_process.sh and post_process.sh to help us justify source code and generated code
  • we don't change original cuda code. we will copy original cuda code to build folder
  • sycl code will be generated in third-party folder
  • sycl code will be generated but not compiled when install deepspeed as hipify

@baodii
Copy link
Author

baodii commented Oct 18, 2023

@delock @CaoZhongZ @rogerxfeng8 please review

return False

cuda_okay = True
if not self.is_rocm_pytorch() and not self.is_sycl_enabled() and torch.cuda.is_available():
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why there is changes for cuda and rocm?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already deleted.

from deepspeed.ops.op_builder.builder import OpBuilder, TORCH_MAJOR, TORCH_MINOR


class SYCLOpBuilder(OpBuilder):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't there be two kinds of builder, SYCLOpBuilder and SYCLAutoOpBuilder, and SYCLAutoOpBuilder works for ops with syclomatic?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't there be two kinds of builder, SYCLOpBuilder and SYCLAutoOpBuilder, and SYCLAutoOpBuilder works for ops with syclomatic?

add SYCLAutoOpBuilder below.

return FusedAdamBuilder
from deepspeed.ops.op_builder.xpu import CPUAdagradBuilder, CPUAdamBuilder, FusedAdamBuilder, AsyncIOBuilder, InferenceBuilder

if class_name == "InferenceBuilder":
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems to turn off CPUAdagradBuilder, CPUAdamBuilder, FusedAdamBuilder, AsyncIOBuilder. Is this intended?

.gitignore Outdated

# Build + installation data
build/
third-party/
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is third-party/ used to store generated sycl kernels? What will the directory structure be under third-party?

find ./deepspeed/third-party/ -type f -exec sed -i "s/at::kCUDA/at::kXPU/g" {} +

# fix pt_binding.cpp torch::from_blob 4 inputs pattern
patch ./deepspeed/third-party/csrc/transformer/inference/csrc/pt_binding.cpp << 'DIFF___'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the error intend to fix? Why can't it be fixed in the source code directory? Is it temporary or will persist? What do we need to do when pt_binding.cpp get changed?

return dpcpp_ext

def sycl_extension(self):
if self.is_sycl_enabled():
Copy link
Owner

@delock delock Oct 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is very long. Can we extract two smaller functions? one for include, one for source.

trans_cmd = c2s_cmd + cuda_inc_flag + extra_args + in_root + out_root + cuda_source
print("**** processing ", f'{trans_cmd}')
p = subprocess.Popen(f'{trans_cmd}', stdout=subprocess.PIPE, shell=True)
# processes_running.append(p)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove code that is not needed.

find ./build/csrc -type f -exec sed -i "s/torch::from_blob/at::from_blob/g" {} +

# fix inference_context.h to make it could be migrate
patch ./build/csrc/transformer/inference/includes/inference_context.h << 'DIFF___'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't change source code directly?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't change source code directly?

add changes in source codes.

@delock
Copy link
Owner

delock commented Oct 20, 2023

@baodii comments added. Please also fix format error by turning on pre-commit in your environment.

@baodii
Copy link
Author

baodii commented Oct 23, 2023

@baodii comments added. Please also fix format error by turning on pre-commit in your environment.

some license check in csrc/xpu folder not pass.

@baodii
Copy link
Author

baodii commented Oct 24, 2023

@delock I have fixed these issues. Please review.

@delock delock merged commit c9ef1ef into delock:gma/xpu_upstream Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants