-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Closed
Description
I am facing following errors while building main on ROCm with ops
Bug:
cd Deepspeed
DS_BUILD_FUSED_ADAM=1 pip install .
Processing /myworkspace/DeepSpeed
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [21 lines of output]
[2024-10-04 18:00:35,186] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-10-04 18:00:35,821] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/myworkspace/DeepSpeed/setup.py", line 200, in <module>
ext_modules.append(builder.builder())
File "/myworkspace/DeepSpeed/op_builder/builder.py", line 711, in builder
compile_args['cxx'].append('-DROCM_WAVEFRONT_SIZE=%s' % self.get_rocm_wavefront_size())
File "/myworkspace/DeepSpeed/op_builder/builder.py", line 276, in get_rocm_wavefront_size
result = subprocess.check_output(rocm_wavefront_size_cmd)
File "/opt/conda/envs/py_3.9/lib/python3.9/subprocess.py", line 424, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/opt/conda/envs/py_3.9/lib/python3.9/subprocess.py", line 505, in run
with Popen(*popenargs, **kwargs) as process:
File "/opt/conda/envs/py_3.9/lib/python3.9/subprocess.py", line 951, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/opt/conda/envs/py_3.9/lib/python3.9/subprocess.py", line 1837, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: "/opt/rocm/bin/rocminfo | grep -Eo -m1 'Wavefront Size:[[:space:]]+[0-9]+' | grep -Eo '[0-9]+'"
DS_BUILD_OPS=0
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
This is caused by the removal of "shell=True" in the below security fix
659f6be
Resolution:
Propose to update the script to use subprocess.run command instead of subprocess.check_out
@staticmethod
def get_rocm_wavefront_size():
if OpBuilder._rocm_wavefront_size:
return OpBuilder._rocm_wavefront_size
rocm_info = Path("/opt/rocm/bin/rocminfo")
if not rocm_info.is_file():
rocm_info = Path("rocminfo")
# Construct the command as a list of arguments
grep_cmd = [
str(rocm_info),
"|",
"grep", "-Eo", "-m1", "Wavefront Size:[[:space:]]+[0-9]+",
"|",
"grep", "-Eo", "[0-9]+"
]
try:
# Run the command using subprocess.run
result = subprocess.run(grep_cmd, capture_output=True)
rocm_wavefront_size = result.stdout.strip()
except subprocess.CalledProcessError:
rocm_wavefront_size = "32"
OpBuilder._rocm_wavefront_size = rocm_wavefront_size
return OpBuilder._rocm_wavefront_size
Metadata
Metadata
Assignees
Labels
No labels