Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] refactor copysign #60968

Merged
merged 1 commit into from
Jan 19, 2024
Merged

[Fix] refactor copysign #60968

merged 1 commit into from
Jan 19, 2024

Conversation

cocoshe
Copy link
Contributor

@cocoshe cocoshe commented Jan 19, 2024

PR types

Bug fixes

PR changes

OPs

Description

之前合入的copysign这个PR,最近因为飞桨内部的编包过程中发现,可能是模板使用较多会导致多架构编包不过的问题。

问题复现:

docker镜像:

registry.baidubce.com/paddlepaddle/paddle_manylinux_devel:cuda12.0-cudnn8.9-trt8.6-gcc12.2

编译脚本

export LD_LIBRARY_PATH=/opt/_internal/cpython-3.10.0/lib/:${LD_LIBRARY_PATH}
export PATH=/opt/_internal/cpython-3.10.0/bin/:${PATH}

cmake .. -DCMAKE_BUILD_TYPE=Release -DWITH_GPU=ON -DWITH_CUDNN_DSO=OFF -DWITH_SHARED_PHI=ON -DWITH_TENSORRT=ON -DWITH_ROCM=OFF -DWITH_CINN=OFF -DWITH_DISTRIBUTE=ON -DWITH_MKL=ON -DWITH_AVX=ON -DCUDA_ARCH_NAME=Manual -DNEW_RELEASE_PYPI=OFF -DNEW_RELEASE_ALL=OFF -DNEW_RELEASE_JIT=OFF -DWITH_PYTHON=ON -DCUDNN_ROOT=/usr/ -DWITH_TESTING=OFF -DWITH_COVERAGE=OFF -DWITH_INCREMENTAL_COVERAGE=OFF -DCMAKE_MODULE_PATH=/opt/rocm/hip/cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DWITH_INFERENCE_API_TEST=OFF -DINFERENCE_DEMO_INSTALL_DIR=/root/.cache/inference_demo -DPY_VERSION=3.10 -DCMAKE_INSTALL_PREFIX=/paddle/build -DWITH_PSCORE=ON -DWITH_PSLIB=OFF -DWITH_GLOO=ON -DLITE_GIT_TAG=release/v2.10 -DWITH_XPU=OFF -DWITH_IPU=OFF -DXPU_SDK_ROOT= -DWITH_LITE=OFF -DWITH_XPU_BKCL=OFF -DWITH_ARM=OFF -DWITH_STRIP=ON -DON_INFER=ON -DWITH_HETERPS=OFF -DWITH_GPU_GRAPH=OFF -DCUDA_ARCH_BIN="70 75 80 86 90" -DWITH_RECORD_BUILDTIME=OFF -DWITH_UNITY_BUILD=OFF -DWITH_ONNXRUNTIME=OFF -DWITH_CUDNN_FRONTEND=OFF

修改前报错信息:
6a1df3c94ea944c457aff0d1221cf31

原cuda实现编译后的.o较大:
image

可能是原来的实现在正向反向时都实现了一份copysign_func的模板。现将copysign合并至elementwise系列中,copysign_func的模板以及copysign相关的Functor都迁移至funcs下,供前向和反向复用。

elementwise相关编译中间产物:

  • 并入copysign前:
    image

  • 并入copysign后:
    image

前向cu.o (+5.2M),反向cu.o(+4.0M),可通过该镜像和cmake选项下的编译

Copy link

paddle-bot bot commented Jan 19, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label Jan 19, 2024
Copy link
Contributor

@zoooo0820 zoooo0820 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jeff41404 jeff41404 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jeff41404 jeff41404 merged commit 99717e9 into PaddlePaddle:develop Jan 19, 2024
29 checks passed
@cocoshe cocoshe deleted the copysign_coco_dev branch January 21, 2024 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants