Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync branches: develop and infrt #39509

Merged
merged 59 commits into from
Feb 16, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
9d4d0c3
【Pten】Adjust the Empyt dev_api (#39143)
zyfncg Feb 9, 2022
2a5d858
Fix code conflict of empty dev_api (#39430)
zyfncg Feb 10, 2022
63d2333
[PluggableDevice] custom kernel supports multi cpp_dtype registering …
Aganlengzi Feb 10, 2022
c7c1db3
[PTen] Add standard kernel suffix set (#39404)
chenwhql Feb 10, 2022
14ed2f5
[pten] update isnan registration (#39419)
zhiqiu Feb 10, 2022
e8ac7fc
[bf16] add bf16 kernel: dropout & reshape & slice (#39395)
zhangbo9674 Feb 10, 2022
59c7aea
[bf16] add bf16 kernel: squeeze & unsqueeze & stack (#39402)
zhangbo9674 Feb 10, 2022
224bc51
Modify the unsqueeze dimension of input data in conv1d NCL And NLC fo…
Zjq9409 Feb 10, 2022
7b70b79
【Pten】Refactor C++ API code-gen (#39408)
zyfncg Feb 10, 2022
32d79bb
Refactored Python-C Attributes Parsing Functions (#39328)
jim19930609 Feb 10, 2022
c47d672
Add _get_parameter method to Lamb optimizer (#39416)
sneaxiy Feb 10, 2022
52d6b30
mkldnn layout issue fix (#39422)
b3602sss Feb 10, 2022
8b58862
fix compile error on jetson (#39441)
jiweibo Feb 10, 2022
e2ad433
move Masked select to pten (#39193)
phlrain Feb 10, 2022
238f3c8
【PaddlePaddle Hackathon】31. Add Java frontend for Paddle Inference (…
chenyanlann Feb 10, 2022
f7a3389
fix check error of ResetHolder (#39439)
zyfncg Feb 10, 2022
43f84d0
Added python-c code generation for final state Eager Dygraph (#39233)
jim19930609 Feb 10, 2022
29d3160
change dtype of pooling mask to 'int32' for Paddle2ONNX (#39314)
weisy11 Feb 10, 2022
35b03e1
share MemOptVarInfos of external variables into cinn_launch subgraph …
CtfGo Feb 10, 2022
2b8b16d
[NPU] add reduce_min (#39019)
windstamp Feb 10, 2022
383de29
[MLU] add mlu kernel for accuracy op (#39337)
fwenguang Feb 10, 2022
1252f4b
[Dy2St]Handle `a, b = paddle.shape(x)` in Static Analysis (#39245)
0x45f Feb 10, 2022
7d6096f
【Pten】Auto-Generate InterMeta register (#39436)
zyfncg Feb 11, 2022
bf30503
Support different dtypes of inputs for elementwise ops (#38859)
zhangting2020 Feb 11, 2022
f38c2e5
Add profiler node tree implementation (#39316)
rainyfly Feb 11, 2022
8803f6b
add print pten kernel tool (#39371)
shangzhizhou Feb 11, 2022
7392578
[new-exec] set type of op-kernel op by place (#39458)
zhiqiu Feb 11, 2022
7e52bea
Add log for executor (#39459)
liutiexing Feb 11, 2022
1c44d3e
[Paddle Inference] support ernie quant model with interleaved (#39424)
Wangzheee Feb 11, 2022
22c67d1
统一 ps 开发 - python (#39431)
ziyoujiyi Feb 11, 2022
667bd96
[PTen] Move grad GetExpectedPtenKernelArgs into pten (#39418)
chenwhql Feb 11, 2022
be8ab0e
fix compilation warning on mac (#39438)
Feb 11, 2022
72ad280
get build time (#39368)
lelelelelez Feb 11, 2022
c86765e
fix prelu trt convert (#39389)
JZZ-NOTE Feb 11, 2022
a117497
Optimize bilinear interpolation foward (#39243)
AshburnLee Feb 11, 2022
2ea15fc
Optimize performance of softmax_bwd when axis!=-1 (#38609)
ZzSean Feb 11, 2022
d763a91
[PTen] Remove pten core's dependency on fluid xxx_info.h (#39401)
chenwhql Feb 11, 2022
d25a7f9
[Pten] move operators/math/math_function_* to pten/kernels/func (#39300)
Feb 11, 2022
702bce5
[MLU] add pool2d and pool2d_grad mlu kernel (#39453)
fwenguang Feb 11, 2022
89aa8b1
[MLU]support c_gen_cncl_id_op run on MLU device (#39336)
kangna-qi Feb 11, 2022
1e6047f
[bf16] add bf16 kernel: transpose & unbind (#39457)
zhangbo9674 Feb 11, 2022
02f0670
uniform_random op for mlu (#39450)
joeqiao12 Feb 11, 2022
2db25f0
[MLU] add pool2d pytest (#39454)
fwenguang Feb 11, 2022
52bbaae
Added shape (U)INT8/BF16/FP32 oneDNN kernel (#36033)
jakpiase Feb 11, 2022
575fa0f
move memcpy.h into cc file (#39469)
chenwhql Feb 11, 2022
69793a2
Add TensorRT inspector into Paddle-TRT (#38362)
leo0519 Feb 11, 2022
739da6c
Fix add profiler node tree implementation cmake error (#39474)
rainyfly Feb 11, 2022
bdeb479
unify naming style (#39481)
chenwhql Feb 12, 2022
74a150f
[Pten] Generate Wrapped InferMeta by Yaml (#39482)
zyfncg Feb 13, 2022
ec8a0c1
Adjusted python-level trace_op to accomodate final state Eager Dygrap…
jim19930609 Feb 14, 2022
9722994
Fixed get_tensor method for EagerTensor (#39414)
jim19930609 Feb 14, 2022
db11357
[Approver Update] update check approver of qili93, test=document_fix …
qili93 Feb 14, 2022
1b9e679
[MLU] add mlu kernel for c_broadcast op (#39470)
maxhuiy Feb 14, 2022
9ba3f42
update xpu test build script and fix get_test_cover_info, *test=kunlu…
tangzhiyi11 Feb 14, 2022
d12c363
fix gather_nd, *test=kunlun (#39283)
tangzhiyi11 Feb 14, 2022
d0df563
[pten] add split kernel (#39060)
MingMingShangTian Feb 14, 2022
e07420b
new may of test cases, *test=kunlun (#39444)
helen88 Feb 14, 2022
ddb1e23
[PTen] Add HasAttr for ArgumentMappingContext (#39464)
chenwhql Feb 14, 2022
55da934
[ROCm] fix missing dcu kernel in operator.cmake, test=develop (#39480)
qili93 Feb 14, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ paddle/pten/api/lib/api.cc
paddle/pten/api/backward/backward_api.h
paddle/pten/api/lib/backward_api.cc
paddle/pten/include/*
paddle/pten/infermeta/generated.*
paddle/pten/extension.h
paddle/fluid/eager/api/generated/*

Expand Down
1 change: 1 addition & 0 deletions AUTHORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,3 +83,4 @@
| jeng1220 | Bai-Cheng(Ryan) Jeng (NVIDIA) |
| mingxu1067 | Ming Huang (NVIDIA) |
| zlsh80826 | Reese Wang (NVIDIA) |
| leo0519 | Leo Chen (NVIDIA) |
10 changes: 9 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,15 @@ option(NEW_RELEASE_CUBIN "PaddlePaddle next-level release strategy for pypi cu
option(NEW_RELEASE_JIT "PaddlePaddle next-level release strategy for backup jit package" OFF)
option(WITH_ASCEND_INT64 "Compile with int64 kernel for ascend NPU" OFF)
option(WITH_POCKETFFT "Compile with pocketfft support" ON)
option(WITH_RECORD_BUILDTIME "Compile PaddlePaddle with record all targets build time" OFF)

if(WITH_RECORD_BUILDTIME)
set_property(GLOBAL PROPERTY RULE_LAUNCH_COMPILE "${CMAKE_CURRENT_SOURCE_DIR}/tools/get_build_time.sh")
set_property(GLOBAL PROPERTY RULE_LAUNCH_LINK "${CMAKE_CURRENT_SOURCE_DIR}/tools/get_build_time.sh")
else()
include(ccache) # set ccache for compilation ; if WITH_RECORD_BUILDTIME=ON can't use ccache
endif()
unset(WITH_RECORD_BUILDTIME CACHE)

# PY_VERSION
if(NOT PY_VERSION)
Expand Down Expand Up @@ -382,7 +391,6 @@ if(WITH_PROFILER)
add_definitions(-DWITH_GPERFTOOLS)
endif()

include(ccache) # set ccache for compilation
include(util) # set unittest and link libs
include(version) # set PADDLE_VERSION
include(coveralls) # set code coverage
Expand Down
1 change: 0 additions & 1 deletion cmake/cuda.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ if(NOT WITH_GPU)
return()
endif()


if(WITH_NV_JETSON)
add_definitions(-DWITH_NV_JETSON)
set(paddle_known_gpu_archs "53 62 72")
Expand Down
11 changes: 11 additions & 0 deletions cmake/operators.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -335,6 +335,17 @@ function(op_library TARGET)
endif()
endforeach()

# pybind USE_OP_DEVICE_KERNEL for ROCm
list (APPEND hip_srcs ${hip_cc_srcs})
# message("hip_srcs ${hip_srcs}")
foreach(hip_src ${hip_srcs})
set(op_name "")
find_register(${hip_src} "REGISTER_OP_CUDA_KERNEL" op_name)
if(NOT ${op_name} EQUAL "")
file(APPEND ${pybind_file} "USE_OP_DEVICE_KERNEL(${op_name}, CUDA);\n")
set(pybind_flag 1)
endif()
endforeach()

# pybind USE_OP_DEVICE_KERNEL for CUDNN/MIOPEN
list(APPEND cudnn_cu_srcs ${cudnn_cu_cc_srcs})
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,12 @@ limitations under the License. */
#include "paddle/fluid/framework/variable.h"
#include "paddle/fluid/framework/variable_helper.h"
#include "paddle/fluid/operators/math/blas.h"
#include "paddle/fluid/operators/math/math_function.h"
#include "paddle/fluid/operators/math/selected_rows_functor.h"
#include "paddle/fluid/platform/device_context.h"
#include "paddle/fluid/platform/enforce.h"
#include "paddle/fluid/platform/place.h"
#include "paddle/fluid/string/split.h"
#include "paddle/pten/kernels/funcs/math_function.h"

#include "paddle/fluid/distributed/ps/service/ps_client.h"

Expand Down Expand Up @@ -180,7 +180,7 @@ inline void MergeVars(const std::string &var_name,

// set output tensor to 0.
paddle::platform::CPUDeviceContext cpu_ctx;
paddle::operators::math::SetConstant<paddle::platform::CPUDeviceContext, T>
pten::funcs::SetConstant<paddle::platform::CPUDeviceContext, T>
constant_functor;
constant_functor(cpu_ctx, out_t, static_cast<T>(0));
// sum all vars to out
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,10 @@
#include "paddle/fluid/distributed/ps/service/ps_service/service.h"
#include "paddle/fluid/distributed/ps/service/sendrecv.pb.h"
#include "paddle/fluid/framework/program_desc.h"
#include "paddle/fluid/operators/math/math_function.h"
#include "paddle/fluid/platform/place.h"
#include "paddle/fluid/string/printf.h"
#include "paddle/pten/kernels/funcs/math_function.h"

namespace paddle {
namespace distributed {
class GraphPyService {
Expand Down
3 changes: 1 addition & 2 deletions paddle/fluid/distributed/test/brpc_service_dense_sgd_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ limitations under the License. */
#include "paddle/fluid/distributed/ps/service/brpc_ps_server.h"
#include "paddle/fluid/framework/program_desc.h"
#include "paddle/fluid/framework/scope.h"
#include "paddle/fluid/operators/math/math_function.h"
#include "paddle/fluid/platform/place.h"
#include "paddle/pten/kernels/funcs/math_function.h"

namespace paddle {
namespace distributed {
Expand All @@ -42,7 +42,6 @@ class DenseTensor;
namespace framework = paddle::framework;
namespace platform = paddle::platform;
namespace operators = paddle::operators;
namespace math = paddle::operators::math;
namespace memory = paddle::memory;
namespace distributed = paddle::distributed;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ limitations under the License. */
#include "paddle/fluid/distributed/ps/service/brpc_ps_server.h"
#include "paddle/fluid/distributed/ps/service/env.h"
#include "paddle/fluid/framework/program_desc.h"
#include "paddle/fluid/operators/math/math_function.h"
#include "paddle/fluid/platform/place.h"
#include "paddle/pten/kernels/funcs/math_function.h"

namespace paddle {
namespace distributed {
Expand All @@ -43,7 +43,6 @@ class DenseTensor;
namespace framework = paddle::framework;
namespace platform = paddle::platform;
namespace operators = paddle::operators;
namespace math = paddle::operators::math;
namespace memory = paddle::memory;
namespace distributed = paddle::distributed;

Expand Down
9 changes: 4 additions & 5 deletions paddle/fluid/distributed/test/brpc_utils_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ limitations under the License. */
#include "gtest/gtest.h"

#include "paddle/fluid/distributed/ps/service/brpc_utils.h"
#include "paddle/fluid/operators/math/math_function.h"
#include "paddle/pten/kernels/funcs/math_function.h"

namespace paddle {
namespace framework {
Expand All @@ -28,7 +28,6 @@ class Variable;
namespace framework = paddle::framework;
namespace platform = paddle::platform;
namespace operators = paddle::operators;
namespace math = paddle::operators::math;
namespace memory = paddle::memory;
namespace distributed = paddle::distributed;

Expand All @@ -42,7 +41,7 @@ void CreateVarsOnScope(framework::Scope* scope, platform::Place* place,
lod1.push_back(framework::Vector<size_t>({1, 3, 8}));
tensor1->set_lod(lod1);
tensor1->mutable_data<float>(*place);
math::set_constant(ctx, tensor1, 31.9);
pten::funcs::set_constant(ctx, tensor1, 31.9);

// var 2
framework::Variable* var2 = scope->Var("x2");
Expand All @@ -52,7 +51,7 @@ void CreateVarsOnScope(framework::Scope* scope, platform::Place* place,
lod2.push_back(framework::Vector<size_t>({1, 1}));
tensor2->set_lod(lod2);
tensor2->mutable_data<int>(*place);
math::set_constant(ctx, tensor2, 100);
pten::funcs::set_constant(ctx, tensor2, 100);

// var 3
framework::Variable* var3 = scope->Var("x3");
Expand All @@ -62,7 +61,7 @@ void CreateVarsOnScope(framework::Scope* scope, platform::Place* place,
auto* rows = slr->mutable_rows();
tensor3->Resize(framework::make_ddim({564, 128}));
tensor3->mutable_data<float>(*place);
math::set_constant(ctx, tensor3, 32.7);
pten::funcs::set_constant(ctx, tensor3, 32.7);
for (int i = 0; i < 564; ++i) rows->push_back(i);
}

Expand Down
3 changes: 1 addition & 2 deletions paddle/fluid/distributed/test/graph_node_split_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -36,14 +36,13 @@ limitations under the License. */
#include "paddle/fluid/framework/scope.h"
#include "paddle/fluid/framework/tensor_util.h"
#include "paddle/fluid/framework/variable.h"
#include "paddle/fluid/operators/math/math_function.h"
#include "paddle/fluid/platform/place.h"
#include "paddle/fluid/string/printf.h"
#include "paddle/pten/kernels/funcs/math_function.h"

namespace framework = paddle::framework;
namespace platform = paddle::platform;
namespace operators = paddle::operators;
namespace math = paddle::operators::math;
namespace memory = paddle::memory;
namespace distributed = paddle::distributed;

Expand Down
3 changes: 1 addition & 2 deletions paddle/fluid/distributed/test/graph_node_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -36,14 +36,13 @@ limitations under the License. */
#include "paddle/fluid/framework/scope.h"
#include "paddle/fluid/framework/tensor_util.h"
#include "paddle/fluid/framework/variable.h"
#include "paddle/fluid/operators/math/math_function.h"
#include "paddle/fluid/platform/place.h"
#include "paddle/fluid/string/printf.h"
#include "paddle/pten/kernels/funcs/math_function.h"

namespace framework = paddle::framework;
namespace platform = paddle::platform;
namespace operators = paddle::operators;
namespace math = paddle::operators::math;
namespace memory = paddle::memory;
namespace distributed = paddle::distributed;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,13 @@ add_custom_target(eager_final_state_codegen
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${tmp_nodes_h_path} ${nodes_h_path}
VERBATIM
)

set(tmp_python_c_output_path "${PADDLE_SOURCE_DIR}/paddle/fluid/pybind/tmp_eager_final_state_op_function_impl.h")
set(python_c_output_path "${PADDLE_SOURCE_DIR}/paddle/fluid/pybind/eager_final_state_op_function_impl.h")
add_custom_target(eager_final_state_python_c_codegen
COMMAND "${PYTHON_EXECUTABLE}" "${PADDLE_SOURCE_DIR}/paddle/fluid/eager/auto_code_generator/final_state_generator/python_c_gen.py"
"--api_yaml_path=${api_yaml_path}"
"--output_path=${tmp_python_c_output_path}"
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${tmp_python_c_output_path} ${python_c_output_path}
VERBATIM
)
Loading