update #5

AnnaTrainingG · 2021-05-25T02:10:37Z

PR types

PR changes

Describe

* fix compile error on jetson platform

* [OPs] Bug fix, fix the segment mean for illegal syncthreads usage. (#32596) (#32610) * [OPs] Bug fix, fix the segment mean for illegal syncthreads usage. * remove packages in __all__ * create new public api level paddle.callbacks;paddle.hub;paddle.utils.unique_name Co-authored-by: Zhong Hui <zhonghui.net@gmail.com>

* add raw program, test=develop

* [Paddle-TRT]fix fc_op * [Paddle-TRT]fix fc_op * [Paddle-TRT]fix fc_op * test_trt_subgraph_pass.py * fix elementwise_op * fix elementwise_op * fix elementwise_op * fix elementwise_op.cc * op_teller.cc

…utor (#32792)

) Remove np Deprecation Warning since `np.bool` is alias of `bool` The warning report from test: ``` 2021-04-30 15:29:32 /workspace/Paddle/build/python/paddle/fluid/framework.py:689: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here. 2021-04-30 15:29:32 Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations 2021-04-30 15:29:32 elif dtype == np.bool: 2021-04-30 15:29:32 /workspace/Paddle/build/python/paddle/fluid/layers/utils.py:77: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working 2021-04-30 15:29:32 return (isinstance(seq, collections.Sequence) and 2021-04-30 15:29:32 /workspace/Paddle/build/python/paddle/fluid/tests/unittests/test_cond.py:99: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here. ```

* refine update_loss_scaling npu kernel * add mutable_data * change Zerolike op to MemcpyAsync * delete useless code * add found_inf_vec * add memcpy if not finite * fix unittest

* add c_identity_op_npu

* add trainprofiler for heterps in oneps; test=develop * add set_use_ps_gpu; test=develop

This reverts commit 92adece.

* pslib with cmake * heter util * vlog * heter server test * add dtor * cmake

* fix model_bhecnmark ci * fix model_bhecnmark ci

* optimize softmax with cross entropy hard label * label ignore_index cleaning

* replace complex in set tensor from and to numpy * replace complex template in cast op

* added support for most matmul cases * added more functionality * full functionality of matmul op, fp32 only * added bf16 tests and functionality * added formatting * changes after review * minor change * added reviewers suggestions

…3060)

…33055) * graph engine demo * upload unsaved changes * fix dependency error * fix shard_num problem * py client * remove lock and graph-type * add load direct graph * add load direct graph * add load direct graph * batch random_sample * batch_sample_k * fix num_nodes size * batch brpc * batch brpc * add test * add test * add load_nodes; change add_node function * change sample return type to pair * resolve conflict * resolved conflict * resolved conflict * separate server and client * merge pair type * fix * resolved conflict * fixed segment fault; high-level VLOG for load edges and load nodes * random_sample return 0 * rm useless loop * test:load edge * fix ret -1 * test: rm sample * rm sample * random_sample return future * random_sample return int * test fake node * fixed here * memory leak * remove test code * fix return problem * add common_graph_table * random sample node &test & change data-structure from linkedList to vector * add common_graph_table * sample with srand * add node_types * optimize nodes sample * recover test * random sample * destruct weighted sampler * GraphEdgeBlob * WeightedGraphEdgeBlob to GraphEdgeBlob * WeightedGraphEdgeBlob to GraphEdgeBlob * pybind sample nodes api * pull nodes with step * fixed pull_graph_list bug; add test for pull_graph_list by step * add graph table;name * add graph table;name * add pybind * add pybind * add FeatureNode * add FeatureNode * add FeatureNode Serialize * add FeatureNode Serialize * get_feat_node * avoid local rpc * fix get_node_feat * fix get_node_feat * remove log * get_node_feat return py:bytes * merge develop with graph_engine * fix threadpool.h head * fix * fix typo * resolve conflict * fix conflict * recover lost content * fix pybind of FeatureNode * recover cmake * recover tools * resolve conflict * resolve linking problem * code style * change test_server port * fix code problems * remove shard_num config * remove redundent threads * optimize start server * remove logs * fix code problems by reviewers' suggestions * move graph files into a folder * code style change * remove graph operations from base table * optimize get_feat function of graph engine * fix long long count problem * remove redandunt graph files * remove unused shell * recover dropout_op_pass.h * fix potential stack overflow when request number is too large & node add & node clear & node remove Co-authored-by: Huang Zhengjie <270018958@qq.com> Co-authored-by: Weiyue Su <weiyue.su@gmail.com> Co-authored-by: suweiyue <suweiyue@baidu.com> Co-authored-by: luobin06 <luobin06@baidu.com> Co-authored-by: liweibin02 <liweibin02@baidu.com> Co-authored-by: tangwei12 <tangwei12@baidu.com>

* - bump up oneDNN to 2.2.2 (should reduce perf drops of mobilenet) * - more recnet onednn 2.2.2 (some more bugfixes)

This reverts commit 0e5d832.

* fix pipeline * fix mp pp dp * fix utest of hybrid parallel * add utest for tuple

* modify conj, real, imag OP to complex template * replace with complex template to dot Op * replace with complex template to Abs Op * add support for complex64 and complex128

* Add elementwise_sub_mkldnn_op without grad * Add test to static_mode_white_list * Refactor code, change license years * Remove invalid grad implementation * Fix element_wise_sub_op test * Fix CI Approval error * Remove unnecessary EltwiseSubMKLDNNGradKernel class * Fix CI Approval 2 * Fix CI Approval 3 * Fix CI Approval Attempt #4 * Fix CI Approve Attempt #5 * Fix CI Approval Attempt #6 * Fix CI Approval Attemt #7 * Change test names containing add to sub * Fix old tests testing add instead of sub * Copy grad implementation from elementwise_add_mkldnn * CI test fix attempt * Revert "CI test fix attempt" This reverts commit c647cacf41e6a87c715385a185de5cbf65fc8900. * Fix CI attempt 2 * Fix elementwise_sub tests, temporary mkldnn broadcast test disable * Add working implementation of elementwise_sub grad * Fix build errors caused by pull * Fix format error * Fix format error 2 * Disable elementwise_sub_mkldnn test on GPU * Apply fix for paddle.fluid import * Revert changes of test_elementwise_sub and Fix mkldnn test * Revert "Apply fix for paddle.fluid import" This reverts commit fc3b122. * fix bug of module 'paddle' has no attribute 'fluid' for python3.6 (PaddlePaddle#35862) * Add changes suggested by reviewers * Change @unittest.skipIf... to @OpTestTool.skip_if_not_cpu_bf16() to satisfy Approval CI * Remove check_dygraph=False to satisify CI Approval Co-authored-by: zhangbo9674 <82555433+zhangbo9674@users.noreply.github.com>

…t=allcases (PaddlePaddle#38632) * Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor * Modified framework::Tensor to inherit from DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes * Rearranged cfunction calls from tensor.data<void>() to tensor.data() * Fixed CI issues * Fixed lite issues * Fixed data() interface issues,test=allcases * Resolved IsInitialized() issues * Fixed ResetHolder() issues * Fixed MKLDNN & Storage issues * Resolved ShareBufferWith() issues * Fixed LoD issues

…lePaddle#39085) * updates callers, test=develop * updates tensor, test=develop * fixes errors, test=develop * remove some dtypes, test=develop * fix errors in the base storage modification, test=develop * fixes a bug, test=develop * fixes the bugs in push the whole, test=develop * updates, test=develop * update * update, test=develop * fixes the mac-py3 CI, test=develop * remove the storage impl, test=develop * updates some codes, test=develop * update, test=develop * updates pten allocation, test=develop

…ddle#39236) * Added selected_rows and rw_lock to pten * Renamed the unit test target to fix CI * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid * Remove rw_lock.h,rw_lock_test.cc in fluid * Use pten::RWLock and pten::AutoRDLock, fix CI * Use pten::SelectedRows * Use pten::SelectedRows * Fix to pass NPU CI * Selected_Rows inherits from TensorBase * Use pten::SelectedRows, to pass NPU CI * To fix NPU CI * To fix NPU CI again * Use paddle/pten/core/enforce and polish code * Use pten::DataType instead of using proto_type * Move part of data_type to pten * Polish Code

…sed to paddle.grad() (PaddlePaddle#41198) * [Refactor] refactored eager_gen.py PR #2 * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes * Fixed minor issue * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition * Fixed issues * Supported higher-order grad node generation * [DoubleGrad PR #4] Supported higher-order GradNode generation * [DoubleGrad #4] Bug Fixes to Double Grad Node Generation * Fixed yaml typo * Fixed yaml typo * fixed minor issues * [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad() * Fixed minor issue * Fixed CI-Inference issue * Fixed CI-inference issues

…rd run (PaddlePaddle#41306) * [Refactor] refactored eager_gen.py PR #2 * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes * Fixed minor issue * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition * Fixed issues * Supported higher-order grad node generation * [DoubleGrad PR #4] Supported higher-order GradNode generation * [DoubleGrad #4] Bug Fixes to Double Grad Node Generation * Fixed yaml typo * Fixed yaml typo * fixed minor issues * [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad() * Fixed minor issue * Fixed CI-Inference issue * Fixed CI-inference issues * [DoubleGrad PR #7] paddle.grad() to copy backward graph before backward run * Fixed minor issues * Fixed issue with backward graph construction logic * Fixed implementation issues with backward graph reconstruction * Fixed unittest issue * Fixed issues

…ePaddle#41387) * [Refactor] refactored eager_gen.py PR #2 * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes * Fixed minor issue * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition * Fixed issues * Supported higher-order grad node generation * [DoubleGrad PR #4] Supported higher-order GradNode generation * [DoubleGrad #4] Bug Fixes to Double Grad Node Generation * Fixed yaml typo * Fixed yaml typo * fixed minor issues * [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad() * Fixed minor issue * Fixed CI-Inference issue * Fixed CI-inference issues * [DoubleGrad PR #7] paddle.grad() to copy backward graph before backward run * Fixed minor issues * Fixed issue with backward graph construction logic * Fixed implementation issues with backward graph reconstruction * Fixed unittest issue * Fixed issues * [DoubleGrad PR #8] Enabled triple grads for sigmoid and matmul * Fixed issues with phi kernel * Added triple grad test case * Fixed minor issue

Update USERNAME/paddle

for Benchmark test

chenwhql and others added 30 commits May 7, 2021 11:33

add timeout for queue.get (#32747)

db5eac2

model_benchmark (#32600)

7468253

add other 15 activation ops (#32622)

b2160e7

fix distro (#32771)

816afb9

Fix compile error on jetson platform (#32748)

8ce6b39

* fix compile error on jetson platform

fix distro in manylinux (#32784)

3753416

Add raw program meta optimizer (#32597)

c1c18b0

* add raw program, test=develop

[Paddle-TRT]fix trt-converter-fc_op (#32671)

62d848d

* [Paddle-TRT]fix fc_op * [Paddle-TRT]fix fc_op * [Paddle-TRT]fix fc_op * test_trt_subgraph_pass.py * fix elementwise_op * fix elementwise_op * fix elementwise_op * fix elementwise_op.cc * op_teller.cc

bugfix: parallel_executor for xpu should use BindThreadedSSAGraphExec…

e8e4a9c

…utor (#32792)

[NPU] refine update_loss_scaling npu kernel (#32580)

4628b6f

* refine update_loss_scaling npu kernel * add mutable_data * change Zerolike op to MemcpyAsync * delete useless code * add found_inf_vec * add memcpy if not finite * fix unittest

add c_identity op npu (#32787)

c8affff

* add c_identity_op_npu

【heterps】support cuda11 for heterps; add profiler in oneps (#32640)

beab956

* add trainprofiler for heterps in oneps; test=develop * add set_use_ps_gpu; test=develop

Dynamic amp support sync_batch_norm op (#32770)

23ab01e

make check_op_desc.py support python3 (#32807)

92adece

fix paddle_build bug (#32813)

fd9a236

Revert "make check_op_desc.py support python3 (#32807)" (#32818)

5fc734c

This reverts commit 92adece.

update unittest for uint8 problem (#32790)

e357cfd

fix npu compile error (#32820)

5aa8faa

[pslib] pslib with cmake (#32800)

fbbc339

* pslib with cmake * heter util * vlog * heter server test * add dtor * cmake

Support different data type between input and output (#32823)

3419de5

modify en_doco of spectral norm test=document_fix (#32812)

1eb59ef

fix ce bug in label value, test=develop

400eb9d

fix ci bug

e2c293f

fix ci coverage

6cd96c1

imporve efficiency

ef7e5fc

fix ci coverage bug

e1ea895

add weigth data to unit test

9495211

add ignore_index for test case

9cdf6bd

zhiqiu and others added 17 commits May 21, 2021 13:48

paddle.to_tensor supports LoDTensor (#33027)

a85eddd

fix model_benchmark ci (#33035)

0e5d832

* fix model_bhecnmark ci * fix model_bhecnmark ci

optimize softmax with cross entropy hard label (#32290)

7be6191

* optimize softmax with cross entropy hard label * label ignore_index cleaning

add method for enhance pass,test=develop (#33004)

79ed717

replace complex64/128 with complex template in cast Op (#33019)

79d918d

* replace complex in set tensor from and to numpy * replace complex template in cast op

Added oneDNN matmul grad BF16/FP32 kernel (#32968)

e2a3a6f

* added support for most matmul cases * added more functionality * full functionality of matmul op, fp32 only * added bf16 tests and functionality * added formatting * changes after review * minor change * added reviewers suggestions

refine conv2d doc (#33045)

a6dc68b

Support OutType tmeplate argument in elementwise_broadcast branch (#3…

d6aea4a

…3060)

open launch ps test=develop (#33044)

d0d5586

[oneDNN] bump up oneDNN to 2.2.2 (#32685)

b8e4ec7

* - bump up oneDNN to 2.2.2 (should reduce perf drops of mobilenet) * - more recnet onednn 2.2.2 (some more bugfixes)

enhance unittest for yolo_box (#33070)

99a11e3

Revert "fix model_benchmark ci (#33035)" (#33080)

6ad5ece

This reverts commit 0e5d832.

[HybridParallel]Fix pipeline in dygraph (#33007)

4920c47

* fix pipeline * fix mp pp dp * fix utest of hybrid parallel * add utest for tuple

Add a new high performance framework for reduce ops (#32697)

88b43b5

Added scale op FP32/BF16 FWD/BWD kernels (#32975)

86ea8dc

modify Ops to complex template (#33041)

5fa44c3

* modify conj, real, imag OP to complex template * replace with complex template to dot Op * replace with complex template to Abs Op * add support for complex64 and complex128

AnnaTrainingG merged commit 8c8717f into AnnaTrainingG:develop May 25, 2021

AnnaTrainingG pushed a commit that referenced this pull request Jun 9, 2022

Merge pull request #5 from PaddlePaddle/develop

a1d92b7

Update USERNAME/paddle

AnnaTrainingG pushed a commit that referenced this pull request Sep 19, 2022

Merge pull request #5 from LielinJiang/benchmark

c56dbd8

for Benchmark test

AnnaTrainingG pushed a commit that referenced this pull request Dec 6, 2023

fix loop cond for num_split=1 (#5)

5ff4bbf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update #5

update #5

AnnaTrainingG commented May 25, 2021

update #5

update #5

Conversation

AnnaTrainingG commented May 25, 2021

PR types

PR changes

Describe