merge #5

esythan · 2021-09-26T06:24:59Z

PR types

PR changes

Describe

* Change uts to nightly mode * remove test_trt_pool_op from parallel_UT_rule.py,test=document_fix

Update git clone model-benchmark CI

…d by allocator。 (#35427) * Add empty_cache api to release idle gpu memory hold by allocator,test=develop * Add empty_cache api to release idle gpu memory hold by allocator,test=develop * Add empty_cache api to release idle gpu memory hold by allocator,test=develop * Fix test coverage problem for empty_cache * delete redundant check for empty_cache * fix the problem of empty_cache's doc * delete the nvidia-smi comment in doc of empty_cache, test=document_fix

* fix raw optimizer gm * update * update ut

add fill_ backward

* dy2stat_error: add revise suggestion for two error cases * fix test_error * fix review

…_USE_STANDALONE_EXECUTOR (#35628) * Intergrate StandaloneExecutor in Static.Executor Interface with FLAGS_USE_STANDALONE_EXECUTOR * Enhance unittest and clean code in StandaloneExecutor * polish unittest

* Add solutions to PyLayer which is unsupported in DataParallel * modify note format for parallel.py * modify docs of dataparallel * add docs of dp with pylayer * modify docs format * modify example format * change example of dp with pylayer * add unittest for dp with pylayer * modify ut * merge latest codes * update * modify for CI-Coverage * modify text-indent

* [NPU] fix npu pr

* add_reshape_teller * add_reshape_teller * add_reshape_teller * add_reshape_teller * add_reshape_teller * add_reshape_teller * add_reshape_teller * add_reshape_teller * add_reshape_teller * add_reshape_teller * add_reshape_teller * add_reshape_teller * add_reshape_teller * add_reshape_teller * add_reshape_teller

* add_shuffle_channel * add_shuffle_channel * add_shuffle_teller * add_shuffle_teller * add_shuffle_channel * add_shuffle_channel

* add_skip_layernorm * add_skip_layernorm * add_skip_layernorm * add_skip_layernorm * add_skip_layernorm * add_skip_layernorm * add_skiplayernorm_teller * add_skip_layernorm * add_skip_layernorm_teller * add_skip_layernorm_teller

* add_softmax_teller * add_softmax_teller * add_softmax_teller * add_softmax_telller * add_softmax_teller

* add_bn_ * add_bn_teller * add_bn_teller * add_bn_teller * add_bn_teller

* add_stack_teller * add_stack_teller * add_stack_teller * add_stack_teller

* Add linalg.solve op, test=develop * Fix a bug caused by accidental deletion * updated description and fix a bug: missing a comma * Add linalg.solve op, test=develop * updated solve op backward logic * updated solve op backward logic again * Add linalg.solve Op, test=develop * Updated and modified to fit CI requirements * Fix a bug * 1)Add more test cases; 2)Fix a wrong usage in reduces operation; 3)Remove redundant code * Remove redundant comments * 1)Removed redundant code; 2)Updated to enhance code robustness * Removed redundant code * Updated API documents

* fix pad tuple * fix format

* - candidate fix * - More fixes to #34554 * - another incosnstent fix to key * - Remvoed unneeded line * - matching the cache behaviour to other ops

* add pool2d convert test * modify error * modify error * modify error * modify error * modify error * modify error

* Add elementwise_sub_mkldnn_op without grad * Add test to static_mode_white_list * Refactor code, change license years * Remove invalid grad implementation * Fix element_wise_sub_op test * Fix CI Approval error * Remove unnecessary EltwiseSubMKLDNNGradKernel class * Fix CI Approval 2 * Fix CI Approval 3 * Fix CI Approval Attempt #4 * Fix CI Approve Attempt #5 * Fix CI Approval Attempt #6 * Fix CI Approval Attemt #7 * Change test names containing add to sub * Fix old tests testing add instead of sub * Copy grad implementation from elementwise_add_mkldnn * CI test fix attempt * Revert "CI test fix attempt" This reverts commit c647cacf41e6a87c715385a185de5cbf65fc8900. * Fix CI attempt 2 * Fix elementwise_sub tests, temporary mkldnn broadcast test disable * Add working implementation of elementwise_sub grad * Fix build errors caused by pull * Fix format error * Fix format error 2 * Disable elementwise_sub_mkldnn test on GPU * Apply fix for paddle.fluid import * Revert changes of test_elementwise_sub and Fix mkldnn test * Revert "Apply fix for paddle.fluid import" This reverts commit fc3b122. * fix bug of module 'paddle' has no attribute 'fluid' for python3.6 (#35862) * Add changes suggested by reviewers * Change @unittest.skipIf... to @OpTestTool.skip_if_not_cpu_bf16() to satisfy Approval CI * Remove check_dygraph=False to satisify CI Approval Co-authored-by: zhangbo9674 <82555433+zhangbo9674@users.noreply.github.com>

* add gradient kernel of det op and slogdet op * fix CI APPROVAL problem

… API (#35743)

* Add New Op: gumbel_softmax * Add New Op: gumbel_softmax * Add New Op: gumbel_softmax (amend) * add __main__ function in unit test * fix bugs when test in windows ci * update en docs * delete reletive error in unit test * delete relative error in unit test * set hard=True in unit test * Support fix seed in Python for test

…sure rules (#35916)

* update fft api path (PaddlePaddle#36219) * update fft api path * add sample code for ihfft2 Co-authored-by: chenfeiyu <chenfeiyu@baidu.com> * fix fft axis (PaddlePaddle#36321) fix: `-1` is used when fft's axis is `0` * use unified external error message for cufft api (PaddlePaddle#36114) * fft: modify sample code result (PaddlePaddle#36325) * dynamic load mkl as a fft backend when it is avaialble and requested (PaddlePaddle#36414) * add rocm support for fft api (PaddlePaddle#36415) * move signal apis * move fft and signal API path (#2) * move signal apis * move fft.py and signal.py to paddle/, fix typos * fix relative imports from fft.py and signal.py * fix typos in signal.py (#3) * move signal apis * move fft.py and signal.py to paddle/, fix typos * fix relative imports from fft.py and signal.py * fix typos * disable Cache when CUFFT_VERSION >= 10200 (#4) * move signal apis * move fft.py and signal.py to paddle/, fix typos * fix relative imports from fft.py and signal.py * fix typos * Add LRUCache for fft plans * add LRUCache for cuff and hipfft (#5) * move signal apis * move fft.py and signal.py to paddle/, fix typos * fix relative imports from fft.py and signal.py * fix typos * WIP: add cache * delete move constructor and operator= for CuFFTHandle and FFTConfig * remove log from CuFFTHandle and FFTConfig * add lrucache for fft rocm backend * disable LRUCache when CUFFT_VERSION >= 10200 * disbale copy and move for hipFFTHandle; format code Co-authored-by: Xiaoxu Chen <chenxx_id@163.com> * remove debug message of cufftHandler * roll_op: support Tensor as input for shifts (PaddlePaddle#36727) * fix fftshift/ifftshift on static mode * update roll_op version * add more test cases for fftshift/ifftshift Co-authored-by: zhiboniu <31800336+zhiboniu@users.noreply.github.com> Co-authored-by: chenfeiyu <chenfeiyu@baidu.com> Co-authored-by: LJQ❤️ <33169170+lijiaqi0612@users.noreply.github.com>

…t=allcases (PaddlePaddle#38632) * Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor * Modified framework::Tensor to inherit from DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes * Rearranged cfunction calls from tensor.data<void>() to tensor.data() * Fixed CI issues * Fixed lite issues * Fixed data() interface issues,test=allcases * Resolved IsInitialized() issues * Fixed ResetHolder() issues * Fixed MKLDNN & Storage issues * Resolved ShareBufferWith() issues * Fixed LoD issues

…lePaddle#39085) * updates callers, test=develop * updates tensor, test=develop * fixes errors, test=develop * remove some dtypes, test=develop * fix errors in the base storage modification, test=develop * fixes a bug, test=develop * fixes the bugs in push the whole, test=develop * updates, test=develop * update * update, test=develop * fixes the mac-py3 CI, test=develop * remove the storage impl, test=develop * updates some codes, test=develop * update, test=develop * updates pten allocation, test=develop

…ddle#39236) * Added selected_rows and rw_lock to pten * Renamed the unit test target to fix CI * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid * Remove rw_lock.h,rw_lock_test.cc in fluid * Use pten::RWLock and pten::AutoRDLock, fix CI * Use pten::SelectedRows * Use pten::SelectedRows * Fix to pass NPU CI * Selected_Rows inherits from TensorBase * Use pten::SelectedRows, to pass NPU CI * To fix NPU CI * To fix NPU CI again * Use paddle/pten/core/enforce and polish code * Use pten::DataType instead of using proto_type * Move part of data_type to pten * Polish Code

…sed to paddle.grad() (PaddlePaddle#41198) * [Refactor] refactored eager_gen.py PR #2 * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes * Fixed minor issue * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition * Fixed issues * Supported higher-order grad node generation * [DoubleGrad PR #4] Supported higher-order GradNode generation * [DoubleGrad #4] Bug Fixes to Double Grad Node Generation * Fixed yaml typo * Fixed yaml typo * fixed minor issues * [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad() * Fixed minor issue * Fixed CI-Inference issue * Fixed CI-inference issues

…rd run (PaddlePaddle#41306) * [Refactor] refactored eager_gen.py PR #2 * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes * Fixed minor issue * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition * Fixed issues * Supported higher-order grad node generation * [DoubleGrad PR #4] Supported higher-order GradNode generation * [DoubleGrad #4] Bug Fixes to Double Grad Node Generation * Fixed yaml typo * Fixed yaml typo * fixed minor issues * [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad() * Fixed minor issue * Fixed CI-Inference issue * Fixed CI-inference issues * [DoubleGrad PR #7] paddle.grad() to copy backward graph before backward run * Fixed minor issues * Fixed issue with backward graph construction logic * Fixed implementation issues with backward graph reconstruction * Fixed unittest issue * Fixed issues

…ePaddle#41387) * [Refactor] refactored eager_gen.py PR #2 * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes * Fixed minor issue * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition * Fixed issues * Supported higher-order grad node generation * [DoubleGrad PR #4] Supported higher-order GradNode generation * [DoubleGrad #4] Bug Fixes to Double Grad Node Generation * Fixed yaml typo * Fixed yaml typo * fixed minor issues * [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad() * Fixed minor issue * Fixed CI-Inference issue * Fixed CI-inference issues * [DoubleGrad PR #7] paddle.grad() to copy backward graph before backward run * Fixed minor issues * Fixed issue with backward graph construction logic * Fixed implementation issues with backward graph reconstruction * Fixed unittest issue * Fixed issues * [DoubleGrad PR #8] Enabled triple grads for sigmoid and matmul * Fixed issues with phi kernel * Added triple grad test case * Fixed minor issue

XieYunshen and others added 30 commits September 13, 2021 19:59

Change uts to nightly mode (#35541)

2b0f9b5

* Change uts to nightly mode * remove test_trt_pool_op from parallel_UT_rule.py,test=document_fix

fix bug for added ut check (#35539)

7b0c206

fix launch util trainer rank function; test=develop (#35610)

a6ac4e8

fix bug, test=document_fix (#35697)

97a73e1

slice_op support bool tensor. (#35586)

f5e430c

[Inference] Add tuned trt_dynamic_shape mode. (#34806)

7c96efe

Update git clone model-benchmark CI (#35643)

174535b

Update git clone model-benchmark CI

experimental feature: error status, test=develop (#35624)

657a8c8

Fix RawProgramOptimizer bug (#35704)

0f74188

* fix raw optimizer gm * update * update ut

fix random seed to avoid UT random failed (#35699)

cb7d859

add paddle.Tensor api fill_(inplace), zero_(inplace) (#33829)

efeec79

add fill_ backward

trt ut add serialize and deserialize (#35645)

bf983c2

[dy2stat_error] add revise suggestion for two error cases (#35648)

0b8664e

* dy2stat_error: add revise suggestion for two error cases * fix test_error * fix review

Intergrate StandaloneExecutor in Static.Executor Interface with FLAGS…

4bc0853

…_USE_STANDALONE_EXECUTOR (#35628) * Intergrate StandaloneExecutor in Static.Executor Interface with FLAGS_USE_STANDALONE_EXECUTOR * Enhance unittest and clean code in StandaloneExecutor * polish unittest

[NPU] fix npu pr, test=develop (#35709)

f0661a1

* [NPU] fix npu pr

fix conv op check (#35693)

e46ffaf

[hybrid performance] Optimize Pipeline Scheduler (#35680)

04fdb10

fix GradientClipByGlobalNorm in hybrid parallel (#35691)

598d32d

Add activation test (#35489)

bda154d

add test (#35710)

ccf5b80

add test (#35568)

5e153bf

add teste (#35311)

627bd88

[Paddle Inference]Add shuffle channel TRT converter unittest. (#35228)

ce220c2

* add_shuffle_channel * add_shuffle_channel * add_shuffle_teller * add_shuffle_teller * add_shuffle_channel * add_shuffle_channel

[Paddle Inference]Add softmax op TRT converter unittest. (#35263)

e93228e

* add_softmax_teller * add_softmax_teller * add_softmax_teller * add_softmax_telller * add_softmax_teller

[Paddle Inference]Add BN op TRT converter unittest (#35527)

39bc7ea

* add_bn_ * add_bn_teller * add_bn_teller * add_bn_teller * add_bn_teller

[Paddle Inference]Add stack op TRT converter unittest (#35531)

fdd069f

* add_stack_teller * add_stack_teller * add_stack_teller * add_stack_teller

veyron95 and others added 21 commits September 24, 2021 12:21

update lite branch (#36010)

17adcf6

fix pad tuple (#35985)

0c0817c

* fix pad tuple * fix format

concat api support empty tensor. (#35845)

eb28a36

fix undefined var in test_batch_sampler. test=develop (#35924)

4f42e5d

[oneDNN] candidate fix to #34554 (#35884)

485b387

* - candidate fix * - More fixes to #34554 * - another incosnstent fix to key * - Remvoed unneeded line * - matching the cache behaviour to other ops

add pool2d convert test (#35923)

82f255d

* add pool2d convert test * modify error * modify error * modify error * modify error * modify error * modify error

add update (#36017)

1691dc7

add gradient kernel of det op and slogdet op (#36013)

b91e8ee

* add gradient kernel of det op and slogdet op * fix CI APPROVAL problem

temporarily fix the performance drop of recurrent op (#36052)

372a1a7

[icafe-31094] Add function comments and instructions to the Primitive…

bc0df48

… API (#35743)

Fix model_benchmark CI (#36047)

ac72f97

fix dygraph grad to support high differential (#36059)

e123b87

add doc for two softmax fuse api, test=document_fix (#35943)

9792255

修改了示例代码错误 (#36041)

d70e45d

CPU forward calculation replaces Eigen with Lapack;Modify linalg expo…

7ff226f

…sure rules (#35916)

Fix FPE of label smooth op (#35861)

628ff34

Add a check for multiplex op (#34972)

b430f6a

auto read all public envs from flags_map in paddle_gtest_main (#36057)

3fabc80

esythan merged commit e1f0559 into esythan:develop Sep 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

merge #5

merge #5

esythan commented Sep 26, 2021

merge #5

merge #5

Conversation

esythan commented Sep 26, 2021

PR types

PR changes

Describe