Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

develop #2

Merged
merged 41 commits into from
Aug 23, 2021
Merged

develop #2

merged 41 commits into from
Aug 23, 2021

Conversation

esythan
Copy link
Owner

@esythan esythan commented Aug 23, 2021

PR types

PR changes

Describe

Aurelius84 and others added 30 commits August 19, 2021 09:46
…34922)

* add device_context

* add gtest for device_event_gpu

* Remvoe duplicate DeviceType

* push for test

* add unittest

* fix macros

* fix MSVC using usage
* notest;test=gpu-inference

* notest;test=gpu-inference

* notest;test=gpu-inference

* notest;test=gpu-inference

* fix error

* notest;test=gpu-inference

* notest;test=gpu-inference

* notest;test=gpu-inference

* test=gpu-inference
* fix batch_norm and instance norm when input is []
* add slim resnet50 quant model in pr-ci-inference

* enable resnet50_quant multi_thread4_trt_int8_bz1

* remove LOG(FATAL)
* add npu sin op

* [NPU] Support npu kernel for sin op

* modify support npu kernel for sin op

* modify support npu kernel for sin op

* modify nou sin op

* modify npu sin op

* add sin op npu
* Add run function log

* test=document_fix
* add (N,C,*) input support for GroupNorm

* --amend
* [NPU] Support npu op where and where grad

* fix use const_cast

* delete a test
* add depthwise_conv2d npu

* add some tests

* Delete test_unique_op_npu.py

* delete trans input
* add trainer desc config to distributed strategy

* code style modified

* data_feed set lod
* use spin lock in auto growth allocator, test=develop

* use pthread spin lock, test=develop

* use lock guard, test=develop

* use malloc spin lock, test=develop

* use lock_guard, test=develop
* [NPU] Support npu kernel for pad3d op

* fix for comment of zhouwei25

* fix some bugs according to qili93's comments

* add support and test for paddings in input

* delete VLOG used for debug
* add rmsprop npu

* add argsort npu

* add argsort npu

* modify according to review

* modify sharedatawith according to review

* modify reshape according to review

* rm dygraph=false
* Add cuda device count api

* update coda format

* fix unittest error

* update code format

* update comment
* adamw support cuda

* adamw support cuda
pangyoki and others added 11 commits August 23, 2021 14:40
* Support getitem by Bool index

* delete some debug info of bool index

* support the case that the shape of bool index is different from indexed tensor
Refactor the organization of layer_norm cuda impl so that it can be reused in fused attention op.

    Extract the layer_norm cuda impl form layer_norm_op.cu to layer_norm_kernel.cu.h.
    Define fused/attention_layer_norm.h, which can be used in fused attention op in next PR.
* enable infer_ut on windows

* remove lib calculation & time

* unset http_proxy when download bos file on windows
* - disabled interpolate onednn

* - compilation fix

* - draft of batch_norm cache disabling

* - fixes to UT
@esythan esythan merged commit 69c797a into esythan:develop Aug 23, 2021
esythan pushed a commit that referenced this pull request Sep 30, 2021
esythan pushed a commit that referenced this pull request Nov 23, 2021
* update fft api path (PaddlePaddle#36219)

* update fft api path
* add sample code for ihfft2

Co-authored-by: chenfeiyu <chenfeiyu@baidu.com>

* fix fft axis (PaddlePaddle#36321)

fix: `-1` is used when fft's axis is `0`

* use unified external error message for cufft api (PaddlePaddle#36114)

* fft: modify sample code result (PaddlePaddle#36325)

* dynamic load mkl as a fft backend when it is avaialble and requested (PaddlePaddle#36414)

* add rocm support for fft api (PaddlePaddle#36415)

* move signal apis

* move fft and signal API path (#2)

* move signal apis

* move fft.py and signal.py to paddle/, fix typos

* fix relative imports from fft.py and signal.py

* fix typos in signal.py (#3)

* move signal apis

* move fft.py and signal.py to paddle/, fix typos

* fix relative imports from fft.py and signal.py

* fix typos

* disable Cache when CUFFT_VERSION >= 10200 (#4)

* move signal apis

* move fft.py and signal.py to paddle/, fix typos

* fix relative imports from fft.py and signal.py

* fix typos

* Add LRUCache for fft plans

* add LRUCache for cuff and hipfft (#5)

* move signal apis

* move fft.py and signal.py to paddle/, fix typos

* fix relative imports from fft.py and signal.py

* fix typos

* WIP: add cache

* delete move constructor and operator= for CuFFTHandle and FFTConfig

* remove log from CuFFTHandle and FFTConfig

* add lrucache for fft rocm backend

* disable LRUCache when CUFFT_VERSION >= 10200

* disbale copy and move for hipFFTHandle; format code

Co-authored-by: Xiaoxu Chen <chenxx_id@163.com>

* remove debug message of cufftHandler

* roll_op: support Tensor as input for shifts (PaddlePaddle#36727)

* fix fftshift/ifftshift on static mode

* update roll_op version

* add more test cases for fftshift/ifftshift

Co-authored-by: zhiboniu <31800336+zhiboniu@users.noreply.github.com>
Co-authored-by: chenfeiyu <chenfeiyu@baidu.com>
Co-authored-by: LJQ❤️ <33169170+lijiaqi0612@users.noreply.github.com>
esythan pushed a commit that referenced this pull request Jan 10, 2022
…addlePaddle#38275)

* Replaced pten::LoD with paddle::framework::LoD

* Overrided CPUVector with CUDAVector

* Refactored paddle::framework::Vector
esythan pushed a commit that referenced this pull request Feb 11, 2022
…addlePaddle#39087)

* Renamed selected_rows.* -> selected_rows_utils.*

* Added selected_rows and rw_lock to pten

* Removed useless header

* Renamed the unit test target to fix CI

* Use pten::framework::DDim

* Set selceted_rows_test properties timeout

* Polish code to pten style

Co-authored-by: Chen Weihang <chenweihang@baidu.com>
esythan pushed a commit that referenced this pull request Mar 29, 2022
…rdFunctions and GradNodes (PaddlePaddle#40937)

* [Refactor] refactored eager_gen.py PR #2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue
esythan pushed a commit that referenced this pull request Mar 29, 2022
…enerateForwardDefinition (PaddlePaddle#41016)

* [Refactor] refactored eager_gen.py PR #2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Fixed minor issue
esythan pushed a commit that referenced this pull request Mar 31, 2022
…Paddle#41051)

* [Refactor] refactored eager_gen.py PR #2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR #4] Supported higher-order GradNode generation

* Fixed yaml typo
esythan pushed a commit that referenced this pull request Apr 1, 2022
…e#41121)

* [Refactor] refactored eager_gen.py PR #2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR #4] Supported higher-order GradNode generation

* [DoubleGrad #4] Bug Fixes to Double Grad Node Generation

* Fixed yaml typo

* Fixed yaml typo

* fixed minor issues

* Fixed minor issue
esythan pushed a commit that referenced this pull request Apr 11, 2022
…sed to paddle.grad() (PaddlePaddle#41198)

* [Refactor] refactored eager_gen.py PR #2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR #4] Supported higher-order GradNode generation

* [DoubleGrad #4] Bug Fixes to Double Grad Node Generation

* Fixed yaml typo

* Fixed yaml typo

* fixed minor issues

* [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad()

* Fixed minor issue

* Fixed CI-Inference issue

* Fixed CI-inference issues
esythan pushed a commit that referenced this pull request Apr 11, 2022
…rd run (PaddlePaddle#41306)

* [Refactor] refactored eager_gen.py PR #2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR #4] Supported higher-order GradNode generation

* [DoubleGrad #4] Bug Fixes to Double Grad Node Generation

* Fixed yaml typo

* Fixed yaml typo

* fixed minor issues

* [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad()

* Fixed minor issue

* Fixed CI-Inference issue

* Fixed CI-inference issues

* [DoubleGrad PR #7] paddle.grad() to copy backward graph before backward run

* Fixed minor issues

* Fixed issue with backward graph construction logic

* Fixed implementation issues with backward graph reconstruction

* Fixed unittest issue

* Fixed issues
esythan pushed a commit that referenced this pull request Apr 11, 2022
…ePaddle#41387)

* [Refactor] refactored eager_gen.py PR #2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR #4] Supported higher-order GradNode generation

* [DoubleGrad #4] Bug Fixes to Double Grad Node Generation

* Fixed yaml typo

* Fixed yaml typo

* fixed minor issues

* [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad()

* Fixed minor issue

* Fixed CI-Inference issue

* Fixed CI-inference issues

* [DoubleGrad PR #7] paddle.grad() to copy backward graph before backward run

* Fixed minor issues

* Fixed issue with backward graph construction logic

* Fixed implementation issues with backward graph reconstruction

* Fixed unittest issue

* Fixed issues

* [DoubleGrad PR #8] Enabled triple grads for sigmoid and matmul

* Fixed issues with phi kernel

* Added triple grad test case

* Fixed minor issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.