Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update #4

Merged
merged 214 commits into from
May 7, 2021
Merged

update #4

merged 214 commits into from
May 7, 2021

Commits on Apr 19, 2021

  1. update get_api_md5, using the real api name as the map's key (#32224)

    * get_api_md5 should prefer use the real name rather than the alias names
    
    * case for ArgSpec style. update the unittests
    
    test=document_fix
    wadefelix authored Apr 19, 2021
    Configuration menu
    Copy the full SHA
    21dc044 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    76cb83e View commit details
    Browse the repository at this point in the history
  3. Fix sublayer (#31824)

    * fix sublayer error with include_sublayers=False
    
    * add ut
    
    * refactor include_sublayers related api
    
    * fix ut
    
    * fix ut of transformer
    
    * fix ut of transformer
    
    * remove useless code
    
    * change sublayer api
    
    * polish code
    
    * add test for include_self=True
    JiabinYang authored Apr 19, 2021
    Configuration menu
    Copy the full SHA
    4d69eea View commit details
    Browse the repository at this point in the history
  4. [Hybrid Parallel] Support dp & mp in dygraph (#32323)

    * support dp & mp
    ForFishes authored Apr 19, 2021
    Configuration menu
    Copy the full SHA
    ffd4086 View commit details
    Browse the repository at this point in the history
  5. [NPU] cherry-pick gc/dataloader/save&load/optimization from ascendrc …

    …to develop (#32294)
    
    * [NPU] support GarbageCollector for npu (#31874)
    
    * support GarbageCollector for npu
    
    * fix typo
    
    * fix gather_grad
    
    * disable NPUDefaultStreamGarbageCollector on NPU
    
    * [NPU] support npu for memcpy op (#31808)
    
    * support npu for memcpy op
    
    * add ut
    
    * fix ut
    
    * fix typo
    
    * 【NPU】fix bug of using temp vector (#31963)
    
    * fix bug when beta1_pow on cpu (#31995)
    
    * [NPU] support npu profiler (#31684)
    
    * support npu profiler
    
    * add python api
    
    * fix bugs
    
    * add wrapper for incomplete type
    
    * update profile proto
    
    * record npu wait
    
    * add xpu placeholder
    
    * fix adam (#32016)
    
    * [NPU] enable async copy and  add wait before sync operation (#31956)
    
    * enable async copy and  add wait before sync operation
    
    * remove unneccessary wait
    
    * add FillNpuTensorWithConstant
    
    * refine
    
    * fix fill_constant
    
    * make TensorFromVector/TensorToVector sync
    
    * [NPU] Support dataloader on npu place. (#31867)
    
    * [NPU] Wait on NPUPlace (#32086)
    
    * [NPU] fix cast op (#32121)
    
    * fix npu kernel of cast op to handle casting to same dtype
    
    * add comments
    
    * [NPU] support cann 20.3 (#32044)
    
    * fix compile problem on cann 20.3
    
    * fix ut
    
    * fix test_mul
    
    * fix check_finite_and_scale
    
    * fix lookup_table_v2_grad
    
    * fix cmake
    
    * support print op
    
    * [NPU] Support npu save load (#31893)
    
    * support save load for NPU
    
    * add save load npu unittest
    
    * support np.array transform in NPU
    
    * fix errors
    
    * delete dygraph in unittest
    
    * add Wait
    
    * fix unittest
    
    * fix review comment
    
    * fix unittest problem
    
    * fix little problem
    
    * change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performance (#32196)
    
    * change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performace
    
    * refine code
    
    * fix NPUDeviceContext in all c++ unittest (#32198)
    
    * fix NPUDeviceContext in all c++ unittest
    
    * refine log
    
    Co-authored-by: pangyoki <pangyoki@126.com>
    
    * [NPU] Remove TensorFromVector and avoid sync copy in npu op kernel for better performance (#31994)
    
    * enable async copy and  add wait before sync operation
    
    * remove unneccessary wait
    
    * add FillNpuTensorWithConstant
    
    * refine
    
    * fix fill_constant
    
    * change TensorFromVector to FillNpuTensorWithConstant
    
    * fix ignored api
    
    * delete extra unittest
    
    * fix little error
    
    * fix update_loss_scaling_op_npu and check_finite_and_unscale_op_npu
    
    * change TensorCopySync to TensorCopy
    
    * delete useless Wait and add StreamWait
    
    * fix npu_stream error
    
    * fix check_finite_and_unscale_op_npu TensorCopy
    
    * only save stream wait
    
    * fix NPUDeviceContext in all c++ unittest
    
    * delete wait
    
    Co-authored-by: zhiqiu <chenqiuliang@baidu.com>
    
    * delete useless unittest file (#32206)
    
    * Fix op test (#32231)
    
    * fix conditional block (#32243)
    
    * fix adam bug again (#32246)
    
    * fix compile
    
    * fix ut
    
    * fix ut
    
    Co-authored-by: liym27 <33742067+liym27@users.noreply.github.com>
    Co-authored-by: pangyoki <pangyoki@126.com>
    3 people authored Apr 19, 2021
    Configuration menu
    Copy the full SHA
    cbe5c9f View commit details
    Browse the repository at this point in the history
  6. add npu check nan and inf (#32340)

    add npu check nan and inf (#32340)
    Baibaifan authored Apr 19, 2021
    Configuration menu
    Copy the full SHA
    1e3a94b View commit details
    Browse the repository at this point in the history

Commits on Apr 20, 2021

  1. Configuration menu
    Copy the full SHA
    f0cc188 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    43926c8 View commit details
    Browse the repository at this point in the history
  3. fix the bug that the error message is not displayed on mac ci (#32367)

    * test for mac task,notest,test=mac_py3
    
    * fix the bug that the error message is not displayed
    XieYunshen authored Apr 20, 2021
    Configuration menu
    Copy the full SHA
    0dd28b8 View commit details
    Browse the repository at this point in the history
  4. [heterps] optimize build task (#32358)

    * build task cost
    
    * return pool
    Thunderbrook authored Apr 20, 2021
    Configuration menu
    Copy the full SHA
    c09d645 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    f6f59e5 View commit details
    Browse the repository at this point in the history
  6. save/load program (#32336)

    hbwx24 authored Apr 20, 2021
    Configuration menu
    Copy the full SHA
    e0a52fd View commit details
    Browse the repository at this point in the history
  7. [Sharding]: update config DOC (#32299)

    * sharding: update config DOC
    
    * update pipeline config
    
    * sharding update doc
    JZ-LIANG authored Apr 20, 2021
    Configuration menu
    Copy the full SHA
    e348901 View commit details
    Browse the repository at this point in the history
  8. add paddle.nn.unfold #32297 (#32298)

    * add paddle.nn.unfold
    * update Parameters of Unfold
    lzzyzlbb authored Apr 20, 2021
    Configuration menu
    Copy the full SHA
    186682f View commit details
    Browse the repository at this point in the history
  9. [Optimize]SparseKV speedup and memory save (#32048)

    Change-Id: Ie35a09772e46f7d90cb68ca82c1d18b9201d1abe
    
    * large scale kv store optimize
    
    Change-Id: I582cc661afdaa20749ec7493eae1b88c32b967f7
    
    * replace std::unorded_map with roundrobin map
    
    Change-Id: I48ee0efef38853876c92d982cdfcac6603c52c88
    
    * remove license
    
    * fix cpp lint
    
    Change-Id: Ia21fafa65adc09bb9094f7dbc987e31d5af2686e
    seiriosPlus authored Apr 20, 2021
    Configuration menu
    Copy the full SHA
    5e7e7c9 View commit details
    Browse the repository at this point in the history

Commits on Apr 21, 2021

  1. remove fluid for auto_checkpoint. (#32157)

    * remove fluid for auto_checkpoint.
    
    * fix bug.
    xiemoyuan authored Apr 21, 2021
    Configuration menu
    Copy the full SHA
    1593ee2 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ead8342 View commit details
    Browse the repository at this point in the history
  3. add retry on gcda_clean.py (#32318)

    * add retry on gcda_clean.py
    
    * add exit code for paddle_coverage.sh
    
    * fix format error
    
    * fix format error
    XieYunshen authored Apr 21, 2021
    Configuration menu
    Copy the full SHA
    229f930 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    a2cbbe8 View commit details
    Browse the repository at this point in the history
  5. add test=develop (#32380)

    gongweibao authored Apr 21, 2021
    Configuration menu
    Copy the full SHA
    4898c38 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    5d19f8d View commit details
    Browse the repository at this point in the history
  7. flush denormal in the tracer op, test=develop (#32350)

    * flush denormal in the tracer op, test=develop
    
    * add cmake dependencies, test=develop
    
    * add a macro, test=develop
    
    * fix the windows case, test=develop
    Shixiaowei02 authored Apr 21, 2021
    Configuration menu
    Copy the full SHA
    9ff8556 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    2194ad1 View commit details
    Browse the repository at this point in the history
  9. remove thrust include files (#32395)

    * remove thrust includes, test=develop
    
    * fix compilation error, test=develop
    
    * fix compilation of truncated_gaussian_random_op, test=develop
    Avin0323 authored Apr 21, 2021
    Configuration menu
    Copy the full SHA
    ab6f874 View commit details
    Browse the repository at this point in the history
  10. [NPU] register npu finalize on exit (#32390)

    * [NPU] register finalize on exit
    
    * fix
    zhiqiu authored Apr 21, 2021
    Configuration menu
    Copy the full SHA
    8e4c193 View commit details
    Browse the repository at this point in the history
  11. optimize get-feat function of graph engine (#32261)

    * graph engine demo
    
    * upload unsaved changes
    
    * fix dependency error
    
    * fix shard_num problem
    
    * py client
    
    * remove lock and graph-type
    
    * add load direct graph
    
    * add load direct graph
    
    * add load direct graph
    
    * batch random_sample
    
    * batch_sample_k
    
    * fix num_nodes size
    
    * batch brpc
    
    * batch brpc
    
    * add test
    
    * add test
    
    * add load_nodes; change add_node function
    
    * change sample return type to pair
    
    * resolve conflict
    
    * resolved conflict
    
    * resolved conflict
    
    * separate server and client
    
    * merge pair type
    
    * fix
    
    * resolved conflict
    
    * fixed segment fault; high-level VLOG for load edges and load nodes
    
    * random_sample return 0
    
    * rm useless loop
    
    * test:load edge
    
    * fix ret -1
    
    * test: rm sample
    
    * rm sample
    
    * random_sample return future
    
    * random_sample return int
    
    * test fake node
    
    * fixed here
    
    * memory leak
    
    * remove test code
    
    * fix return problem
    
    * add common_graph_table
    
    * random sample node &test & change data-structure from linkedList to vector
    
    * add common_graph_table
    
    * sample with srand
    
    * add node_types
    
    * optimize nodes sample
    
    * recover test
    
    * random sample
    
    * destruct weighted sampler
    
    * GraphEdgeBlob
    
    * WeightedGraphEdgeBlob to GraphEdgeBlob
    
    * WeightedGraphEdgeBlob to GraphEdgeBlob
    
    * pybind sample nodes api
    
    * pull nodes with step
    
    * fixed pull_graph_list bug; add test for pull_graph_list by step
    
    * add graph table;name
    
    * add graph table;name
    
    * add pybind
    
    * add pybind
    
    * add FeatureNode
    
    * add FeatureNode
    
    * add FeatureNode Serialize
    
    * add FeatureNode Serialize
    
    * get_feat_node
    
    * avoid local rpc
    
    * fix get_node_feat
    
    * fix get_node_feat
    
    * remove log
    
    * get_node_feat return  py:bytes
    
    * merge develop with graph_engine
    
    * fix threadpool.h head
    
    * fix
    
    * fix typo
    
    * resolve conflict
    
    * fix conflict
    
    * recover lost content
    
    * fix pybind of FeatureNode
    
    * recover cmake
    
    * recover tools
    
    * resolve conflict
    
    * resolve linking problem
    
    * code style
    
    * change test_server port
    
    * fix code problems
    
    * remove shard_num config
    
    * remove redundent threads
    
    * optimize start server
    
    * remove logs
    
    * fix code problems by reviewers' suggestions
    
    * move graph files into a folder
    
    * code style change
    
    * remove graph operations from base table
    
    * optimize get_feat function of graph engine
    
    Co-authored-by: Huang Zhengjie <270018958@qq.com>
    Co-authored-by: Weiyue Su <weiyue.su@gmail.com>
    Co-authored-by: suweiyue <suweiyue@baidu.com>
    Co-authored-by: luobin06 <luobin06@baidu.com>
    Co-authored-by: liweibin02 <liweibin02@baidu.com>
    Co-authored-by: tangwei12 <tangwei12@baidu.com>
    7 people authored Apr 21, 2021
    Configuration menu
    Copy the full SHA
    2b68d20 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    37bb334 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    3da2c7f View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    661a1f6 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    7bae5e9 View commit details
    Browse the repository at this point in the history
  16. fix bug in amp O2 (#32343)

    huangxu96 authored Apr 21, 2021
    Configuration menu
    Copy the full SHA
    4be3b05 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    bc90916 View commit details
    Browse the repository at this point in the history
  18. 【NPU】Merge NPU ccl code (#32381)

    * add allreduce and broadcast without test (#31024)
    
    add allreduce and broadcast without test
    
    * Refactor HCCLCommContext to be compatible with Paddle (#31359)
    
    Refactor HCCLCommContext to be compatible with Paddle (#31359)
    
    * [NPU] add npu kernel for communication op (#31437)
    
    * add allreduce and broadcast without test
    
    * add c_broadcast_test case
    
    * build c_comm_init and c_create_group operators
    
    * make the whole thing compile
    
    * add broadcast and init op test case but run failed
    
    * make unit test compile
    
    * fix broadcast test bug and change into hcom for ccl
    
    * change c_comm_init and c_create_group ops accordingly
    
    * make tests compile
    
    * transfer code to 27
    
    * compiled successfully in 28, but run failed
    
    * test broadcast in 28, but failed
    
    * make hcom primitives work
    
    * change hccl data type for base.h
    
    * fix broadcast bug
    
    * make attributes work
    
    * fix group name bug
    
    * add allreduce but test failed
    
    * allreduce bug for qiuliang
    
    * allreduce finished
    
    * add allgather and reducescatter
    
    * merge all op code
    
    * add allgather test
    
    * finish run all ccl op test exclude send/recv
    
    * all all op and test exclude send/recv
    
    * send_v2_npu.cc recv_v2_npiu.cc compiled
    
    * fix ccl core dump bug and test allgather, reducescatter, broadcast op
    
    * fix allreduce bug just for test
    
    * hcom send&recv test pass, without hcom_destroy
    
    * for qiuliang test
    
    * Ascend Send&Recv Test Pass
    
    * all op (ex send/recv) ok
    
    * fix bug
    
    * merge all ccl op
    
    * style merge to PaddlePaddle
    
    * merge style
    
    * new merge style
    
    * merge style 2
    
    * insert an empty at the end
    
    * disable ctest for hcom to pass ci
    
    Co-authored-by: void-main <voidmain1313113@gmail.com>
    Co-authored-by: f2hkop <f2huestc@outlook.com>
    
    * Add auto-increasing tag id for Hcom OPs (#31702)
    
    * add c_reduce_sum op (#31793)
    
    add c_reduce_sum op
    
    * update Ascendrc hccl to 20.3 (#32126)
    
    update Ascendrc hccl to 20.3 (#32126)
    
    * fix merge code
    
    * change cmake.txt1
    
    * [NPU] Support npu kernel for c sync stream op (#31386)
    
    * sync stream npu op
    
    * add with_ascend_acl
    
    * update c++ unittest
    
    * compile all failed
    
    * try to pre commit
    
    * after pre commit
    
    * merge&compile&test hccl successfully!
    
    * fix code style
    
    * fix code style
    
    * fix bugs about hccl
    
    * fix some bugs
    
    * fix code style
    
    * fix style
    
    * fix style
    
    * fix
    
    * fixed
    
    * merge develop
    
    Co-authored-by: lw921014 <liuwei921014@yeah.net>
    Co-authored-by: Void Main <voidmain1313113@gmail.com>
    Co-authored-by: f2hkop <f2huestc@outlook.com>
    Co-authored-by: xiayanming <41795079@qq.com>
    5 people authored Apr 21, 2021
    Configuration menu
    Copy the full SHA
    c315852 View commit details
    Browse the repository at this point in the history
  19. [HotFix] Add support for optimizer with varbase input (#32362)

    * add support for optimizer with varbase input
    
    * refine cond
    
    * fix failed unittest
    
    * add test for coverage
    chenwhql authored Apr 21, 2021
    Configuration menu
    Copy the full SHA
    b47dd15 View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    bf0ec9b View commit details
    Browse the repository at this point in the history

Commits on Apr 22, 2021

  1. Configuration menu
    Copy the full SHA
    e58c705 View commit details
    Browse the repository at this point in the history
  2. support save/load binary format tensor. (#32211)

    * support save/load binary format tensor
    
    * Fix error when create cudaplace
    
    * Fix error when create cudaplace
    
    * Fix error when create cudaplace
    
    * get devive context from pool.
    
    * move define of 'SerializeToStream' and 'DeserializeFromStream' to 'lod_tensor.cc' and 'selected_rows.cc'.
    
    * improve coverage.
    
    * improve coverage.
    
    * polish API
    
    * deal with conflict
    
    * disable save/load large file in unnittest
    
    * split unnittest.
    hbwx24 authored Apr 22, 2021
    Configuration menu
    Copy the full SHA
    f4d9adc View commit details
    Browse the repository at this point in the history
  3. fix count problem (#32415)

    * graph engine demo
    
    * upload unsaved changes
    
    * fix dependency error
    
    * fix shard_num problem
    
    * py client
    
    * remove lock and graph-type
    
    * add load direct graph
    
    * add load direct graph
    
    * add load direct graph
    
    * batch random_sample
    
    * batch_sample_k
    
    * fix num_nodes size
    
    * batch brpc
    
    * batch brpc
    
    * add test
    
    * add test
    
    * add load_nodes; change add_node function
    
    * change sample return type to pair
    
    * resolve conflict
    
    * resolved conflict
    
    * resolved conflict
    
    * separate server and client
    
    * merge pair type
    
    * fix
    
    * resolved conflict
    
    * fixed segment fault; high-level VLOG for load edges and load nodes
    
    * random_sample return 0
    
    * rm useless loop
    
    * test:load edge
    
    * fix ret -1
    
    * test: rm sample
    
    * rm sample
    
    * random_sample return future
    
    * random_sample return int
    
    * test fake node
    
    * fixed here
    
    * memory leak
    
    * remove test code
    
    * fix return problem
    
    * add common_graph_table
    
    * random sample node &test & change data-structure from linkedList to vector
    
    * add common_graph_table
    
    * sample with srand
    
    * add node_types
    
    * optimize nodes sample
    
    * recover test
    
    * random sample
    
    * destruct weighted sampler
    
    * GraphEdgeBlob
    
    * WeightedGraphEdgeBlob to GraphEdgeBlob
    
    * WeightedGraphEdgeBlob to GraphEdgeBlob
    
    * pybind sample nodes api
    
    * pull nodes with step
    
    * fixed pull_graph_list bug; add test for pull_graph_list by step
    
    * add graph table;name
    
    * add graph table;name
    
    * add pybind
    
    * add pybind
    
    * add FeatureNode
    
    * add FeatureNode
    
    * add FeatureNode Serialize
    
    * add FeatureNode Serialize
    
    * get_feat_node
    
    * avoid local rpc
    
    * fix get_node_feat
    
    * fix get_node_feat
    
    * remove log
    
    * get_node_feat return  py:bytes
    
    * merge develop with graph_engine
    
    * fix threadpool.h head
    
    * fix
    
    * fix typo
    
    * resolve conflict
    
    * fix conflict
    
    * recover lost content
    
    * fix pybind of FeatureNode
    
    * recover cmake
    
    * recover tools
    
    * resolve conflict
    
    * resolve linking problem
    
    * code style
    
    * change test_server port
    
    * fix code problems
    
    * remove shard_num config
    
    * remove redundent threads
    
    * optimize start server
    
    * remove logs
    
    * fix code problems by reviewers' suggestions
    
    * move graph files into a folder
    
    * code style change
    
    * remove graph operations from base table
    
    * optimize get_feat function of graph engine
    
    * fix long long count problem
    
    Co-authored-by: Huang Zhengjie <270018958@qq.com>
    Co-authored-by: Weiyue Su <weiyue.su@gmail.com>
    Co-authored-by: suweiyue <suweiyue@baidu.com>
    Co-authored-by: luobin06 <luobin06@baidu.com>
    Co-authored-by: liweibin02 <liweibin02@baidu.com>
    Co-authored-by: tangwei12 <tangwei12@baidu.com>
    7 people authored Apr 22, 2021
    Configuration menu
    Copy the full SHA
    73d0b0e View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    e727820 View commit details
    Browse the repository at this point in the history
  5. add glu in nn.functional (#32096)

    add glu in nn.functional
    Feiyu Chan authored Apr 22, 2021
    Configuration menu
    Copy the full SHA
    b2ee838 View commit details
    Browse the repository at this point in the history
  6. [HybridParallel] Add ClipGradByGlobalNorm & check_finite_and_unscale …

    …in Dygraph (#32354)
    
    * add clip/check
    
    * add amp & clip grad in dygraph
    
    * add logging
    ForFishes authored Apr 22, 2021
    Configuration menu
    Copy the full SHA
    7ea999f View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    bec4b16 View commit details
    Browse the repository at this point in the history
  8. modify conv2d_transpose docs (#32410)

    * modify conv2d_transpose docs
    wangxinxin08 authored Apr 22, 2021
    Configuration menu
    Copy the full SHA
    1064f2b View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    890d6bc View commit details
    Browse the repository at this point in the history
  10. import sequence_* API to new namespace (#32089)

    * import sequence_* API to new namespace
    
    * fix typos, remove alias marking
    
    * update sample code
    
    * fix sample code
    
    * fix docstring for sequence_mask
    Feiyu Chan authored Apr 22, 2021
    Configuration menu
    Copy the full SHA
    f12c943 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    d03b0b1 View commit details
    Browse the repository at this point in the history
  12. fix doc for adamw (#32438)

    hutuxian authored Apr 22, 2021
    Configuration menu
    Copy the full SHA
    c481570 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    a1a527f View commit details
    Browse the repository at this point in the history
  14. support int32 and int64 kernel for clip operator (#32373)

    support int32 and int64 kernel for clip operator
    wuyefeilin authored Apr 22, 2021
    Configuration menu
    Copy the full SHA
    c332828 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    f8ca5a9 View commit details
    Browse the repository at this point in the history

Commits on Apr 23, 2021

  1. Fix seven error message (#32397)

    * fix two error message
    
    * fix two error message
    
    * fix error
    
    * fix error
    
    * fix error
    
    * fix error
    
    * fix some error message
    
    * fix some error
    
    * fix error
    
    * fix some error
    
    * fix some error
    
    * fix some error
    
    * fix one error
    
    * fix some error
    
    * fix seven error message
    
    * fix error
    
    * fix error
    
    * fix error
    
    * fix error
    Kqnonrime authored Apr 23, 2021
    Configuration menu
    Copy the full SHA
    203ac4f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    49773f3 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    7879477 View commit details
    Browse the repository at this point in the history
  4. disable utest (#32474)

    ForFishes authored Apr 23, 2021
    Configuration menu
    Copy the full SHA
    1dc8393 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    51bcd97 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    b6f8ccd View commit details
    Browse the repository at this point in the history
  7. add c_concat and c_split ops (#32486)

    * add c_concat op
    lilong12 authored Apr 23, 2021
    Configuration menu
    Copy the full SHA
    2b108a0 View commit details
    Browse the repository at this point in the history
  8. solve hccl communicate conflict (#32447)

    solve hccl communicate conflict (#32447)
    Baibaifan authored Apr 23, 2021
    Configuration menu
    Copy the full SHA
    0e74eea View commit details
    Browse the repository at this point in the history
  9. fix Windows CI MP compile and environment install script and openblas…

    … CI (#32378)
    
    * fix Windows CI MP compile and environment install script
    
    * clear Windows CI environment
    
    * clear Windows CI environment
    
    * clear Windows CI environment
    zhwesky2010 authored Apr 23, 2021
    Configuration menu
    Copy the full SHA
    7a681f0 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    1b83de2 View commit details
    Browse the repository at this point in the history
  11. move semantic checks to op_teller (#32279)

    * move semantic checks to op_teller
    
    * more ops
    
    * more ops
    
    * revert block related change
    
    * part1
    
    * revert activation
    
    * remove if
    
    * remove const_cast
    
    * reslove conflict
    
    * remove const_cast
    
    * delete useless var
    
    * replace vlog(1) with vlog(3), replace assert with PADDLE_ENFORCE
    
    * down to 19 files
    b3602sss authored Apr 23, 2021
    Configuration menu
    Copy the full SHA
    7c38114 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    a01b510 View commit details
    Browse the repository at this point in the history
  13. [NPU] refactor check_finite_and_scale npu kernel (#32407)

    * refactor_check_finite_and_scale_npu_kernel
    
    * fix compile
    
    * add alloc_float_status op
    
    * add alloc_float_status op
    
    * add FloatStatus for check_finite_and_unscale
    
    * refine code
    
    * remove unneccessary logic
    
    * refine for fleet
    zhiqiu authored Apr 23, 2021
    Configuration menu
    Copy the full SHA
    39a59dc View commit details
    Browse the repository at this point in the history
  14. Polish ParallelExectuor constructor into small functions (#32191)

    * Refine Constructor logic of ParallelExecutor
    
    * refine function name
    
    * refine code comment
    Aurelius84 authored Apr 23, 2021
    Configuration menu
    Copy the full SHA
    faa8c70 View commit details
    Browse the repository at this point in the history
  15. Ut test conv3d op timeout (#32216)

    * remove ut from parallel_ut_rule caused by timeout
    
    * remove timeout ut from parallel_ut_rule file
    
    * move convert_model2dot_ernie to TWO_PARALLEL_JOB list
    XieYunshen authored Apr 23, 2021
    Configuration menu
    Copy the full SHA
    de94743 View commit details
    Browse the repository at this point in the history
  16. add the c_identity op (#32485)

    * add c_identity op, test=develop
    lilong12 authored Apr 23, 2021
    Configuration menu
    Copy the full SHA
    8fa8a37 View commit details
    Browse the repository at this point in the history
  17. [CustomOp] Remove useless extension headers for old custom op (#32463)

    * remove useless ext headers
    
    * fix boost header compile failed
    chenwhql authored Apr 23, 2021
    Configuration menu
    Copy the full SHA
    7d4998a View commit details
    Browse the repository at this point in the history

Commits on Apr 24, 2021

  1. Configuration menu
    Copy the full SHA
    8beb170 View commit details
    Browse the repository at this point in the history
  2. Fix test_yolov3 Random Failure (#32496)

    Reduce max iter size to fix windows openblas test_yolov3 random failure.
    Decrease batch size to fix pe related unittest random failure.
    zhhsplendid authored Apr 24, 2021
    Configuration menu
    Copy the full SHA
    9bf9092 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    18d3e2c View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    f8caa58 View commit details
    Browse the repository at this point in the history
  5. print the real name for Functions instead of the ArgSpec (#32379)

    * print the real name for Functions instead of the ArgSpec
    
    class function method
    
    * some API's name is not __module__ + __name__
    
    so, we disard them temporarily.
    
    * update the logging format for console
    
    * ommit the top level of paddle package.
    
    * these APIs have been removed.
    
    test=document_fix
    
    * Another Error occerd
    
    * print_signatures.py 's stdout is redirect to spec file, so should not print any other info.
    
    so sad.
    
    * print the error msg to stderr
    
    * disable the __init__ magic method
    
    * update unittest for sampcd_processor.py
    
    update unittest for sampcd_processor.py
    
    * PR-CI-APPROVAL 's python interpreter name is not 'python3'.
    
    it's a python3.9;
    it does not have paddle installed yet.
    
    此句在CI流水线竟然不可以用。报python3找不到
    此句在CI流水线竟然不可以用。因为环境没有安装paddle
    
    * testing only extract api from __all__
    
    paddle module(the top module) does not have __add__
    test=document_fix
    
    * should import paddle here
    
    * update the mechanism of extractiong and executing for the sample-codes test.
    
    更新抽取代码和执行代码的逻辑
    优化输出打印
    
    * good code style
    wadefelix authored Apr 24, 2021
    Configuration menu
    Copy the full SHA
    ef8671e View commit details
    Browse the repository at this point in the history

Commits on Apr 25, 2021

  1. Nne integration (#32255)

    * Add dlnne engine runtime
    
    * Fix log
    
    * Remove <const_cast> and remove unrelated modify with dlnne, +clang-format
    
    * Fix CMakeList format error
    
    * Add copyright message
    
    * Fix dlnne CMakeList.txt
    
    * Add some paddlepaddle_pass to support more networks
    
    * Fix some format bug
    denglin-github authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    feb2e47 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    83580ee View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    136ef09 View commit details
    Browse the repository at this point in the history
  4. [NPU] refine lookup_table_v2_grad npu_kernel (#32497)

    * use ZerosLike instead of NPUMemsetAsync
    
    * fix compile
    zhiqiu authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    fb7590d View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    4db2cc9 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    3b61d06 View commit details
    Browse the repository at this point in the history
  7. let paddle.utils.install_check support CPU package with GPU device (#…

    …32428)
    
    * let paddle.utils.install_check support CPU package with GPU device
    
    * use use_cuda in dygraph checking
    
    * add unittest for install_check
    pangyoki authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    06276f4 View commit details
    Browse the repository at this point in the history
  8. fix tc trt shape (#32458)

    * fix tc trt shape
    
    * fix fc dynamic shape
    
    * add fc shape assert
    
    * update
    shangzhizhou authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    f272e59 View commit details
    Browse the repository at this point in the history
  9. [BUG FIX] when x.dim < y.dim, the result of compare_op is inverse (#3…

    …2470)
    
    * fix bug: when x.dim < y.dim, the result of compare_op is inverse to expected result
    
    * support the cuda for fix the compare broadcast bug
    wawltor authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    78eff52 View commit details
    Browse the repository at this point in the history
  10. Fix the bug in mp (#31996)

    * update
    lilong12 authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    976fe6f View commit details
    Browse the repository at this point in the history
  11. [HybridParallel] Add pipeline layer in dygraph (#32449)

    * add pipeline layer
    ForFishes authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    7ef1de6 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    2f351ed View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    3b4dcad View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    7a4cbb3 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    486946a View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    92dc9b2 View commit details
    Browse the repository at this point in the history
  17. Add hub Module for easy to use pre-trained models. (#31873)

    * add Hub Module for easy to use pre-trained models.
    *   support list, load, help fucntions.
    *   support load models by github, gitee, local 
    
    Co-authored-by: LielinJiang <jianglielin@baidu.com>
    lyuwenyu and LielinJiang authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    4e460d7 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    74824fd View commit details
    Browse the repository at this point in the history
  19. paddle.save/load support nested structure and layer (#32446)

    * support save/load binary format tensor
    
    * Fix error when create cudaplace
    
    * Fix error when create cudaplace
    
    * Fix error when create cudaplace
    
    * get devive context from pool.
    
    * move define of 'SerializeToStream' and 'DeserializeFromStream' to 'lod_tensor.cc' and 'selected_rows.cc'.
    
    * support complex object
    
    * improve coverage.
    
    * improve coverage
    
    * improve coverage.
    
    * fix a bug.
    
    * polish API
    
    * save/load program
    
    * paddle.save/load: layer
    
    * deal with conflict
    
    * if PY2, block test_paddle_save_load.TestSaveLoadLayer
    
    * polish code.
    
    * polish code
    
    * edit unnittest
    
    * The condition for object to be identified as state_dict becomes strict
    
    * use 'core._cuda_synchronize'
    hbwx24 authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    727b28d View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    1896c77 View commit details
    Browse the repository at this point in the history
  21. add trt verbose logs (#32459)

    cryoco authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    541d702 View commit details
    Browse the repository at this point in the history
  22. [Paddle-TRT] Add trt runtime version check (#32443)

    * add trt runtime version check
    
    * use different wrap, and change to major version check
    cryoco authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    b055676 View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    5943ff7 View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    25e723e View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    aceec7f View commit details
    Browse the repository at this point in the history
  26. support python39 and delete python35 in Dockerfile (#32385)

    * support python39 and delete python35
    
    * support python39 in Dockerfile.centos
    
    * fix ubuntu18 bug
    
    * update Dockerfile.ubuntu setuptools
    
    * fix centos py39 errors
    
    * fix centos py39 error2
    pangyoki authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    78fc74b View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    29e081b View commit details
    Browse the repository at this point in the history
  28. Configuration menu
    Copy the full SHA
    5468de0 View commit details
    Browse the repository at this point in the history
  29. Cleanup the unused codes for samplecode testing (#32525)

    * update testcases
    
    * remove unused codes
    
    * update the docstring for samcd_processor
    
    * no need to import the six module
    
    * 我也不知道为何有一个前导空格,但现在有单元测试,取消这个空格没啥问题
    
    * add unittests for print_signatures; add the first case for 'required' mechanism when executing sample code testing
    
    * there is no paddle installed in PR-CI-APPROVAL
    
    test=document_fix
    wadefelix authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    2328921 View commit details
    Browse the repository at this point in the history
  30. [Paddle-TRT] Fix AI-Rank BERT emb_eltwise_layernorm input order (#32482)

    * fix airank bert emb order
    
    * move input num check to converter
    
    * add input num check
    
    * add unused var check white list
    cryoco authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    fba46ea View commit details
    Browse the repository at this point in the history
  31. Make range API set its out shape when possible (#32472)

    `range` API set its output shape in dygraph but not in static graph, which can cause Dy2stat error. This PR set the shape of `range` API when possible.
    zhhsplendid authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    f16981b View commit details
    Browse the repository at this point in the history
  32. Dygraph Recompute (#32516)

    * Dygraph reocmpute
    
    * unitest for Dygraph reocmpute
    
    * dy recompute remove unitest for win and mac
    JZ-LIANG authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    583ebab View commit details
    Browse the repository at this point in the history
  33. add pipeline for dynamic graph (#32511)

    * add pp dygraph, test=develop
    lilong12 authored Apr 25, 2021
    Configuration menu
    Copy the full SHA
    561dc71 View commit details
    Browse the repository at this point in the history

Commits on Apr 26, 2021

  1. [DOC] Clarify the difference of paddle.norm and np.linalg.norm (#32530)

    * [DOC] Clarify the difference between paddle.norm and np.linalg.norm
    ZHUI authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    33ca455 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d0751d0 View commit details
    Browse the repository at this point in the history
  3. [Dy2stat] Support paddle.to_tensor with int, float, bool. (#32420)

    paddle.to_tensor will be translated to paddle.assign in Dy2stat, however paddle.assign doesn't support int, float, bool. This PR added the supports.
    zhhsplendid authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    1b9a3bf View commit details
    Browse the repository at this point in the history
  4. add norm_by_times param to ctc_loss (#32490)

    * add norm_by_times param to ctc_loss
    
    * fix doc,test=develop
    LDOUBLEV authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    6c03ea5 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    756f463 View commit details
    Browse the repository at this point in the history
  6. [AMP] Autocast to fp32 for op has no fp16 kernel (#32543)

    * skip op has no fp16 kernel
    
    * add ut
    zhiqiu authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    d2b31a1 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    ab3d2bf View commit details
    Browse the repository at this point in the history
  8. optimize slice op and slice grad op (#32266)

    * optimize slice op and slice grad op, test=develop
    
    * optimize variable name and annotation information, test=develop
    thisjiang authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    5161f71 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    fd85a4a View commit details
    Browse the repository at this point in the history
  10. support backward return None, when corresponding input tensor without…

    … gradient (#32494)
    
    * support backward return None.
    
    * edit unittest.
    
    * edit code according to CI
    
    * Improve error information
    hbwx24 authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    8e66046 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    40e51b2 View commit details
    Browse the repository at this point in the history
  12. [HybridParallel]Fix model parallel bug by using C++ op (#32536)

    * fix model parallel
    
    * rm parallel_help.py
    
    * add embedding
    ForFishes authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    ea465fa View commit details
    Browse the repository at this point in the history
  13. change prepend_op to append_op in initializer (#32177)

    * change prepend to append
    
    * fix ut
    
    * add testcase
    
    * fix ut
    
    * fix test_custom_relu_model
    zhiqiu authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    8fec3c6 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    41bfec8 View commit details
    Browse the repository at this point in the history
  15. Unset ReserveSpace of batch_norm for inference program. (#32493)

    * Unset ReserveSpace for inference program.
    
    * Support training from an inference program.
    Xreki authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    202b0ea View commit details
    Browse the repository at this point in the history
  16. fix dataloader exit error (#32550)

    * fix dataloader exit error if user exit program when dataloader is still iterating. test=develop
    heavengate authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    eae3405 View commit details
    Browse the repository at this point in the history
  17. Modified the return value of tensor.grad from numpy to tensor. (#32142)

    * Modified the return value of tensor.grad from numpy as tensor.
    
    * Modify unittests.
    
    * fixed bugs.
    
    * Add warning info for x.grad
    
    * fixed unittests which used x.grad
    
    * fixed bug.
    xiemoyuan authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    c40c16a View commit details
    Browse the repository at this point in the history
  18. [2.1 API] Modified params of some APIs to support tuple and list. (#3…

    …2528)
    
    * Modified params of some APIs to support tuple and list.
    
    * fixed bug.
    xiemoyuan authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    400c3aa View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    78908b4 View commit details
    Browse the repository at this point in the history
  20. Make assign Doc Same for creation.py and layers/tensor.py, test=docum…

    …ent_fix (#32553)
    
    A follow up PR of #32420, we changed the doc of python/paddle/fluid/layers/tensor.py in that PR and we are changing python/paddle/tensor/creation.py in this PR.
    zhhsplendid authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    7f162b5 View commit details
    Browse the repository at this point in the history
  21. fix bn docs (#32492)

    * fix bn docs
    
    * fix unittest
    ceci3 authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    913317f View commit details
    Browse the repository at this point in the history
  22. [PsCore] optimize performance of large kv (#32535)

    * optimize pull sparse
    
    * optimize pull sparse
    
    * change macro
    
    * format
    Thunderbrook authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    4b7242b View commit details
    Browse the repository at this point in the history
  23. Optimize where_index_op(prefix sum) (#30601)

    * new optimize for where_index_op with prefix sum version.
    
    * write a scan prefix sum kernel with stream for where index op.
    
    * optimize where_index by using cub::DeviceScan::InclusiveSum instead of imperfect self-kernel.
    
    * remove CheckTrue struct and rename stide_array for readable.
    
    * optimize variable name for readable.
    
    * optimize function name and annotation.
    thisjiang authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    6ec4e64 View commit details
    Browse the repository at this point in the history
  24. Fix OPENBLAS ci and fix windows CPU CI to parallel compile (#32548)

    * clear CUDA compile environment on windows
    
    * fix Windows CI
    
    * fix Windows CI
    
    * fix Windows CI
    zhwesky2010 authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    1ec9525 View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    fcd18ef View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    4ba49af View commit details
    Browse the repository at this point in the history
  27. deal with conflict. (#32578)

    hbwx24 authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    a7be32c View commit details
    Browse the repository at this point in the history
  28. add send/recv api (#32504)

    * add sendrecv, test=develop
    lilong12 authored Apr 26, 2021
    Configuration menu
    Copy the full SHA
    c47bafc View commit details
    Browse the repository at this point in the history

Commits on Apr 27, 2021

  1. Configuration menu
    Copy the full SHA
    0bc97e9 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f1bc322 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    9930a58 View commit details
    Browse the repository at this point in the history
  4. [HybridParallel] Fix amp bug in ModelParallel (#32579)

    * fix amp bug
    
    * fix name of wordsize
    ForFishes authored Apr 27, 2021
    Configuration menu
    Copy the full SHA
    c1db7e3 View commit details
    Browse the repository at this point in the history
  5. Check for cuda errors immediately after kernel launch (#32557)

    Co-authored-by: Yang Zhang <yangzhang@live.com>
    jeff41404 and willthefrog authored Apr 27, 2021
    Configuration menu
    Copy the full SHA
    19eefef View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    6579432 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    809ac03 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    85e697d View commit details
    Browse the repository at this point in the history
  9. Support list and tuple for args. (#32344)

    * Support list and tuple for parameters of layer_norm, multiprocess_reader, DatasetFolder and ImageFolder.
    
    * add unittest for layer_norm.
    
    * add require gpu for example.
    xiemoyuan authored Apr 27, 2021
    Configuration menu
    Copy the full SHA
    a08a118 View commit details
    Browse the repository at this point in the history
  10. str in python2 is different to python3's, it make mistakes for some a…

    …pi's docstring (#32588)
    
    * UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1788: ordinal not in range(128)
    
    test=document_fix
    
    str(doc) in python2
    
    test=document_fix
    
    * update md5 function in count_api_without_core_ops.py
    
    str in py2 is different.
    
    test=document_fix
    wadefelix authored Apr 27, 2021
    Configuration menu
    Copy the full SHA
    97794ec View commit details
    Browse the repository at this point in the history
  11. fix cross_entropy calculation error (#32545)

    * fix cross_entropy calculation error
    
    * add unittest and fix static
    yghstill authored Apr 27, 2021
    Configuration menu
    Copy the full SHA
    23d3e36 View commit details
    Browse the repository at this point in the history
  12. [Docs] Modified the docs of some api for supporting list/tuple args. (#…

    …32360)
    
    * fixed docs.
    
    * Fixed docs. test=document_fix
    
    code bak.
    
    fixed docs. test=document_fix
    
    * Revert to previous version of python/paddle/fluid/backward.py
    
    * fixed bugs.
    
    * test=document_fix. Fixed examples.
    xiemoyuan authored Apr 27, 2021
    Configuration menu
    Copy the full SHA
    1515892 View commit details
    Browse the repository at this point in the history
  13. 'jit.save/load' support save/load function without parameters. (#32430)

    * jit.save/load support function.
    
    * delete unnittest test_jit_load_model_incomplete.
    
    * edit code according to CI
    
    * Modify the documentation.
    
    * add note to doc.
    hbwx24 authored Apr 27, 2021
    Configuration menu
    Copy the full SHA
    0372f1d View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    6f6e159 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    eca8dcc View commit details
    Browse the repository at this point in the history
  16. [OPs] Bug fix, fix the segment mean for illegal syncthreads usage. (#…

    …32596)
    
    * [OPs] Bug fix, fix the segment mean for illegal syncthreads usage.
    ZHUI authored Apr 27, 2021
    Configuration menu
    Copy the full SHA
    1afe1ac View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    f285f4c View commit details
    Browse the repository at this point in the history
  18. clear 'BasicEngine' when an exception occurs in the backward. (#32546)

    * clear 'BasicEngine' when an exception occurs in the backward.
    
    * deal with conflict.
    
    * deal with conflict.
    hbwx24 authored Apr 27, 2021
    Configuration menu
    Copy the full SHA
    797b2df View commit details
    Browse the repository at this point in the history
  19. edit paddle.save/load API (#32532)

    * edit paddle.save/load API
    
    * Update io.py
    
    edit doc
    
    * delete cpython-37.pyc
    
    * Update io.py
    
    edit doc
    
    * Update io.py
    
    recommit
    
    * Update io.py
    
    recommit
    
    * Update io.py
    
    recommit
    
    * Update io.py
    
    recommit
    hbwx24 authored Apr 27, 2021
    Configuration menu
    Copy the full SHA
    79f7ba6 View commit details
    Browse the repository at this point in the history
  20. update 2.0 public api in paddle.init (#32034)

    Co-authored-by: XiaoguangHu <46782768+XiaoguangHu01@users.noreply.github.com>
    zhiboniu and XiaoguangHu01 authored Apr 27, 2021
    Configuration menu
    Copy the full SHA
    125e481 View commit details
    Browse the repository at this point in the history
  21. update 2.0 public api in nn (#31912)

    * update 2.0 public api in nn
    
    * replace Chinese character cause error in ci;synchronization with pr:#32588 to avoid 'ascii' codec in python2
    
    * numbers used in paddle.nn.functional.norm but not imported
    zhiboniu authored Apr 27, 2021
    Configuration menu
    Copy the full SHA
    3b81f2b View commit details
    Browse the repository at this point in the history
  22. [Docker] support cuda11.2 and using gcc5.4 in cuda10.1 (#32531)

    * support cuda11.2 and using gcc5.4 in cuda10.1
    
    * fix manylinux py36 bug
    
    * support cuda11.2
    
    * fix python36 pip version problem in ubuntu
    
    * save cuda11.0
    pangyoki authored Apr 27, 2021
    Configuration menu
    Copy the full SHA
    3132695 View commit details
    Browse the repository at this point in the history
  23. add alltoall api (#32507)

    * add alltoall api, test=develop
    lilong12 authored Apr 27, 2021
    Configuration menu
    Copy the full SHA
    db41b74 View commit details
    Browse the repository at this point in the history

Commits on Apr 28, 2021

  1. Optimize update_loss_scaling_op (#32554)

    * optimize update_loss_scaling_op by fused for loop to one kernel, test=develop
    
    * remove useless while loop and optimize variable name, test=develop
    
    * optimize variable name from out_addrs_tensor to out_addrs_mem, test=develop
    
    * optimize variable name for readable by change prefix identifier from t_ to local_
    thisjiang authored Apr 28, 2021
    Configuration menu
    Copy the full SHA
    0dc02dc View commit details
    Browse the repository at this point in the history
  2. [oneDNN] Added clearing oneDNN cache per executor (#32499)

    * - Added clearing oneDNN per executor
    
    * - Executor is nt always having FLAGS_use_mkldnn set to true
    jczaja authored Apr 28, 2021
    Configuration menu
    Copy the full SHA
    ba61076 View commit details
    Browse the repository at this point in the history
  3. Reduce the time cost for the elementwise_add test case (#32628)

    Reduce the time cost for the elementwise_add test case (#32628)
    wawltor authored Apr 28, 2021
    Configuration menu
    Copy the full SHA
    6d3eb3d View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    7a245b7 View commit details
    Browse the repository at this point in the history
  5. Fix some error message (#32614)

    * fix two error message
    
    * fix two error message
    
    * fix error
    
    * fix error
    
    * fix error
    
    * fix error
    
    * fix some error message
    
    * fix some error
    
    * fix error
    
    * fix some error
    
    * fix some error
    
    * fix some error
    
    * fix one error
    
    * fix some error
    
    * fix seven error message
    
    * fix error
    
    * fix error
    
    * fix error
    
    * fix error
    
    * fix some error message
    
    * fix error
    
    * fix some error
    
    * fix some error
    Kqnonrime authored Apr 28, 2021
    Configuration menu
    Copy the full SHA
    9ee709f View commit details
    Browse the repository at this point in the history
  6. [PsCore] solve Brpc dep (#32632)

    * Revert "Revert "[PsCore] optimize performance of large kv (#32535)" (#32599)"
    
    This reverts commit 809ac03.
    
    * brpc dep
    Thunderbrook authored Apr 28, 2021
    Configuration menu
    Copy the full SHA
    4ead9a5 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    bda0e60 View commit details
    Browse the repository at this point in the history
  8. Nne integration (#32604)

    * Add dlnne engine runtime
    
    * Fix log
    
    * Remove <const_cast> and remove unrelated modify with dlnne, +clang-format
    
    * Fix CMakeList format error
    
    * Add copyright message
    
    * Fix dlnne CMakeList.txt
    
    * Add some paddlepaddle_pass to support more networks
    
    * Fix some format bug
    
    * Add delete dropout_op pass
    
    * Fix some format bug
    
    * Fix format bug
    denglin-github authored Apr 28, 2021
    Configuration menu
    Copy the full SHA
    abcb3f5 View commit details
    Browse the repository at this point in the history
  9. Add fake interface for register_hook in static mode (#32642)

    * add fake interface for hook in static mode
    
    * add unittests
    
    * fix failed unittests
    chenwhql authored Apr 28, 2021
    Configuration menu
    Copy the full SHA
    9aad752 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    bc379ca View commit details
    Browse the repository at this point in the history
  11. [NPU] add input EpsilonTensor for adam (#32605)

    * add input EpsilonTensor for adam
    
    * update python api
    
    * add unit test
    
    * add npu test
    
    * add more ut
    zhiqiu authored Apr 28, 2021
    Configuration menu
    Copy the full SHA
    119cda3 View commit details
    Browse the repository at this point in the history

Commits on Apr 29, 2021

  1. Configuration menu
    Copy the full SHA
    243b432 View commit details
    Browse the repository at this point in the history
  2. [Paddle-TRT] Implement MHA fp16 order same as training (#32629)

    * implement MHA order same as training
    
    * fix fp16 compile issue on old architecture
    
    * fix format
    
    * fix format
    zlsh80826 authored Apr 29, 2021
    Configuration menu
    Copy the full SHA
    75282e7 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    dec8ab8 View commit details
    Browse the repository at this point in the history
  4. Add BF16 uniform random initializer (#32468)

    * Add bf16 uniform random initializer
    
    * Remove duplicated section
    
    * Change UT to CPU place only
    
    * Put detail functions into anonymous namespace
    wozna authored Apr 29, 2021
    Configuration menu
    Copy the full SHA
    f46f15a View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    8ccf549 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    b7ddd7d View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    b6ca6a5 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    10c493a View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    7a73692 View commit details
    Browse the repository at this point in the history
  10. Add op read_file and decode_jpeg (#32564)

    * add op read_file and decode_jpeg
    LielinJiang authored Apr 29, 2021
    Configuration menu
    Copy the full SHA
    b22f6d6 View commit details
    Browse the repository at this point in the history
  11. add __all__=[] to python files not in API public list; import * only …

    …support in API public list files (#32643)
    zhiboniu authored Apr 29, 2021
    Configuration menu
    Copy the full SHA
    69d237c View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    0f578db View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    a3e7719 View commit details
    Browse the repository at this point in the history

Commits on Apr 30, 2021

  1. Reduce grad fix (#32592)

    jakpiase authored Apr 30, 2021
    Configuration menu
    Copy the full SHA
    43527a2 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    8fd724a View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5ada032 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    bd8d35a View commit details
    Browse the repository at this point in the history
  5. 单测全量列表修改 (#32641)

    * 单测全量列表修改
    
    * 单测全量列表修改
    
    * 去除挂掉的windows单测
    
    * 去除挂掉的windows单测
    feng626 authored Apr 30, 2021
    Configuration menu
    Copy the full SHA
    9b4fabf View commit details
    Browse the repository at this point in the history
  6. Add 12 inplace APIs including auto generated (#32573)

    * add relu6_ hardsigmoid_ leaky_relu_ Inplace APIs
    
    * add softmax_with_cross_entropy_ Inplace API
    
    * add clip_ scale_ add_ subtract_ Inplace APIs
    
    * add wlist
    
    * fix parameter of scale api
    
    * add add_n_ Inplace API and remove log_ Inplace API
    
    * fix elementwise_add_ and elementwise_sub_ broadcast problem
    
    * elementwise inplace api give error message before run the op
    
    * use broadcast_shape in elementwise inplace op
    
    * add 8 inplace apis that is auto generated
    
    * add unittest for all inplace apis
    
    * add decorator for inplace apis in static mode
    
    * fix windows blas fail of exp inplace api, change array_equal to allclose
    
    * add flatten inplace api
    
    * add flatten unittest
    
    * fix flatten unittest
    
    * add decorator
    
    * fix grad.numpy in test_pylayer_op
    
    * unsupport softmax_with_cross_entropy_
    
    * add test_inplace_softmax_with_cross_entropy to static_mode_white_list
    
    * delete __all__ in inplace_utils
    
    * delete activation inplace function and add Tensor.inplace_func
    
    * change paddle.inplace_ to Tensor.inplace_
    
    * fix little problem
    
    * add paddle in inplace_utils
    pangyoki authored Apr 30, 2021
    Configuration menu
    Copy the full SHA
    308073d View commit details
    Browse the repository at this point in the history
  7. revert data_generator __init__.py (#32670)

    * revert data_generator
    
    * test
    
    * add setup.py
    tianshuo78520a authored Apr 30, 2021
    Configuration menu
    Copy the full SHA
    eb13c19 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    7e2b60a View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    c6713bc View commit details
    Browse the repository at this point in the history
  10. Support transforms for paddle tensor image (#31970)

    * add to_grayscale, normalize
    
    * add rotate
    
    * add vfip and hflip
    
    * add crop center_crop
    
    
    * add padding, support constant, reflect, replicate, circular same as paddle.pad
    
    * add get-image-[n,c,w,h] axis utils
    lyuwenyu authored Apr 30, 2021
    Configuration menu
    Copy the full SHA
    6ab43f7 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    109fdf1 View commit details
    Browse the repository at this point in the history
  12. avoid polluting logging's root logger (#32673)

    avoid polluting logging's root logger
    Feiyu Chan authored Apr 30, 2021
    Configuration menu
    Copy the full SHA
    4d95c8c View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    0a0f324 View commit details
    Browse the repository at this point in the history
  14. [Dy2stat] Fix to_tensor Bug Reported from QA (#32701)

    Dy2stat failed when user writes return paddle.to_tensor(xxx), the reason is that visit_Expr doesn't work when the Expr is in return. Some other statements may trigger same bug. To fix it, we re-wrote a transformer to transform paddle.to_tensor to paddle.assign for all Call nodes.
    zhhsplendid authored Apr 30, 2021
    Configuration menu
    Copy the full SHA
    0026819 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    3cc11a3 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    f4a3f85 View commit details
    Browse the repository at this point in the history

Commits on May 3, 2021

  1. Fix the bug in pipeline for dygraph mode (#32716)

    * update, test=develop
    lilong12 authored May 3, 2021
    Configuration menu
    Copy the full SHA
    a0f4ac5 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d0de2d8 View commit details
    Browse the repository at this point in the history

Commits on May 5, 2021

  1. update, test=develop (#32726)

    lilong12 authored May 5, 2021
    Configuration menu
    Copy the full SHA
    a259076 View commit details
    Browse the repository at this point in the history

Commits on May 6, 2021

  1. Configuration menu
    Copy the full SHA
    8b1b214 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    9599c3b View commit details
    Browse the repository at this point in the history
  3. add int64 support test=develop (#32736)

    add int64 support
    gongweibao authored May 6, 2021
    Configuration menu
    Copy the full SHA
    f1c68a0 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    c5ae21f View commit details
    Browse the repository at this point in the history
  5. fix l1 decay for inplace (#32717)

    littletomatodonkey authored May 6, 2021
    Configuration menu
    Copy the full SHA
    efdb0a7 View commit details
    Browse the repository at this point in the history
  6. [ROCM] bugfix for unittest (#32392)

    * fix test_unpool_op
    
    * fix test_inplace_addto_strategy
    
    * fix test_conv2d_fusion_op
    
    * fix test_imperative_lod_tensor_to_selected_rows, test_imperative_selected_rows_to_lod_tensor
    
    * fix test_dot_op
    
    * fix test_correlation_op
    
    * fix tracer
    
    * fix test_memcpy_op
    ronny1996 authored May 6, 2021
    Configuration menu
    Copy the full SHA
    3139262 View commit details
    Browse the repository at this point in the history
  7. [Rocm] fix expand as (#32704)

    * [Rocm] fix test_expand_as_op
    
    * [Rocm] fix test_expand_as_op
    
    * [Rocm] fix test_expand_as_op
    
    * [Rocm] fix test_expand_as_op
    
    * [Rocm] fix test_expand_as_op
    
    * [Rocm] fix test_expand_as_op
    Ray2020BD authored May 6, 2021
    Configuration menu
    Copy the full SHA
    2fe4580 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    28d42a9 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    70eb435 View commit details
    Browse the repository at this point in the history
  10. [Rocm] fix tests of inplace_abn_op & grid_sampler_op (#32703)

    * [Rocm] fix tests of inplace_abn_op & grid_sampler_op
    
    * [Rocm] fix tests of inplace_abn_op & grid_sampler_op
    Ray2020BD authored May 6, 2021
    Configuration menu
    Copy the full SHA
    7c27541 View commit details
    Browse the repository at this point in the history
  11. [2.1 API] Enable printing deprecated warning info. (#32712)

    * Add deprecated warning info.
    
    * Add unittest for deprecated decorator.
    
    * Add warning info for tensor.grad
    xiemoyuan authored May 6, 2021
    Configuration menu
    Copy the full SHA
    51b39a9 View commit details
    Browse the repository at this point in the history

Commits on May 7, 2021

  1. Mechanism that converts startup_program initializers to BF16 (#32720)

    * Add casting initializers for bf16 training
    
    * Changes after review
    
    * Correct test and add comment
    wozna authored May 7, 2021
    Configuration menu
    Copy the full SHA
    ce2bdb0 View commit details
    Browse the repository at this point in the history
  2. Refactor dot op's CPU kernel for better performance (#32589)

    * OP dot: refactor CPU kernels and get better loop performance.
    
    * Minor fix on code format.
    
    * Fixed minor errors.
    tongxin authored May 7, 2021
    Configuration menu
    Copy the full SHA
    97a9552 View commit details
    Browse the repository at this point in the history
  3. bug fix, test=develop (#32752)

    lilong12 authored May 7, 2021
    Configuration menu
    Copy the full SHA
    9b65d4c View commit details
    Browse the repository at this point in the history
  4. Remove paddle_custom_op dynamic libraries, and link to FLUID_CORE on …

    …Windows (#32583)
    
    * Remove paddle_custom_op dynamic libraries, change link to FLUID_CORE on windows, and check copy_to
    
    * fix CI
    zhwesky2010 authored May 7, 2021
    Configuration menu
    Copy the full SHA
    7610c2b View commit details
    Browse the repository at this point in the history