Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate warp-ctc as WarpCTCLayer, including unitest and layer inter… #651

Merged
merged 8 commits into from
Dec 7, 2016

Conversation

Xreki
Copy link
Contributor

@Xreki Xreki commented Nov 29, 2016

issue #434

  1. 修改dso相关实现,使其在cpu-only的情况下也可以使用。
  2. 添加WarpCTCLayer,其中集成了warp-ctc。
  3. 添加单测test_WarpCTCLayer,其中将loss、gradients和CTCLayer对比。

TODO:

  • 将warp-ctc作为submodule添加到PaddlePaddle中,添加后pre-commit检查时,remove-crlf和detect-private-key不通过,错误信息为IOError: [Errno 21] Is a directory: 'warp-ctc'
  • ci系统中添加对submodule的自动下载和编译功能。

@@ -94,6 +94,11 @@ endif()
if(NOT WITH_GPU)
add_definitions(-DPADDLE_ONLY_CPU)
add_definitions(-DHPPL_STUB_FUNC)

if(WITH_DSO)
add_definitions(-DPADDLE_USE_DSO)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请问DSO是指?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DSO是动态装载so的简写, @gangliao 实现用来装载cuda相关的一些库,所以之前只在with_gpu的情况下使用。warp-ctc使用dso这种方式,由于warp-ctc存在cpu-gpu以及cpu-only版本,所以做了些修改,使得paddle在cpu-only的版本中也支持dso这个功能。

* @param **dso_handle dso handler
*
*/
void GetWarpctcDsoHandle(void** dso_handle);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CTC是缩写,应该大写。DSO貌似也是一个缩写?所以是 GetWarpCTCDSOHandle

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gangliao DSO这个要统一改成大写吗?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不用改吧 我觉得Dso还挺常见 类似于tf loader, 当时我来的时候,由于只有linux,所以用了so命名, 而没有考虑DLL一类的名字。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果是缩写,则应该都大写。

#define HL_WARPCTC_WRAP_H_

#include "hl_base.h"
/// #include "hl_cuda.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果是不要的代码,请删除,而不是注释掉。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的。

struct DynLoad__##__name { \
template <typename... Args> \
__type operator()(Args... args) { \
typedef __type (*warpctcFunc)(Args...); \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议用这种风格, 使得编译器编译期间能自动推导函数返回类型,使得DYNAMIC_LOAD_WARPCTC_WRAP更加concise。 hl_cudart_wrap.cc之所以用__type, 是因为存在extern c 函数, 无法进行推导。

Copy link
Collaborator

@wangkuiyi wangkuiyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我不是layer的专家,我的comments主要是关于代码风格的。 @Xreki 了解并修改了就好。我approve了。赞这个PR 集成 WarpCTC!

关于代码本身,请 @Xreki 主要征求其他同事的意见吧。

for (int i = threadIdx.x; i < sequenceWidth; i += blockDim.x) {
batch[batchBaseIdx + i] = sequence[sequenceBaseIdx + i];
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以考虑将469-477行都合并成一个for循环:(下同)
real scale = 1.0f / (real)sequenceLength if normByTimes else 1.0f;


P_DECLARE_bool(use_gpu);

const real* getData(const Matrix& matrix) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getData和checkError函数都已经有实现,可以直接调用

Copy link
Contributor Author

@Xreki Xreki Nov 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkError这个函数确实有在多个文件中重复实现,比如paddle/math/tests/test_matrixUtil.h中的checkMatrixErrpaddle/gserver/tests/test_RecurrentLayer.cpp里面的checkError
另外,paddle/math/tests/test_matrixUtil.hpaddle/gserver/tests/TestUtil.h也都定义了函数checkMatrixEqual。因此,test_WarpCTCLayer.cpp中不能同时引入TestUtil.htest_matrixUtil.h,这里等单测重构了之后再统一修改吧。

@Xreki Xreki added this to the 0.10.0 milestone Nov 30, 2016
@Xreki Xreki force-pushed the warpctc branch 2 times, most recently from f8989ae to cee7001 Compare December 1, 2016 07:39
@zzsu
Copy link

zzsu commented Dec 2, 2016

Looks good to me

@@ -105,6 +105,11 @@ endif()
if(NOT WITH_GPU)
add_definitions(-DPADDLE_ONLY_CPU)
add_definitions(-DHPPL_STUB_FUNC)

if(WITH_DSO)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

和134行重复了,可以提出去单独写了。

@gangliao
Copy link
Contributor

gangliao commented Dec 2, 2016

LGTM

@Xreki Xreki merged commit 4823075 into PaddlePaddle:develop Dec 7, 2016
@Xreki Xreki deleted the warpctc branch December 7, 2016 04:54
zhhsplendid pushed a commit to zhhsplendid/Paddle that referenced this pull request Sep 25, 2019
zhhsplendid pushed a commit to zhhsplendid/Paddle that referenced this pull request Sep 25, 2019
* synchronize with develop (PaddlePaddle#642)

* update_commitid1.3 (PaddlePaddle#641)

* update inference c++ API doc (PaddlePaddle#634)

* update inference c++ API doc

* fix link

* thorough clean for doc (PaddlePaddle#644)

* thorough clean

* delete_DS_Store

* Cherrypick1.3 (PaddlePaddle#652)

* thorough clean

* delete_DS_Store

* [Don't merge now]update_install_doc (PaddlePaddle#643)

* update_install_doc

* follow_comments

* add maxdepth (PaddlePaddle#646)

* upload_md (PaddlePaddle#649)

* update_version (PaddlePaddle#650)

* Translation of 16 new apis (PaddlePaddle#651)

* fix_windows

* Final update 1.3 (PaddlePaddle#653)

* thorough clean

* delete_DS_Store

* update_1.3

* Deadlink fix (PaddlePaddle#654)

* fix_deadlinks

* update_docker

* Update release_note.rst

* Update index_cn.rst

* update_Paddle (PaddlePaddle#658)

* fix pic (PaddlePaddle#659)

* [to 1.3] cn api debug (PaddlePaddle#655) (PaddlePaddle#661)

* debug

* fix 2 -conv2d

* "锚" ==> anchor(s)
zhhsplendid pushed a commit to zhhsplendid/Paddle that referenced this pull request Sep 25, 2019
* synchronize with develop (PaddlePaddle#642)

* update_commitid1.3 (PaddlePaddle#641)

* update inference c++ API doc (PaddlePaddle#634)

* update inference c++ API doc

* fix link

* thorough clean for doc (PaddlePaddle#644)

* thorough clean

* delete_DS_Store

* Cherrypick1.3 (PaddlePaddle#652)

* thorough clean

* delete_DS_Store

* [Don't merge now]update_install_doc (PaddlePaddle#643)

* update_install_doc

* follow_comments

* add maxdepth (PaddlePaddle#646)

* upload_md (PaddlePaddle#649)

* update_version (PaddlePaddle#650)

* Translation of 16 new apis (PaddlePaddle#651)

* fix_windows

* Final update 1.3 (PaddlePaddle#653)

* thorough clean

* delete_DS_Store

* update_1.3

* Deadlink fix (PaddlePaddle#654)

* fix_deadlinks

* update_docker

* Update release_note.rst

* Update index_cn.rst

* update_Paddle (PaddlePaddle#658)

* fix pic (PaddlePaddle#659)

* [to 1.3] cn api debug (PaddlePaddle#655) (PaddlePaddle#661)

* debug

* fix 2 -conv2d

* "锚" ==> anchor(s)

* Weekly cherrypick0302 (PaddlePaddle#668)

* Update programming_guide.md (PaddlePaddle#664)

* Update programming_guide.md

* Update programming_guide_en.md

* Update cn api to 1.3 (PaddlePaddle#663)

* Update cn api to 1.3 fluid & layers

* Rest to 1.3

* Weeklyupdate 0301 (PaddlePaddle#666)

* Tables_rm_op

* update_op

* update_index

* update_book_0302 (PaddlePaddle#667)

* fix_format (PaddlePaddle#669) (PaddlePaddle#670)

* fix_format

* Update Tables.md

* Update Tables_en.md

* add dataset api_cn (PaddlePaddle#673)

* rm fluid.core in desigin_idea (PaddlePaddle#674)

* Update fluid_design_idea.md

* Update fluid_design_idea_en.md

* Fix array_read code example error. (PaddlePaddle#671)

Signed-off-by: zhaoyuchen <zhaoyuchen01@baidu.com>

* add data_reader_cn (PaddlePaddle#676)

* fix doc error (PaddlePaddle#675)

* update_book_commitid (PaddlePaddle#680)

* update_book_commitid

* commitid0309

* fix typo

* book indexes (PaddlePaddle#677)
Meiyim pushed a commit to Meiyim/Paddle that referenced this pull request May 21, 2021
heavengate pushed a commit to heavengate/Paddle that referenced this pull request Aug 24, 2022
…ation

reset gpt2 generation logic, avoid using Engine.predict
lizexu123 pushed a commit to lizexu123/Paddle that referenced this pull request Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants