Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hybrid performance] pipeline add program cache #33954

Merged

Conversation

wangxicoding
Copy link
Contributor

@wangxicoding wangxicoding commented Jul 4, 2021

PR types

Performance optimization

PR changes

Others

Describe

pipeline运行添加program_cache,加快运行速度

  1. 没有program_cache时,有较长时间消耗在ctx的准备上
    image
  2. 添加program_cache后,python端无太大额外耗时
    image

V100 32G,gpt2-en模型测试

卡数 优化 dtype speed(tokens/s) (S) 提升
4卡pp baseline fp32 16748  
  fp16 31304  
4卡pp 干掉_dump_debug_info fp32 18344 9.53%
  fp16 39141 25.0%

后续TODO:c++端分析有无ctx准备耗时,优化

@paddle-bot-old
Copy link

paddle-bot-old bot commented Jul 4, 2021

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@wangxicoding wangxicoding force-pushed the pipeline_program_cache branch from ef67f39 to f91942f Compare July 5, 2021 03:41
@wangxicoding wangxicoding force-pushed the pipeline_program_cache branch from 8dd5b08 to fa2721f Compare July 5, 2021 09:24
@wangxicoding wangxicoding requested a review from sandyhouse July 5, 2021 11:25
@wangxicoding wangxicoding changed the title pipeline add program cache [hybrid performance] pipeline add program cache Jul 5, 2021
Copy link

@sandyhouse sandyhouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wangxicoding wangxicoding merged commit c9ae136 into PaddlePaddle:develop Jul 6, 2021
@wangxicoding wangxicoding deleted the pipeline_program_cache branch July 6, 2021 08:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants