Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add vae compile to sd-webui #473

Merged
merged 13 commits into from
Jan 3, 2024
Merged

add vae compile to sd-webui #473

merged 13 commits into from
Jan 3, 2024

Conversation

fpzh2011
Copy link
Contributor

@fpzh2011 fpzh2011 commented Dec 28, 2023

  • 在 script 运行前保存原始的 vae model
  • 编译 vae model 并缓存
  • script 运行后恢复原始的 vae model

txt2img 的执行时间,onediff 需要 4.46s,torch 需要 7.75s(sdxl 1024x1024)。
@strint @ccssu

@fpzh2011 fpzh2011 requested review from strint and ccssu December 29, 2023 03:28
@fpzh2011
Copy link
Contributor Author

修复 sd-2.1 模型结果为 NaN 的问题。

@fpzh2011
Copy link
Contributor Author

fpzh2011 commented Jan 3, 2024

之前的 review 意见都做了修改。另外做了如下改动:

  • oneflow_compile 关闭 graph_file 选项以避免 shared_graph 错误
  • 修复第一次执行时 vae 没有编译的问题。通过日志验证 vae 编译确实生效。
    @strint

@strint
Copy link
Collaborator

strint commented Jan 3, 2024

txt2img 的执行时间,onediff 需要 4.46s,torch 需要 7.75s(sdxl 1024x1024)

需要说明下设备型号;
记录下 trt 的数据;
把 e2e 的性能数据也更新到 readme;

@fpzh2011
Copy link
Contributor Author

fpzh2011 commented Jan 3, 2024

readme 补充了 TensorRT 的端到端时间,形式改为表格,增加 steps 数据。
@strint

@fpzh2011
Copy link
Contributor Author

fpzh2011 commented Jan 3, 2024

  • 修正了 onediff 的执行时间
  • 增加 TensorRT 的执行时间
  • 增加性能提升比例(torch/onediff)
    @strint

@fpzh2011 fpzh2011 merged commit f040183 into main Jan 3, 2024
2 of 4 checks passed
@fpzh2011 fpzh2011 deleted the sd_webui_vae branch January 3, 2024 10:09
@@ -11,6 +11,11 @@ Updated on DEC 26, 2023. Device: RTX 3090. Resolution: 1024x1024
| --------------- | --------------- | ------------------ | ---------------------- |
| 2.99it/s | 6.40it/s | 6.71it/s | 224.41% |

Time to enerate a 1024x1024 image with sdxl (30 steps) on 3090
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Time to enerate a 1024x1024 image with sdxl (30 steps) on 3090
End2end time(seconds) to generate a 1024x1024 image with SDXL (30 steps) on NVIDIA RTX 3090:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants