torch.compile based model optimizer #6377

bnellnm · 2024-07-12T15:30:27Z

This PR implements the first cut of the optimizer decribed here: [RFC] A Graph Optimization System in vLLM using torch.compile

Issue: #6378

peaceorwell · 2024-07-17T03:06:06Z

@bnellnm I'd like to ask, I didn't see any pattern matching related code in the PR. Did I miss it?

peaceorwell · 2024-07-17T03:02:18Z

vllm/model_executor/model_optimizer/silu_mul_quant.py

I have a small question. The operator has already been registered to torch in torch_binding.cpp above, why does it need to be registered again here?

This version of silu_mul_quant includes allocating the output/tmp buffers. The function names here don't really mean anything since we will be going by the torch op name (but probably should be changed to avoid confusion).

bnellnm · 2024-07-17T14:47:38Z

@bnellnm I'd like to ask, I didn't see any pattern matching related code in the PR. Did I miss it?

We don't use pytorch's pattern matching routines. The code in fusion.py searches for and combines fusable operations.

bnellnm · 2024-07-23T21:01:04Z

vllm/model_executor/model_optimizer/fused_op_generator_utils.py

I think this should be

if isinstance(n, float): ty = "float" elif isinstance(n, int): ty = "int" elif n.type is not None: ...

so that the proper C++/Python type is returned.

bnellnm · 2024-07-23T21:03:10Z

vllm/model_executor/model_optimizer/fused_op_generator_utils.py

I think the signature of this function should remain as it was before so that we could do a proper job of generating a meta function at some point in the future. The compose stuff doesn't really work for non-unary meta functions.

…ak cache sizes

…rong answer bug

lnykww · 2024-10-14T07:08:20Z

@bnellnm is this feature still ongoing？

bnellnm · 2024-10-14T21:23:02Z

lnykww

@bnellnm is this feature still ongoing？

It is, although this PR will probably be superceded by others.

hmellor · 2025-02-20T16:16:01Z

Closing as torch.compile support is now added

bnellnm mentioned this pull request Jul 12, 2024

[RFC]: A Graph Optimization System in vLLM using torch.compile #6378

Closed

bnellnm force-pushed the bnell/torch-compile-fusion branch from 79d604d to 66f0159 Compare July 12, 2024 15:45

peaceorwell reviewed Jul 17, 2024

View reviewed changes

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 17, 2024

robertgshaw2-redhat removed the ready ONLY add when PR is ready to merge/full CI is needed label Jul 17, 2024

bnellnm commented Jul 23, 2024

View reviewed changes

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 25, 2024

bnellnm force-pushed the bnell/torch-compile-fusion branch from 9a9af98 to 962d9bb Compare July 26, 2024 02:09

rkooo567 removed the ready ONLY add when PR is ready to merge/full CI is needed label Jul 29, 2024

bnellnm added 18 commits August 20, 2024 19:43

Add meta functions for ops to prevent graph breaks

69d46da

format

679470c

add torch.compile to loader + symint support for gptq_gemm_meta + twe…

08f969f

…ak cache sizes

pull out punica support test, move torch.compile to runner to avoid w…

7f5946d

…rong answer bug

tweaks

2853e5d

change codebook_partition_sizes to List[int]

1559470

use string schemas for all functions

929894d

back out lora test hacks

b707b57

cleanups

43cbb23

fix flash_attn

7d9ab09

fix marlin schemas and meta funcs

33ab5fe

fix format

69bfa21

add some opcheck tests

729d99c

fix registrations for non-Tensor ops

e59aa74

rebase + fix gguf registrations

b7a851b

update PR template with info on pytorch registration

d57d913

try registering meta-function via python to handle symbolic shapes

ea22ab5

format

58fb6b6

bnellnm and others added 14 commits August 20, 2024 20:39

fix flash_attn missing window_size

47771bb

fix advance schema

feb1e84

update gptq_marlin_gemm schema and meta fn

8f42319

fix flash_attn registration

45e6451

re-enable silu_and_mul kernel

bea04ba

a few tweaks

c84c60c

more tweaks

325b861

update torch to 2.4

2a465e6

add some meta functions, revert some pt2.3 hacks

9931233

comments

10e73a2

move optimizer call

8dd4382

merge in fix-graph-breaks

e14e255

symint and other fixes

6793994

add support for torch.Tensor.size

355436f

bnellnm force-pushed the bnell/torch-compile-fusion branch from e2fa8b8 to 355436f Compare August 28, 2024 19:46

bnellnm and others added 9 commits August 30, 2024 18:36

rewrite gather function

5fa9530

revert some wip so fp8 model will run

7de4709

wip. TODO: dynamic_fp8 wrong answer

0b3f4bc

support floordiv

d7fa26d

update fusion passes for llama 3

578e8a1

more fused operators

2e5d996

prepare for commit

6cbf5e4

remove prints

cbb88ed

reorder args in silu_mul_quant

64a88bb

hmellor closed this Feb 20, 2025

mergify bot added ci/build frontend labels Feb 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

torch.compile based model optimizer #6377

torch.compile based model optimizer #6377

Uh oh!

bnellnm commented Jul 12, 2024 •

edited

Loading

Uh oh!

peaceorwell commented Jul 17, 2024

Uh oh!

peaceorwell Jul 17, 2024

Uh oh!

bnellnm Jul 17, 2024

Uh oh!

bnellnm commented Jul 17, 2024

Uh oh!

bnellnm Jul 23, 2024 •

edited

Loading

Uh oh!

bnellnm Jul 23, 2024

Uh oh!

lnykww commented Oct 14, 2024

Uh oh!

bnellnm commented Oct 14, 2024

Uh oh!

hmellor commented Feb 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Uh oh!

torch.compile based model optimizer #6377

torch.compile based model optimizer #6377

Uh oh!

Conversation

bnellnm commented Jul 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

peaceorwell commented Jul 17, 2024

Uh oh!

peaceorwell Jul 17, 2024

Choose a reason for hiding this comment

Uh oh!

bnellnm Jul 17, 2024

Choose a reason for hiding this comment

Uh oh!

bnellnm commented Jul 17, 2024

Uh oh!

bnellnm Jul 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bnellnm Jul 23, 2024

Choose a reason for hiding this comment

Uh oh!

lnykww commented Oct 14, 2024

Uh oh!

bnellnm commented Oct 14, 2024

Uh oh!

hmellor commented Feb 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

bnellnm commented Jul 12, 2024 •

edited

Loading

bnellnm Jul 23, 2024 •

edited

Loading