-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Relay][Pass] Support combine multiple dense op just into dense #6062
[Relay][Pass] Support combine multiple dense op just into dense #6062
Conversation
022bd68
to
2170bed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I have two questions other than the comments:
-
How to trigger this change (i.e.,
to_batch=false
) for an end user? It seems to me that you can configure it only by manually modifying the build_module or VM compiler and rebuilding TVM. -
IIUC, the reason that
ParallelDenseFlatCombiner
derived fromParallelOpCombiner
instead ofParallelOpBatchCombiner
is it requires special processes to almost every function, so it seems no benefit to derive fromParallelOpBatchCombiner
. Now the class hierarchy becomes:ParallelDenseBatchCombiner
<-ParallelOpBatchCombiner
<-ParallelOpCombiner
ParallelDenseFlatCombiner
<----------------------------------|
Since I didn't find any other classes derived from
ParallelOpBatchCombiner
, should we simplifyParallelOpBatchCombiner
class if we cannot make bothParallelDense*Combiner
derive from it?
Thanks for your comments !
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO there should be a CombineParallelOp Pass that just call CombineParallelDense/Conv2d/etc in sequential. Can you add that?
Also, why CombineParallel instead of Batch? Ppl use static batching/dynamic batching to describe this process.
@MarisaKirisame please manage the PR and merge after things everyone approves |
@tqchen got it. |
@MarisaKirisame please followup |
@wrongtest can you just make a pass that is nothing but a sequential of the 3 combine passes? that's all i think should be changed. |
Sorry for too late, a wrapped function BatchingOps() is added. |
…he#6062) * feat: Support combine multiple matmuls to flat matmul * fix: Change to_batch -> to_batch_matmul and enrich docstring * feat: Add wrapped batching ops pass for python
…he#6062) * feat: Support combine multiple matmuls to flat matmul * fix: Change to_batch -> to_batch_matmul and enrich docstring * feat: Add wrapped batching ops pass for python
…he#6062) * feat: Support combine multiple matmuls to flat matmul * fix: Change to_batch -> to_batch_matmul and enrich docstring * feat: Add wrapped batching ops pass for python
…he#6062) * feat: Support combine multiple matmuls to flat matmul * fix: Change to_batch -> to_batch_matmul and enrich docstring * feat: Add wrapped batching ops pass for python
…he#6062) * feat: Support combine multiple matmuls to flat matmul * fix: Change to_batch -> to_batch_matmul and enrich docstring * feat: Add wrapped batching ops pass for python
…he#6062) * feat: Support combine multiple matmuls to flat matmul * fix: Change to_batch -> to_batch_matmul and enrich docstring * feat: Add wrapped batching ops pass for python
Hi there, this PR is a minor modification to CombineParallelDense pass, refer to https://discuss.tvm.ai/t/yet-another-dense-op-combine-strategy/7126. The changes are:
Add option "to_batch" (default to True) to control whether combine dense ops into
batch_matmul
ordense
.Add implementation to combine dense ops into one large
dense
instead ofbatch_matmul
, which take almost same logic with that ofCombineParallelConv2D
pass.Test cases for combine various shapes of elem-wise op followed.
The new strategy can combine even ops of different output dims and may take better performance in circumstances where flat matmul operation is faster than equivalent batch_matmul operation.