Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TIR] fix storage rewrite index remap #8338

Merged

Conversation

wrongtest-intellif
Copy link
Contributor

@wrongtest-intellif wrongtest-intellif commented Jun 25, 2021

This is a fix trial for #8308 where the storage rewrite pass remap the index of a vectorized load.

The offset computation should be based on the actual allocated buffer's datatype rather than the load/store op's result datatype.

For example, if a float32*1024 ramp load is relocated to offset 4096b of a float buffer, the offset expr should be +4096/bits_of(float32) rather than +4096/bits_of(float32*1024)

cc @vinx13 @masahi

@@ -504,8 +503,13 @@ class StoragePlanRewriter : public StmtExprMutator {
return MergeNest(nest, body);
}
// Remap the index
PrimExpr RemapIndex(DataType dtype, PrimExpr index, StorageEntry* e) {
PrimExpr RemapIndex(PrimExpr index, StorageEntry* e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we removed the dtype here? perhaps we can limit the lanes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my understanding the most correct dtype to use is the buffer data type itself. The dtype param before is load_op->dtype or store_op->dtype from context.
If the buffer dtype has non-trivial lanes like tir.allocate([n], "float32*4") (I'm not sure if it is possible), it seems not clear which factor to use based on original dtype param.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current setup(our code generation), the Load and Store follows the dtype specified by the instruction, regardless of the dtype specified in allocation. So it is better to use dtype in the Node. But you are right that we should not multiply by lanes when calculating the offset

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you're right! I checked some mixed dtypes and found that the type casting is done before indexing for load/store. I repush the change which modify just one line of code about lanes, can you kindly review it again?

@tqchen tqchen merged commit 6bcad2e into apache:main Jul 6, 2021
@tqchen
Copy link
Member

tqchen commented Jul 6, 2021

Thanks @wrongtest

ylc pushed a commit to ylc/tvm that referenced this pull request Sep 29, 2021
zxy844288792 pushed a commit to zxy844288792/tvm that referenced this pull request Mar 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants