Codegen tensor to support arbitrary stride order #1144

jjsjann123 · 2021-09-21T17:50:58Z

🚀 Feature

Codegen tensor could support stride order, instead of assuming/requiring descending stride on domains.

Motivation

This request arises from our channels last PR #1118. Where currently we are permuting tensor on integration code, which creates inconsistent semantics on axes of tensors between TorchScript IR and codegen IR.
The motivation is to allow dimensional collapsing on non-contiguous tensors, which is not possible with codegen at this moment.

Pitch

As a demonstration, we would want to update two things on client APIs to allow user specified stride-index (I'm not arguing that stride_index should be maintained by TensorView):

  TensorView(
      TensorDomain* domain,
      DataType dtype,
      MemoryType mtype = MemoryType::Local,
      std::vector<size_t> stride_index);

class TORCH_CUDA_CU_API TensorViewBuilder {
 public:
      TensorViewBuilder& stride_index(const std::vector<size_t>& stride_index);
};

We need to keep in mind that the end goal is simplified indexing & performance, hence we should minimize permutation as much as possible, including input tensors as well as output tensors. A few more things regarding that topic:

Currently when we infer output tensor, we are only returning tensor size. This would likely change when we want to preserve & propagate memory format from inputs to outputs, because we need to communicate the stride requirements on outputs as well.
The concept of striding index might make parsing/scheduling tricky. Without the assumption of descending stride index on tensors, we now need to be careful when we merge tensor domains in scheduling, as well as some general dimension manipulation things that we do in parser. e.g. merging dimensions in layer-norm.

The text was updated successfully, but these errors were encountered:

jjsjann123 · 2021-09-21T17:51:43Z

cc'ing @naoyam
Hopefully I'm capturing what you want to track/discuss in the comment. But feel free to yell at me otherwise 🙏

naoyam · 2021-09-21T18:01:11Z

Thanks Jie. I think the main point is that we want to allow collapsing of dimensions when tensors are indeed contiguous. When contiguous tensors are reordered, like NHWC, they could appear non-contiguous even when they are really contiguous. I think it's possible to do that by extending the indexing and predication logic.

jjsjann123 added the backlog low priority label Feb 10, 2022

jjsjann123 mentioned this issue Apr 20, 2022

Extended permutation support requested from upstream #1601

Open

jjsjann123 mentioned this issue Apr 28, 2023

Codegen tensor to support arbitrary stride order NVIDIA/Fuser#248

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codegen tensor to support arbitrary stride order #1144

Codegen tensor to support arbitrary stride order #1144

jjsjann123 commented Sep 21, 2021

jjsjann123 commented Sep 21, 2021

naoyam commented Sep 21, 2021

Codegen tensor to support arbitrary stride order #1144

Codegen tensor to support arbitrary stride order #1144

Comments

jjsjann123 commented Sep 21, 2021

🚀 Feature

Motivation

Pitch

jjsjann123 commented Sep 21, 2021

naoyam commented Sep 21, 2021