Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codegen tensor to support arbitrary stride order #1144

Open
jjsjann123 opened this issue Sep 21, 2021 · 2 comments
Open

Codegen tensor to support arbitrary stride order #1144

jjsjann123 opened this issue Sep 21, 2021 · 2 comments
Labels
backlog low priority

Comments

@jjsjann123
Copy link
Collaborator

🚀 Feature

Codegen tensor could support stride order, instead of assuming/requiring descending stride on domains.

Motivation

This request arises from our channels last PR #1118. Where currently we are permuting tensor on integration code, which creates inconsistent semantics on axes of tensors between TorchScript IR and codegen IR.
The motivation is to allow dimensional collapsing on non-contiguous tensors, which is not possible with codegen at this moment.

Pitch

As a demonstration, we would want to update two things on client APIs to allow user specified stride-index (I'm not arguing that stride_index should be maintained by TensorView):

  TensorView(
      TensorDomain* domain,
      DataType dtype,
      MemoryType mtype = MemoryType::Local,
      std::vector<size_t> stride_index);

class TORCH_CUDA_CU_API TensorViewBuilder {
 public:
      TensorViewBuilder& stride_index(const std::vector<size_t>& stride_index);
};

We need to keep in mind that the end goal is simplified indexing & performance, hence we should minimize permutation as much as possible, including input tensors as well as output tensors. A few more things regarding that topic:

  1. Currently when we infer output tensor, we are only returning tensor size. This would likely change when we want to preserve & propagate memory format from inputs to outputs, because we need to communicate the stride requirements on outputs as well.
  2. The concept of striding index might make parsing/scheduling tricky. Without the assumption of descending stride index on tensors, we now need to be careful when we merge tensor domains in scheduling, as well as some general dimension manipulation things that we do in parser. e.g. merging dimensions in layer-norm.
@jjsjann123
Copy link
Collaborator Author

cc'ing @naoyam
Hopefully I'm capturing what you want to track/discuss in the comment. But feel free to yell at me otherwise 🙏

@naoyam
Copy link
Collaborator

naoyam commented Sep 21, 2021

Thanks Jie. I think the main point is that we want to allow collapsing of dimensions when tensors are indeed contiguous. When contiguous tensors are reordered, like NHWC, they could appear non-contiguous even when they are really contiguous. I think it's possible to do that by extending the indexing and predication logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog low priority
Projects
None yet
Development

No branches or pull requests

2 participants