Fusion executor #162

csarofeen · 2020-07-09T22:54:20Z

Reworking kernel.cpp to be more stand alone with more automation. Not sure yet if we'll replace it or just work on the ideas and refactor kernel.cpp

Still need:

Broadcast support in launch config inference
Global memory allocations (for reductions)
Thread binding constraints on symbolic values
Option to allocate outputs for you
Line numbers on kernel code for error output
Cache based on input run-time sizes
Allow splitting on symbolic integer

… into FusionExecutor

jjsjann123

LGTM as an initial PR to start the conversation / refactor.

We could use a merge and adopt some of the changes to kernel.cpp.

I like the new launch config inference using existing fusion IR, (even tho I have some questions/concerns).

FYI: We have some tests where we are launching kernel with launch config that does not comply to the size of bound axes, (which may or may not be legit test case), but they would not be doable using this interface.

jjsjann123 · 2020-07-10T08:12:51Z

torch/csrc/jit/codegen/cuda/kernel.cpp

@@ -341,6 +341,10 @@ void compileKernel(CudaKernel* entry) {
  std::string func_name;
  std::tie(func_name, code) = codeGeneration(entry->fusion());

+  std::ofstream out("output_ke.cu");


Can we try merge dev branch and restore kernel.cpp?

Done, need to see what you've changed to figure out if I need to take some new things in.

jjsjann123 · 2020-07-10T09:19:06Z

torch/csrc/jit/codegen/cuda/fusion_executor.h

+  void checkAndSet(
+      const int64_t incoming_val,
+      int& class_val,
+      std::string val) {


val is not used?

Thanks, meant to include it in the error message. Fixed.

jjsjann123 · 2020-07-10T09:28:56Z

torch/csrc/jit/codegen/cuda/fusion_executor.cpp

+  int nDims = tensor.ndimension();
+
+  c10::ScalarType dtype = tensor.scalar_type();
+  TensorArgAbstract* tensor_arg = getTensorArg(dtype, nDims);


nitpick: use unique_ptr to make ownership clear?

jjsjann123 · 2020-07-10T09:36:29Z

torch/csrc/jit/codegen/cuda/fusion_executor.cpp

+    }
+  }
+
+  // Infer bindings


I like it that we are inferring binding/launch_config from the codegen IR.
My question here is, would the same thing work for things like: dynamic shared memory size & global buffer size?

Yes, this is precisely how we will do it. I think everything will go together so it can use the same expression context.

jjsjann123 · 2020-07-10T09:39:22Z

torch/csrc/jit/codegen/cuda/fusion_executor.cpp

+    kah.appendPhilox(rand_offset);
+  }
+
+  // TODO SUPPORT GRID REDUCTIONS:


I was discussing with @naoyam on this topic earlier.

I think it would be cleaner if we can have aggregated global buffer request and codegen split buffer appropriately. (same way that we would work with dynamic shared memory.) That we we can infer buffer size in the IR easier using ExpressionEvaluator.

I don't think we need split buffers, but we can allocate individual buffers. Yes, I am planning to do this.

jjsjann123 · 2020-07-10T09:45:55Z

torch/csrc/jit/codegen/cuda/fusion_executor.cpp

+            ", within the tensor ",
+            tv,
+            " to set launch bounds but could not.");
+        lp.bind(val.value(), id->parallel_method());


I'm asking a future-ish question here.

If we are to support broadcast of size-1 dimension, does that mean we could have an IterDomain bind to thread dimension, which has size-1, while non-broadcasted tensor binding to the same thread dimension would have a different size.

We're going to have to figure out how that will work. I don't have the logic in for parallelized broadcast right now. I think we'll just exclude size=1 dims from the inference.

csarofeen · 2020-07-10T15:06:39Z

torch/csrc/jit/codegen/cuda/fusion_executor.cpp

+    }
+  }
+
+  // Infer bindings


Yes, this is precisely how we will do it. I think everything will go together so it can use the same expression context.

csarofeen · 2020-07-10T15:07:17Z

torch/csrc/jit/codegen/cuda/fusion_executor.cpp

+            ", within the tensor ",
+            tv,
+            " to set launch bounds but could not.");
+        lp.bind(val.value(), id->parallel_method());


We're going to have to figure out how that will work. I don't have the logic in for parallelized broadcast right now. I think we'll just exclude size=1 dims from the inference.

csarofeen · 2020-07-10T15:07:44Z

torch/csrc/jit/codegen/cuda/fusion_executor.cpp

+    kah.appendPhilox(rand_offset);
+  }
+
+  // TODO SUPPORT GRID REDUCTIONS:


I don't think we need split buffers, but we can allocate individual buffers. Yes, I am planning to do this.

csarofeen · 2020-07-10T15:09:09Z

torch/csrc/jit/codegen/cuda/fusion_executor.h

+  void checkAndSet(
+      const int64_t incoming_val,
+      int& class_val,
+      std::string val) {


Thanks, meant to include it in the error message. Fixed.

csarofeen · 2020-07-10T15:09:24Z

torch/csrc/jit/codegen/cuda/kernel.cpp

@@ -341,6 +341,10 @@ void compileKernel(CudaKernel* entry) {
  std::string func_name;
  std::tie(func_name, code) = codeGeneration(entry->fusion());

+  std::ofstream out("output_ke.cu");


Done, need to see what you've changed to figure out if I need to take some new things in.

csarofeen · 2020-07-10T15:09:48Z

torch/csrc/jit/codegen/cuda/kernel.cpp

@@ -607,12 +615,6 @@ void runTestKernel(
  // from I/O expected by the generated CUDA kernel.
  for (auto& input : inputs) {
    if (input.isTensor()) {


I removed the below as this is checked elsewhere (partially above)

tlemo

Looks like a nice step in the right direction.

tlemo · 2020-07-10T21:23:30Z

torch/csrc/jit/codegen/cuda/executor_kernel_arg.h

+
+class KernelArgumentHolder {
+ public:
+  ~KernelArgumentHolder() = default;


nit: is this needed?

tlemo · 2020-07-10T21:25:32Z

torch/csrc/jit/codegen/cuda/executor_kernel_arg.h

+
+  void appendArgs(const std::vector<at::Tensor>& tensors);
+
+  void appendPhilox(uint64_t rand_offset);


add a comment explaining what this does (and what a Philox is?)

Who doesn't know about Philox?! Will thoroughly comment when we're closer to usability.

changed var name to be more clear

tlemo · 2020-07-10T21:28:30Z

torch/csrc/jit/codegen/cuda/executor_launch_params.h

+  }
+
+  unsigned int bdimx() const {
+    return (unsigned int)bdimx_ == -1 ? 1 : bdimx_;


why the cast? did you mean
(unsigned int)(bdimx_ == -1 ? 1 : bdimx_) ?
or even better
static_cast<unsigned int>(bdimx_ == -1 ? 1 : bdimx_)

All syntax options here work, changed to static_cast.

tlemo · 2020-07-10T21:29:57Z

torch/csrc/jit/codegen/cuda/executor_launch_params.h

+        " that is ",
+        incoming_val,
+        ". Cannot create negative threads.");
+    if (class_val == -1 || class_val == 1) {


what is the meaning of -1 vs 1 ? maybe create symbolic constants for them?

Added a todo to convert them to c10::optional. Though even that seems like a relatively heavy hammer.

Using -1 and 1 as special values may be ok, but they could use good symbolic names

NITPICK: Is -1 even necessary?
In cases where we access un-initialized binding, we map -1 to 1 anyway. So maybe we can just stick with 1 to begin with as default binding and not use c10::optional?

Good point Jie, I was actually using both -1 and 1 as wildcard values. However, what I should be doing is using -1 as a wildcard value, and not evaluating broadcast dimensions. The broadcast dimensions were setting params to 1 which would be over-written later, but this isn't useful information to check self consistency in the bindings. I switched to -1 only, and if we get a value of 1 we assume that's a real inferred value.

tlemo · 2020-07-10T21:32:24Z

torch/csrc/jit/codegen/cuda/executor_utils.h

+    int device);
+
+struct NvrtcFunction {
+ public:


nit: not needed

tlemo · 2020-07-10T21:50:38Z

torch/csrc/jit/codegen/cuda/executor.h

+  executor_utils::NvrtcFunction compiled_kernel;
+
+  // State of the fusion that's important
+  bool has_random;


tlemo · 2020-07-10T21:51:08Z

torch/csrc/jit/codegen/cuda/executor.h

+ private:
+  // TODO: Make pointer to const, will take some const fixing in codegenerator
+  // to work.
+  Fusion fusion_;


So we don't mutate the input fusion? Nice! (the comment needs to be updated, there's no pointer in here)

Removed for now, still need to consider const model, but taking our own copy and not modifying it is probably enough.

tlemo · 2020-07-10T21:52:49Z

torch/csrc/jit/codegen/cuda/executor.h

+
+  std::string getKernel();
+
+  void compileFusion(Fusion* fusion);


we can const Fusion* fusion !

Can't yet, need to go through Fusion and fix its constness model. This is a goal, but not high priority at the moment.

since we make a copy, why not? (the copy constructor take a const Fusion&)

tlemo · 2020-07-10T21:55:20Z

torch/csrc/jit/codegen/cuda/executor.cpp

+    }
+  }
+
+  LaunchParams lp;


nit: launch_params instead of the cryptic lp?

tlemo · 2020-07-10T21:55:44Z

torch/csrc/jit/codegen/cuda/executor.cpp

+
+  TORCH_INTERNAL_ASSERT(!outputs.empty(), "No outputs set for test kernel.");
+
+  KernelArgumentHolder kah;


kernel arguments instead of kah ?

csarofeen · 2020-07-10T18:14:59Z

torch/csrc/jit/codegen/cuda/executor.h

+
+  std::string KernelName() const {
+    std::stringstream ss;
+    ss << "kernel"; // << fusion_id;


uncomment fusion_id

csarofeen · 2020-07-10T18:19:12Z

torch/csrc/jit/codegen/cuda/executor_launch_params.h

+  unsigned int smem_ = 0;
+
+  // TODO: Fill in output sizes
+  std::vector<std::vector<unsigned int>> output_sizes;


uint_64 or int_64

csarofeen · 2020-07-11T12:51:53Z

torch/csrc/jit/codegen/cuda/executor.h

+  int device = 0;
+};
+
+class TORCH_CUDA_API FusionExecutor {


Will add comments when it's closer to being usable.

csarofeen · 2020-07-11T12:54:01Z

torch/csrc/jit/codegen/cuda/executor.h

+
+  std::string getKernel();
+
+  void compileFusion(Fusion* fusion);


Can't yet, need to go through Fusion and fix its constness model. This is a goal, but not high priority at the moment.

csarofeen · 2020-07-11T12:55:54Z

torch/csrc/jit/codegen/cuda/executor.h

+
+  LaunchParams computeLaunchParams(const at::ArrayRef<IValue>& aten_inputs);
+
+  std::vector<at::Tensor> runFusion(


If you like kernel then why not compileKernel and runKernel? We are doing both, just lazy to remember name changes from one to the next.

csarofeen · 2020-07-11T13:00:05Z

torch/csrc/jit/codegen/cuda/executor_kernel_arg.h

+  // in the buffer
+  void** getBuffer();
+
+  void appendArgs(const c10::ArrayRef<c10::IValue>& args);


I always liked append for pushing multiple values, but can change it to be more consistent. Always forget about standard containers.

csarofeen · 2020-07-11T13:00:37Z

torch/csrc/jit/codegen/cuda/executor_kernel_arg.h

+
+  void appendArgs(const std::vector<at::Tensor>& tensors);
+
+  void appendPhilox(uint64_t rand_offset);


Who doesn't know about Philox?! Will thoroughly comment when we're closer to usability.

csarofeen · 2020-07-11T13:01:40Z

torch/csrc/jit/codegen/cuda/executor_launch_params.h

+  }
+
+  unsigned int bdimx() const {
+    return (unsigned int)bdimx_ == -1 ? 1 : bdimx_;


All syntax options here work, changed to static_cast.

csarofeen · 2020-07-11T13:02:37Z

torch/csrc/jit/codegen/cuda/executor_launch_params.h

+        " that is ",
+        incoming_val,
+        ". Cannot create negative threads.");
+    if (class_val == -1 || class_val == 1) {


Added a todo to convert them to c10::optional. Though even that seems like a relatively heavy hammer.

csarofeen · 2020-07-11T13:03:15Z

torch/csrc/jit/codegen/cuda/executor_utils.cpp

+namespace fuser {
+namespace cuda {
+namespace executor_utils {
+std::string KernelPreamble() {


surprised clang format didn't do that...

tlemo · 2020-07-11T17:34:07Z

torch/csrc/jit/codegen/cuda/executor_kernel_arg.h

@@ -103,26 +105,26 @@ struct TensorArg : public TensorArgAbstract {
 };

 template <typename T>
-TensorArgAbstract* getTensorArg(int nDims) {
+std::unique_ptr<TensorArgAbstract> getTensorArg(int nDims) {
  switch (nDims) {
    case (0):


case (0) vs case 0 :)

tlemo · 2020-07-11T17:36:35Z

torch/csrc/jit/codegen/cuda/executor.h

+
+  LaunchParams computeLaunchParams(const at::ArrayRef<IValue>& aten_inputs);
+
+  std::vector<at::Tensor> runFusion(


compile implies a transformation: we compile a fusion into a kernel. The names should reflect that (ex. you compile c++ code and run the binary. if someone would say they run C++ code you'd probably think of an interpreter)

tlemo · 2020-07-11T17:37:40Z

torch/csrc/jit/codegen/cuda/executor_kernel_arg.h

+  // in the buffer
+  void** getBuffer();
+
+  void appendArgs(const c10::ArrayRef<c10::IValue>& args);


consistency with standard containers is nice, but self-consistency is what I was going after (I think multiple overloads, all names either push or append would be fine)

… infered from input dimensions alone.

…messages.

… thread dimension.

…uctions.

…r we process.

…rt tests to fusion executor.

tlemo

I did a quick pass and left some small comments.

This would be easier to review if it was split into multiple parts, but I guess we're past that point now?

tlemo · 2020-07-16T20:55:33Z

test/cpp/jit/test_gpu.cpp

@@ -25,6 +27,7 @@
 namespace torch {
 namespace jit {

+using namespace torch::jit::fuser;


probably not intentional?

tlemo · 2020-07-16T20:56:29Z

test/cpp/jit/test_gpu.cpp

-  prog.setFusionPtr(std::make_unique<Fusion>());
-  Fusion* fusion = prog.fusion();
-  FusionGuard fg(fusion);
+  Fusion fusion;


yay! (I like this pattern better)

tlemo · 2020-07-16T20:57:57Z

torch/csrc/jit/codegen/cuda/dispatch.h

@@ -73,6 +73,7 @@ class UnaryOp;
 class BinaryOp;
 class TernaryOp;
 class ReductionOp;
+class GridReduction;


please update the IrGraphGenerator test case to include this node type

This is done.

tlemo · 2020-07-16T20:59:11Z

torch/csrc/jit/codegen/cuda/executor_kernel_arg.cpp

+    c10::ScalarType dtype,
+    int nDims) {
+  switch (dtype) {
+    case (c10::ScalarType::Float):


please remove the parentheses

tlemo · 2020-07-16T21:00:11Z

torch/csrc/jit/codegen/cuda/executor_kernel_arg.cpp

+      "Tried to push an arg to run in a fused kernel, expected a scalar but got, ",
+      val);
+  switch (val.toScalar().type()) {
+    case (c10::ScalarType::Double):


tlemo · 2020-07-16T21:24:56Z

torch/csrc/jit/codegen/cuda/lower2device.h

 private:
-  Fusion* const fusion_ = nullptr;
+  std::vector<Allocate*> global_allocations_;


these deserve some comments

tlemo · 2020-07-16T21:25:47Z

torch/csrc/jit/codegen/cuda/lower2device.h

+  std::vector<Allocate*> global_allocations_;
+  std::vector<Allocate*> sync_allocations_;
+
+  void lower();


move lowr() to a dedicated private section for members (please don't mix data member and methods)

tlemo · 2020-07-16T21:27:22Z

torch/csrc/jit/codegen/cuda/type.cpp

+    default:
+      break;
+  }
+  TORCH_INTERNAL_ASSERT(false, "No data type found for scalar type.");


nit: the default path can be moved inside the switch/default

tlemo · 2020-07-16T21:30:12Z

torch/csrc/jit/codegen/cuda/mutator.cpp

 }

 Statement* OptOutMutator::mutate(Split* s) {
  IterDomain* ot = static_cast<IterDomain*>(mutateAsVal(s->outer()));
  IterDomain* inr = static_cast<IterDomain*>(mutateAsVal(s->inner()));
  IterDomain* in = static_cast<IterDomain*>(mutateAsVal(s->in()));
-  Int* fact = static_cast<Int*>(mutateAsVal(s->factor()));
+  Val* fact = static_cast<Val*>(mutateAsVal(s->factor()));


tlemo · 2020-07-16T21:31:56Z

torch/csrc/jit/codegen/cuda/lower_index.h

@@ -10,30 +10,39 @@ namespace torch {
 namespace jit {
 namespace fuser {

-class TORCH_CUDA_API IndexLowering : public OptOutMutator {
+class TORCH_CUDA_API IndexLowering : public OptInDispatch {
 private:


please use separate private sections for data member and methods (private methods, then private data)

tlemo · 2020-07-17T17:48:09Z

test/cpp/jit/test_gpu.cpp

@@ -490,10 +483,6 @@ void testGPU_FusionCopy() {

    tv3->axis(0)->parallelize(ParallelType::BIDx);
    tv3->axis(-1)->parallelize(ParallelType::TIDx);
-


why is this removed? it was added to make sure the launch config is copied correctly

Launch config is being removed. Done in follow up PR.

tlemo · 2020-07-17T17:51:25Z

test/cpp/jit/test_gpu.cpp


  at::Tensor tv2_ref = input2 + 2.0;
  at::Tensor output_ref = input1 + tv2_ref;

-  TORCH_CHECK(output_ref.equal(output));
+  TORCH_CHECK(output_ref.equal(outputs[0]));
 }

 void testGPU_FusionCopy() {


pls add coverage for GridReduction (same in FusionMove test case)

Fusion copy and fusion move tests are not exhaustively testing the IR. If you think that needs to be covered please open an issue and we can hit these tests harder.

tlemo · 2020-07-17T20:20:34Z

torch/csrc/jit/codegen/cuda/arith.cpp

-              new Int(0), new Int(1), ParallelType::Serial, false, false, true);
-        return dom;
-      });
+  for (size_t dim_i = 0; dim_i < out_domain.size(); dim_i++) {


add a comment here?

tlemo · 2020-07-17T20:22:07Z

torch/csrc/jit/codegen/cuda/arith.cpp

-Val* newOutputVal(const std::vector<Val*>& vals) {
-  TORCH_INTERNAL_ASSERT(
-      !vals.empty(), "Cannot promote values if there aren't any.");
+std::vector<Val*> maybeBroadcast(const std::vector<Val*>& vals) {


is this a local helper?

It's in an anonymous namespace, so yes.

thanks. it's really hard to tell while reviewing a PR (which is a reason I prefer static instead of anonymous namespaces)

tlemo · 2020-07-17T20:25:23Z

torch/csrc/jit/codegen/cuda/index_compute.cpp

@@ -127,27 +127,31 @@ TensorIndex* Index::getGlobalProducerIndex(
  std::vector<Val*> p_inds;
  auto p_root = TensorDomain::noReductions(producer->getRootDomain());
  // Number of root dims that are broadcasted
-  size_t bcast_dims = 0;
+  size_t implicit_bcast_dims = 0;
  {


maybe factor this out as a separate helper function?

Index compute may deserve a refactor but probably not in the most need for it at this point.

tlemo · 2020-07-17T20:32:09Z

torch/csrc/jit/codegen/cuda/ir_nodes.cpp

+}
+
+c10::optional<ParallelType> NamedScalar::getParallelDim() const {
+  if (stringifyThreadSize(ParallelType::TIDx).compare(name()) == 0) {


this could be simplified as a unordered_map<string, ParallelType> lookup

We did use an unordered map, this was explicitly changed by FB due to an internal build failure.

tlemo · 2020-07-17T20:32:40Z

torch/csrc/jit/codegen/cuda/ir_nodes.cpp

+}
+
+c10::optional<ParallelType> NamedScalar::getParallelIndex() const {
+  if (stringifyThread(ParallelType::TIDx).compare(name()) == 0) {


tlemo · 2020-07-17T20:33:06Z

torch/csrc/jit/codegen/cuda/kernel_cache.cpp

@@ -11,16 +11,15 @@ namespace fuser {
 namespace cuda {

 at::optional<CudaKernel*> CudaKernelCache::getKernelPtr(
-    const at::ArrayRef<c10::IValue> inputs,
-    const std::vector<int64_t>& broadcasted_shape) {
+    const at::ArrayRef<c10::IValue> inputs) {


const ref ?

tlemo · 2020-07-17T20:33:36Z

torch/csrc/jit/codegen/cuda/kernel_cache.h

@@ -107,8 +104,7 @@ class CudaKernelCache {
  CudaKernelCache() = default;

  at::optional<CudaKernel*> getKernelPtr(
-      const at::ArrayRef<c10::IValue> inputs,
-      const std::vector<int64_t>& broadcasted_shape);
+      const at::ArrayRef<c10::IValue> inputs);


kernel cache will get rewritten or removed in following prs.

tlemo · 2020-07-17T20:34:39Z

torch/csrc/jit/codegen/cuda/lower2device.cpp

@@ -30,20 +81,47 @@ std::vector<Expr*> GPULower::getLoweredExprs() {

  auto indexed_loops = IndexLowering::getIndexedExprs(fusion_, unrolled_loops);

-  return indexed_loops;
+  lowered_exprs_ = indexed_loops;


I mean, why not assign directly to lowered_exprs_ ?

tlemo

Overall it looks good to me.

The only part I'm not sure about is the FusionExecutor interface: why do we a stateful, yet reusable executor?

tlemo · 2020-07-17T21:30:09Z

torch/csrc/jit/codegen/cuda/executor_utils.h

+// Include all the functions we might need in generated code
+std::string kernelPreamble();
+
+bool validateKernelArgTensor(


why are all these functions extern?

what would you prefer them to be?

tlemo · 2020-07-17T21:34:06Z

torch/csrc/jit/codegen/cuda/executor.h

@@ -0,0 +1,83 @@
+#pragma once


nit: empty line after #pragma once

tlemo · 2020-07-17T21:34:47Z

torch/csrc/jit/codegen/cuda/executor.h

+
+class TORCH_CUDA_API FusionExecutor {
+ public:
+  FusionExecutor() {}


nit: = default?

tlemo · 2020-07-17T21:36:44Z

torch/csrc/jit/codegen/cuda/executor.h

+class TORCH_CUDA_API FusionExecutor {
+ public:
+  FusionExecutor() {}
+  FusionExecutor(CompileOptions options) : options_(options) {}


why aren't CompileOptions passed to compileFusion() ?

Yes, I changed this in: #182

tlemo · 2020-07-17T21:38:21Z

torch/csrc/jit/codegen/cuda/executor.h

+  int device = 0;
+};
+
+class TORCH_CUDA_API FusionExecutor {


A FusionExecutor seems to be usable for multiple compilations / runs, right? At the same time, it's stateful. What is the rationale for this approach (as opposed to a more "functional" design - compile(fusion) -> kernel. kernel object can be executed)

tlemo · 2020-07-17T21:40:37Z

torch/csrc/jit/codegen/cuda/executor.cpp

+
+// Check if a value is already bound, if so validate we're trying to bind to the
+// same value
+void safeBind(


When tensor is resized, reference array to it's sizes may become invalid. Make a copy in advance. <details> <summary>ASAN report</summary> ``` ================================================================= ==1115867==ERROR: AddressSanitizer: heap-use-after-free on address 0x61000013d790 at pc 0x03ff8e7da360 bp 0x03fff53c83a0 sp 0x03fff53c8390 READ of size 8 at 0x61000013d790 thread T0 #0 0x3ff8e7da35f in c10::SymInt::is_heap_allocated() const /home/user/pytorch/c10/core/SymInt.h:154 #1 0x3ff8e7da35f in c10::SymInt::maybe_as_int() const /home/user/pytorch/c10/core/SymInt.h:215 csarofeen#2 0x3ff8e7d0a6d in c10::SymInt::sym_eq(c10::SymInt const&) const /home/user/pytorch/c10/core/SymInt.cpp:69 csarofeen#3 0x3ff7a9ab0bd in c10::SymInt::operator==(c10::SymInt const&) const /home/user/pytorch/c10/core/SymInt.h:177 csarofeen#4 0x3ff7a9aaedd in bool std::__equal<false>::equal<c10::SymInt const*, c10::SymInt const*>(c10::SymInt const*, c10::SymInt const*, c10::SymInt const*) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++- v11/bits/stl_algobase.h:1162 csarofeen#5 0x3ff7a9aae4b in bool std::__equal_aux1<c10::SymInt const*, c10::SymInt const*>(c10::SymInt const*, c10::SymInt const*, c10::SymInt const*) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/ stl_algobase.h:1211 csarofeen#6 0x3ff7a9aae05 in bool std::__equal_aux<c10::SymInt const*, c10::SymInt const*>(c10::SymInt const*, c10::SymInt const*, c10::SymInt const*) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/s tl_algobase.h:1219 csarofeen#7 0x3ff7a9aad97 in bool std::equal<c10::SymInt const*, c10::SymInt const*>(c10::SymInt const*, c10::SymInt const*, c10::SymInt const*) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_alg obase.h:1556 csarofeen#8 0x3ff4b23c771 in c10::ArrayRef<c10::SymInt>::equals(c10::ArrayRef<c10::SymInt>) const /home/user/pytorch/c10/util/ArrayRef.h:188 csarofeen#9 0x3ff4cb91bc1 in bool c10::operator!=<c10::SymInt>(c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>) /home/user/pytorch/c10/util/ArrayRef.h:341 csarofeen#10 0x3ff6d1b57ff in torch::ADInplaceOrView::resize_(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pytorch/torch/csrc/autograd/Variab leTypeManual.cpp:408 csarofeen#11 0x3ff6d1e59c7 in c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c1 0::MemoryFormat>), &torch::ADInplaceOrView::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > >::operator()(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13 csarofeen#12 0x3ff6d1e59c7 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10: :ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>), &torch::ADInplaceOrView::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::Sy mInt>, c10::optional<c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKernel*, c10::Disp atchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:480 csarofeen#13 0x3ff51ca5129 in at::Tensor const& c10::callUnboxedKernelFunction<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(void*, c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>&&, c10::optional<c10::MemoryFormat>&&) /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50 csarofeen#14 0x3ff51ca6e8f in at::Tensor const& c10::KernelFunction::call<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::OperatorHandle const&, c10::D ispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:90 csarofeen#15 0x3ff51ca6e8f in at::Tensor const& c10::Dispatcher::redispatch<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::TypedOperatorHandle<at::Ten sor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)> const&, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:656 csarofeen#16 0x3ff5182006b in c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::redispatch(c10::DispatchKeySet, at::Tensor const&, c 10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:492 csarofeen#17 0x3ff5182006b in at::_ops::resize_::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) aten/src/ATen/Operators_4.cpp:2144 csarofeen#18 0x3ff6d1d5e07 in at::redispatch::resize__symint(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) aten/src/ATen/RedispatchFunctions.h:2847 csarofeen#19 0x3ff6d1bbb67 in torch::autograd::VariableType::(anonymous namespace)::resize_(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pyto rch/torch/csrc/autograd/VariableTypeManual.cpp:243 csarofeen#20 0x3ff6d1bd197 in c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c1 0::MemoryFormat>), &torch::autograd::VariableType::(anonymous namespace)::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10 ::optional<c10::MemoryFormat> > >::operator()(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pytorch/aten/src/ATen/core/boxing/impl/WrapFu nctionIntoFunctor.h:13 csarofeen#21 0x3ff6d1bd197 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10: :ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>), &torch::autograd::VariableType::(anonymous namespace)::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::call(c 10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor .h:480 csarofeen#22 0x3ff51ca5129 in at::Tensor const& c10::callUnboxedKernelFunction<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(void*, c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>&&, c10::optional<c10::MemoryFormat>&&) /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50 csarofeen#23 0x3ff5181ead1 in at::Tensor const& c10::KernelFunction::call<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::OperatorHandle const&, c10::D ispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:90 csarofeen#24 0x3ff5181ead1 in at::Tensor const& c10::Dispatcher::call<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::TypedOperatorHandle<at::Tensor co nst& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)> const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/user/pytorch/at en/src/ATen/core/dispatch/Dispatcher.h:639 csarofeen#25 0x3ff5181ead1 in c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::call(at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:487 csarofeen#26 0x3ff5181ead1 in at::_ops::resize_::call(at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) aten/src/ATen/Operators_4.cpp:2137 csarofeen#27 0x3ff79b44fcf in at::Tensor::resize__symint(c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const aten/src/ATen/core/TensorBody.h:2452 csarofeen#28 0x3ff79a802db in torch::autograd::THPVariable_resize_(_object*, _object*, _object*)::$_0::operator()(at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/us er/pytorch/torch/csrc/autograd/generated/python_variable_methods.cpp:13417 csarofeen#29 0x3ff7999f1eb in torch::autograd::THPVariable_resize_(_object*, _object*, _object*) /home/user/pytorch/torch/csrc/autograd/generated/python_variable_methods.cpp:13419 csarofeen#30 0x3ffa2c9b009 in method_vectorcall_VARARGS_KEYWORDS Objects/descrobject.c:344 csarofeen#31 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#32 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#33 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#34 0x3ffa2dff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198 csarofeen#35 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#36 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#37 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#38 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255 csarofeen#39 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290 csarofeen#40 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317 csarofeen#41 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943 csarofeen#42 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277 csarofeen#43 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#44 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#45 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#46 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255 csarofeen#47 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290 csarofeen#48 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317 csarofeen#49 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943 csarofeen#50 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277 csarofeen#51 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#52 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#53 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#54 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#55 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#56 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#57 0x3ffa2dff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198 csarofeen#58 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#59 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#60 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#61 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#62 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#63 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#64 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#65 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#66 0x3ffa2dff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213 csarofeen#67 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#68 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#69 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#70 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#71 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#72 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#73 0x3ffa2dff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198 csarofeen#74 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#75 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#76 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#77 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#78 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#79 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#80 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#81 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#82 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231 csarofeen#83 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#84 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#85 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#86 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#87 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#88 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#89 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#90 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#91 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231 csarofeen#92 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#93 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#94 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#95 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#96 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#97 0x3ffa2c8ab9b in PyVectorcall_Call Objects/call.c:267 csarofeen#98 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290 csarofeen#99 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317 csarofeen#100 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943 csarofeen#101 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277 csarofeen#102 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#103 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#104 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#105 0x3ffa2c8a695 in _PyObject_FastCallDictTstate Objects/call.c:153 csarofeen#106 0x3ffa2c8b271 in _PyObject_Call_Prepend Objects/call.c:431 csarofeen#107 0x3ffa2d3f307 in slot_tp_call Objects/typeobject.c:7494 csarofeen#108 0x3ffa2c8a933 in _PyObject_MakeTpCall Objects/call.c:215 csarofeen#109 0x3ffa2df0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112 csarofeen#110 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#111 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#112 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231 csarofeen#113 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#114 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#115 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#116 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#117 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#118 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#119 0x3ffa2dff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198 csarofeen#120 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#121 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#122 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#123 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255 csarofeen#124 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290 csarofeen#125 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317 csarofeen#126 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943 csarofeen#127 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277 csarofeen#128 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#129 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#130 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#131 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#132 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#133 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#134 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181 csarofeen#135 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#136 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#137 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#138 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#139 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#140 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#141 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#142 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#143 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181 csarofeen#144 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#145 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#146 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#147 0x3ffa2c8a695 in _PyObject_FastCallDictTstate Objects/call.c:153 csarofeen#148 0x3ffa2c8b271 in _PyObject_Call_Prepend Objects/call.c:431 csarofeen#149 0x3ffa2d3f307 in slot_tp_call Objects/typeobject.c:7494 csarofeen#150 0x3ffa2c8ad17 in _PyObject_Call Objects/call.c:305 csarofeen#151 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317 csarofeen#152 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943 csarofeen#153 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277 csarofeen#154 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#155 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#156 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#157 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#158 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#159 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#160 0x3ffa2dff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213 csarofeen#161 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#162 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#163 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#164 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#165 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#166 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#167 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#168 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#169 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231 csarofeen#170 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#171 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#172 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#173 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255 csarofeen#174 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290 csarofeen#175 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317 csarofeen#176 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943 csarofeen#177 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277 csarofeen#178 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#179 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#180 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#181 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#182 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#183 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#184 0x3ffa2dff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213 csarofeen#185 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#186 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#187 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#188 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#189 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#190 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#191 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231 csarofeen#192 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#193 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#194 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#195 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255 csarofeen#196 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290 csarofeen#197 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317 csarofeen#198 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943 csarofeen#199 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277 csarofeen#200 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#201 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#202 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#203 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#204 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#205 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#206 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181 csarofeen#207 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#208 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#209 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#210 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#211 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#212 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#213 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#214 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#215 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181 csarofeen#216 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#217 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#218 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#219 0x3ffa2c8a695 in _PyObject_FastCallDictTstate Objects/call.c:153 csarofeen#220 0x3ffa2c8b271 in _PyObject_Call_Prepend Objects/call.c:431 csarofeen#221 0x3ffa2d3f307 in slot_tp_call Objects/typeobject.c:7494 csarofeen#222 0x3ffa2c8a933 in _PyObject_MakeTpCall Objects/call.c:215 csarofeen#223 0x3ffa2df0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112 csarofeen#224 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#225 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#226 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231 csarofeen#227 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#228 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#229 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#230 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255 csarofeen#231 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290 csarofeen#232 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317 csarofeen#233 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943 csarofeen#234 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277 csarofeen#235 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#236 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#237 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#238 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#239 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#240 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#241 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181 csarofeen#242 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#243 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#244 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#245 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#246 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#247 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#248 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#249 0x3ffa2e05447 in call_function Python/ceval.c:5891 csarofeen#250 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181 csarofeen#251 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#252 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#253 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#254 0x3ffa2c8a695 in _PyObject_FastCallDictTstate Objects/call.c:153 csarofeen#255 0x3ffa2c8b271 in _PyObject_Call_Prepend Objects/call.c:431 csarofeen#256 0x3ffa2d3f307 in slot_tp_call Objects/typeobject.c:7494 csarofeen#257 0x3ffa2c8a933 in _PyObject_MakeTpCall Objects/call.c:215 0x61000013d790 is located 80 bytes inside of 192-byte region [0x61000013d740,0x61000013d800) freed by thread T0 here: #0 0x3ffa3237de5 in operator delete(void*) /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:160 #1 0x3ff8e7e3221 in c10::TensorImpl::~TensorImpl() /home/user/pytorch/c10/core/TensorImpl.cpp:75 previously allocated by thread T0 here: #0 0x3ffa323734f in operator new(unsigned long) /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:99 #1 0x3ff4aeeb3d1 in c10::intrusive_ptr<c10::TensorImpl, c10::detail::intrusive_target_default_null_type<c10::TensorImpl> > c10::intrusive_ptr<c10::TensorImpl, c10::detail::intrusive_target_default_nul l_type<c10::TensorImpl> >::make<c10::intrusive_ptr<c10::StorageImpl, c10::detail::intrusive_target_default_null_type<c10::StorageImpl> >, c10::DispatchKeySet&, caffe2::TypeMeta&>(c10::intrusive_ptr<c10::S torageImpl, c10::detail::intrusive_target_default_null_type<c10::StorageImpl> >&&, c10::DispatchKeySet&, caffe2::TypeMeta&) /home/user/pytorch/c10/util/intrusive_ptr.h:498 csarofeen#2 0x3ff76f79e17 (/home/user/pytorch/build/lib.linux-s390x-cpython-310/torch/lib/libtorch_cpu.so+0x2fb79e17) SUMMARY: AddressSanitizer: heap-use-after-free /home/user/pytorch/c10/core/SymInt.h:154 in c10::SymInt::is_heap_allocated() const Shadow bytes around the buggy address: 0x100c2000027aa0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd 0x100c2000027ab0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x100c2000027ac0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd 0x100c2000027ad0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x100c2000027ae0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd =>0x100c2000027af0: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd 0x100c2000027b00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00 0x100c2000027b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100c2000027b20: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00 0x100c2000027b30: 00 00 00 00 04 fa fa fa fa fa fa fa fa fa fa fa 0x100c2000027b40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==1115867==ABORTING ``` </details> <details> <summary>Additional backtraces (not full)</summary> Memory deallocation: ``` #0 operator delete (ptr=0x61000013d740) at /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:160 #1 0x000003ffa77e3222 in c10::TensorImpl::~TensorImpl (this=0x61000013d740) at /home/user/pytorch/c10/core/TensorImpl.cpp:75 csarofeen#2 0x000003ff63e76e8c in c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::reset_ (this=0x3ffd7ec8230) at /home/user/pytorch/c10/util/intrusive_ptr.h:291 csarofeen#3 0x000003ff63e76910 in c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::~intrusive_ptr (this=0x3ffd7ec8230) at /home/user/pytorch/c10/util/intrusive_ptr.h:370 csarofeen#4 0x000003ff63e67240 in at::TensorBase::~TensorBase (this=0x3ffd7ec8230) at /home/user/pytorch/aten/src/ATen/core/TensorBase.h:80 csarofeen#5 0x000003ff63e85ee0 in at::Tensor::~Tensor (this=0x3ffd7ec8230) at aten/src/ATen/core/TensorBody.h:90 csarofeen#6 0x000003ff63f67304 in resize__functionalization (dispatchKeySet=..., self=..., size=..., memory_format=...) at /home/user/pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:173 csarofeen#7 0x000003ff63f89258 in c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>), &(resize__functionalization(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>))>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat> > >::operator()(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>) ( this=0x6030000390a0, args=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13 csarofeen#8 c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>), &(resize__functionalization(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>))>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>) (functor=0x6030000390a0, dispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:480 csarofeen#9 0x000003ff6aca560a in c10::callUnboxedKernelFunction<at::Tensor const&, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat> > ( unboxed_kernel_func=0x3ff63f88a80 <c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tenso r const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>), &(resize__functionalization(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>))>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>)>, functor=0x6030000390a0, dispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50 csarofeen#10 0x000003ff6aca715c in c10::KernelFunction::call<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > (this=0x6210005e1b28, opHandle=..., dispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:96 csarofeen#11 c10::Dispatcher::redispatch<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)> const&, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const ( this=0x3ff919400e0 <c10::Dispatcher::realSingleton()::_singleton>, op=..., currentDispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:656 csarofeen#12 0x000003ff6a82006c in c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const ( this=0x3ff919a07e0 <at::_ops::resize_::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)::op>, currentDispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:492 csarofeen#13 at::_ops::resize_::redispatch (dispatchKeySet=..., self=..., size=..., memory_format=...) at /home/user/pytorch/build/aten/src/ATen/Operators_4.cpp:2144 csarofeen#14 0x000003ff861d5e08 in at::redispatch::resize__symint (dispatchKeySet=..., self=..., size=..., memory_format=...) at aten/src/ATen/RedispatchFunctions.h:2847 csarofeen#15 0x000003ff861b579e in torch::ADInplaceOrView::resize_ (ks=..., self=..., size=..., optional_memory_format=...) at /home/user/pytorch/torch/csrc/autograd/VariableTypeManual.cpp:401 ``` Memory access: ``` #0 c10::SymInt::maybe_as_int (this=0x61000013d790) at /home/user/pytorch/c10/core/SymInt.h:215 #1 0x000003ff734d0a6e in c10::SymInt::sym_eq (this=0x61000013d790, sci=...) at /home/user/pytorch/c10/core/SymInt.cpp:69 csarofeen#2 0x000003ff5f6ab0be in c10::SymInt::operator== (this=0x61000013d790, o=...) at /home/user/pytorch/c10/core/SymInt.h:177 csarofeen#3 0x000003ff5f6aaede in std::__equal<false>::equal<c10::SymInt const*, c10::SymInt const*> (__first1=0x61000013d790, __last1=0x61000013d7a0, __first2=0x602000015c30) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_algobase.h:1162 csarofeen#4 0x000003ff5f6aae4c in std::__equal_aux1<c10::SymInt const*, c10::SymInt const*> (__first1=0x61000013d790, __last1=0x61000013d7a0, __first2=0x602000015c30) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_algobase.h:1211 csarofeen#5 0x000003ff5f6aae06 in std::__equal_aux<c10::SymInt const*, c10::SymInt const*> (__first1=0x61000013d790, __last1=0x61000013d7a0, __first2=0x602000015c30) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_algobase.h:1219 csarofeen#6 0x000003ff5f6aad98 in std::equal<c10::SymInt const*, c10::SymInt const*> (__first1=0x61000013d790, __last1=0x61000013d7a0, __first2=0x602000015c30) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_algobase.h:1556 csarofeen#7 0x000003ff2ff3c772 in c10::ArrayRef<c10::SymInt>::equals (this=0x3ffed7c9900, RHS=...) at /home/user/pytorch/c10/util/ArrayRef.h:188 csarofeen#8 0x000003ff31891bc2 in c10::operator!=<c10::SymInt> (a1=..., a2=...) at /home/user/pytorch/c10/util/ArrayRef.h:341 csarofeen#9 0x000003ff51eb5800 in torch::ADInplaceOrView::resize_ (ks=..., self=..., size=..., optional_memory_format=...) at /home/user/pytorch/torch/csrc/autograd/VariableTypeManual.cpp:408 csarofeen#10 0x000003ff51ee59c8 in c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c 10::MemoryFormat>), &torch::ADInplaceOrView::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > >::operator()(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) (this=0x6030007dca40, args=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13 csarofeen#11 c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt >, c10::optional<c10::MemoryFormat>), &torch::ADInplaceOrView::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional< c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tenso r const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) (functor=0x6030007dca40, dispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:480 csarofeen#12 0x000003ff369a512a in c10::callUnboxedKernelFunction<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > ( unboxed_kernel_func=0x3ff51ee51f0 <c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tenso r const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>), &torch::ADInplaceOrView::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::Ar rayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKern el*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>, functor=0x6030007dca40, dispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50 csarofeen#13 0x000003ff369a6e90 in c10::KernelFunction::call<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > (this=0x6210005e1bc8, opHandle=..., dispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:90 csarofeen#14 c10::Dispatcher::redispatch<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::Arr ayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)> const&, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const ( this=0x3ff5d6400e0 <c10::Dispatcher::realSingleton()::_singleton>, op=..., currentDispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:656 csarofeen#15 0x000003ff3652006c in c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const ( this=0x3ff5d6a07e0 <at::_ops::resize_::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)::op>, currentDispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:492 csarofeen#16 at::_ops::resize_::redispatch (dispatchKeySet=..., self=..., size=..., memory_format=...) at /home/user/pytorch/build/aten/src/ATen/Operators_4.cpp:2144 csarofeen#17 0x000003ff51ed5e08 in at::redispatch::resize__symint (dispatchKeySet=..., self=..., size=..., memory_format=...) at aten/src/ATen/RedispatchFunctions.h:2847 csarofeen#18 0x000003ff51ebbb68 in torch::autograd::VariableType::(anonymous namespace)::resize_ (ks=..., self=..., size=..., optional_memory_format=...) at /home/user/pytorch/torch/csrc/autograd/VariableTypeManual.cpp:243 ``` </details> Pull Request resolved: pytorch#101064 Approved by: https://github.com/Skylion007, https://github.com/albanD

arguments() returns vector member of object returned by schema() call. When object returned by schema() call is destroyed, the vector is deallocated as well, it's lifetime isn't extended. This issue detected while running `pytest -v test/mobile/test_lite_script_type.py -k test_nest_typing_namedtuple_custom_classtype` with ASAN. <details> <summary>ASAN output</summary> ``` ==1134126==ERROR: AddressSanitizer: heap-use-after-free on address 0x60d0005a5790 at pc 0x03ff844488d8 bp 0x03fff584afe8 sp 0x03fff584afd8 READ of size 8 at 0x60d0005a5790 thread T0 #0 0x3ff844488d7 in __gnu_cxx::__normal_iterator<c10::Argument const*, std::vector<c10::Argument, std::allocator<c10::Argument> > >::__normal_iterator(c10::Argument const* const&) /usr/lib/gcc/s390x-i bm-linux-gnu/11/include/g++-v11/bits/stl_iterator.h:1028 #1 0x3ff8444293f in std::vector<c10::Argument, std::allocator<c10::Argument> >::begin() const /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_vector.h:821 csarofeen#2 0x3ff84d807d1 in torch::jit::toPyObject(c10::IValue) /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:617 csarofeen#3 0x3ff84d80305 in torch::jit::toPyObject(c10::IValue) /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:604 csarofeen#4 0x3ff84856871 in pybind11::detail::type_caster<c10::IValue, void>::cast(c10::IValue, pybind11::return_value_policy, pybind11::handle) /home/user/pytorch/torch/csrc/jit/python/pybind.h:138 csarofeen#5 0x3ff85318191 in pybind11::cpp_function::initialize<torch::jit::initJitScriptBindings(_object*)::$_45, c10::IValue, torch::jit::mobile::Module&, pybind11::tuple const&, pybind11::name, pybind11::is _method, pybind11::sibling, pybind11::arg>(torch::jit::initJitScriptBindings(_object*)::$_45&&, c10::IValue (*)(torch::jit::mobile::Module&, pybind11::tuple const&), pybind11::name const&, pybind11::is_me thod const&, pybind11::sibling const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)#1}::operator()(pybind11::detail::function_call&) const /home/user/pytorch/cmake/../third_party/pybin d11/include/pybind11/pybind11.h:249 csarofeen#6 0x3ff85317cfd in pybind11::cpp_function::initialize<torch::jit::initJitScriptBindings(_object*)::$_45, c10::IValue, torch::jit::mobile::Module&, pybind11::tuple const&, pybind11::name, pybind11::is _method, pybind11::sibling, pybind11::arg>(torch::jit::initJitScriptBindings(_object*)::$_45&&, c10::IValue (*)(torch::jit::mobile::Module&, pybind11::tuple const&), pybind11::name const&, pybind11::is_me thod const&, pybind11::sibling const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)#1}::__invoke(pybind11::detail::function_call&) /home/user/pytorch/cmake/../third_party/pybind11/incl ude/pybind11/pybind11.h:224 csarofeen#7 0x3ff82ee52e9 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) /home/user/pytorch/cmake/../third_party/pybind11/include/pybind11/pybind11.h:929 csarofeen#8 0x3ffab002903 in cfunction_call Objects/methodobject.c:543 csarofeen#9 0x3ffaaf8a933 in _PyObject_MakeTpCall Objects/call.c:215 csarofeen#10 0x3ffaaf8e919 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112 csarofeen#11 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#12 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#13 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#14 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#15 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181 csarofeen#16 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#17 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#18 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#19 0x3ffaaf8a615 in _PyObject_FastCallDictTstate Objects/call.c:142 csarofeen#20 0x3ffaaf8b271 in _PyObject_Call_Prepend Objects/call.c:431 csarofeen#21 0x3ffab03f307 in slot_tp_call Objects/typeobject.c:7494 csarofeen#22 0x3ffaaf8a933 in _PyObject_MakeTpCall Objects/call.c:215 csarofeen#23 0x3ffab0f0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112 csarofeen#24 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#25 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#26 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213 csarofeen#27 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#28 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#29 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#30 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#31 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#32 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#33 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#34 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#35 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213 csarofeen#36 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#37 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#38 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#39 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#40 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#41 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#42 0x3ffab0ff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198 csarofeen#43 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#44 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#45 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#46 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#47 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#48 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#49 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#50 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#51 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231 csarofeen#52 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#53 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#54 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#55 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#56 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#57 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#58 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#59 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#60 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231 csarofeen#61 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#62 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#63 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#64 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#65 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#66 0x3ffaaf8ab9b in PyVectorcall_Call Objects/call.c:267 csarofeen#67 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290 csarofeen#68 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317 csarofeen#69 0x3ffab1059c7 in do_call_core Python/ceval.c:5943 csarofeen#70 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277 csarofeen#71 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#72 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#73 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#74 0x3ffaaf8a695 in _PyObject_FastCallDictTstate Objects/call.c:153 csarofeen#75 0x3ffaaf8b271 in _PyObject_Call_Prepend Objects/call.c:431 csarofeen#76 0x3ffab03f307 in slot_tp_call Objects/typeobject.c:7494 csarofeen#77 0x3ffaaf8a933 in _PyObject_MakeTpCall Objects/call.c:215 csarofeen#78 0x3ffab0f0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112 csarofeen#79 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#80 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#81 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231 csarofeen#82 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#83 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#84 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#85 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#86 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#87 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#88 0x3ffab0ff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198 csarofeen#89 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#90 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#91 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#92 0x3ffaaf8ab15 in PyVectorcall_Call Objects/call.c:255 csarofeen#93 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290 csarofeen#94 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317 csarofeen#95 0x3ffab1059c7 in do_call_core Python/ceval.c:5943 csarofeen#96 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277 csarofeen#97 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#98 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#99 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#100 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#101 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#102 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#103 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181 csarofeen#104 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#105 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#106 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#107 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#108 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#109 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#110 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#111 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#112 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181 csarofeen#113 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#114 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#115 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#116 0x3ffaaf8a695 in _PyObject_FastCallDictTstate Objects/call.c:153 csarofeen#117 0x3ffaaf8b271 in _PyObject_Call_Prepend Objects/call.c:431 csarofeen#118 0x3ffab03f307 in slot_tp_call Objects/typeobject.c:7494 csarofeen#119 0x3ffaaf8ad17 in _PyObject_Call Objects/call.c:305 csarofeen#120 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317 csarofeen#121 0x3ffab1059c7 in do_call_core Python/ceval.c:5943 csarofeen#122 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277 csarofeen#123 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#124 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#125 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#126 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#127 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#128 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#129 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213 csarofeen#130 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#131 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#132 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#133 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#134 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#135 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#136 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#137 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#138 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231 csarofeen#139 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#140 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#141 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#142 0x3ffaaf8ab15 in PyVectorcall_Call Objects/call.c:255 csarofeen#143 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290 csarofeen#144 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317 csarofeen#145 0x3ffab1059c7 in do_call_core Python/ceval.c:5943 csarofeen#146 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277 csarofeen#147 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#148 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#149 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#150 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#151 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#152 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#153 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213 csarofeen#154 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#155 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#156 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#157 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#158 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#159 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#160 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231 csarofeen#161 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#162 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#163 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#164 0x3ffaaf8ab15 in PyVectorcall_Call Objects/call.c:255 csarofeen#165 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290 csarofeen#166 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317 csarofeen#167 0x3ffab1059c7 in do_call_core Python/ceval.c:5943 csarofeen#168 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277 csarofeen#169 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#170 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#171 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#172 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#173 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#174 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#175 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181 csarofeen#176 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#177 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#178 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#179 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#180 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#181 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#182 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#183 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#184 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181 csarofeen#185 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#186 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#187 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#188 0x3ffaaf8a695 in _PyObject_FastCallDictTstate Objects/call.c:153 csarofeen#189 0x3ffaaf8b271 in _PyObject_Call_Prepend Objects/call.c:431 csarofeen#190 0x3ffab03f307 in slot_tp_call Objects/typeobject.c:7494 csarofeen#191 0x3ffaaf8a933 in _PyObject_MakeTpCall Objects/call.c:215 csarofeen#192 0x3ffab0f0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112 csarofeen#193 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#194 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#195 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231 csarofeen#196 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#197 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#198 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#199 0x3ffaaf8ab15 in PyVectorcall_Call Objects/call.c:255 csarofeen#200 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290 csarofeen#201 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317 csarofeen#202 0x3ffab1059c7 in do_call_core Python/ceval.c:5943 csarofeen#203 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277 csarofeen#204 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#205 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#206 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#207 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#208 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#209 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#210 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181 csarofeen#211 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#212 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#213 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#214 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#215 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53 csarofeen#216 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#216 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#217 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#218 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#219 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181 csarofeen#220 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#221 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#222 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#223 0x3ffaaf8a695 in _PyObject_FastCallDictTstate Objects/call.c:153 csarofeen#224 0x3ffaaf8b271 in _PyObject_Call_Prepend Objects/call.c:431 csarofeen#225 0x3ffab03f307 in slot_tp_call Objects/typeobject.c:7494 csarofeen#226 0x3ffaaf8a933 in _PyObject_MakeTpCall Objects/call.c:215 csarofeen#227 0x3ffab0f0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112 csarofeen#228 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#229 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#230 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231 csarofeen#231 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#232 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#233 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#234 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#235 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#236 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#237 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213 csarofeen#238 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#239 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#240 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#241 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114 csarofeen#242 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123 csarofeen#243 0x3ffab105447 in call_function Python/ceval.c:5891 csarofeen#244 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213 csarofeen#245 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46 csarofeen#246 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065 csarofeen#247 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342 csarofeen#248 0x3ffaaf8ab15 in PyVectorcall_Call Objects/call.c:255 csarofeen#249 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290 0x60d0005a5790 is located 80 bytes inside of 136-byte region [0x60d0005a5740,0x60d0005a57c8) freed by thread T0 here: #0 0x3ffab537de5 in operator delete(void*) /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:160 #1 0x3ff55984fdb in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> >::deallocate(std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2>*, unsigned long) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/ext/new_allocator.h:145 previously allocated by thread T0 here: #0 0x3ffab53734f in operator new(unsigned long) /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:99 #1 0x3ff5598443f in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> >::allocate(unsigned long, void const*) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/ext/new_allocator.h:127 csarofeen#2 0x3fff5849ecf ([stack]+0xb2ecf) SUMMARY: AddressSanitizer: heap-use-after-free /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_iterator.h:1028 in __gnu_cxx::__normal_iterator<c10::Argument const*, std::vector<c10::Argument, std::allocator<c10::Argument> > >::__normal_iterator(c10::Argument const* const&) Shadow bytes around the buggy address: 0x100c1a000b4aa0: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa 0x100c1a000b4ab0: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd 0x100c1a000b4ac0: fd fd fd fd fd fa fa fa fa fa fa fa fa fa fd fd 0x100c1a000b4ad0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa 0x100c1a000b4ae0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd =>0x100c1a000b4af0: fd fd[fd]fd fd fd fd fd fd fa fa fa fa fa fa fa 0x100c1a000b4b00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x100c1a000b4b10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x100c1a000b4b20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x100c1a000b4b30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x100c1a000b4b40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==1134126==ABORTING ``` Additional backtraces (not full): Allocation: ``` #0 __memset_z196 () at ../sysdeps/s390/memset-z900.S:144 #1 0x000003ff96f3072a in __asan::Allocator::Allocate (this=this@entry=0x3ff97041eb8 <__asan::instance>, size=size@entry=136, alignment=8, alignment@entry=0, stack=<optimized out>, stack@entry=0x3ffdbb45d78, alloc_type=<optimized out>, can_fill=true) at /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_allocator.cpp:599 csarofeen#2 0x000003ff96f2c088 in __asan::asan_memalign (alignment=alignment@entry=0, size=size@entry=136, stack=stack@entry=0x3ffdbb45d78, alloc_type=alloc_type@entry=__asan::FROM_NEW) at /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_allocator.cpp:1039 csarofeen#3 0x000003ff96fb73b0 in operator new (size=136) at /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:99 csarofeen#4 0x000003ff41404440 in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> >::allocate (this=0x3ffdbb468c0, __n=1) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/ext/new_allocator.h:127 csarofeen#5 0x000003ff414042a0 in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> > >::allocate (__a=..., __n=1) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/alloc_traits.h:464 csarofeen#6 0x000003ff41403b66 in std::__allocate_guarded<std::allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> > > (__a=...) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/allocated_ptr.h:98 csarofeen#7 0x000003ff4140372a in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<c10::Argument, std::allocator<c10::Argument> >, std::vector<c10::Argument, std::allocator<c10::Argument> > > (this=0x3ffdbb47888, __p=@0x3ffdbb47880: 0x0, __a=..., __args=..., __args=..., __args=..., __args=...) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:648 csarofeen#8 0x000003ff41403328 in std::__shared_ptr<c10::FunctionSchema, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<c10::FunctionSchema>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<c10::Argument, std::allocator<c10::Argument> >, std::vector<c10::Argument, std::allocator<c10::Argument> > > (this=0x3ffdbb47880, __tag=..., __args=..., __args=..., __args=..., __args=...) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:1342 csarofeen#9 0x000003ff41402f06 in std::shared_ptr<c10::FunctionSchema>::shared_ptr<std::allocator<c10::FunctionSchema>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<c10::Argument, std::allocator<c10::Argument> >, std::vector<c10::Argument, std::allocator<c10::Argument> > > ( this=0x3ffdbb47880, __tag=..., __args=..., __args=..., __args=..., __args=...) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr.h:409 csarofeen#10 0x000003ff41402b6e in std::allocate_shared<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<c10::Argument, std::allocator<c10::Argument> >, std::vector<c10::Argument, std::allocator<c10::Argument> > > (__a=..., __args=..., __args=..., __args=..., __args=...) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr.h:862 csarofeen#11 0x000003ff4140215c in std::make_shared<c10::FunctionSchema, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<c10::Argument, std::allocator<c10::Argument> >, std::vector<c10::Argument, std::allocator<c10::Argument> > > (__args=..., __args=..., __args=..., __args=...) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr.h:878 csarofeen#12 0x000003ff413d180c in c10::TupleType::createWithSpec<c10::basic_string_view<char> > (qualName=..., field_names=std::vector of length 1, capacity 1 = {...}, field_types=std::vector of length 1, capacity 1 = {...}, field_defaults=std::vector of length 0, capacity 0) at /home/user/pytorch/aten/src/ATen/core/type.cpp:769 csarofeen#13 0x000003ff413b9ca6 in c10::TupleType::createNamed (qualName=..., field_names=std::vector of length 1, capacity 1 = {...}, field_types=std::vector of length 1, capacity 1 = {...}) at /home/user/pytorch/aten/src/ATen/core/type.cpp:725 csarofeen#14 0x000003ff4115fbac in c10::ivalue::TupleTypeFactory<c10::TupleType>::fallback (type=...) at /home/user/pytorch/aten/src/ATen/core/dynamic_type.cpp:383 csarofeen#15 0x000003ff708217fe in c10::ivalue::Tuple::type<c10::TupleType> (this=0x6080004b8520) at /home/user/pytorch/aten/src/ATen/core/ivalue_inl.h:781 csarofeen#16 0x000003ff70800740 in torch::jit::toPyObject (ivalue=...) at /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:613 csarofeen#17 0x000003ff70800306 in torch::jit::toPyObject (ivalue=...) at /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:604 csarofeen#18 0x000003ff702d6872 in pybind11::detail::type_caster<c10::IValue, void>::cast (src=...) at /home/user/pytorch/torch/csrc/jit/python/pybind.h:138 csarofeen#19 0x000003ff70d98192 in pybind11::cpp_function::initialize<torch::jit::initJitScriptBindings(_object*)::$_45, c10::IValue, torch::jit::mobile::Module&, pybind11::tuple const&, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg>(torch::jit::initJitScriptBindings(_object*)::$_45&&, c10::IValue (*)(torch::jit::mobile::Module&, pybind11::tuple const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)#1}::operator()(pybind11::detail::function_call&) const (this=0x3ffdbb4ca20, call=...) at /home/user/pytorch/cmake/../third_party/pybind11/include/pybind11/pybind11.h:249 csarofeen#20 0x000003ff70d97cfe in pybind11::cpp_function::initialize<torch::jit::initJitScriptBindings(_object*)::$_45, c10::IValue, torch::jit::mobile::Module&, pybind11::tuple const&, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg>(torch::jit::initJitScriptBindings(_object*)::$_45&&, c10::IValue (*)(torch::jit::mobile::Module&, pybind11::tuple const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)#1}::__invoke(pybind11::detail::function_call&) (call=...) at /home/user/pytorch/cmake/../third_party/pybind11/include/pybind11/pybind11.h:224 csarofeen#21 0x000003ff6e9652ea in pybind11::cpp_function::dispatcher (self=<PyCapsule at remote 0x3ff83e27720>, args_in=(<torch._C.LiteScriptModule at remote 0x3ff811844b0>, (<Tensor at remote 0x3ff814efb00>,)), kwargs_in=0x0) at /home/user/pytorch/cmake/../third_party/pybind11/include/pybind11/pybind11.h:929 ``` Deallocation: ``` #0 operator delete (ptr=0x60d0005a5740) at /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:160 #1 0x000003ff44904fdc in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> >::deallocate (this=0x3ffc5dc8020, __p=0x60d0005a5740, __t=1) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/ext/new_allocator.h:145 csarofeen#2 0x000003ff44904fa8 in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> > >::deallocate ( __a=..., __p=0x60d0005a5740, __n=1) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/alloc_traits.h:496 csarofeen#3 0x000003ff449041f2 in std::__allocated_ptr<std::allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> > >::~__allocated_ptr ( this=0x3ffc5dc8030) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/allocated_ptr.h:74 csarofeen#4 0x000003ff44904888 in std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2>::_M_destroy (this=0x60d0005a5740) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:538 csarofeen#5 0x000003ff43895a62 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x60d0005a5740) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:184 csarofeen#6 0x000003ff43895420 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x611000c40648) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:705 csarofeen#7 0x000003ff4466e7f4 in std::__shared_ptr<c10::FunctionSchema, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x611000c40640) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:1154 csarofeen#8 0x000003ff4466d820 in std::shared_ptr<c10::FunctionSchema>::~shared_ptr (this=0x611000c40640) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr.h:122 csarofeen#9 0x000003ff448d82f6 in c10::TupleType::~TupleType (this=0x611000c40580) at /home/user/pytorch/aten/src/ATen/core/jit_type.h:1142 csarofeen#10 0x000003ff448d8346 in c10::TupleType::~TupleType (this=0x611000c40580) at /home/user/pytorch/aten/src/ATen/core/jit_type.h:1142 csarofeen#11 0x000003ff731296a4 in std::_Sp_counted_ptr<c10::TupleType*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x603000c43ae0) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:348 csarofeen#12 0x000003ff71eaf666 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x603000c43ae0) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:168 csarofeen#13 0x000003ff71eaf330 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x3ffc5dc9368) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:705 csarofeen#14 0x000003ff73129ee4 in std::__shared_ptr<c10::TupleType, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x3ffc5dc9360) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:1154 csarofeen#15 0x000003ff73122390 in std::shared_ptr<c10::TupleType>::~shared_ptr (this=0x3ffc5dc9360) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr.h:122 csarofeen#16 0x000003ff73d00788 in torch::jit::toPyObject (ivalue=...) at /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:613 csarofeen#17 0x000003ff73d00306 in torch::jit::toPyObject (ivalue=...) at /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:604 ``` </details> Pull Request resolved: pytorch#101400 Approved by: https://github.com/zou3519

Working towards a new fusion execution engine.

dcbd8e4

csarofeen changed the base branch from master to 20_7_6_devel July 9, 2020 22:55

jjsjann123 and others added 4 commits July 9, 2020 16:20

fixing func_name string

27c57cd

Worked through bugs, simple gemm now working.

70c3862

Merge branch '20_7_6_devel' of https://www.github.com/csarofeen/pytorch…

0c7f207

… into FusionExecutor

Fix bad merge conflict resolution.

112ff1a

jjsjann123 approved these changes Jul 10, 2020

View reviewed changes

csarofeen added 2 commits July 10, 2020 10:08

Merge branch '20_7_6_devel' into FusionExecutor

90f09c8

Change arguments to use unique_ptr.

3c46d89

csarofeen commented Jul 10, 2020

View reviewed changes

csarofeen added 7 commits July 10, 2020 11:10

Clang tidy.

0246ee3

Prefix executor related files.

9711b82

Fix build after moving files.

a2d03c2

Try to support thread infernece for broadcast.

bb32428

Split out launch parameters.

a4043c2

Move kernel argument holder to live with kernel arguments.

e68584d

Move argument validation and nvrtc compile to a utility file.

d2277f9

tlemo approved these changes Jul 10, 2020

View reviewed changes

Some more minor tweaks/cleanup.

07552e7

csarofeen commented Jul 11, 2020

View reviewed changes

tlemo approved these changes Jul 11, 2020

View reviewed changes

csarofeen added 8 commits July 11, 2020 16:41

Add launch constraints to manually set thread dims when they can't be…

48ef19b

… infered from input dimensions alone.

Add NamedScalar support to expression evaluator. Improve a few error …

094136b

…messages.

Basic support to split on a symbolic value we can make an input, or a…

616475e

… thread dimension.

Fix tests.

fbc5579

Rework allocation nodes a bit to prepare for using them with grid red…

54422a4

…uctions.

Add a grid reduction IR node.

bd6ae1f

Restructure lower_index so we can return multiple exprs from each exp…

83ff4bb

…r we process.

Enable reduction buffers, enable reductions in fusion executor. Conve…

6374594

…rt tests to fusion executor.

csarofeen force-pushed the FusionExecutor branch from e2ade98 to 6374594 Compare July 16, 2020 20:08

Maybe broadcast support.

80d6de8

tlemo requested changes Jul 16, 2020

View reviewed changes

csarofeen and others added 9 commits July 16, 2020 18:14

Try to get broadcast in the right spot this time.

a803c08

quick fix to disable broadcast hack for integration

c8a332d

Add automatic output allocation, change every other test to use it.

c34e966

tlemo comments

99da19d

update broadcast tests

bd0a2c1

hacky switch to FusionExecutor

1b68ff9

Fix implicit broadcast indexing.

2592f1b

Clang.

6b34df7

fixing legacy fuser with missing static shape info

484451f

tlemo reviewed Jul 17, 2020

View reviewed changes

tlemo approved these changes Jul 17, 2020

View reviewed changes

csarofeen added 3 commits July 19, 2020 10:07

Address PR comments.

bec3aff

Merge remote-tracking branch 'origin/20_7_6_devel' into FusionExecutor

c440423

Clang tidy.

12bd4e4

csarofeen merged commit c94c67e into 20_7_6_devel Jul 19, 2020

csarofeen mentioned this pull request Jul 20, 2020

broadcast to support the expansion of size 1 dimension. #124

Closed


		void appendArgs(const std::vector<at::Tensor>& tensors);

		void appendPhilox(uint64_t rand_offset);


		TORCH_INTERNAL_ASSERT(!outputs.empty(), "No outputs set for test kernel.");

		KernelArgumentHolder kah;


		LaunchParams computeLaunchParams(const at::ArrayRef<IValue>& aten_inputs);

		std::vector<at::Tensor> runFusion(

		@@ -490,10 +483,6 @@ void testGPU_FusionCopy() {

		tv3->axis(0)->parallelize(ParallelType::BIDx);
		tv3->axis(-1)->parallelize(ParallelType::TIDx);

Fusion executor #162

Fusion executor #162

Conversation

csarofeen commented Jul 9, 2020 • edited Loading

jjsjann123 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tlemo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tlemo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

csarofeen commented Jul 9, 2020 •

edited

Loading