Initial commit does not build #4082

Muzammiluddin-Syed-ECE · 2025-03-07T18:49:55Z

Draft PR For initial review.

Adding lowering support for RoiAlign ops with static shapes and known sampling ratios.

Muzammiluddin-Syed-ECE · 2025-03-07T18:50:50Z

Current to do list:

Fix minor build issues (replace SubOp with SubIOp/SubFOp, etc)
Add unit test

zjgarvey · 2025-03-07T21:51:15Z

lib/Conversion/TorchToLinalg/Pooling.cpp

@@ -1616,6 +1616,223 @@ class ConvertAtenAdaptivePoolOp : public OpConversionPattern<OpTy> {
 };
 } // namespace

+namespace {
+template <typename OpTy, typename PoolingOpTy, int Dim>


Remove the other two template parameters, since they are unused.

I took a short glance through and I don't have any immediate comments yet. Let's get something built so we can test correctness and iterate from there.

You'll also need to add this pattern to populatePoolingPatternsAndLegality at the bottom of the file.

kuhar

This needs tests.

lib/Conversion/TorchToLinalg/Pooling.cpp

kuhar · 2025-03-10T21:48:18Z

lib/Conversion/TorchToLinalg/Pooling.cpp

+          auto forLoop = b1.create<scf::ForOp>(
+              loc, lb, ub1, step, ValueRange{},
+              [&](OpBuilder &b, Location loc, Value iv1, ValueRange args) {
+                // Step 1: Extract bounds for region of interest (roi)


nit: Use proper punctuation in comments

Suggested change

// Step 1: Extract bounds for region of interest (roi)

// Step 1: Extract bounds for region of interest (roi).

See https://llvm.org/docs/CodingStandards.html#commenting

Also in other comments in this file

kuhar · 2025-03-10T21:49:06Z

lib/Conversion/TorchToLinalg/Pooling.cpp

+                SmallVector<Value, 2> strideInts = {scaleH, scaleW};
+                SmallVector<Value, 2> paddingInts = {zeroAttr, zeroAttr};
+                SmallVector<Value, 2> dilationInts(oneAttr, 2);
+                SmallVector<Value, 4> outTensorShape;


If you need two values only you can use a plain C array

kuhar · 2025-03-10T21:50:10Z

lib/Conversion/TorchToLinalg/Pooling.cpp

+                                                          roiSampleWidth};
+                SmallVector<Value> dims =
+                    getTensorSizesUntilDim(b, loc, extractRoi, 1);
+                for (unsigned i = 2; i < inputRank; i++) {


Use pre-increment:

Suggested change

for (unsigned i = 2; i < inputRank; i++) {

for (unsigned i = 2; i < inputRank; ++i) {

See https://llvm.org/docs/CodingStandards.html#prefer-preincrement

Also in other loops

kuhar · 2025-03-10T21:51:21Z

lib/Conversion/TorchToLinalg/Pooling.cpp

+                Value lowY_int = b.create<math::FloorOp>(loc, lowY);
+                Value lowX_int = b.create<math::FloorOp>(loc, lowX);
+                Value highY_int = b.create<math::CeilOp>(loc, highY);
+                Value highX_int = b.create<math::CeilOp>(loc, highX);


llvm doesn't use underscores in variable names, see https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly and https://mlir.llvm.org/getting_started/DeveloperGuide/#style-guide

kuhar · 2025-03-11T02:07:36Z

lib/Conversion/TorchToLinalg/Pooling.cpp

+
+
  Type elementType = cast<RankedTensorType>(self.getType()).getElementType();
  if (!isa<mlir::FloatType>(elementType) && !supportNonFPInput)
    return op->emitError("unimplemented: non-floating point type");
-
+  
  Value initValue =
      rewriter.create<arith::ConstantOp>(loc, cast<TypedAttr>(initValueAttr));

  paddedInput = padInputTensor(op, rewriter, self, ceilMode, dimensionality,
                               strideInts, paddingInts, initValue);
-
+  
  auto outTensorInitialized = computeOutputTensor(
      op, rewriter, self, dimensionality, ceilMode, strideInts, paddingInts,
      dilationInts, kernelSizeIntValues, outTensorShape, initValue);
-
+
+
  auto stridesAttr = rewriter.getI64VectorAttr(strideInts);
  auto dilationAttr = rewriter.getI64VectorAttr(dilationInts);
  auto shape = castIntVectorToIndexVector(rewriter, loc, kernelSizeIntValues);
+
  Value windowTensor = rewriter.create<tensor::EmptyOp>(
      loc, getAsOpFoldResult(shape), elementType);
-
+  


Please undo these unrelated changes. If you want to reformat the whole file, do that before landing this patch.

kuhar · 2025-03-11T02:07:42Z

lib/Conversion/TorchToLinalg/Pooling.cpp

@@ -187,7 +194,7 @@ static LogicalResult createPoolingOp(
      return rewriter.notifyMatchFailure(
          op, "failed to perform permutation of tensor");
  }
-
+  


kuhar · 2025-03-11T02:09:33Z

lib/Conversion/TorchToLinalg/Pooling.cpp

+  using OpConversionPattern::OpConversionPattern;
+
+  static SmallVector<Value>
+  coordinateTransform(OpBuilder &b, Torch::TorchvisionRoiAlignOp op,


Function names should start with a verb. (Similar with PR subject line.)

kuhar · 2025-03-11T02:10:02Z

lib/Conversion/TorchToLinalg/Pooling.cpp

+                      Location loc, SmallVector<Value> outputSizes, Value input,
+                      SmallVector<Value> inputSizes,
+                      SmallVector<Value> scaleValues, std::string coordStr,
+                      bool alignCornersBool, SmallVector<Value> indices,


Passing vectors by value will result in copies. Instead, you can pass them as ArrayRef

kuhar · 2025-03-11T02:10:34Z

lib/Conversion/TorchToLinalg/Pooling.cpp

+
+    unsigned dimOffset = 2;
+    auto inputType = cast<RankedTensorType>(input.getType());
+    auto inputRank = inputType.getRank();


Do not use auto when the type is not obvious based on the context: https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable

kuhar · 2025-03-11T02:13:56Z

lib/Conversion/TorchToLinalg/Pooling.cpp

+    RankedTensorType inputType = dyn_cast_or_null<RankedTensorType>(
+        this->getTypeConverter()->convertType(op.getInput().getType()));
+
+    if (inputType == nullptr) {


Suggested change

if (inputType == nullptr) {

if (!inputType) {

llvm prefers unary negation / explicit cast to bool over comparisons with nullptr

kuhar · 2025-03-11T02:14:37Z

lib/Conversion/TorchToLinalg/Pooling.cpp

+        rewriter.create<arith::ConstantOp>(loc, rewriter.getF32FloatAttr(0.0));
+    RankedTensorType resultType = dyn_cast_or_null<RankedTensorType>(
+        this->getTypeConverter()->convertType(result.getType()));
+    if (resultType == nullptr) {


Suggested change

if (resultType == nullptr) {

if (!resultType) {

kuhar · 2025-03-11T02:15:47Z

lib/Conversion/TorchToLinalg/Pooling.cpp

+    RankedTensorType resultType = dyn_cast_or_null<RankedTensorType>(
+        this->getTypeConverter()->convertType(result.getType()));


Suggested change

RankedTensorType resultType = dyn_cast_or_null<RankedTensorType>(

this->getTypeConverter()->convertType(result.getType()));

auto resultType = getTypeConverter()->convertType<RankedTensorType>(result.getType());

Also, you should check that this conversion succeeded and emit an error otherwise

kuhar · 2025-03-11T02:16:41Z

lib/Conversion/TorchToLinalg/Pooling.cpp

+        loc, getAsOpFoldResult(finalOutputShape), resultElementType);
+    rewriter.create<scf::ForOp>(
+        loc, lb, ub0, step, ValueRange{},
+        [&](OpBuilder &b, Location loc, Value iv0, ValueRange args) {


Because this is a long lambda, I'd either make it a free function or at least write out the list of captured variables

kuhar · 2025-03-11T02:18:10Z

lib/Conversion/TorchToLinalg/Pooling.cpp

@@ -1665,4 +2125,5 @@ void mlir::torch::torch_to_linalg::populatePoolingPatternsAndLegality(
      typeConverter, context);
  patterns.add<ConvertAtenAdaptivePoolOp<AtenAdaptiveMaxPool3dOp>>(
      typeConverter, context);
+  patterns.add<ConvertRoiAlignOp>(typeConverter, context);


You can add all of the patterns with the standard constructor to the same add call:

patterns.add<P0, P1, P2, P3>(typeConverter, context);

kuhar · 2025-03-11T03:02:56Z

@Muzammiluddin-Syed-ECE make sure you click the 'Resolve conversation' button under comments you believe are addressed. This makes it much easier to iterate on pull requests.

Muzammiluddin-Syed-ECE · 2025-03-12T19:12:48Z

Pausing work on this issue as a ramp up task. This task has proven to be more involved than expected for something issued as part of an onboarding process.

Work done:
1. Identified assumptions required for base case of support for lowering RoiAlign:

Only support lowering on inputs with a trivial batch dimension (=1).
Only support lowering on RoiAlign when sampling ratio is explicitly defined.

Both these conditions were added to avoid the dependence on data in execution (for example the loop bounds and tensor indexing being dependent on the data inside the inputs)

2. Identified an algorithm to implement base case

To do:
1. Finish debugging current implementation

Current error: error: expected type to be 'tensor<?x?x?x?xf32>' or a rank-reduced version. (size mismatch) on final insert_slice

2. Add this unit test in an appropriate location:

func.func @main(%arg0: !torch.vtensor<[?,256,?,?],f32>, %arg1: !torch.vtensor<[2,4],f32>, %arg2: !torch.vtensor<[2],si64>) -> !torch.vtensor<[?,256,7,7],f32> attributes {torch.assume_strict_symbolic_shapes, torch.onnx_meta.ir_version = 4 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.producer_name = "pytorch", torch.onnx_meta.producer_version = "1.1"} {
  %int2 = torch.constant.int 2
  %int7 = torch.constant.int 7
  %float1.250000e-01 = torch.constant.float 1.250000e-01
  %false = torch.constant.bool false
  %none = torch.constant.none
  %int6 = torch.constant.int 6
  %int1 = torch.constant.int 1
  %0 = torch.aten.unsqueeze %arg2, %int1 : !torch.vtensor<[2],si64>, !torch.int -> !torch.vtensor<[2,1],si64>
  %1 = torch.aten.to.dtype %0, %int6, %false, %false, %none : !torch.vtensor<[2,1],si64>, !torch.int, !torch.bool, !torch.bool, !torch.none -> !torch.vtensor<[2,1],f32>
  %2 = torch.prim.ListConstruct %1, %arg1 : (!torch.vtensor<[2,1],f32>, !torch.vtensor<[2,4],f32>) -> !torch.list<vtensor>
  %3 = torch.aten.cat %2, %int1 : !torch.list<vtensor>, !torch.int -> !torch.vtensor<[2,5],f32>
  %4 = torch.torchvision.roi_align %arg0, %3, %float1.250000e-01, %int7, %int7, %int2, %false : !torch.vtensor<[?,256,?,?],f32>, !torch.vtensor<[2,5],f32>, !torch.float, !torch.int, !torch.int, !torch.int, !torch.bool -> !torch.vtensor<[?,256,7,7],f32>
  return %4 : !torch.vtensor<[?,256,7,7],f32>
}

3. Create a test to verify numerics

Muzammiluddin-Syed-ECE · 2025-03-12T19:14:19Z

Feel free to ping me for additional context and detail

Initial commit does not build

e0d6f02

Muzammiluddin-Syed-ECE marked this pull request as draft March 7, 2025 18:54

zjgarvey reviewed Mar 7, 2025

View reviewed changes

Fixing issues

31140fb

Muzammiluddin-Syed-ECE force-pushed the muzasyed/roi branch from 7ba849f to 31140fb Compare March 10, 2025 06:14

kuhar requested changes Mar 10, 2025

View reviewed changes

Muzammiluddin-Syed-ECE added 2 commits March 10, 2025 18:10

debug statements remove later

059c443

removed debug statements addressed comments

ca5df55

kuhar reviewed Mar 11, 2025

View reviewed changes

Muzammiluddin-Syed-ECE added 2 commits March 12, 2025 01:09

not ready for review, fixing bugs, will drop commit later

e4a50e5

last commit before hiatus

d035744

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial commit does not build #4082

Initial commit does not build #4082

Muzammiluddin-Syed-ECE commented Mar 7, 2025

Muzammiluddin-Syed-ECE commented Mar 7, 2025 •

edited

Loading

zjgarvey Mar 7, 2025

zjgarvey Mar 7, 2025

kuhar left a comment

kuhar Mar 10, 2025

kuhar Mar 10, 2025

kuhar Mar 10, 2025

kuhar Mar 10, 2025

kuhar Mar 10, 2025

kuhar Mar 11, 2025

kuhar Mar 11, 2025

kuhar Mar 11, 2025

kuhar Mar 11, 2025

kuhar Mar 11, 2025

kuhar Mar 11, 2025

kuhar Mar 11, 2025

kuhar Mar 11, 2025

kuhar Mar 11, 2025

kuhar Mar 11, 2025

kuhar commented Mar 11, 2025 •

edited

Loading

Muzammiluddin-Syed-ECE commented Mar 12, 2025

Muzammiluddin-Syed-ECE commented Mar 12, 2025

	// Step 1: Extract bounds for region of interest (roi)
	// Step 1: Extract bounds for region of interest (roi).

	for (unsigned i = 2; i < inputRank; i++) {
	for (unsigned i = 2; i < inputRank; ++i) {

		RankedTensorType resultType = dyn_cast_or_null<RankedTensorType>(
		this->getTypeConverter()->convertType(result.getType()));

	RankedTensorType resultType = dyn_cast_or_null<RankedTensorType>(
	this->getTypeConverter()->convertType(result.getType()));
	auto resultType = getTypeConverter()->convertType<RankedTensorType>(result.getType());

Initial commit does not build #4082

Are you sure you want to change the base?

Initial commit does not build #4082

Conversation

Muzammiluddin-Syed-ECE commented Mar 7, 2025

Muzammiluddin-Syed-ECE commented Mar 7, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kuhar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kuhar commented Mar 11, 2025 • edited Loading

Muzammiluddin-Syed-ECE commented Mar 12, 2025

Muzammiluddin-Syed-ECE commented Mar 12, 2025

Muzammiluddin-Syed-ECE commented Mar 7, 2025 •

edited

Loading

kuhar commented Mar 11, 2025 •

edited

Loading