Skip to content

Initial commit does not build #4082

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Muzammiluddin-Syed-ECE
Copy link
Collaborator

Draft PR For initial review.

Adding lowering support for RoiAlign ops with static shapes and known sampling ratios.

@Muzammiluddin-Syed-ECE
Copy link
Collaborator Author

Muzammiluddin-Syed-ECE commented Mar 7, 2025

Current to do list:

  • Fix minor build issues (replace SubOp with SubIOp/SubFOp, etc)
  • Add unit test

@Muzammiluddin-Syed-ECE Muzammiluddin-Syed-ECE marked this pull request as draft March 7, 2025 18:54
@@ -1616,6 +1616,223 @@ class ConvertAtenAdaptivePoolOp : public OpConversionPattern<OpTy> {
};
} // namespace

namespace {
template <typename OpTy, typename PoolingOpTy, int Dim>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the other two template parameters, since they are unused.

I took a short glance through and I don't have any immediate comments yet. Let's get something built so we can test correctness and iterate from there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll also need to add this pattern to populatePoolingPatternsAndLegality at the bottom of the file.

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs tests.

auto forLoop = b1.create<scf::ForOp>(
loc, lb, ub1, step, ValueRange{},
[&](OpBuilder &b, Location loc, Value iv1, ValueRange args) {
// Step 1: Extract bounds for region of interest (roi)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Use proper punctuation in comments

Suggested change
// Step 1: Extract bounds for region of interest (roi)
// Step 1: Extract bounds for region of interest (roi).

See https://llvm.org/docs/CodingStandards.html#commenting

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in other comments in this file

Comment on lines 1783 to 1786
SmallVector<Value, 2> strideInts = {scaleH, scaleW};
SmallVector<Value, 2> paddingInts = {zeroAttr, zeroAttr};
SmallVector<Value, 2> dilationInts(oneAttr, 2);
SmallVector<Value, 4> outTensorShape;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need two values only you can use a plain C array

roiSampleWidth};
SmallVector<Value> dims =
getTensorSizesUntilDim(b, loc, extractRoi, 1);
for (unsigned i = 2; i < inputRank; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use pre-increment:

Suggested change
for (unsigned i = 2; i < inputRank; i++) {
for (unsigned i = 2; i < inputRank; ++i) {

See https://llvm.org/docs/CodingStandards.html#prefer-preincrement

Also in other loops

Comment on lines 1704 to 1707
Value lowY_int = b.create<math::FloorOp>(loc, lowY);
Value lowX_int = b.create<math::FloorOp>(loc, lowX);
Value highY_int = b.create<math::CeilOp>(loc, highY);
Value highX_int = b.create<math::CeilOp>(loc, highX);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 156 to 179


Type elementType = cast<RankedTensorType>(self.getType()).getElementType();
if (!isa<mlir::FloatType>(elementType) && !supportNonFPInput)
return op->emitError("unimplemented: non-floating point type");

Value initValue =
rewriter.create<arith::ConstantOp>(loc, cast<TypedAttr>(initValueAttr));

paddedInput = padInputTensor(op, rewriter, self, ceilMode, dimensionality,
strideInts, paddingInts, initValue);

auto outTensorInitialized = computeOutputTensor(
op, rewriter, self, dimensionality, ceilMode, strideInts, paddingInts,
dilationInts, kernelSizeIntValues, outTensorShape, initValue);



auto stridesAttr = rewriter.getI64VectorAttr(strideInts);
auto dilationAttr = rewriter.getI64VectorAttr(dilationInts);
auto shape = castIntVectorToIndexVector(rewriter, loc, kernelSizeIntValues);

Value windowTensor = rewriter.create<tensor::EmptyOp>(
loc, getAsOpFoldResult(shape), elementType);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please undo these unrelated changes. If you want to reformat the whole file, do that before landing this patch.

@@ -187,7 +194,7 @@ static LogicalResult createPoolingOp(
return rewriter.notifyMatchFailure(
op, "failed to perform permutation of tensor");
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here

using OpConversionPattern::OpConversionPattern;

static SmallVector<Value>
coordinateTransform(OpBuilder &b, Torch::TorchvisionRoiAlignOp op,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function names should start with a verb. (Similar with PR subject line.)

Comment on lines +1631 to +1634
Location loc, SmallVector<Value> outputSizes, Value input,
SmallVector<Value> inputSizes,
SmallVector<Value> scaleValues, std::string coordStr,
bool alignCornersBool, SmallVector<Value> indices,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing vectors by value will result in copies. Instead, you can pass them as ArrayRef


unsigned dimOffset = 2;
auto inputType = cast<RankedTensorType>(input.getType());
auto inputRank = inputType.getRank();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not use auto when the type is not obvious based on the context: https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable

RankedTensorType inputType = dyn_cast_or_null<RankedTensorType>(
this->getTypeConverter()->convertType(op.getInput().getType()));

if (inputType == nullptr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (inputType == nullptr) {
if (!inputType) {

llvm prefers unary negation / explicit cast to bool over comparisons with nullptr

rewriter.create<arith::ConstantOp>(loc, rewriter.getF32FloatAttr(0.0));
RankedTensorType resultType = dyn_cast_or_null<RankedTensorType>(
this->getTypeConverter()->convertType(result.getType()));
if (resultType == nullptr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (resultType == nullptr) {
if (!resultType) {

Comment on lines +1858 to +1859
RankedTensorType resultType = dyn_cast_or_null<RankedTensorType>(
this->getTypeConverter()->convertType(result.getType()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RankedTensorType resultType = dyn_cast_or_null<RankedTensorType>(
this->getTypeConverter()->convertType(result.getType()));
auto resultType = getTypeConverter()->convertType<RankedTensorType>(result.getType());

Also, you should check that this conversion succeeded and emit an error otherwise

loc, getAsOpFoldResult(finalOutputShape), resultElementType);
rewriter.create<scf::ForOp>(
loc, lb, ub0, step, ValueRange{},
[&](OpBuilder &b, Location loc, Value iv0, ValueRange args) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this is a long lambda, I'd either make it a free function or at least write out the list of captured variables

@@ -1665,4 +2125,5 @@ void mlir::torch::torch_to_linalg::populatePoolingPatternsAndLegality(
typeConverter, context);
patterns.add<ConvertAtenAdaptivePoolOp<AtenAdaptiveMaxPool3dOp>>(
typeConverter, context);
patterns.add<ConvertRoiAlignOp>(typeConverter, context);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add all of the patterns with the standard constructor to the same add call:

patterns.add<P0, P1, P2, P3>(typeConverter, context);

@kuhar
Copy link
Member

kuhar commented Mar 11, 2025

@Muzammiluddin-Syed-ECE make sure you click the 'Resolve conversation' button under comments you believe are addressed. This makes it much easier to iterate on pull requests.

@Muzammiluddin-Syed-ECE
Copy link
Collaborator Author

Pausing work on this issue as a ramp up task. This task has proven to be more involved than expected for something issued as part of an onboarding process.

Work done:
1. Identified assumptions required for base case of support for lowering RoiAlign:

  • Only support lowering on inputs with a trivial batch dimension (=1).
  • Only support lowering on RoiAlign when sampling ratio is explicitly defined.

Both these conditions were added to avoid the dependence on data in execution (for example the loop bounds and tensor indexing being dependent on the data inside the inputs)

2. Identified an algorithm to implement base case

To do:
1. Finish debugging current implementation

  • Current error: error: expected type to be 'tensor<?x?x?x?xf32>' or a rank-reduced version. (size mismatch) on final insert_slice

2. Add this unit test in an appropriate location:

func.func @main(%arg0: !torch.vtensor<[?,256,?,?],f32>, %arg1: !torch.vtensor<[2,4],f32>, %arg2: !torch.vtensor<[2],si64>) -> !torch.vtensor<[?,256,7,7],f32> attributes {torch.assume_strict_symbolic_shapes, torch.onnx_meta.ir_version = 4 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.producer_name = "pytorch", torch.onnx_meta.producer_version = "1.1"} {
  %int2 = torch.constant.int 2
  %int7 = torch.constant.int 7
  %float1.250000e-01 = torch.constant.float 1.250000e-01
  %false = torch.constant.bool false
  %none = torch.constant.none
  %int6 = torch.constant.int 6
  %int1 = torch.constant.int 1
  %0 = torch.aten.unsqueeze %arg2, %int1 : !torch.vtensor<[2],si64>, !torch.int -> !torch.vtensor<[2,1],si64>
  %1 = torch.aten.to.dtype %0, %int6, %false, %false, %none : !torch.vtensor<[2,1],si64>, !torch.int, !torch.bool, !torch.bool, !torch.none -> !torch.vtensor<[2,1],f32>
  %2 = torch.prim.ListConstruct %1, %arg1 : (!torch.vtensor<[2,1],f32>, !torch.vtensor<[2,4],f32>) -> !torch.list<vtensor>
  %3 = torch.aten.cat %2, %int1 : !torch.list<vtensor>, !torch.int -> !torch.vtensor<[2,5],f32>
  %4 = torch.torchvision.roi_align %arg0, %3, %float1.250000e-01, %int7, %int7, %int2, %false : !torch.vtensor<[?,256,?,?],f32>, !torch.vtensor<[2,5],f32>, !torch.float, !torch.int, !torch.int, !torch.int, !torch.bool -> !torch.vtensor<[?,256,7,7],f32>
  return %4 : !torch.vtensor<[?,256,7,7],f32>
}

3. Create a test to verify numerics

@Muzammiluddin-Syed-ECE
Copy link
Collaborator Author

Feel free to ping me for additional context and detail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants