@@ -18,9 +18,46 @@ include "mlir/IR/OpBase.td"
1818
1919def TilingInterface : OpInterface<"TilingInterface"> {
2020 let description = [{
21- Interface for allowing operations to expose information needed to
22- tile them (similar to LinalgOp, but without having access to
23- indexing maps)
21+ This interface allows operations to expose information needed to tile them.
22+
23+ The intent of this interface is to separate the generation of the loop
24+ structure (and constructs used for it) from the information needed from
25+ the operation to be able to tile them. As a result an implementation of
26+ the tiling algorithm (like `scf::tileUsingSCF`) can generate the inter-tile
27+ loop structure, and call into the methods of the interface to be able to
28+ tile any operation that implements the interface.
29+
30+ This interface is also meant to help with "tile and fuse", i.e. the process
31+ of fusing a producer with a consumer by
32+ a) Tiling the consumer
33+ b) Based on the tile of the producer used by the tiled consumer,
34+ materialize the tiled implementation of a producer to generate that
35+ tile (and use it immediately in the consumer)
36+ You could also fuse a consumer with a producer by
37+ a) Tiling the producer
38+ b) Based on the tile produced, materialize the tiled implementation of
39+ a consumer that uses this tile.
40+ Note that the tile and fuse does not make any calculation on whether it
41+ is "profitable to do this", but simply provides a mechansim to implement
42+ the transformation when such a fusion is needed by the caller.
43+
44+ For any operation to be tilable, an operation has to implement the
45+ following two methods (see description below)
46+ - `getLoopIteratorTypes`
47+ - `getIterationDomain`
48+ - `getTiledImplementation`
49+ - `getResultTilePosition`
50+
51+ For an operation to be "tiled and fused" with its (already tiled) consumer,
52+ an operation has to implement the following additional method (see
53+ description below):
54+ - `generateResultTileValue
55+
56+ For an operation to be "tiled and fused" with its (already tiled) producer,
57+ an operation has to implement the following additional methods (see
58+ description below):
59+ - `getTiledImplementationFromOperandTile`
60+ - `getIterationDomainTileFromOperandTile`.
2461 }];
2562 let cppNamespace = "::mlir";
2663 let methods = [
@@ -49,19 +86,22 @@ def TilingInterface : OpInterface<"TilingInterface"> {
4986 /*desc=*/[{
5087 Method to generate the tiled implementation of an operation.
5188
52- The iteration space of the operation is returned by
53- `getIterationDomain`. The caller provides the information of the
54- tile within this iteration space whose implementation the
55- caller needs.
89+ Given a tile of the iteration space (as returned by
90+ `getIterationDomain`), generate in-place the code that represents
91+ the computation corresponding to that tile of the iteration space.
92+ It is the responsibility of the implementation of this method in
93+ the operation to generate the slices of the operands needed for the
94+ tiled implementation.
5695 - `offsets` provides the offset of the tile in the coordinate system
5796 of the original iteration space, i.e., if an iteration space
58- dimension had non-zero offset, it must be included in the offset
97+ dimension had non-zero offset, it will be included in the offset
5998 provided here (as opposed to zero-based offset "relative" to the
6099 iteration space).
61100 - `sizes` provides the size of the tile.
62101
63- The method returns the operation that is the tiled
64- implementation.
102+ The returned `TilingResult` must return for each result of the
103+ untiled operation, a `Value` that is the result of the tiled
104+ operation.
65105 }],
66106 /*retType=*/"FailureOr<::mlir::TilingResult>",
67107 /*methodName=*/"getTiledImplementation",
@@ -76,11 +116,34 @@ def TilingInterface : OpInterface<"TilingInterface"> {
76116 >,
77117 InterfaceMethod<
78118 /*desc=*/[{
79- Method to return the position of the result tile computed by the tiled operation.
119+ Method to return the position of the result tile computed by the
120+ tiled operation.
121+
122+ For operations that return a value (typically a value of type
123+ `RankedTensorType`), the generated tiled computation has to also
124+ recompute a replacement for the results of the original operation.
125+ The tiled implementation of the operation returns a tile of the
126+ result(s). This methods returns information about what part of the
127+ result tensor is computed by the tiled implementation. The manner in
128+ which these tiles get put together to get the final result is upto
129+ the surrounding loop construct. If an operation has no results, (for
130+ example an operation that operates only on memrefs), then this method
131+ need not be implemented by the operation.
132+ - `resultNumber` is the result number of the original operation
133+ being processed.
134+ - `offsets` provides the offset of the tile in the coordinate system
135+ of the original iteration space, i.e., if an iteration space
136+ dimension had non-zero offset, it will be included in the offset
137+ provided here (as opposed to zero-based offset "relative" to the
138+ iteration space).
139+ - `sizes` provides the size of the tile.
140+ - `resultOffsets` is the offsets of the tile of the result generated
141+ by the tiled implementation (returned by value).
142+ - `resultSizes` is the size of the tile of the result generated
143+ by the tiled implementation (returned by value).
80144
81- Specifies what tile of the result of the original tensor is computed
82- by the tiled implementation. Expects the same `offsets` and `sizes` as
83- used to obtain the tiled implementation of the operation.
145+ Note: It is undefined behaviour if there is overlap between the
146+ tiles of the result generated by the tiled implementation.
84147 }],
85148 /*retType=*/"::mlir::LogicalResult",
86149 /*methodName=*/"getResultTilePosition",
@@ -98,51 +161,66 @@ def TilingInterface : OpInterface<"TilingInterface"> {
98161 >,
99162 InterfaceMethod<
100163 /*desc=*/[{
101- Method to return the tile of the iteration domain where
102- values from the given tile of the operand are used.
164+ Method to generate the code that produces a tile of the result.
165+
166+ This method is required to allow operations to be "tiled and fused"
167+ with an (already tiled) consumer. Typically, for two operations with
168+ producer -> consumer relation ship, to compute a tile of the
169+ consumer a `slice` of the producer is needed. This method allows
170+ computing that slice of the producer in-place, thereby "fusing"
171+ the operations at tile-granularity. This method is different from
172+ `getTiledImplementation`, which produces a tiled implementation
173+ for a tile of the iteration space. This method produces a tiled
174+ implementation based on the tile of producer required.
175+ - `resultNumber` is the result of the producer used by the consumer.
176+ - `offsets` is the offset of the slice of the producer result used by
177+ the tiled implementation of the consumer.
178+ - `sizes` is the size of the slice of the producer result used by the
179+ consumer.
180+ If fusion of the producer with the consumer is not legal for the
181+ operation/result, this method should return failure.
182+
183+ Note: This method only deals with the mechanism of implementing the
184+ fusion. In general the fusion might result in recomputation (based on
185+ the way the result is produced by the producer and the access pattern
186+ used in the consumer to access). This is upto the caller to handle
187+ appropriately.
103188 }],
104- /*retType=*/"::mlir::LogicalResult ",
105- /*methodName=*/"getIterationDomainTileFromOperandTile ",
189+ /*retType=*/"FailureOr< ::mlir::TilingResult> ",
190+ /*methodName=*/"generateResultTileValue ",
106191 /*args=*/(ins
107192 "OpBuilder &":$b,
108- "unsigned":$operandNumber,
109- "ArrayRef<OpFoldResult> ":$offsets,
110- "ArrayRef<OpFoldResult> ":$sizes,
111- "SmallVectorImpl<OpFoldResult> &":$iterDomainOffsets,
112- "SmallVectorImpl<OpFoldResult> &":$iterDomainSizes),
193+ "unsigned":$resultNumber,
194+ "ArrayRef<OpFoldResult>":$offsets,
195+ "ArrayRef<OpFoldResult>":$sizes),
113196 /*methodBody=*/"",
114197 /*defaultImplementation=*/[{
115198 return failure();
116199 }]
117200 >,
118201 InterfaceMethod<
119202 /*desc=*/[{
120- Method to generate the code that produces a tile of the result.
121-
122- Generates the IR that computes the tile of a result of the
123- operation. The `offsets` and `sizes` describe the tile of
124- the output required. This is different from
125- `getTiledImplementation` which generates the tiled
126- implementation of the operation given a tile of the
127- iteration space. This method generates a tiled
128- implementation of the operation based on the tile of the
129- result required. This method enables fusion by using tile
130- and fuse. The method returns failure if the operation can't be
131- tiled to generate the result tile. In practical terms this
132- implies it cannot be tiled and fused with its consumers.
203+ Method to generate the tiled implementation of an operation that uses
204+ exactly a tile of the given operand.
133205
134- - `offsets` provides the offset of the tile in the coordinate system
135- of the original iteration space, i.e., if an iteration space
136- dimension had non-zero offset, it must be included in the offset
137- provided here (as opposed to zero-based offset "relative" to the
138- iteration space).
139- - `sizes` provides the size of the tile.
206+ This method is required to allow operations to be "tiled and fused"
207+ with an (already tiled) producer. Given a tile of the producer, this
208+ method generates the tile of the consumer that uses exactly this
209+ produced tile. In some sense it is the "reverse" of
210+ `generateResultTileValue`.
211+ - `operandNumber` is the result of the producer used by the consumer.
212+ - `offsets` is the offset of the slice of the producer result used by
213+ the tiled implementation of the consumer.
214+ - `sizes` is the size of the slice of the producer result used by the
215+ consumer.
216+ If it is illegal to fuse with a producer along the given operand for
217+ an operation, the implementation should return a failure.
140218 }],
141219 /*retType=*/"FailureOr<::mlir::TilingResult>",
142- /*methodName=*/"generateResultTileValue ",
220+ /*methodName=*/"getTiledImplementationFromOperandTile ",
143221 /*args=*/(ins
144222 "OpBuilder &":$b,
145- "unsigned":$resultNumber ,
223+ "unsigned":$operandNumber ,
146224 "ArrayRef<OpFoldResult>":$offsets,
147225 "ArrayRef<OpFoldResult>":$sizes),
148226 /*methodBody=*/"",
@@ -152,38 +230,73 @@ def TilingInterface : OpInterface<"TilingInterface"> {
152230 >,
153231 InterfaceMethod<
154232 /*desc=*/[{
155- Method to generate the tiled implementation of an operation from
156- operand tile position .
233+ Method to return the tile of the iteration domain that uses a given
234+ tile of the operand .
157235
158- NOTE: For most operations, this should be a trivial composition of
159- getIterationDomainTileFromOperandTile and getTiledImplementation.
236+ This method is required to allow operations to be "tiled and fused"
237+ with an (already tiled) producer. Given a tile of an operand,
238+ returns the tile of the iteration space that uses this tile.
239+ - `operandNumber` is the result of the producer used by the consumer.
240+ - `offsets` is the offset of the slice of the producer result used by
241+ the tiled implementation of the consumer.
242+ - `sizes` is the size of the slice of the producer result used by the
243+ consumer.
244+ If it is illegal to fuse with a producer along the given operand for
245+ an operation, or if this mapping cannot be computed, the
246+ implementation should return a failure.
160247
161- Generates the IR that computes the tiled implementation of an
162- operation from operand tile. The `offsets` and `sizes`
163- describe the tile of the operand required. This is different from
164- `getTiledImplementation` which generates the tiled
165- implementation of the operation given a tile of the
166- iteration space. This method generates a tiled
167- implementation of the operation based on the tile of the
168- operand required. This method enables consumer fusion by using
169- tile and fuse. The method returns failure if the operation
170- can't be tiled to generate the operand tile. In practical terms
171- this implies it cannot be tiled and fused with its producers.
248+ Note that unlike the "tile consumer and fuse producer" case, the
249+ "tile producer and fuse consumer" requires an additional method to get
250+ the iteration tile space that encompasses all uses of the given operand
251+ tile. The reason for this is, consider
252+ ```mlir
253+ %1 = scf.for... {
254+ %2 = <tiled_producer_op>
255+ %3 = tensor.insert_slice %2 into ...
256+ scf.yield %3
257+ }
258+ %4 = <consumer_op>)(... %1... )
259+ ... <some_op>(... %4 ...)
260+ ```
172261
173- - `offsets` provides the offset of the tile in the coordinate system
174- of the original iteration space, i.e., if an iteration space
175- dimension had non-zero offset, it must be included in the offset
176- provided here (as opposed to zero-based offset "relative" to the
177- iteration space).
178- - `sizes` provides the size of the tile.
262+ when fused this becomes
263+ ```
264+ %1 = scf.for... {
265+ %2 = <tiled_producer_op>
266+ %3 = <tiled_consumer_op>(... %2...)
267+ %4 = tensor.insert_slice %3 into ...
268+ scf.yield %4
269+ }
270+ ... <some_op>(... %1 ...)
271+ ```
272+
273+ i.e, when fusing the consumer, the replacement for the result of the
274+ consumer needs to be returned to replace the uses of the consumer.
275+ For the tile+fuse algorithm to do this it needs information about
276+ which tile of the iteration space encompasses all uses of the tile
277+ produced and use that to compute what are the results produced. Note
278+ that this iteration space might be the entire iteration space of the
279+ operation, or multiple operand tiles might map to intersecting
280+ iteration spaces. It is upto the caller to make sure that it is still
281+ fusable with producer in this scenario, or it must return a failure.
282+
283+ Note that this method is only used as a way to implement the
284+ transformation. It does not provide guarantees on whether such a
285+ transformation is profitable.
286+
287+ For most cases `getTiledImplementationFromOperandTile` could be a
288+ implemented using `getIterationDomainTileFromOperandTile` +
289+ `getTiledImplementation` methods.
179290 }],
180- /*retType=*/"FailureOr< ::mlir::TilingResult> ",
181- /*methodName=*/"getTiledImplementationFromOperandTile ",
291+ /*retType=*/"::mlir::LogicalResult ",
292+ /*methodName=*/"getIterationDomainTileFromOperandTile ",
182293 /*args=*/(ins
183294 "OpBuilder &":$b,
184295 "unsigned":$operandNumber,
185- "ArrayRef<OpFoldResult>":$offsets,
186- "ArrayRef<OpFoldResult>":$sizes),
296+ "ArrayRef<OpFoldResult> ":$offsets,
297+ "ArrayRef<OpFoldResult> ":$sizes,
298+ "SmallVectorImpl<OpFoldResult> &":$iterDomainOffsets,
299+ "SmallVectorImpl<OpFoldResult> &":$iterDomainSizes),
187300 /*methodBody=*/"",
188301 /*defaultImplementation=*/[{
189302 return failure();
0 commit comments