From 68ca2651df065ada7e20813a9a0ea505b241719a Mon Sep 17 00:00:00 2001 From: Scott Todd Date: Mon, 20 Sep 2021 12:15:05 -0700 Subject: [PATCH 1/3] Sketching out a "types and shapes" developer document. --- .../design_docs/types_and_shapes.md | 136 ++++++++++++++++++ 1 file changed, 136 insertions(+) create mode 100644 docs/developers/design_docs/types_and_shapes.md diff --git a/docs/developers/design_docs/types_and_shapes.md b/docs/developers/design_docs/types_and_shapes.md new file mode 100644 index 000000000000..b80a35b039a2 --- /dev/null +++ b/docs/developers/design_docs/types_and_shapes.md @@ -0,0 +1,136 @@ +# Types and Shapes + +IREE supports compiling programs from a variety of frontend frameworks to a +number of backends and uses a collection of MLIR dialects and passes to connect +between each slice through the system. Each layer of the stack has its its own +views on data types and shapes. + +* Data _type_ here refers to an attribute of data which describes its meaning, + defines operations that can be performed on it, and gives information about + how it can be stored. Examples of data types are `integer`, `float`, and + `string`. See [the Wikipedia page on data types](https://en.wikipedia.org/wiki/Data_type) + for more background. +* Data _shape_ here refers to an attribute of multidimensional data (scalars, + matrices, tensors) which describes the number of elements in each axis of the + data. Shapes are comprised of a rank (the number of axes, if defined) and a + list of dimensions, one element per axis. Some example shapes are `[3, 4]`, + `[*]` (unranked), and `[?, 2]` (ranked with one unknown dimension). See the + [MLIR 'shape' Dialect documentation](https://mlir.llvm.org/docs/Dialects/ShapeDialect/) + for more background. + +Frontend references: + +* TensorFlow: [Introduction to Tensors](https://www.tensorflow.org/guide/tensor) +* PyTorch: [`torch.Tensor` documentation](https://pytorch.org/docs/stable/tensors.html) +* NumPy: [Data types documentation](https://numpy.org/doc/stable/user/basics.types.html) + +Backend references: + +* Vulkan: [buffer and image formats](https://www.khronos.org/registry/vulkan/specs/1.0/html/vkspec.html#formats) +* SPIR-V: [types](https://www.khronos.org/registry/SPIR-V/specs/1.0/SPIRV.html#_types) and [capabilities](https://www.khronos.org/registry/SPIR-V/specs/1.0/SPIRV.html#_a_id_capability_a_capability) + +## Types + +Types can roughly be grouped in a few different ways: + +* Primitive (`char`, `int`) vs composite (`string`, `array`) +* Signed (`int`, `int32_t`) vs unsigned (`unsigned`, `uint32_t`) vs signless +* Fixed width (`int32_t`) vs variable width (`int`, `index`, `uintptr_t`) +* Real (`float32`) vs complex (`tf.complex64`) +* Concrete vs opaque (`void*`, API internal structs, hardware image formats) +* Quantized data types (`bfloat16`) + +Types are least constrained in user code within high level frameworks, where +composite types such as Python classes, media files, Protocol Buffers, JSON +objects, and other data structures can be freely created and transformed. +Meanwhile, types are most constrained by hardware and device APIs, where only +specific low level primitives are defined or where certain operations are +supported by efficient hardware implementations. + +### Conversion process + +IREE lowers programs from representations produced by high level frondends down +to low level host code with scheduling logic and device code containing fused +kernels of dense computation. The phases of compilation can be segmented by +which MLIR dialects are primarily being transformed: + +``` +Frontends (PyTorch, JAX, TensorFlow, TOSA, etc.) + * Includes user code, serialized ML models / programs, and other libraries + + ↓ + +Import dialects (`iree`, `tensor`, `linalg`, etc.) + + ↓ + +`flow` dialect (tensor program modeling and compute workload partitioning) + + ↓ + + `stream` dialect | code generation + (device placement and asynchronous scheduling) | (SPIR-V, LLVM, etc.) + + ↓ + +`hal` dialect (Hardware Abstraction Layer for buffer and execution management) + + ↓ + +`vm` dialect (Virtual Machine for setting up and dispatching workloads) +``` + +See also https://google.github.io/iree/#project-architecture. + +#### Requirements for import dialects + +#### Requirements for `flow` dialect + +#### Requirements for `stream` dialect + +#### Requirements for code generation + +TODO: LLVM / SPIR-V emulation of types? + +#### Requirements for `hal` dialect + +The Hardware Abstraction Layer maps nearly directly to underlying hardware APIs +such as Vulkan, Metal, and CUDA. + +* No tensor types. Buffers of primitives or explicitly supported opaque data + types. +* Supported primitives vary per target backend and may be optionally available. + Generally expect for int32 and float32 to be well supported for mobile to + desktop -scale devices and for lower or higher bit depth types (e.g. float16, + int64) to be optionally available. On embedded systems or certain + accelerators there may be no floating support at all. + +#### Requirements for `vm` dialect + +IREE's Virtual Machine aims to be maximally portable, so it implements support +for i64, f32, and f64 behind extensions. See +[iree/base/config.h](https://github.com/google/iree/blob/main/iree/base/config.h) +for the specifics of each extension. + +### Strategies for converting between types + +#### Emulating + +#### Truncating / Demotion + +#### Extending / Promotion + +#### Packing + +TODO: pack i1 into i8/i32 (vectorization) + +## Shapes + +TODO: static vs dynamic +TODO: ranked vs unranked +TODO: shape inference, https://mlir.llvm.org/docs/ShapeInference/ + +## Layouts and tiling + +TODO: dense vs sparse +TODO: dispatch grids From a2bdad589d8d75e277927cd8024f1af6620e230e Mon Sep 17 00:00:00 2001 From: Scott Todd Date: Tue, 21 Sep 2021 12:57:12 -0700 Subject: [PATCH 2/3] Sketch out 'shapes' section, reorder hal/codegen --- .../design_docs/types_and_shapes.md | 125 ++++++++++++------ 1 file changed, 86 insertions(+), 39 deletions(-) diff --git a/docs/developers/design_docs/types_and_shapes.md b/docs/developers/design_docs/types_and_shapes.md index b80a35b039a2..e324ac4c1c08 100644 --- a/docs/developers/design_docs/types_and_shapes.md +++ b/docs/developers/design_docs/types_and_shapes.md @@ -1,5 +1,9 @@ # Types and Shapes +_This page gives background information on types and shapes then outlines IREE's +specific requirements at each layer of its systems. This is intended as a +reference page for developers working on IREE and adjacent projects._ + IREE supports compiling programs from a variety of frontend frameworks to a number of backends and uses a collection of MLIR dialects and passes to connect between each slice through the system. Each layer of the stack has its its own @@ -47,7 +51,67 @@ Meanwhile, types are most constrained by hardware and device APIs, where only specific low level primitives are defined or where certain operations are supported by efficient hardware implementations. -### Conversion process +### Strategies for converting between types + +When converting to a more constrained type system or targeting an interface +where certain types come with execution latency, memory bandwidth, or +representation clarity improvements, there are several strategies available for +performing conversions. + +Note that each conversion generally loses some information, so care must be +taken to preserve correct (or approximately correct, where that is acceptable) +behavior. + +#### Emulation + +#### Truncation / Demotion + +#### Extension / Promotion + +#### Packing + +TODO: pack i1 into i8/i32 (vectorization) + +## Shapes + +Shapes can also be grouped in a few different ways: + +* Ranked (`[1, 2, ?]`) vs unranked (`[*]`) +* Static (`[3, 4]`) vs dynamic (`[?, 4]`, `[3, ?]`) +* Scalar (`i32`) vs 0 rank tensor (`tensor`) vs higher rank tensor + (`tensor<1x1xi32>`) + +IREE requires that shapes be ranked (known, fixed number of dimensions). + +IREE aims to fully support dynamic shapes (also see the +[dynamic shapes sample](https://github.com/google/iree/tree/main/iree/samples/dynamic_shapes)), +though historically static shapes have been most reliably supported. Note that +for optimal performance prefer to only mark slow varying dimensions like batch +index or timestamp (as opposed to inner dimensions like image x/y/channel) as +dynamic. + +The process by which static shapes are deduced from dynamic shape dimensions is +known as "shape inference". Program authors working in a high level framework +will typically only specify the computation shapes at the edges of the program +they are authoring directly, while the underlying framework will create many +dynamically shaped operations in the middle. Shape inference runs prior to the +bulk of IREE's core compilation and it propagates these outer static shapes +through the full program. + +As with any high efficiency compute programming model, IREE can benefit from +programs using certain standard data dimensions/shapes. For example, compute +kernels operating on `256x256` matrices are more likely to use system resources +efficiently than those operating on `10000x3x9x17x3` tensors. Similarly, there +is potential for partially constrained shapes to act as hints to the compiler, +such as "dynamic but between 512 and 1024". + +## Layouts and tiling + +TODO: dense vs sparse + +TODO: dispatch grids + +## Conversion process IREE lowers programs from representations produced by high level frondends down to low level host code with scheduling logic and device code containing fused @@ -55,12 +119,12 @@ kernels of dense computation. The phases of compilation can be segmented by which MLIR dialects are primarily being transformed: ``` -Frontends (PyTorch, JAX, TensorFlow, TOSA, etc.) +frontends (PyTorch, JAX, TensorFlow, TOSA, etc.) * Includes user code, serialized ML models / programs, and other libraries ↓ -Import dialects (`iree`, `tensor`, `linalg`, etc.) +import dialects (`iree`, `tensor`, `linalg`, etc.) ↓ @@ -68,31 +132,31 @@ Import dialects (`iree`, `tensor`, `linalg`, etc.) ↓ - `stream` dialect | code generation - (device placement and asynchronous scheduling) | (SPIR-V, LLVM, etc.) +`stream` dialect (device placement and asynchronous scheduling) ↓ `hal` dialect (Hardware Abstraction Layer for buffer and execution management) - ↓ + ↙ ↘ -`vm` dialect (Virtual Machine for setting up and dispatching workloads) -``` + host code generation | device code generation + (CPU, Vulkan API, etc.) | (x86 via LLVM, SPIR-V, etc.) -See also https://google.github.io/iree/#project-architecture. + ↘ ↙ -#### Requirements for import dialects +`vm` dialect (Virtual Machine for dispatching workloads) +``` -#### Requirements for `flow` dialect +See also https://google.github.io/iree/#project-architecture. -#### Requirements for `stream` dialect +### Requirements for import dialects -#### Requirements for code generation +### Requirements for the `flow` dialect -TODO: LLVM / SPIR-V emulation of types? +### Requirements for the `stream` dialect -#### Requirements for `hal` dialect +### Requirements for the `hal` dialect The Hardware Abstraction Layer maps nearly directly to underlying hardware APIs such as Vulkan, Metal, and CUDA. @@ -105,32 +169,15 @@ such as Vulkan, Metal, and CUDA. int64) to be optionally available. On embedded systems or certain accelerators there may be no floating support at all. -#### Requirements for `vm` dialect +#### Requirements for host code generation + +#### Requirements for device code generation + +TODO: LLVM / SPIR-V emulation of types? + +### Requirements for the `vm` dialect IREE's Virtual Machine aims to be maximally portable, so it implements support for i64, f32, and f64 behind extensions. See [iree/base/config.h](https://github.com/google/iree/blob/main/iree/base/config.h) for the specifics of each extension. - -### Strategies for converting between types - -#### Emulating - -#### Truncating / Demotion - -#### Extending / Promotion - -#### Packing - -TODO: pack i1 into i8/i32 (vectorization) - -## Shapes - -TODO: static vs dynamic -TODO: ranked vs unranked -TODO: shape inference, https://mlir.llvm.org/docs/ShapeInference/ - -## Layouts and tiling - -TODO: dense vs sparse -TODO: dispatch grids From 95569acd71d74b878fb02381a141070471f45199 Mon Sep 17 00:00:00 2001 From: Scott Todd Date: Wed, 22 Sep 2021 17:30:43 -0700 Subject: [PATCH 3/3] Replace 'iree' with 'standard' in list of example input dialects. --- docs/developers/design_docs/types_and_shapes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/developers/design_docs/types_and_shapes.md b/docs/developers/design_docs/types_and_shapes.md index e324ac4c1c08..a22e03288ccb 100644 --- a/docs/developers/design_docs/types_and_shapes.md +++ b/docs/developers/design_docs/types_and_shapes.md @@ -124,7 +124,7 @@ frontends (PyTorch, JAX, TensorFlow, TOSA, etc.) ↓ -import dialects (`iree`, `tensor`, `linalg`, etc.) +import dialects (`standard`, `tensor`, `linalg`, etc.) ↓