Skip to content
David Tanner edited this page Jan 19, 2018 · 66 revisions

A tool for creating a benchmark-driven backend library for GEMMs, GEMM-like problems (such as batched GEMM), N-dimensional tensor contractions, and anything else that multiplies two multi-dimensional objects together on a GPU.

Overview for creating a custom TensileLib backend library for your application:

  1. Install Tensile (optional), or at least install the PyYAML dependency (mandatory).
  2. Create a benchmark config.yaml file.
  3. Run the benchmark to produce a library logic.yaml file.
  4. Add the Tensile library to your application's CMake target. The Tensile library will be written, compiled and linked to your application at application-compile-time.
    • GPU kernels, written in HIP or OpenCL.
    • Solution classes which enqueue the kernels.
    • APIs which call the fastest solution for a problem.

Quick Example:

sudo apt-get install python-yaml
mkdir Tensile
cd Tensile
git clone https://github.com/RadeonOpenCompute/Tensile.git repo
mkdir build
cd build
python ../repo/Tensile/Tensile.py ../repo/Tensile/Configs/test_sgemm.yaml ./

After a while of benchmarking, Tensile will print out the path to the client you can run.

./4_LibraryClient/build/client -h
./4_LibraryClient/build/client --sizes 5760 5760 5760
Clone this wiki locally