Use SHL

English | 简体中文

SHL(Structure of Heterogeneous Library, Chinese name: ShiHulan) is a high-performance Heterogeneous computing library provided by XuanTie. The interface of SHL uses XuanTie neural network library API for XuanTie CPU platform: CSI-NN2, and provides a series of optimized binary libraries.

Features for SHL:

Reference implementation of c code version
Assembly optimization implementation for XuanTie CPU
Supports symmetric quantization and asymmetric quantization
Support 8bit, 16bit, and f16 data types
compaatible with NCHW and NHWC formates
Use HHB to automatically call API
Covers different architectures, such as CPU and NPU
Reference heterogeneous schedule implementation

In principle, SHL only provides the reference implementation of XuanTie CPU platform, and the optimization of each NPU target platform is completed by the vendor of the specific platform.

Use SHL

Installation

Official Python packages

SHL released packages are published in PyPi, can install with hhb.

pip3 install hhb

binary libary is at /usr/local/lib/python3.6/dist-packages/tvm/install_nn2/

Build SHL from Source

Here is one example to build C906 library.

We need to install XuanTie RISC-V GCC 2.6, which can get from XuanTie OCC, download, decompress, and set path environment.

wget https://occ-oss-prod.oss-cn-hangzhou.aliyuncs.com/resource//1663142514282/Xuantie-900-gcc-linux-5.10.4-glibc-x86_64-V2.6.1-20220906.tar.gz
tar xf Xuantie-900-gcc-linux-5.10.4-glibc-x86_64-V2.6.1-20220906.tar.gz
export PATH=${PWD}/Xuantie-900-gcc-linux-5.10.4-glibc-x86_64-V2.6.1/bin:$PATH

Download source code

git clone https://github.com/XUANTIE-RV/csi-nn2.git

compile c906

cd csi-nn2
make nn2_c906

install c906

make install_nn2

Quick Start Example

Here is one example for XuanTie C906 to run mobilenetv1. It shows how to call SHL API to inference the whole model.

compile command:

cd example
make c906_m1_f16

c906_mobilenetv1_f16.elf will be generated after completion. After copying it to the development board with C906 CPU [such as D1], execute:

./c906_mobilenetv1_f16.elf

NOTE: Original mobilenetv1's every conv2d has one BN(batch norm), but the example assumes BN had been fused into conv2d。About how to use deployment tools to fuse BN, and emit right weight float16 value, can reference HHB.

Resources

Acknowledgement

SHL refers to the following projects:

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
cmake		cmake
example		example
include		include
module		module
python		python
script		script
source		source
tests		tests
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
CMakeLists.txt		CMakeLists.txt
Doxyfile		Doxyfile
Kconfig		Kconfig
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
README_CN.md		README_CN.md
version		version

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Use SHL

Installation

Official Python packages

Build SHL from Source

Quick Start Example

Resources

Acknowledgement

About

Releases 10

Packages

Contributors 2

Languages

License

XUANTIE-RV/csi-nn2

Folders and files

Latest commit

History

Repository files navigation

Use SHL

Installation

Official Python packages

Build SHL from Source

Quick Start Example

Resources

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 10

Packages 0

Contributors 2

Languages

Packages