Releases · sophgo/tpu-mlir

24 Dec 03:45

25971cb

v1.14-beta.0 Latest

Latest

fix bug in build ppl

Change-Id: Ib93341da7fa6b420f9fb9cd9e4b61dc21aeaf001

Assets 8

01 Dec 04:53

github-actions

v1.13

0fb281c

v1.13

add a16 matmul multi_core

Change-Id: I10a9097ee52e324555f4a505ce18d7fe9b665803

Assets 8

22 Nov 13:12

github-actions

v1.13-beta.0

537bd53

v1.13-beta.0

[doc] layergroup opt intro

Change-Id: I0797b73e4d020e9556da29d1c1a743b8c80a83ad

Assets 8

05 Nov 07:30

github-actions

v1.12

f732432

v1.12

Features

Support for backend operators implemented using PPL.
TPUv7-runtime CModel integrated with TPU-MLIR for BM1690 model CModel inference.
Optimized inference speed for BM1690 Stable Diffusion 3.0 at 512 resolution to 0.72 img/s (Mac utilization: 41.9%).
Support for training graph compilation of ResNet50-v1 through FxGraphConverter.

Bug Fixes

Performance: Fixed the issue of performance degradation in SegNet.
Functionality: Resolved the compilation comparison issue for BM1688 DeppLabv3P.

Known Issues

Performance: Slight performance degradation observed in BM1690 YOLOv5-6 with 4 batch INT8 on eight cores.

Assets 8

25 Oct 10:18

github-actions

v1.12-beta.0

85c9177

v1.12-beta.0

combine slice and concate to new Rope ConcatToRope

Change-Id: Ib15b12fe97117b96c6fe7267c96c3f714aac6ec4

Assets 8

27 Sep 05:30

github-actions

v1.11

029d83a

v1.11

[python] distinguish data path model-zoo from regression

Change-Id: I98fa0df1524f0b38d91cda02ab5d49876f7caee8
(cherry picked from commit fa082d0b29df8a82af77839df86349aabab86949)

Assets 8

18 Sep 09:02

github-actions

v1.11-beta.0

3a49702

v1.11-beta.0

[soc_dump] add doc

Change-Id: Icaf313113415a9bf0ad9c75abdcb609d661c815b

Assets 8

15 Aug 05:02

github-actions

v1.10

d9ce48d

TPU-MLIR v1.10 Release

Release Note

Enhancements:

Added CUDA support for various operations like conv2d, MatMul, dwconv, pool2d, and more.
Improved performance for operations like MeanStdScale and softmax.
Enhanced multi-core batch mm and added support for bm168x with CUDA.
Refined CUDA code style and adjusted interfaces for various operations.

Bug Fixes:

Fixed issues with matmul, calibration failures, conv pad problems, and various performance problems.
Addressed bugs in model transformations, calibration, and various pattern issues.
Resolved bugs in different model backends like ssd, vit, detr, and yolov5.

New Features:

Added support for new models like resnet50, mobilenet_v2, shufflenet_v2, and yolox_s/alphapose_res50.
Introduced new operations like RequantIntAxisOp and Depth2Space with CUDA support.
Implemented new functionalities for better model inference and compilation.

Documentation Updates:

Updated weight.md, calibration sections, and user interface details.
Improved documentation for quick start, developer manual, and various tpulang interfaces.
Enhanced documentation for model transformation parameters and tensor data arrangements.

Miscellaneous:

Added new npz tools, modelzoo regression, and support for bmodel encryption.
Fixed issues with various model performance, shape inference, and CUDA backend optimizations.
Revived performance for models like yolov5s-6, bm1690 swin multicore, and more.

Assets 8

15 Jul 14:40

github-actions

v1.9

3fe7a13

TPU-MLIR v1.9 Release

Release Note

Enhancements:

Implemented output order preservation in converters like ONNX, Caffe, Torch, and TFLite.
Added support for resnet50-v2 bm1690 f8 regression.
Improved ILP group mlir file sequences for resnet50 training.
Updated chip libraries and performance AI for A2 profiling.
Added a new dump mode "COMB" and refined abs/relu conversions.

Bug Fixes:

Fixed issues with preprocess when source layout differs from target layout.
Addressed bugs in various operations like softmax, concat, and weight reorder in conv2d.
Resolved bugs in model training, model transformation, and various pattern issues.
Fixed bugs related to CUDA inference, matmul with bias, and multi-output calibration.

New Features:

Added support for multi-graph in TPULang.
Introduced new options in TPULang for inference and model deployment.
Implemented various optimizations and enhancements for dynamic operations and model transformations.

Documentation Updates:

Refined documentation for quick start quantization and user interface sections.
Updated backend information, docker image download methods, and model deployment details in the documentation.

Miscellaneous:

Improved performance for various models like vit, yolov5s, and bm1690.
Introduced new functionalities like embedding multi-device slice and groupnorm train operations.
Added support for adaptive_avgpool inference and multiple Einsum modes.

Assets 8

12 Jul 09:27

github-actions

v1.8.1

f411c2b

TPU-MLIR v1.8.1

Full Changelog: v1.8...v1.8.1

Assets 8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release Note

Enhancements:

Bug Fixes:

New Features:

Documentation Updates:

Miscellaneous:

Release Note

Enhancements:

Bug Fixes:

New Features:

Documentation Updates:

Miscellaneous:

Releases: sophgo/tpu-mlir

v1.14-beta.0

v1.13

v1.13-beta.0

v1.12

v1.12-beta.0

v1.11

v1.11-beta.0

TPU-MLIR v1.10 Release

Release Note

Enhancements:

Bug Fixes:

New Features:

Documentation Updates:

Miscellaneous:

TPU-MLIR v1.9 Release

Release Note

Enhancements:

Bug Fixes:

New Features:

Documentation Updates:

Miscellaneous:

TPU-MLIR v1.8.1