v1.6-beta.0
Pre-release
Pre-release
New Features
- Implemented SG2260 structureOp interface and structured transform, including a solver for finding transforms【ea234bc2†source】.
- Added OneHot converter and support for fp8 in the debugger【c03ba46c†source】【f87127bd†source】【fed7e68a†source】.
- Supported MatMulOp for special cases broadcast in batch dims and added interface for attention【90d4b327†source】【044c4fc3†source】.
- Provided "decompose linalg op" and "tile+fuse" pass for MatMul parallel supports more batch patterns【25f24e3d†source】.
- Unet single block test added【ea76f9c9†source】.
- Implemented fp8 support for Matmul and other ops including addconst, subconst, mul, add, sub, and abs【e09adbda†source】【7eaec57f†source】.
Performance Improvements
- Improved Matmul fp8 performance with new backend support【2b8dd03b†source】.
- Enabled distribute MLP and attention with improved performance for cascade_net input/output names and order【d5a42d7a†source】.
- Refactored tdb to improve disassembler serialize and resolve BM1688 decoding issue【e73450f8†source】【1457df29†source】.
- Improved weight reorder for ConvOp and optimized permute of attention matmul【a9045c3c†source】【91a353e3†source】.
Bug Fixes
- Resolved various bugs in MatMul, Conv, and other ops across multiple chipsets including SG2260, BM1688, and CV18xx【b809a8c1†source】【bfada4de†source】【9804e30c†source】.
- Fixed bugs related to ReduceOp, ArgOp, SliceOp, and others for better operation and tensor handling【2cdeb60d†source】【bbacf00f†source】.
- Addressed issues in SAM, daily test, and tdb related to core operations and functionality【83e1979c†source】【7c37e39d†source】.
- Fixed memory and data handling bugs for more accurate and stable operation of the models【2310cd8d†source】【0ed60f1f†source】.
Documentation Updates
- Updated documentation to remove sensitive words and improve clarity and comprehensiveness【43e0b428†source】【5d6c49fc†source】.
Miscellaneous
- Enhanced various backend libraries and supported new ops and patterns for more efficient and versatile model handling【1ca95d71†source】【8f1a2de8†source】.
- Improved scatterE and reduce dynamic shape_value handling for better model optimization【fa2ccf29†source】.
- Refinements in graph optimization, permute parallel indexMapping, and related areas for improved model processing【094f05da†source】【1ec6c16b†source】.