Functionality
- Enhanced the Benchmarks Framework to incorporate the new QPL device selection mechanism introduced in the previous release.
- Saved intermediate job states in the dynamic Deflate job to prevent duplicate work when executing with the asynchronous API on the
Hardware Path
and encountering theQPL_STS_QUEUES_ARE_BUSY_ERR
error. In such cases, the job is resubmitted without repeating the already completed work. In v1.6.0 release, this functionality was enabled with the synchronous API. - [experimental feature] Added a mechanism to measure Intel IAA execution time in a single-threaded application with the synchronous API.
Usability and Documentation
- Introduced a clang-format configuration file and formatted the entire codebase using clang-format 17.
- Improved C++ compatibility by fixing field order mismatches when creating structures, initializing
char*
strings with literals, and removing unnecessary conversions between integers and enums. - Added a documentation note clarifying that QPL testing with datasets provided under
tools/testdata
requires a maximum transfer size of 2GB to avoid theQPL_STS_TRANSFER_SIZE_INVALID
error code. - Updated documentation on the
-DEFFICIENT_WAIT
build option. - Enhanced the Introduction section of the QPL documentation, including adding useful links for the Intel® In-Memory Analytics Accelerator.
- Extended testing to generate stored block insertion on the last job.
- Made multiple updates to documentation and examples on
qpl_get_safe_deflate_compression_buffer
size usage for multi-chunk compression. - Improved distance code computation logic on the
Software Path
.
Deprecated Functionality
- The Force Array Output Modification Feature has been deprecated on the Auto Path due to the lack of host fallback support. Use the
Hardware Path
instead.
Bug Fixes
- Resolved build issues with Clang-17 caused by a missing header.
- Corrected logic in
qpl_check_job
to prevent unintended host fallback instead of accelerator execution. - Fixed the compression verification step on the asynchronous path when a stored block occurs.
- Implemented multiple fixes for the stored block insertion feature on both asynchronous and synchronous paths.
- Prevented reprocessing when
qpl_check_job
orqpl_wait_job
is called after submission. - Implemented multiple fixes for issues with index compression/decompression.
- Fixed intermediate buffer incrementing for the
select
operation. - Initialized intermediate huffman table structure correctly to avoid garbage in the huffman table.
- Implemented creation of the mapping CAM huffman decompression table.
- Resolved the issue of never setting the accelerator context on the
Auto Path
. - Introduced immediate fallback to host execution for the specific case of Huffman-only BE16 decompression on the
Auto Path
.
Known Limitations
-
Intel(R) QPL could be built from directly downloadable files (
.tar
,.tgz
) without tests and benchmark frameworks, using the-DQPL_BUILD_TESTS=OFF
build option. This is because it requires submodules that are not included in the archives by GitHub* during release creation. -
Known test failures are listed below. Some tests fail only under certain conditions, which are noted in parentheses.
- Functional tests:
- (
software_path
,auto_path
only on platforms without IAA) ta_c_api_deflate_stateful.{dynamic/fixed/static}_default_verify - (
software_path
,auto_path
) ta_c_api_deflate_stateful.{dynamic/fixed/static}_high_verify
- (
- Functional tests:
-
Compression verification on the
qpl_path_software
works only with indexing mode and data of size smaller than 32KB in other modes. -
Inflate does not report the error code
QPL_STS_BIG_HEADER_ERR
when a header is too big to fit in the input buffer. -
The implementation of
QPL_FLAG_CRC32C
is in progress. -
When using
qpl_path_hardware
, the compression and decompression with indexing mode on IAA 2.0 are limited to data sizes smaller than 4KB. -
The Force Array Output Modification feature is enabled only for
qpl_path_hardware
and Intel IAA 2.0 (and later). In the case ofqpl_path_auto
, an error codeQPL_STS_NOT_SUPPORTED
is returned as no fallback is available currently.
Thanks to the Contributors
The release includes contributions from the project team and @fwph, @Permanence-AI-Coder.