Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BZip2 compression fails with BZ_OUTBUFF_FULL error on small data sizes #4388

Open
dqwu opened this issue Oct 31, 2024 · 1 comment
Open

BZip2 compression fails with BZ_OUTBUFF_FULL error on small data sizes #4388

dqwu opened this issue Oct 31, 2024 · 1 comment

Comments

@dqwu
Copy link

dqwu commented Oct 31, 2024

Issue Description

When using ADIOS2 2.10.1 release with BZip2 compression enabled, attempts to compress very small datasets (in this case, an array of 9 integers) result in a runtime error:
[ADIOS2 ERROR] <Helper> <adiosSystem> <ExceptionToError> : adios2_put: [ADIOS2 EXCEPTION] <Operator> <CompressBZIP2> <CheckStatus> : BZ_OUTBUFF_FULL BZIP2 detected size of compressed data is larger than destination length in call to ADIOS2 BZIP2 Compress batch 0

This error is unexpected, as one might assume smaller data sizes would be easier to compress, and no buffer issues would arise. Setting the data length to 10 or higher avoids the error.

Expected Behavior

The library should handle small data sizes without a buffer overflow or provide a more descriptive error message indicating limitations on BZip2 compression for small datasets.

Proposed Solution

To improve usability, the ADIOS2 library could:

  • Adjust internal buffer allocations to handle small data sizes with BZip2.
  • Implement a pre-check on data size, recommending a minimum size for BZip2 compression, or handle the compression gracefully at small sizes by padding or using an alternative compression approach.

Test Case

The following code reproduces the issue. When DATA_LEN is set to 9, it triggers the error, whereas setting it to 10 works as expected.

#include <adios2_c.h>

#define DATA_LEN 9

int main(int argc, char *argv[])
{
    adios2_adios *adios = adios2_init_serial();
    adios2_io *bpIO = adios2_declare_io(adios, "BP5WriterWithComp");
    adios2_set_engine(bpIO, "BP5");

    adios2_engine *bpWriter = adios2_open(bpIO, "BZip2_compression.bp", adios2_mode_write);

    size_t count = DATA_LEN;
    adios2_variable *var = adios2_define_variable(bpIO, "data_with_comp", adios2_type_int32_t, 1, NULL, NULL, &count, adios2_constant_dims_true);

    adios2_operator *op = adios2_define_operator(adios, "BZip2Lossless", "bzip2");

    size_t operation_index;
    adios2_add_operation(&operation_index, var, op, "", "");

    int data[DATA_LEN] = {0};
    adios2_put(bpWriter, var, data, adios2_mode_sync);

    adios2_close(bpWriter);
    adios2_finalize(adios);

    return 0;
}
@dqwu
Copy link
Author

dqwu commented Oct 31, 2024

@pnorbert Could you please take a look at this or assign it to someone familiar with BZip2 in ADIOS? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant