Skip to content

Comments

PARQUET-152: Add validation on Encoding.DELTA_BYTE_ARRAY to allow FIX…#225

Closed
spena wants to merge 1 commit intoapache:masterfrom
spena:parquet-152
Closed

PARQUET-152: Add validation on Encoding.DELTA_BYTE_ARRAY to allow FIX…#225
spena wants to merge 1 commit intoapache:masterfrom
spena:parquet-152

Conversation

@spena
Copy link

@spena spena commented Jun 23, 2015

PARQUET-152: Add validation on Encoding.DELTA_BYTE_ARRAY to allow FIXED_LEN_BYTE_ARRAY types.

  • FIXED_LEN_BYTE_ARRAY types are binary values that may use DELTA_BYTE_ARRAY encoding,
    so they should be allowed to be decoded using the same DELTA_BYTE_ARRAY encoding.

@rdblue @nezihyigitbasi Could you review this fix?

I executed a test by writing a file that fall backs to DELTA_BYTE_ARRAY encoding, then read the file, and compare the read values with the written values, and it worked fine.

…ED_LEN_BYTE_ARRAY types.

  * FIXED_LEN_BYTE_ARRAY types are binary values thay may use DELTA_BYTE_ARRAY encoding,
    so they should be allowed to be decoded using the same DELTA_BYTE_ARRAY encoding.
@nezihyigitbasi
Copy link
Contributor

@spena this problem was previously causing failures during read benchmarks, I verified that this change at least makes the benchmarks go through.
@rdblue are there any other types that can use the DELTA_BYTE_ARRAY encoding?

@spena
Copy link
Author

spena commented Jun 23, 2015

ParquetProperties.writerToFallbackTo() shows that only BINARY and FIXED_LEN_BYTE_ARRAY are supported for DELTA_BYTE_ARRAY:

...
case BOOLEAN:
        return new RunLengthBitPackingHybridValuesWriter(1, initialSizePerCol, pageSize);
case BINARY:
case FIXED_LEN_BYTE_ARRAY:
        return new DeltaByteArrayWriter(initialSizePerCol, pageSize);
case INT32:
...

@rdblue
Copy link
Contributor

rdblue commented Jun 27, 2015

Looks good to me. I don't think anything else should use DELTA_BYTE_ARRAY.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants