Apache Iceberg version
No response
Query engine
No response
Please describe the bug 🐞
The logic in https://github.com/apache/iceberg/blob/master/python/pyiceberg/avro/decoder.py#L70 appears to be incorrect.
The spec for a binary-encoded int in the manifest files is as follows:
int | Stored as 4-byte little-endian
so, an example bytestring of 0xad4a0000 should be read as the decimal 19117:
- lsb
0xad is 173
- 2nd lsb
0x4a is 74
- (74 * 256) + 173 == 19117
however BinaryDecoder does not read this correctly:
import io
def as_fo(x):
return io.BytesIO(bytes.fromhex(x))
assert as_fo('ad4a0000').read(4).hex() == 'ad4a0000'
assert BinaryDecoder(as_io('ad4a0000')).read_int() == -4759
it is not obvious by inspection of BinaryDecoder.read_int where the bug is, but it is clearly a bug.