util/parquet: copy bytes from fmtCtx when writing byte arrays · cockroachdb/cockroach@343ba8d

Commit

util/parquet: copy bytes from fmtCtx when writing byte arrays

While running CI, it was discovered that there is a data race when
writing bytes with the fmtCtx in this pkg. The reason for the
data race is that when a byte array is written to the underlying
library, it creates a shallow copy and buffers it to be flushed
later. While its being buffered, the `Writer` in this package closes
the fmtCtx and returns it to the pool. Then, any code running
concurrently may access the buffer in the fmtCtx.
The data race can be seen [here](https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_BazelExtendedCi/10458378?showRootCauses=true&expandBuildChangesSection=true&expandBuildProblemsSection=true&expandBuildTestsSection=true). To fix this, we now create a copy of the byte array when using fmtCtx before passing it to the underlying library.

In this particular failure, the problem is in page statistics.
The underlying library will buffer the min and max byte array datums
until it needs to flush stats to the output file. See [here](https://github.com/apache/arrow/blob/6808bfe3cdf5623212e575c1ec1083e194ed580c/go/parquet/file/column_writer_types.gen.go#L1428).

It's worth noting that some encoders will copy byte slices before
buffering them and some will not. For example the plain encoder
copies slices (see [UnsafeWrite](https://github.com/apache/arrow/blob/6808bfe3cdf5623212e575c1ec1083e194ed580c/go/parquet/internal/encoding/byte_array_encoder.go#LL39C1-L46C2)) and the dict encoder does not (see [here](https://github.com/apache/arrow/blob/6808bfe3cdf5623212e575c1ec1083e194ed580c/go/parquet/internal/encoding/byte_array_encoder.go#L98) and [appendVal](https://github.com/apache/arrow/blob/6808bfe3cdf5623212e575c1ec1083e194ed580c/go/internal/hashing/xxh3_memo_table.go#L248)). If the problem with statistics
above were to be solved, then we could consider using a non-dictionary encoding when writing byte slices. For now, we cannot.

As expected, this change causes more alloctions when writing datums:
Before
BenchmarkParquetWriter/bytes-10 162480 7136 ns/op 1913 B/op 64 allocs/op
After
BenchmarkParquetWriter/bytes-10 181750 6422 ns/op 1148 B/op 32 allocs/op

The affected types are bit, box2d, date, decimal, inet, json, interval, timestamp, timestamptz and timetz.

Informs: #99028
Epic: CRDB-27372
Release note: None

jayshrivastava committed Jun 13, 2023

1 parent af78952 commit 343ba8d

Please sign in to comment.