You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm seeing a difference in DataColumn.Data.Length for the keyColumn and the valueColumn of the partitionValues column.
Here's the paths I'm using -
keyPath: "add/partitionValues/key_value/key"
valuePath: "add/partitionValues/key_value/value"
For the keyPath, I'm getting 64865 as the DataColumn.Data.Length whereas valuePath returns 64867.
Note that this issue was not present in version 3.10.0
Failing test
Codethat I used to verify the issue:privateasyncTask<DataColumn[]>ReadParquetMyFileAsync(booltreatByteArrayAsString){List<DataColumn>dataColumns=newList<DataColumn>();stringname="<filename>.checkpoint.parquet";stringkeyPath="add/partitionValues/key_value/key";stringvaluePath="add/partitionValues/key_value/value";using(Streams= OpenTestFile(name)){using(ParquetReaderpr=await ParquetReader.CreateAsync(
s,new ParquetOptions {TreatByteArrayAsString=treatByteArrayAsString})){
DataField[]dataFields= pr.Schema.GetDataFields();Dictionary<string,DataField>dataFieldMapping=this.RetrieveDataFieldMapping(dataFields);for(inti=0;i< pr.RowGroupCount;++i){usingParquetRowGroupReadergroupReader= pr.OpenRowGroupReader(i);if(dataFieldMapping.TryGetValue(keyPath,out DataField keyField)&&
dataFieldMapping.TryGetValue(valuePath,out DataField valueField)){DataColumnkeyColumn=await groupReader.ReadColumnAsync(keyField);DataColumnvalueColumn=await groupReader.ReadColumnAsync(valueField);ArraykeyColumnData= keyColumn.Data;ArrayvalueColumnData= valueColumn.Data;
dataColumns.Add(keyColumn);
dataColumns.Add(valueColumn);stringresult=string.Empty;for(intdataIndex=0;dataIndex< keyColumn.Data.Length;++dataIndex){stringkey= keyColumnData.GetValue(dataIndex).ToString();stringval= valueColumnData.GetValue(dataIndex)==null?"null": valueColumnData.GetValue(dataIndex).ToString();result+="["+(dataIndex)+"] "+key+": "+val+"\n";}
Console.WriteLine(result);}}return dataColumns.ToArray();}}}
The text was updated successfully, but these errors were encountered:
shamimashik
changed the title
[BUG]: Getting different length for keyColumn and valueColumn of partition column
[BUG]: Getting different length for keyColumn and valueColumn of a partition column
Mar 18, 2024
Library Version
4.23.4
OS
Windows
OS Architecture
64 bit
How to reproduce?
I'm seeing a difference in DataColumn.Data.Length for the keyColumn and the valueColumn of the partitionValues column.
Here's the paths I'm using -
keyPath: "add/partitionValues/key_value/key"
valuePath: "add/partitionValues/key_value/value"
For the keyPath, I'm getting 64865 as the DataColumn.Data.Length whereas valuePath returns 64867.
Note that this issue was not present in version 3.10.0
Failing test
The text was updated successfully, but these errors were encountered: