Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NestedType problem. #465

Closed
ag-conspectus opened this issue Apr 14, 2024 · 2 comments
Closed

NestedType problem. #465

ag-conspectus opened this issue Apr 14, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@ag-conspectus
Copy link

ag-conspectus commented Apr 14, 2024

Hello,

I suppose that NestedType class is not working as intended.

As I understand, it is used when Nested column is declared as a single array of tuples (flatten_nested = 0). For example:

SET flatten_nested = 0;
CREATE TABLE TestTable
(
    `_id` UUID,
    `CreatedAt` DateTime DEFAULT now(),
    `Comments` Nested(Id Nullable(String), Comment Nullable(String)),
)
ENGINE = ReplacingMergeTree(CreatedAt)
ORDER BY (_id, CreatedAt);

NestedType inherits TupleType and has property FrameworkType initialized with type of array of tuples. But it is implemented to read\write single tuple.

When I pass an array of tuples to WriteToServerAsync, I will receive the validation error:

using (var bulkCopyInterface = new ClickHouseBulkCopy(connection)
{
	ColumnNames = new string[] {"_id", "Comments"},
	DestinationTableName = "TestTable"
})
{
  await bulkCopyInterface.InitAsync();
  await bulkCopyInterface.WriteToServerAsync(new List<object[]>()
  {
  	new object[] { Guid.NewGuid(), new ITuple[] {("1", "Comment1"),("2","Comment2"),("3","Comment3")}}
  });
}

Unhandled exception. ClickHouse.Client.Copy.ClickHouseBulkCopySerializationException: Error when serializing data
---> System.ArgumentException: Wrong number of elements in Tuple (Parameter 'value')
at ClickHouse.Client.Types.TupleType.Write(ExtendedBinaryWriter writer, Object value) in C:\Code\ClickHouse.Client\ClickHouse.Client\Types\TupleType.cs:line 114
at ClickHouse.Client.Copy.ClickHouseBulkCopy.SerializeBatch(Batch batch) in C:\Code\ClickHouse.Client\ClickHouse.Client\Copy\ClickHouseBulkCopy.cs:line 17
...

When I use single tuple, I will receive another error after sending batch to ClickHouse:

using (var bulkCopyInterface = new ClickHouseBulkCopy(connection)
{
	ColumnNames = new string[] {"_id", "Comments"},
	DestinationTableName = "TestTable"
})
{
  await bulkCopyInterface.InitAsync();
  await bulkCopyInterface.WriteToServerAsync(new List<object[]>()
  {
        new object[] { Guid.NewGuid(), ("1", "Comment1")},
  	//new object[] { Guid.NewGuid(), new ITuple[] {("1", "Comment1"),("2","Comment2"),("3","Comment3")}}
  });
}

Unhandled exception. ClickHouse.Client.ClickHouseServerException (0x00000021): Code: 33. DB::Exception: Cannot read all data. Bytes read: 12. Bytes expected: 16.: (at row 2): While executing BinaryRowInputFormat. (CANNOT_READ_ALL_DATA) (version 22.9.2.7 (official build)) at ClickHouse.Client.ADO.ClickHouseConnection.HandleError(HttpResponseMessage response, String query, Activity activity) in C:\Code\ClickHouse.Client\ClickHouse.Client\ADO\ClickHouseConnection.cs:line 216
at ClickHouse.Client.ADO.ClickHouseConnection.PostStreamAsync(String sql, Stream data, Boolean isCompressed, CancellationToken token) in C:\Code\ClickHouse.Client\ClickHouse.Client\ADO\ClickHouseConnection.cs:line 296
at ClickHouse.Client.Copy.ClickHouseBulkCopy.SendBatchAsync(Batch batch, CancellationToken token) in C:\Code\ClickHouse.Client\ClickHouse.Client\Copy\ClickHouseBulkCopy.cs:line 193...

There are errors on reading the nested column too.

I think NestedType has to be implemented like ArrayType, with Read\Write methods overridden:

internal class NestedType : TupleType
{
    public override string Name => "Nested";

    public override Type FrameworkType => base.FrameworkType.MakeArrayType();

    //...
  
    public override object Read(ExtendedBinaryReader reader)
    {
        var length = reader.Read7BitEncodedInt();
        var data = Array.CreateInstance(base.FrameworkType, length);
        for (var i = 0; i < length; i++)
        {
            data.SetValue(ClearDBNull(base.Read(reader)), i);
        }
        return data;
    }
    
    public override void Write(ExtendedBinaryWriter writer, object value)
    {
        if (value is null || value is DBNull)
        {
            writer.Write7BitEncodedInt(0);
            return;
        }
    
        var collection = (IList)value;
        writer.Write7BitEncodedInt(collection.Count);
        for (var i = 0; i < collection.Count; i++)
        {
            base.Write(writer, collection[i]);
        }
    }
    
    //...
}

It allows to read\write nested column as array of tuples correctly:

class TestModel
{
	public Guid _id { get; set; }
	public DateTime CreatedAt { get; set; }
	public ITuple[] Comments { get; set; }
}
var result = await connection.QueryAsync<TestModel>("SELECT * FROM TestTable");
//...
await bulkCopyInterface.WriteToServerAsync(new List<object[]>()
{
	new object[] { Guid.NewGuid(), new ITuple[] {("1", "Comment1"),("2","Comment2"),("3","Comment3")}}
});
@DarkWanderer DarkWanderer reopened this Apr 17, 2024
@DarkWanderer DarkWanderer self-assigned this Apr 17, 2024
@DarkWanderer DarkWanderer added the bug Something isn't working label Apr 17, 2024
@DarkWanderer
Copy link
Owner

Please check behavior in version 7.3.0. The client was behaving correctly with flatten_nested = 1 (default) but did not behave correctly when flatten_nested was set to 0 for nested item insertion

@ag-conspectus
Copy link
Author

ag-conspectus commented Apr 17, 2024

Problem is solved. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants