Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type hinting in serialization depends on field ordering #34

Closed
cpg314 opened this issue Dec 25, 2023 · 4 comments
Closed

Type hinting in serialization depends on field ordering #34

cpg314 opened this issue Dec 25, 2023 · 4 comments

Comments

@cpg314
Copy link
Collaborator

cpg314 commented Dec 25, 2023

When struct fields are not declared in the same order as in the database, type hints will be passed to the wrong fields.

This affects only structs with fields requiring a type hint (decimal, bytes...) and results in an error from the server.

Detailed description

The Row::serialize_row method signature is

    fn serialize_row(
        self,
        type_hints: &[&Type]
    ) -> Result<Vec<(Cow<'static, str>, Value)>>;

It is called in client.rs as follows:

let types = first_block.column_types.values().collect::<Vec<_>>();
rows.into_iter()
    .map(|x| x.serialize_row(&types[..]))

The Row derive macro implements serialize_row by iterating over fields, and calling types[idx] to get the type hint for each field

There is however no guarantee that the fields order will match between the ones in the struct declaration and in first_block (returned by the server).

Example

#[derive(klickhouse::Row)]
struct TestRow {
    a: klickhouse::Bytes,
    b: klickhouse::Bytes,
}

#[tokio::test]
async fn blah() {
    let client = get_client().await;

    prepare_table(
        "test",
        "
         a Array(UInt8),
         b String",
        &client,
    )
    .await;

    client
        .insert_native_block(
            "INSERT INTO test FORMAT Native",
            vec![TestRow {
                a: vec![].into(),
                b: vec![].into(),
            }],
        )
        .await
        .unwrap();
}

This test passes, but fails if a and b are switched in the struct declaration:

struct TestRow {
    b: klickhouse::Bytes,
    a: klickhouse::Bytes,
}

Indeed, b now gets the type hint meant for a.

Solution

The solution is pretty simple but requires a change to the Row trait:

In client.rs, column_types is originally an IndexMap in a Block. Rather than passing it as a slice, one should pass the IndexMap, allowing to safely retrieve field type hints by name.

This would change the signature of serialize_row to

fn serialize_row(
        self,
        type_hints: &IndexMap<String, Type>,
    ) -> Result<Vec<(Cow<'static, str>, Value)>>;
@Protryon
Copy link
Owner

Protryon commented Jan 4, 2024

Yeah, I'm on board for this, just requires a minor version bump due to breaking change. I've been aware of this limitation for a while.

@cpg314
Copy link
Collaborator Author

cpg314 commented Jan 4, 2024

Ok, I'm happy to send a PR if you wish. Thanks a lot for your review of the others :)

@Protryon
Copy link
Owner

Protryon commented Jan 4, 2024

Yep @cpg314, your contributions are much appreciated. :) I added you as a contributor so you can stack your PRs.

@Protryon
Copy link
Owner

Fixed by #37

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants