Skip to content

Conversation

Weijun-H
Copy link
Member

@Weijun-H Weijun-H commented Aug 20, 2025

Which issue does this PR close?

Rationale for this change

Maps are now cast to Variant::Objects

What changes are included in this PR?

Are these changes tested?

Yes

Are there any user-facing changes?

@github-actions github-actions bot added the parquet-variant parquet-variant* crates label Aug 20, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Weijun-H -- I have some questions

let variant1 = result.value(0);
let item = variant1.as_list().unwrap().get(0).unwrap();
assert_eq!(
item.as_object().unwrap().get("keys").unwrap(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to have created something like

[{ 
  "keys": "key1",
  "values": 1
}]

I expect the result to be

{ "key1": 1 }

Questions:

  1. where did the list come from?
  2. I am not sure about the "keys" and "values" fields 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is no data type in Variant that can directly represent a map,
I converted the map into the following structure:

[
{ 
  "keys": "key1",
  "values": 1
},
{ 
  "keys": "key2",
  "values": 2
}
]

Using an object might be a better idea, but the keys can only be strings. 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think object is the more expected result

Maybe you can all the keys of the MapArray into strings using the cast_to_variant kernel ?

@Weijun-H Weijun-H force-pushed the 8063-cast-map-to-variant branch 2 times, most recently from 7dd5aaa to 85eabb0 Compare August 20, 2025 21:48
@Weijun-H Weijun-H requested a review from alamb August 20, 2025 22:27
@Weijun-H Weijun-H force-pushed the 8063-cast-map-to-variant branch from 38a0b4e to 60c56ff Compare August 21, 2025 06:20
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Weijun-H -- this is looking good

DataType::Map(field, _) => match field.data_type() {
DataType::Struct(_) => {
let map_array = input.as_map();
let keys = cast_to_variant(map_array.keys())?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also potentially cast this directly to a string aray here, somethin glike

Suggested change
let keys = cast_to_variant(map_array.keys())?;
let keys = cast(map_array.keys(), DataType::Utf8)?;

And then you wouldn't have to worry about dealing with Variants

I think what you have here, other than the unwrap, is looks good to me

@Weijun-H Weijun-H force-pushed the 8063-cast-map-to-variant branch from 5fc2ef2 to e85a41d Compare August 21, 2025 19:30
@Weijun-H Weijun-H force-pushed the 8063-cast-map-to-variant branch from e85a41d to ff5f64e Compare August 21, 2025 19:30
@Weijun-H Weijun-H requested a review from alamb August 21, 2025 20:41
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Weijun-H

@alamb
Copy link
Contributor

alamb commented Aug 22, 2025

I merged up to resolve a conflict

@alamb alamb merged commit 4009514 into apache:main Aug 23, 2025
12 checks passed
@alamb
Copy link
Contributor

alamb commented Aug 23, 2025

Thanks. again @Weijun-H

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet-variant parquet-variant* crates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Variant]: Implement DataType::Map support for cast_to_variant kernel
2 participants