-
Hi there, I'm trying to view a huggingface dataset including images: import daft
df = daft.read_parquet("hf://datasets/HuggingFaceM4/DocumentVQA")
df.show(5) Following https://www.getdaft.io/projects/docs/en/stable/10-min.html#working-with-multimodal-data I try: df = df.with_column("image_show", daft.col("image").image.decode()) Unfortunately this yields: DaftCoreException: DaftError::External Unable to create logical plan node.
Due to: DaftError::TypeError ImageDecode can only decode BinaryArrays, got image#Struct[bytes: Binary, path: Utf8] It looks like the huggingface dataset image column has additional information that needs to handled first, but I can't seem to work out how to fix this? Many thanks for any help, and this amazing lib! :) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Oh interesting. Looks like this particular dataset stores images as some kind of struct column, with both the URL and bytes for some reason 🤷 You can try:
|
Beta Was this translation helpful? Give feedback.
Oh interesting. Looks like this particular dataset stores images as some kind of struct column, with both the URL and bytes for some reason 🤷
You can try:
.image.decode()
works on a binary column, which in this case is a nested column under theimage
struct column.