-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Couldn't cast array of type fixed_size_list to Sequence(Value(float64)) #6280
Comments
Thanks for reporting! I've opened a PR with a fix. |
Thanks for the quick response @mariosasko! I just installed your branch via Follow up question for you, should For example, the following script:
fails with
Ideally we can have empty embedding columns as well! |
This part of PyArrow is buggy and inconsistent regarding features implemented across the types, so the only option is to operate on the Arrow buffer level to fix issues such as the above one. |
Ok - can you take the POC I did here? Happy to turn this into an actual PR but would appreciate feedback on the implementation before I take another pass! |
Describe the bug
I have a dataset with an embedding column, when I try to map that dataset I get the following exception:
Steps to reproduce the bug
Here's a simple repro script:
Removing the embedding column fixes the issue!
Expected behavior
The mapping completes successfully.
Environment info
datasets
version: 2.14.4The text was updated successfully, but these errors were encountered: