Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

[DISCUSSION] add new versions of failable utf8 cast implementations #457

Closed
houqp opened this issue Sep 26, 2021 · 2 comments
Closed

[DISCUSSION] add new versions of failable utf8 cast implementations #457

houqp opened this issue Sep 26, 2021 · 2 comments
Labels
question Further information is requested

Comments

@houqp
Copy link
Collaborator

houqp commented Sep 26, 2021

Postgres returns a syntax error when user tries to cast an non-numeric string to integers. The current cast compute module returns NULL when casting cannot be done. This makes it not possible to implement the same behavior in Datafusion (see comment in https://github.com/houqp/arrow-datafusion/pull/9/files#r716175200).

Is there any particular reason why we don't want to have a failable string cast implementation in arrow2?

@houqp houqp added the question Further information is requested label Sep 26, 2021
@jorgecarleitao
Copy link
Owner

jorgecarleitao commented Sep 26, 2021

There is an easy way to compute them from the existing cast:

let array = ...
let result = cast(&array, data_type)?;

let no_cast_failed = result.null_count() == array.null_count();

let casted_and_valids = result.validity().unwrap();
let valids = array.validity().unwrap();
let failed_elements = &(!valids) | &(!casted_and_valids);  // or something like this xD

It is a bit more expensive and we can offer an optimized version of casted_and_valids & &(!valids) in bitmaps, but I felt that the current covers both cases pretty well.

@houqp
Copy link
Collaborator Author

houqp commented Sep 26, 2021

Cool, i think this is a reasonable workaround to unblock the migration for now :) I was thinking of short-circuiting the cast on casting error when I filed this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants