-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API/ENH: Accept nan-likes in StringArray constructor #40839
Labels
Enhancement
NA - MaskedArrays
Related to pd.NA and nullable extension arrays
Strings
String extension data type and string data
Milestone
Comments
lithomas1
added
Enhancement
Needs Triage
Issue that has not been reviewed by a pandas team member
labels
Apr 8, 2021
jbrockmendel
added
the
NA - MaskedArrays
Related to pd.NA and nullable extension arrays
label
Apr 12, 2021
4 tasks
lithomas1
changed the title
API/ENH: Accept np.nan/None in StringArray constructor
API/ENH: Accept nan-likes in StringArray constructor
May 10, 2021
4 tasks
lithomas1
removed
the
Needs Triage
Issue that has not been reviewed by a pandas team member
label
May 21, 2021
removing milestone for now, can add back later |
3 tasks
This was referenced Dec 24, 2021
@lithomas1 created a pull request for this and it looks like it was very close to complete but I think it's now closed for inactivity. Is there a chance this issue will be reopened at some point? I could really use this feature. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Enhancement
NA - MaskedArrays
Related to pd.NA and nullable extension arrays
Strings
String extension data type and string data
Is your feature request related to a problem?
Currently, StringArray can only be instantiated directly with a ndarray with strings or NA values represented by pd.NA. The only way to instantiate a StringArray with other missing value indicators(like
np.nan
andNone
) is to use pandas.array, which has a side effect of casting non-string elements to strings instead of erroring.The proposed solution would allow StringArray instantiation from a numpy array containing np.nan/None without casting non-strings. This is useful if you want the StringArray constructor to validate that inputs are strings and also accepts other missing values other than pd.NA. At the very least, it should support np.nan since StringArray is created from a numpy array, and np.nan is the missing value indicator for numpy.
Describe the solution you'd like
Either accept nan-likes in the constructor directly(breaking change) or add a parameter to the constructor allowing other na_values, maybe something like the na_values parameter from read_csv.
API breaking implications
Either breaking change or new parameter.
Describe alternatives you've considered
You'd have to do the validation yourself and validating yourself and then having StringArray validate again is not good for perf.
cc @jorisvandenbossche
The text was updated successfully, but these errors were encountered: