-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Respect numpy fixed length strings #10351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
this would add quite a bit of complexity, so what exactly would be the gain here? |
The date was just an example. The gain would be that numpy operations on the underlying string array would be possible and |
you don't need to specify However, you have much greater complexity in the entire code base. |
This seems nearly impossible. What should happen in the following case?
The only manner I could imagine something like this being implemented would be to add a fixed-width-string-column type, like Categorical. This column would have to be aware of the fixed size and do padding for strings that are too small (or raise) and raise for strings that are too large. Seems like a lot effort to avoid a small number of function calls when writing data. |
yeh, I think I went thru this exercise before. The problem is that assignment might have to astype the entire array, which is completely inefficient. The closing, but if you want to discuss more, happy to. |
this is a dupe of #5261 as well. |
Since pandas supports customs types now: |
in theory but would require a community pull request to make this happen |
pandas converts all strings to 'O' columns.
For consistency and writing to files it would be good if pandas respects numpy fixed length strings.
Results in
dtype('O')
and notdtype('S10')
The text was updated successfully, but these errors were encountered: