Allow np.object dtypes into virtualfile_from_vectors #684
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of proposed changes
Loosen the check in
virtualfile_from_vectors
to allow for any string-like dtype (np.str, np.object) by performing the check usingpd.api.types.is_string_dtype()
. The array is then converted (if needed) to a propernp.str
dtype before giving it toput_strings
.Why is this needed ?
This is one step in enabling text input into modules like:
meca
, see Wrap meca #516 (comment), ff164e6velo
, see Wrap velo #525 (comment)Those modules rely on
pandas.DataFrame
inputs, but a 'str' column in pandas is typically stored as an 'object' dtype (see https://stackoverflow.com/questions/21018654/strings-in-a-dataframe-but-dtype-is-object), unless users take due care to store them in the new pandas.StringDtype. Either way, when we convert these pandas.Series objects to a numpy array, their dtype becomesnp.object
rather thannp.str
(hence why our code needs to handle np.object too).After this PR is merged, we can do something like:
Fixes #
Reminders
make format
andmake check
to make sure the code follows the style guide.doc/api/index.rst
.Notes
/format
in the first line of a comment to lint the code automatically