Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange error message when summing datetime64 and datetime.time column #10329

Closed
jorisvandenbossche opened this issue Jun 11, 2015 · 17 comments · Fixed by #31538
Closed

Strange error message when summing datetime64 and datetime.time column #10329

jorisvandenbossche opened this issue Jun 11, 2015 · 17 comments · Fixed by #31538
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@jorisvandenbossche
Copy link
Member

You get TypeError: Argument 'values' has incorrect type (expected numpy.ndarray, got Series):

In [29]: df = pd.DataFrame({'date':pd.date_range('2012-01-01', periods=3), 'time':[datetime.time(i, i, i) for i in range(3)]})

In [30]: df
Out[30]:
        date      time
0 2012-01-01  00:00:00
1 2012-01-02  01:01:01
2 2012-01-03  02:02:02

In [31]: df['date'] + df['time']
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-31-5101228e303e> in <module>()
----> 1 df['date'] + df['time']

C:\Anaconda\lib\site-packages\pandas\core\ops.pyc in wrapper(left, right, name)
    491             return NotImplemented
    492
--> 493         time_converted = _TimeOp.maybe_convert_for_time_op(left, right,
name)
    494
    495         if time_converted is None:

C:\Anaconda\lib\site-packages\pandas\core\ops.pyc in maybe_convert_for_time_op(c
ls, left, right, name)
    455         if name.startswith('__r'):
    456             name = "__" + name[3:]
--> 457         return cls(left, right, name)
    458
    459

C:\Anaconda\lib\site-packages\pandas\core\ops.pyc in __init__(self, left, right,
 name)
    272         self.right = right
    273         lvalues = self._convert_to_array(left, name=name)
--> 274         rvalues = self._convert_to_array(right, name=name, other=lvalues
)
    275
    276         self.is_timedelta_lhs = com.is_timedelta64_dtype(left)

C:\Anaconda\lib\site-packages\pandas\core\ops.pyc in _convert_to_array(self, val
ues, name, other)
    354             elif not (isinstance(values, (np.ndarray, pd.Series)) and
    355                       com.is_datetime64_dtype(values)):
--> 356                 values = tslib.array_to_datetime(values)
    357         elif inferred_type in ('timedelta', 'timedelta64'):
    358             # have a timedelta, convert to to ns here

TypeError: Argument 'values' has incorrect type (expected numpy.ndarray, got Series)
@jreback
Copy link
Contributor

jreback commented Jun 11, 2015

xref to #8698

@jreback jreback added this to the Next Major Release milestone Jun 11, 2015
@jreback jreback added Enhancement Timedelta Timedelta data type labels Jun 11, 2015
@jorisvandenbossche
Copy link
Member Author

@jreback it is indeed somewhat related in the sense that I should have converted it to a timedelta, but first didn't and got this error.
But for #8698, this will always be done explicitly by the user? And not automatically in something like df['date'] + df['time']. So, summing those should still generate a (more useful) error

@jreback
Copy link
Contributor

jreback commented Jun 11, 2015

well I think I tried to make this happen automatically and ran into a road-block, but don't remember exactly what the problem (aside from the pretty useless tz attribute in datetime.time). To be honest I think this is a completely useless type and can completely and transparently converted to Timedelta. I think it was python's attempt to make an offset like object.

@jorisvandenbossche
Copy link
Member Author

well, this discussion is more related to #8343 then, I think.
But although it seems pretty useless, it exists and people do use it (you eg also can get it when reading in data from certain sources such as SQL and excel), so I don't think it is a good idea to convert it automatically in a Timedelta, since it has some different semantics

@jorisvandenbossche jorisvandenbossche added the Error Reporting Incorrect or improved errors from pandas label Jun 11, 2015
@jorisvandenbossche
Copy link
Member Author

@jreback What I wanted to report here: not that this operation should work, but that the error message should be more clear, something in the sense of Operation '+' not supported for datetime64 and object/time

@jreback
Copy link
Contributor

jreback commented Jun 11, 2015

@jorisvandenbossche ok, I suppose it could raise the default message
TypeError: incompatible type [object] for a datetime/timedelta operation

@jorisvandenbossche
Copy link
Member Author

yes, that is indeed what you get with other object typed columns

@jreback
Copy link
Contributor

jreback commented Jun 11, 2015

I think since datetime.time is a datetime sub-class it probably gets a bit further in the chain and actually tries to do the operation, but datetime.time doesn't know how to interoperate with a np.datetime64 so it fails. So in this case it could actually work w/o converting to a Timedelta. But would need some handling.

But if someone actually did this of course datetime.time are object dtype (for the foreseable future)

@jorisvandenbossche
Copy link
Member Author

So in this case it could actually work w/o converting to a Timedelta

Do you mean it should return a value? I think it should always error, as a datetime and time object are (and should be) incompatible types to add (in plain python this gives: TypeError: unsupported operand type(s) for +: 'datetime.datetime' and 'datetime.time').

@jreback
Copy link
Contributor

jreback commented Jun 11, 2015

oh, I get it now. a datetime.time is really a datetime.datetime, but JUST represents time. ok. Then this could then really be represented by a sub-class of Timedelta (e.g. >=0 <= midnight). Which would allow actual operations on it. So I would suggest if someone needs to represent 'time' w/o the date, then just use a Timedelta.

@jorisvandenbossche
Copy link
Member Author

yes, indeed. But in the sense that, as a user you probably can better use Timedelta for such a case in pandas, not that pandas should represent it times as a Timedelta automatically (as datetime.time and datetime.timedelta are completely different in attributes and methods)

@jreback
Copy link
Contributor

jreback commented Jun 11, 2015

my point is the datetime.time in python is just silly.

@shoyer
Copy link
Member

shoyer commented Jun 11, 2015

I agree that datetime.time in Python is silly, but we still need to treat it as a different class. It would be a decent idea to add an easy API for converting to time deltas.

@jorisvandenbossche
Copy link
Member Author

Would this easy API be simply allowing to provide it to to_timedelta ? (with some rule what to do when it has a timezone attached)

(this is how I interpreted #8698, but maybe that was a wrong interpretation)

@shoyer
Copy link
Member

shoyer commented Jun 11, 2015

Yes, that would work for me.

On Thu, Jun 11, 2015 at 9:23 AM, Joris Van den Bossche
notifications@github.com wrote:

Would this easy API be simply allowing to provide it to to_timedelta ? (with some rule what to do when it has a timezone attached)

(this is how I interpreted #8698, but maybe that was a wrong interpretation)

Reply to this email directly or view it on GitHub:
#10329 (comment)

@jreback
Copy link
Contributor

jreback commented Jun 11, 2015

yeh I think conversions could be done both in to_timedelta and Timedelta and prob just document this somewhere is easiest (but that's the other issue), this one should just show a better error message.

@jbrockmendel jbrockmendel added the Numeric Operations Arithmetic, Comparison, and Logical operations label Dec 21, 2019
@mroeschke
Copy link
Member

Looks like this raises a more sensible error now. Could use a test

In [74]: df['date'] + df['time']

TypeError: unsupported operand type(s) for +: 'Timestamp' and 'datetime.time'

In [75]: pd.__version__
Out[75]: '1.0.0rc0+212.gca3bfcc54'

@mroeschke mroeschke removed Error Reporting Incorrect or improved errors from pandas Numeric Operations Arithmetic, Comparison, and Logical operations labels Jan 26, 2020
@mroeschke mroeschke added Needs Tests Unit test(s) needed to prevent regressions and removed Timedelta Timedelta data type labels Jan 26, 2020
@jreback jreback modified the milestones: Contributions Welcome, 1.1 Mar 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants