-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Add Float64Index class? #236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Same idea came up on this mailing list thread. |
Yes, I would voice support for a general index that keeps the original index dtype. I used a float as index (it was time in seconds) and was delighted when everything, including |
@kghose: I agree. I also use indices to store things like the time in seconds (e.g. oscilloscope traces), and am constantly having to do |
+1 here. Matplotlib issue has tripped me up a number of times when I needed to make custom plots. |
see http://pandas.pydata.org/pandas-docs/dev/indexing.html#fallback-indexing, it is rarely necessary to actually use a float index; you are often better off served by using a column. The point of the index is to make individual elements faster, e.g. |
Yes, its true because whether two floats are the same depends on precision, but its nice to be able to have that as a time index. |
In my cases I don't really care about being able to select via get item ish style indexing I usually want to loop over the index series pairs or I have them in frame that I want to show as an image with the index in the columns. The object dtype makes matplotlib show the index to full precision which is really annoying since I then have to go in and format the tick labels by hand. I wholeheartedly agree that float indexes are to be avoided but sometimes they make sense. My cases are mostly plotting issues which only matters when I can't use pandas plotting abilities which thankfully isn't that often. |
@kghose consider using a datetime64[ns] index (if you are dealing with time), or as I said, use it as a column; you can do nearly everything you need (with an ocassional |
General index dtype retention is probably not worth the amount of complexity and code that it would require to do it right. Datetime indexes are your friend. @jreback what about attempting coercion of object indexes when accessing the values attribute? |
Something like this is pretty easy (@cpcloud, can't change the way Is this useful?
|
Of course for datetimes you get this
|
Well it's consistent... But it looks like it would only be useful in the float case... What would strings return? |
Shouldn't dates return array of date time? |
I could return anything...(e.g. a datetime64[ns]) numpy array for example, is easy enough, strings will return the same (an object array).. |
numpy 1.7 (this is the same as .values though)
|
@cpcloud I think you are right, only float is dfferent... |
i mean...i don't feel super strong about this since it seems like there are so few use cases for float indices. i do think that it should return the "highest level" dtype possible that can be represented by numpy, e.g., return dates as dates like u show if this is going to be done. again though, |
I've been using float indices a lot, so I would love A time axis is not the only use case for a float index; sometimes I work with spectral data where the X axis is a floating point value representing frequency or wavelength. |
Do this somewhere in your code (before you use it!)
|
You're abs right here, I also use it for things other than time. It would be great if there was some way to integrate pandas with |
I (inadvertently) started a thread about this on the pystatsmodels list; thread link: https://groups.google.com/forum/#!topic/pystatsmodels/ua7WpNd-U8Q My use case is also for time values (and DatetimeIndex is not useful for a variety of reasons, most notably that all I have a deltas against some unknown epoch defined as "whenever someone hit the record button".). My concern though isn't so much having a useful .values attribute (though I guess that might be nice too!), but for having a reliable way to do time-based indexing, mostly for ad hoc interactive use. The main features I'm looking for are:
|
I made a couple of minor changes to .loc to get the following behavior, which I believe is still consistent
|
cc @dragoljub, cc @nehalecky you guys have had interests in indexing in the past....not sure if you have any comments wrt this |
BTW, I'd be in favor of a |
For much of the data I work with I have been OK with using Object/Int64 index types, however I do also keep a copy of my indexers as data columns to enable easier plotting/slicing for some cases. IMO, anything that enables a smoother interface to MatplotLib, Galry or Scikit-Learn I'm 👍 |
Idea from conversation with @CRP in #235
The text was updated successfully, but these errors were encountered: