-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Allow time series of 3D vectors #4913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
have you tried using a what you are describing is not efficient at all (to store a vector inside a vector). |
Well, why should two dimensional vectors (and a time-series is stored as np.array, as far as I know) be inefficient? I will look at the Panel again to find out why this did not work for me. |
might this be what you are looking for?
|
Ok, a panel can be used if you have three dimensional vectors only. But two questions remain: Example code for 3d vectors only (this works fine): import pandas as pd
import numpy as np
import numpy.linalg as la
"""
items: axis 0, each item corresponds to a DataFrame contained inside (pos, vel, acc)
major_axis: axis 1, it is the index (rows) of each of the DataFrames (timestamps)
minor_axis: axis 2, it is the columns of each of the DataFrames (x, y, z)
"""
pa = pd.Panel(np.random.randn(3, 5, 4), items=['pos', 'vel', 'acc'],
major_axis = pd.date_range(start='2001-01-01 00:00:00', end='2001-01-01 00:00:00.2', freq='50L'),
minor_axis=['x', 'y', 'z', 'norm'])
pa.pos.norm = 0
pa.vel.norm = 0
pa.acc.norm = 0
pa.pos.norm = la.norm(pa.pos.as_matrix(), axis=1)
pa.vel.norm = la.norm(pa.vel.as_matrix(), axis=1)
pa.acc.norm = la.norm(pa.acc.as_matrix(), axis=1)
print pa.to_frame()
print
print "pa.pos\n", pa.pos Best regards: |
@ufechner7 I think you could create a class to hold some of this information with variables of a Panel/Multi-index frame. Don't try to jam too much into a single data structures, sometimes have multiple objects is the way to go |
Well, the data I have is kind of limited in size, just the data of many sensors of 2 hour flights with 20 Hz sampling rate (in average). This means that one dataset has a size of 100-150 MB (uncompressed). I found out that I can use a Panel4D to store vectors of different type and scalars (all this data are time series) in one data structure. This works OK, but there is room for improvement w.r.t. to the Panel4D implementation. The following code shows how to store scalars and 3d vectors in one Panel4D object: """ Test code for using a panel object for storing dataframes with different column sets. """
import pandas as pd
import numpy as np
"""
items: axis 0, each item corresponds to a DataFrame contained inside (pos, vel, acc)
major_axis: axis 1, it is the index (rows) of each of the DataFrames (timestamps)
minor_axis: axis 2, it is the columns of each of the DataFrames (x, y, z)
"""
XYZ = ['x','y','z']
NORM = ['norm']
ENU = ['pos', 'vel', 'acc']
pav = pd.Panel(np.random.randn(3, 6, 3), items=ENU,
major_axis = pd.date_range(start='2001-01-01 00:00:00', end='2001-01-01 00:00:00.25', freq='50L'),
minor_axis=['x', 'y', 'z'])
pas = pd.Panel(np.random.randn(5, 6, 1), items=['pos', 'vel', 'acc', 'v_reel_out','u_winch'],
major_axis = pd.date_range(start='2001-01-01 00:00:00', end='2001-01-01 00:00:00.25', freq='50L'),
minor_axis=['norm'])
pav.pos.norm = 0
pav.vel.norm = 0
pav.acc.norm = 0
data = { 'ENU' : pav ,
'scalar' : pas }
pa = pd.Panel4D(data)
print pa
print pa.ENU.loc[ENU].to_frame()
print pa.ENU.pos[XYZ]
print
print pa.scalar.to_frame()
print pa.scalar.pos[NORM] |
@ufechner7 you can certainly do that but keep in mind that the scalar data is 'replicated', e.g. it is not sparse. this may or may not work for you. if you are simultaneously using say 3-d and scalars then it makes sense to keep related objects (that are pandas objects), but not necessarily put them in one object....up2u e.g.
|
Why is the data handling not sparse? Why is it not possible to use Panel4D as container for non-homogeneous data? I want to be able to filter in the time axis, and that becomes complicated if I use a dictionary of pandas objects as container. |
why would you expect it to be sparse? its a 4-dim container that is homogeneous in dimensions, e.g. there is a recorded value for each of the dimensions. You are able to put non-homogeneous types (e.g. floats/strings etc). This is true of all pandas objects (and numpy objects in general). Try doing |
Well, I think that the idea is that Pandas offers more features than just numpy arrays. I think I will open a new issue "Pandas should support packed, heterogeneous, numeric data structures." |
Please don't. We already have one: #3443 |
@ufechner7 If you'd like to discuss some of your ideas over at #3443, we would love to hear them. Deciding on an API for the proposed |
closing as not a bug |
We use Pandas to analyse flight data. Many of the recorded measurements are 3D vectors of double, e.g. position, velocity, acceleration. Currently I can only store scalars or objects in a time series. Scalars make the dataset very large, processing it is not very convenient.
I would like to be able to do:
df.velocity.norm().plot()
to plot the norm of the velocity vector that is stored in the data-frame.
Currently I have to type:
pd.Series(np.sqrt(df.velocity_x2 + df.velocity_y2 + df.velocity_z**2)).plot()
which is not very convenient.
The text was updated successfully, but these errors were encountered: