You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'd like to group and aggregate my dataframe. The df can have different columns like (timestamp_unix, latitude, longitude, ..., accelerometer_z, temperature)
Usually, each row has a timestamp and one value (like accelerometer, temperature, ...). All other phenomena are null. I'm grouping my df based on a geo_id I'm calculating before and would now like to calculate aggregations.
// Specify aggregation functions for each columnconstaggregationFunctions: {[key: string]: any}={timestamp_unix: 'min',latitude: 'max',longitude: 'max',overtaking: 'sum',finedust_pm1: 'mean',finedust_pm2_5: 'mean',finedust_pm4: 'mean',finedust_pm10: 'mean',distance: 'sum',humidity: 'mean',accelerometer_x: 'mean',accelerometer_y: 'mean',accelerometer_z: 'mean',temperature: 'mean',}// Group by geolocation and apply aggregationsconstgrouped=df.groupby(['geo_id']).agg(aggregationFunctions)grouped.print()
Unfortunately, null cells get included in the aggregation (as 0 I think). Thus, temperature_mean does not represent the real mean temperature. Filling null cells with NaN does not work either, as the resulting aggregation values all are NaN now.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I'd like to group and aggregate my dataframe. The df can have different columns like (timestamp_unix, latitude, longitude, ..., accelerometer_z, temperature)
Usually, each row has a timestamp and one value (like accelerometer, temperature, ...). All other phenomena are
null
. I'm grouping my df based on ageo_id
I'm calculating before and would now like to calculate aggregations.Unfortunately,
null
cells get included in the aggregation (as 0 I think). Thus, temperature_mean does not represent the real mean temperature. Fillingnull
cells withNaN
does not work either, as the resulting aggregation values all areNaN
now.How can I exclude
null
orNaN
values from my groupby aggregations?Beta Was this translation helpful? Give feedback.
All reactions