Description
I often have the situation where I would like to apply multiple aggregation functions to all the columns of a grouped dataframe, like:
grouped = df.groupby('somekey')
dfAggregated = grouped.agg([np.mean, np.std])
That works well, but sometimes (all the time, actually) I would also like to be able to use lambda functions this way, like:
grouped = df.groupby('somekey')
dfAggregated = grouped.agg([np.mean, np.std, lambda v: v.mean()/v.max()])
This works fine, but the resulting column name will now be 'lambda', which is ugly. This can be resolved by using the much more verbose syntax where you specify a dictionary for every column separately, but I would propose to allow the following syntax:
grouped = df.groupby('somekey')
dfAggregated = grouped.agg([np.mean,np.std,{'normalized_mean': lambda v: v.mean()/v.max()}])
The dictionary key should then be used as the resulting column name.
Interestingly, using this syntax in the version 0.16 does not produce an error, but produces a column named 'Nan', that is filled with tupple values: ('n','o','r','m','a','l','i','z','e','d','_','m','e','a','n'), which I don't think is of use to anyone:)