Skip to content

Commit

Permalink
groupvalues function and Base.pairs method for GroupedDataFrame
Browse files Browse the repository at this point in the history
  • Loading branch information
jlumpe committed Aug 2, 2019
1 parent 6102f89 commit 81d9801
Showing 1 changed file with 45 additions and 0 deletions.
45 changes: 45 additions & 0 deletions src/groupeddataframe/grouping.jl
Original file line number Diff line number Diff line change
Expand Up @@ -1214,3 +1214,48 @@ groupindices(gd::GroupedDataFrame) = replace(gd.groups, 0=>missing)
Return a vector of column names in `parent(gd)` used for grouping.
"""
groupvars(gd::GroupedDataFrame) = _names(gd)[gd.cols]

"""
groupvalues(gd::GroupedDataFrame)

Get a vector of values of the grouped columns for each group.

### Returns

Vector of tuples, where the `j`th element of the `i`th tuple is the value of the
column `groupvars(gd)[j]` in `gd[i]`.

Note: the values are always tuples, even if a single column was used in
(`groupby`)[@ref].

### Examples

julia> df = DataFrame(a = repeat([:foo, :bar, :baz], outer=[2]),
b = repeat([2, 1], outer=[3]),
c = 1:6);

julia> gd = groupby(df, :a)
GroupedDataFrame with 3 groups based on key: a
First Group (2 rows): a = :foo
│ Row │ a │ b │ c │
│ │ Symbol │ Int64 │ Int64 │
├─────┼────────┼───────┼───────┤
│ 1 │ foo │ 2 │ 1 │
│ 2 │ foo │ 1 │ 4 │
Last Group (2 rows): a = :baz
│ Row │ a │ b │ c │
│ │ Symbol │ Int64 │ Int64 │
├─────┼────────┼───────┼───────┤
│ 1 │ baz │ 2 │ 3 │
│ 2 │ baz │ 1 │ 6 │

julia> groupvalues(gd)
3-element Array{Tuple{Symbol},1}:
(:foo,)
(:bar,)
(:baz,)
"""
groupvalues(gd::GroupedDataFrame) = map(Tuple, eachrow(gb.parent[gb.idx[gb.starts], gb.cols]))

Base.pairs(gd::GroupedDataFrame) = map(Pair, groupvalues(gd), gd)

0 comments on commit 81d9801

Please sign in to comment.