Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@by is not working #162

Closed
xiaodaigh opened this issue Aug 29, 2020 · 3 comments
Closed

@by is not working #162

xiaodaigh opened this issue Aug 29, 2020 · 3 comments

Comments

@xiaodaigh
Copy link
Contributor

This simple MWE is throwing error . I have DataFrames 0.21.7 and DataFramesMeta 0.5.1

using DataFrames
df = DataFrame(grp = rand(1:8, 100), val = rand(100))

using DataFramesMeta, Statistics
@by(df, :grp, mean(:val))
ERROR: ArgumentError: 'Float64' iterates 'Float64' values, which doesn't satisfy the Tables.jl 
`AbstractRow` interface
Stacktrace:
 [1] invalidtable(::Float64, ::Float64) at C:\Users\RTX2080\.julia\packages\Tables\Eti9i\src\tofromdatavalues.jl:42
 [2] iterate at C:\Users\RTX2080\.julia\packages\Tables\Eti9i\src\tofromdatavalues.jl:48 [inlined]
 [3] buildcolumns at C:\Users\RTX2080\.julia\packages\Tables\Eti9i\src\fallbacks.jl:185 [inlined]
 [4] columns at C:\Users\RTX2080\.julia\packages\Tables\Eti9i\src\fallbacks.jl:237 [inlined]   
 [5] DataFrame(::Float64; copycols::Bool) at C:\Users\RTX2080\.julia\packages\DataFrames\cdZCk\src\other\tables.jl:43
 [6] DataFrame at C:\Users\RTX2080\.julia\packages\DataFrames\cdZCk\src\other\tables.jl:34 [inlined]
 [7] (::var"##293#29")(::SubArray{Float64,1,Array{Float64,1},Tuple{Array{Int64,1}},false}) at C:\Users\RTX2080\.julia\packages\DataFramesMeta\c67UK\src\DataFramesMeta.jl:71
 [8] (::var"#27#28")(::SubDataFrame{DataFrame,DataFrames.Index,Array{Int64,1}}) at C:\Users\RTX2080\.julia\packages\DataFramesMeta\c67UK\src\DataFramesMeta.jl:73
 [9] _combine(::var"#27#28", ::GroupedDataFrame{DataFrame}, ::Nothing, ::Bool, ::Bool) at C:\Users\RTX2080\.julia\packages\DataFrames\cdZCk\src\groupeddataframe\splitapplycombine.jl:1248    
 [10] combine_helper(::Function, ::GroupedDataFrame{DataFrame}, ::Nothing; keepkeys::Bool, ungroup::Bool, copycols::Bool, keeprows::Bool) at C:\Users\RTX2080\.julia\packages\DataFrames\cdZCk\src\groupeddataframe\splitapplycombine.jl:589
 [11] #combine#375 at C:\Users\RTX2080\.julia\packages\DataFrames\cdZCk\src\groupeddataframe\splitapplycombine.jl:442 [inlined]
 [12] combine(::Function, ::GroupedDataFrame{DataFrame}) at C:\Users\RTX2080\.julia\packages\DataFrames\cdZCk\src\groupeddataframe\splitapplycombine.jl:442
 [13] top-level scope at REPL[17]:1
 [14] include_string(::Function, ::Module, ::String, ::String) at .\loading.jl:1088
@pdeffebach
Copy link
Collaborator

The problem is that @by currently needs an expression of the form y = fun(:x). Your above example doesn't meet that.

In the future, we will hopefully be able to make this work by transforming mean(:val) to :val => mean and then putting it into DataFrames.combine(groupby(df, :grp), :val => mean). Initial work has started towards that goal in #163.

Unfortunately the error here is very difficult to understand because it's hard to reason exactly what expression DataFramesMeta is making at the end of the day. This will become easier to reason about after #163. It looks like @by actually does call DataFrames.combine currently, constructs an intermediate NamedTuple and calls DataFrame on that. Your expression creates a Float64 value for each group, which DataFrames.combine doesn't know what to do with.

Rest assured this will get fixed in the future.

@pdeffebach
Copy link
Collaborator

Fixed in #163

@bkamins
Copy link
Member

bkamins commented Sep 22, 2020

closing - right?

@bkamins bkamins closed this as completed Sep 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants