-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance the DataFrame keyword constructor to recycle #882
Conversation
Makes sense, but I'd prefer making the recycling behaviour stricter: IMHO it should only work when the requested length is a multiple of the current length of the vector. It would thus be more similar to how broadcasting works. I've been bitten badly in R by silent recycling in cases where it was clearly a mistake, which would have been caught by a less tolerant policy. |
I agree with that argument, @nalimilan. I'll update the PR tonight. |
Any last comments? If not, I'll merge over the weekend. |
src/dataframe/dataframe.jl
Outdated
# Return and AbstractVector of length `len` filled with `x`. | ||
# `x` is recycled if `len(x)` is an even multiple of `len`. | ||
rep_len(x, len) = rep(x, len) | ||
function rep_len(x::AbstractVector, len) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather raise an error than return a vector of a different length than the requested one:
julia> rep_len([1, 2], 7)
6-element Array{Int64,1}:
1
2
1
2
1
2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. Thx.
Do you still want to merge this? |
No. After rethinking, I only want to handle the case of extending a single element. Ill redo it. |
I've updated this to only recycle for a single element. If this is okay, I'll squash. |
That works for me. |
src/dataframe/dataframe.jl
Outdated
@@ -98,10 +98,32 @@ type DataFrame <: AbstractDataFrame | |||
end | |||
end | |||
|
|||
# Return and AbstractVector of length `len` filled with `x`. | |||
# `x` is recycled if `len(x)` is an even multiple of `len`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is length(x)
. Also, the description doesn't sound completely accurate: x
is always recycled an integer number of times, so that the length is less than or equal to len
. Though the behaviour you describe would be sensible (i.e. raise an error if not a multiple).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching. This is also out of date. I forgot to update the docs.
+1 too, modulo the small comment I added. |
Squashed and ready to go, assuming that Travis doesn't complain. |
for (k, v) in kwargs | ||
result[k] = v | ||
if length(v) != 1 && length(v) != len | ||
error("Incompatible lengths of arguments") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe better throw an ArgumentError
? Also I would be really nice to give the name of the problematic column as well as its size vs. the expected one. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also check that an error is thrown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps DimensionMismatch
would be an appropriate exception here?
Bump. With the small fix I noted should be good to go. |
Re-bump (see issue #973). |
Sorry. Slammed until next weekend...
|
No worries, let's just no forget about this completely. :-) |
Re-bump. :-) |
Consolidating the constructors minimized the number of places where auto promotion could take place. The new constructor recycles scalars such that if DataTable is created with a mix of scalars and vectors the scalars will be recycled to the same length as the vectors. Fixes an outstanding bug where scalar recycling only worked if the scalar assignments came after the vector assignments of the desired length, see #882. Tests that used to assume NullableArray promotion now explicitly use NullableArrays and new constructor tests have been added to test changes.
Consolidating the constructors minimized the number of places where auto promotion could take place. The new constructor recycles scalars such that if DataTable is created with a mix of scalars and vectors the scalars will be recycled to the same length as the vectors. Fixes an outstanding bug where scalar recycling only worked if the scalar assignments came after the vector assignments of the desired length, see #882. Tests that used to assume NullableArray promotion now explicitly use NullableArrays and new constructor tests have been added to test changes.
Consolidating the constructors minimized the number of places where auto promotion could take place. The new constructor recycles scalars such that if DataTable is created with a mix of scalars and vectors the scalars will be recycled to the same length as the vectors. Fixes an outstanding bug where scalar recycling only worked if the scalar assignments came after the vector assignments of the desired length, see #882. Tests that used to assume NullableArray promotion now explicitly use NullableArrays and new constructor tests have been added to test changes.
Consolidating the constructors minimized the number of places where auto promotion could take place. The new constructor recycles scalars such that if DataTable is created with a mix of scalars and vectors the scalars will be recycled to the same length as the vectors. Fixes an outstanding bug where scalar recycling only worked if the scalar assignments came after the vector assignments of the desired length, see #882. Tests that used to assume NullableArray promotion now explicitly use NullableArrays and new constructor tests have been added to test changes.
Consolidating the constructors minimized the number of places where auto promotion could take place. The new constructor recycles scalars such that if DataTable is created with a mix of scalars and vectors the scalars will be recycled to the same length as the vectors. Fixes an outstanding bug where scalar recycling only worked if the scalar assignments came after the vector assignments of the desired length, see #882. Tests that used to assume NullableArray promotion now explicitly use NullableArrays and new constructor tests have been added to test changes.
Consolidating the constructors minimized the number of places where auto promotion could take place. The new constructor recycles scalars such that if DataTable is created with a mix of scalars and vectors the scalars will be recycled to the same length as the vectors. Fixes an outstanding bug where scalar recycling only worked if the scalar assignments came after the vector assignments of the desired length, see #882. Tests that used to assume NullableArray promotion now explicitly use NullableArrays and new constructor tests have been added to test changes.
Consolidating the constructors minimized the number of places where auto promotion could take place. The new constructor recycles scalars such that if DataTable is created with a mix of scalars and vectors the scalars will be recycled to the same length as the vectors. Fixes an outstanding bug where scalar recycling only worked if the scalar assignments came after the vector assignments of the desired length, see #882. Tests that used to assume NullableArray promotion now explicitly use NullableArrays and new constructor tests have been added to test changes.
Consolidating the constructors minimized the number of places where auto promotion could take place. The new constructor recycles scalars such that if DataTable is created with a mix of scalars and vectors the scalars will be recycled to the same length as the vectors. Fixes an outstanding bug where scalar recycling only worked if the scalar assignments came after the vector assignments of the desired length, see #882. Tests that used to assume NullableArray promotion now explicitly use NullableArrays and new constructor tests have been added to test changes.
Consolidating the constructors minimized the number of places where auto promotion could take place. The new constructor recycles scalars such that if DataTable is created with a mix of scalars and vectors the scalars will be recycled to the same length as the vectors. Fixes an outstanding bug where scalar recycling only worked if the scalar assignments came after the vector assignments of the desired length, see #882. Tests that used to assume NullableArray promotion now explicitly use NullableArrays and new constructor tests have been added to test changes.
I think this has been resolved on master. |
This fixes an issue with the
DataFrame
constructor with keyword arguments. This comes up when usingby
, particularly with DataFramesMeta use. Here is the issue:Here are results with this PR:
Note that this doesn't change the lack of recycling for other methods.