-
Notifications
You must be signed in to change notification settings - Fork 2.1k
stat_bin accepts functions for binwidth #1890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Allows variable binwidths across facets.
Sturgis to Sturges
I wouldn't mind extending bin to accept functions but I really dislike Sturgis/Scott/FD — I think they give the illusion that you're picking a good binwidth, but they're just as arbitrary as picking 30 bins. |
None of the options are a default, they have to be chosen manually, which is still a level of intention beyond 30 bins. Would you prefer to drop the built-in options entirely? |
Yes please |
Better visualize varying ranges.
Is there anything else preventing you from accepting this pull request? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more suggestions
@@ -111,9 +113,10 @@ StatBin <- ggproto("StatBin", Stat, | |||
stop("Only one of `boundary` and `center` may be specified.", call. = FALSE) | |||
} | |||
|
|||
if (is.null(params$breaks) && is.null(params$binwidth) && is.null(params$bins)) { | |||
if (is.null(params$breaks) && (is.null(params$binwidth) || !(is.numeric(params$binwidth) || is.function(params$binwidth))) && is.null(params$bins)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to be changed? If a function is supplied binwidth
will no longer be NULL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that was leftover from checking for a string input as well.
@@ -130,8 +133,11 @@ StatBin <- ggproto("StatBin", Stat, | |||
if (!is.null(breaks)) { | |||
bins <- bin_breaks(breaks, closed) | |||
} else if (!is.null(binwidth)) { | |||
bins <- bin_breaks_width(scales$x$dimension(), binwidth, center = center, | |||
boundary = boundary, closed = closed) | |||
if (!is.numeric(binwidth)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to check for is.function()
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
bins <- bin_breaks_width(scales$x$dimension(), binwidth, center = center, | ||
boundary = boundary, closed = closed) | ||
if (!is.numeric(binwidth)) { | ||
binwidth <- do.call(binwidth, list(data$x)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need do.call
here
@@ -22,6 +22,14 @@ test_that("bins specifies the number of bins", { | |||
expect_equal(nrow(out(bins = 100)), 100) | |||
}) | |||
|
|||
test_that("binwidth computes widths for function input", { | |||
df <- data.frame(x = 1:100) | |||
out <- function(x, ...) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can remove the indirection of this function now.
And a simpler binwidth function would just return a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I get the simpler function that just returns a constant, but I don't know what "indirection of this function" means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, that was a bit unclear. I meant eliminate the out
function:
out <- layer_data(ggplot(...))
@@ -16,6 +16,10 @@ | |||
|
|||
## Major new features | |||
|
|||
### Binning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please move this up a section?
@@ -22,6 +22,14 @@ test_that("bins specifies the number of bins", { | |||
expect_equal(nrow(out(bins = 100)), 100) | |||
}) | |||
|
|||
test_that("binwidth computes widths for function input", { | |||
df <- data.frame(x = 1:100) | |||
out <- function(x, ...) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, that was a bit unclear. I meant eliminate the out
function:
out <- layer_data(ggplot(...))
Thanks! |
This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/ |
Allowing binwidth to accept functions greatly improves histograms of faceted variables. Furthermore, the current default (30 bins) is somewhat arbitrary, and could be replaced with one of the three built-ins included here.
The built-ins include 'Sturgis', 'Scott', and 'FD'. These match the options included in d3. More information is available on wikipedia.
Examples are included in the pull-request, here's a motivating example: