-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R] Speed up nrow()
on filtered dataset
#43659
Labels
Comments
FWIW I'm not seeing this at least on this query using a smaller sample of nyc_taxi:
That said, the first time I did it, it was slower, but on subsequent tries it was faster. Sounds like disk caching or something? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the enhancement requested
From the Arrow workship at posit::conf 2024. Several participants reported that
was faster than
nrow()
usesScanner$CountRows
: https://github.com/apache/arrow/blob/main/r/R/dplyr.R#L186We could replace that with something that runs an ExecPlan instead, as the comment above that line suggests, and perhaps that is more performant.
cc @thisisnic @steph
Component(s)
R
The text was updated successfully, but these errors were encountered: