Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reason for increasing the required R version to >= 4.0? #173

Closed
etiennebacher opened this issue Jun 22, 2024 · 22 comments
Closed

Reason for increasing the required R version to >= 4.0? #173

etiennebacher opened this issue Jun 22, 2024 · 22 comments

Comments

@etiennebacher
Copy link

Hello, I see that the required R version has been bumped from 3.0.2 to 4.0 in #128. I'm wondering what is the reason behind this change.

I understand that the tidyverse policy is to support the previous 4 versions, meaning back to 4.0 at the time of writing. However, AFAIK evaluate is not part of the tidyverse and is not listed in tidyverse::tidyverse_deps(). Also, while one can choose to use alternatives to the tidyverse packages for data wrangling, r-lib packages are the foundation for many tools used across a large number of packages, e.g. testthat, knitr. With this change, no package can run testthat on versions < 4.0 anymore.

So unless this change is motivated by technical reasons (which doesn't seem to be the case based on the changes in #128), wouldn't it be better to keep the required R version as low as possible?

Finally, I didn't see any announcements on this quite important change and this is also not mentioned in NEWS. Having some information on this and some explanation of the consequences would be greatly appreciated.

Thanks

@yihui
Copy link
Collaborator

yihui commented Jun 22, 2024

So unless this change is motivated by technical reasons (which doesn't seem to be the case based on the changes in #128), wouldn't it be better to keep the required R version as low as possible?

As the former maintainer of this package, I tend to agree with you. I don't see strong technical reasons to require R >= 4.0. As far as I can see, the relevant commit was 9dcfe92. I think the cost is too high on the user side if the goal is to remove 8 lines of code from this package.

(Also see yihui/knitr#2351)

@hadley
Copy link
Member

hadley commented Jun 24, 2024

The first problem is that we have not been testing on older versions of R for quite some time, so I don't have any evidence to suggest that evaluate does actually work on earlier versions of R. (I have a vague recollection of seeing some function that was only available in newer R versions, but I might be misremembering).

The second problem is that while it appears to be a few lines of code now, maintaining compatibility with older versions of R is tax that we have to pay on all future development. (Additionally, there has been quite a bit of refactoring since that change was made, and I've definitely been taking advantage of modern R features.) We're definitely happy to pay this tax for 5 years worth of backward compatibility (as our tidyverse position makes clear), but the benefits of maintaining compatibility even further back in time are less clear to me (especially since users can't easily get binaries). I don't recall anyone wanting to use the tidyverse on even older versions of R, so if you have specific use cases, I'd love to hear them.

(I'll fix the problem with this not being mentioned in the news momentarily.)

@hadley
Copy link
Member

hadley commented Jun 24, 2024

A quick analysis suggest that this is unlikely to have much impact on testthat, since it already depends on brio, fs, glue, lifecycle and waldo which all depend on R 3.6.0 (and will be bumped to 4.0.0 when we next release them). In the past we have explored trying to keep specific packages depending on old versions of R but it ends up being a big lift because our tooling doesn't really support it. Of course, we could make our tools support it, but we haven't had any user requests for support on old versions of R, and I'm not even sure how many people are even using tidyverse packages on 4.0.0, let alone older versions.

Code
library(dplyr, warn.conflicts = FALSE)
library(purrr)

deps <- pak::pkg_deps("testthat")
#> ℹ Loading metadata database
#> ✔ Loading metadata database ... done
#> 
deps$deps |> 
  set_names(deps$package) |> 
  map(\(df) df |> filter(ref == "R") |> select(version)) |> 
  list_rbind(names_to = "package") |> 
  arrange(desc(package_version(version)))
#> # A data frame: 25 × 2
#>    package   version
#>    <chr>     <chr>  
#>  1 evaluate  4.0.0  
#>  2 brio      3.6    
#>  3 fs        3.6    
#>  4 glue      3.6    
#>  5 lifecycle 3.6    
#>  6 testthat  3.6.0  
#>  7 waldo     3.6    
#>  8 pkgbuild  3.5    
#>  9 vctrs     3.5.0  
#> 10 withr     3.5.0  
#> # ℹ 15 more rows

Created on 2024-06-24 with reprex v2.0.2

@etiennebacher
Copy link
Author

I don't recall anyone wanting to use the tidyverse on even older versions of R, so if you have specific use cases, I'd love to hear them.

I don't have a problem with the 4/5-year policy of the tidyverse because there are alternatives to it if one wants a longer support period. My issue is that putting this requirement on evaluate basically forces everyone using testthat / knitr in their package to stop testing on R < 4.0.

We discussed about our version support policy in the easystats ecosystem and a poll on Linkedin showed that about 15% of users had R < 4.0, which is not negligible. Therefore we would like to ensure that our suite of packages works for them but it is now impossible to test in the CI workflows.

@hadley
Copy link
Member

hadley commented Jun 25, 2024

Ok that's good to know. I'll do some more exploration with the data we have at hand (i.e. number of packages downloaded from PPPM by different R versions) and see if it convinces me. (But even if we make the change for evaluate, it won't directly prevent testthat from requiring R 4.0.0 as part of our release process for minor versions is bumping the R dependency so it matches our stated policy.)

We'll probably also update that blog post to make it clear that our version policy also applies to r-lib packages. Our sense is that five years of support was reasonably generous (especially since folks on older versions of R can continue to use older versions of packages) and not something we've had any specific demand for. Continuing to push the support window back generally hampers our development velocity especially when we want to use features available in recent versions of R, since it forces us to develop our own backports.

@hadley
Copy link
Member

hadley commented Jun 28, 2024

First data pull, which gives the version of R used to download R packages from PPPM in May this year:

  R            n  perc
  <chr>    <dbl> <dbl>
1 3.3         18     0
2 3.4       8085     0
3 3.5      62359     0
4 3.6   15994069     8
5 4.0    5851334     3
6 4.1   27035142    14
7 4.2   26424627    13
8 4.3   41012803    21
9 4.4   81605145    41

So it looks like ~8% of packages are coming from R 3.6 vs 3% from R 4.0. That suggests to me that there is definitely some additional challenge to a major version upgrade.

If we filter the data to just look at windows packages (strongly suggesting a human is involved rather than a CI job somewhere), the use of 3.6 is even higher.

  R            n  perc
  <chr>    <dbl> <dbl>
1 3.4        404     0
2 3.5       9266     0
3 3.6    4774800    12
4 4.0     281963     1
5 4.1    6602864    17
6 4.2    4854551    13
7 4.3   10669790    28
8 4.4   11604928    30

I believe that most of the people using old versions of R are downloading old versions of packages. But I don't have that data yet, and it might take me a bit longer to get it.

@lachlansimpson
Copy link

Just saw this issue in zendesk ticket #106807 - installation of testthat failed in R 3.6.1. Needed to recommend the devtools installation:

install.package('devtools')
require('devtools')
install_version("evaluate", version = "0.23", repos = "http://cran.us.r-project.org")
install.packages('testthat')

@hadley
Copy link
Member

hadley commented Jul 3, 2024

I've restored R 3.6 compatibility for the next release since it looks like it's going to take me a bit longer to do a full analysis, and the changes weren't too arduous.

@hadley
Copy link
Member

hadley commented Jul 3, 2024

I somehow missed the fact that the requirement occurred in the released version of evaluate, which was published on CRAN on June 10. That's about 3 weeks ago, and we've only heard one complaint from a user; so I think that suggests there are very few folks on 3.6 who are going to be affected by this.

(Nevertheless, I'll continue my analysis journey so I can get more concrete data to back this up, and evaluate 1.0.0, which we'll release shortly will only need R 3.6)

@etiennebacher
Copy link
Author

That's about 3 weeks ago, and we've only heard one complaint from a user; so I think that suggests there are very few folks on 3.6 who are going to be affected by this.

I rather think it shows that few developers are testing on 3.6, not that there are few users on 3.6. Technically we could stop testing on R < 4.0 and still provide our packages for users on 3.6, but that's not a great experience since we don't actually know if the package would work correctly on 3.6.

@hadley
Copy link
Member

hadley commented Jul 3, 2024

@etiennebacher we broke knitr/rmarkdown on R 3.6 and no one complained about it. I think that implies there are few people using R 3.6 + recent evaluate.

@IndrajeetPatil
Copy link

We also discovered this breakage in {lintr} CI, but decided to bump the minimum needed R version instead. We reckoned that since {lintr} enforces the tidyverse style guide, it should also adopt the same R version support policy (ditto for {styler}). I am mentioning these two examples just to highlight that even the developers who were testing on R 3.6 might not have complained; they might have decided to just bump the R version requirement instead.

But, as Etienne mentioned, we did notice this in the easystats ecosystem where we don't use any of the tidyverse packages, and wished to support at least five previous R versions. After the current change, we could neither test ({testthat}) nor build vignettes ({knitr}) on R 3.6 in our CI, and thus this issue.

I think it would really help to clarify in writing that r-lib packages also follow the same R version support policy as the tidyverse packages to avoid such a confusion in future.

@hadley
Copy link
Member

hadley commented Jul 3, 2024

@IndrajeetPatil the point I'm making is that evaluate (and hence knitr and testthat) has not worked on R 3.6 for 21 days and no user has complained about it. I know that this change has affected developers, but I have yet to get any evidence that it has affected folks using R to do data science.

@hadley
Copy link
Member

hadley commented Jul 4, 2024

Version policy page now updated: https://www.tidyverse.org/blog/2019/04/r-version-support/

@hadley
Copy link
Member

hadley commented Jul 5, 2024

Ok, I've convinced myself that my hypothesis was wrong — people using old versions of R are using modern versions of packages, because PPPM provides binaries for them. That means that at a minimum the testthat stack needs to stay working on R 3.6 so folks can continue to test packages if they want to. This won't affect the tidyverse version policy in general (at least not yet), but we will guarantee that testthat (and all of its dependencies) continues to work on R 3.6.

@hadley hadley closed this as completed Jul 5, 2024
@etiennebacher
Copy link
Author

Thanks for the discussion and the analysis. Just to be sure I'm not missing anything, does this mean that the change in the tidyverse policy indicating that this also applies to r-lib will be reverted?

@hadley
Copy link
Member

hadley commented Jul 5, 2024

@etiennebacher no. Currently this is a special carve out for testthat and dependencies only.

@hadley
Copy link
Member

hadley commented Jul 9, 2024

It looks like I made some mistake in my initial analysis because it now looks like R 3.6 downloads are much rarer:

  r_minor        n  perc
  <chr>      <int> <dbl>
1 3.6       202845   0.5
2 4.0      1118800   2.7
3 4.1      2479137   6  
4 4.2      3101150   7.5
5 4.3      9095999  22  
6 4.4     23176555  56.1
7 4.5       696865   1.7

That said, my hypothesis about people using older version of R downloading older versions of R packages is clearly wrong. The following plot shows downloads by package release date broken down by version of R. By and large there is little different in the distribution between R versions:

plot

I'm at useR this week, but I'll hopefully have time to work through the analysis with some colleagues next week so we can be more confident that it's correct this time 😄

@hadley
Copy link
Member

hadley commented Jul 9, 2024

Worth noting that this is data from Posit Public Package Manager (which has binaries for 3.6 and up), but the total number of downloads is about 30% of our main CRAN mirror (which doesn't have binaries for recent packages versions on old R versions).

@gtritchie
Copy link

FWIW (after the fact) this also caused us problems a few weeks back on the RStudio build machines where two of our platforms (focal and centos7) still have R 3.x. rstudio/rstudio#14904

@klmr
Copy link

klmr commented Jul 24, 2024

@etiennebacher we broke knitr/rmarkdown on R 3.6 and no one complained about it. I think that implies there are few people using R 3.6 + recent evaluate.

For what it's worth I was going to report this as an issue, and simply haven't gotten around to it yet.

(For my purposes I’d be happy enough to use an old version of the dependent packages — ‘knitr’ etc. — but unfortunately the ‘pak’ dependency resolver does not seem to take the R version into account when computing compatible dependencies.)

(Cf. https://github.com/klmr/box/actions/runs/10050247328/job/27777797147#step:5:1)

@hadley
Copy link
Member

hadley commented Jul 24, 2024

A possible way forward is r-lib/actions#879

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants