-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bump stated R dependency? #5757
Comments
I would think that if CI minutes are limited we should test
|
Rather than 3.4.4 and 3.5. I would prefer to have 6 jobs win/macos * release/devel/oldrel so we can provide more binaries. Also #5745 should help to reduce some minutes, so we don't have to freeze suggested deps and risk of breaking change sneaks unnoticed. |
are you able to check how many minutes we are using on github actions? I can see that for my own account under settings -> billing and plans (but it says zero so i'm not sure that is correct). but I can not see usage for Rdatatable org. |
No I am not able to check for Rdatatable org. I am able to check, same as you, for my own namespace.
On GitLab free plan, as Rdatatable org, we have
As a user, I have also 2000 minutes/month, but 10GB of storage So the question is, how many compute minutes we get as Rdatatable org on github. For that I believe we need Matt as we privileges only on repository, an not the org. |
Keeping support for very old R is great, but I don't think it makes sense to keep tracking arbitrarily old R version indefinitely. We are just signing ourselves up for ever-increasing tech debt with limited benefit. Eventually people need to upgrade R. Better to set a policy of support and gradually start bringing our dependency forward. R 3.1.0 is nearly 10 years old: https://github.com/wch/r-source/tree/tags/R-3-1-0 I think even a policy of 5 years old (R 3.5.0) is quite generous, but 6 or 7 would also be fine. I don't think it's reasonable to expect {data.table} to cater to decade-old installations -- archived versions of {data.table} exist for this reason. |
Let's see if there are active users of data.table on R 3.1.0. Sticking blindly to 5 (or 6 or 7) years rule doesn't sound to be great idea. Not sure what package that was, but once in production I had to use own fork of a package, and the only change was pushing stated dependency to older version. Everything worked fine. We don't want our users to be forced to do tricks like this by blindly following a "5 years rule". I do believe we should bump stated dependency but when we are ready to follow up with benefits it gives. |
so far we are too "greedy" about this. 3.2.0 does not bring (us) many direct benefits vs 3.1.0. But it gets us closer to more recent R where there are larger benefits. And I would rather gradually hit 3.2.0, 3.3.0, ... than jump suddenly to 3.6.0.
I'm not worried about this. we are quite good about earmarking which code can be updated once we depend on certain R version. We will very quickly become incompatible with older R upon upgrade. |
Is there a way (that I'm not aware of) for us to know what versions of R are downloading data.table without a survey? |
Yes, described in the first post. CSV files have that field. |
From December 2022 to November 2023. 365 days, 280 valid days (maybe missing or network error) l = list.files() # obtained from cran.stats
d = rbindlist(lapply(l, function(f) {cat(f,"\n",sep=""); fread(f, showProgress=FALSE)[package=="data.table", .N, r_version]}))
d[,sum(N),substr(r_version,1,3)][order(-V1)]
almost 1% of users were on 3.4. |
What do we know about the source for this data? E.g. we see 5 people on 3.1, how likely is it that's just someone like us running really old R for testing purposes? (either way I think it's clear we can bump to 3.2 ASAP and 3.3 in the subsequent release) |
This is from cloud.r-project.org, which is widely used in CI setups. Therefore bias for 4.4, 4.3 and 4.2 may be there. There are tens of different mirrors so it is just a, little biased, sample :) |
I am quite happy staying on R 3.1.0, so it is not like we need to upgrade.
One practical aspect could be to simplify CI. Yet we have couple of CI jobs (3.1, 3.4.4, 3.5) testing different corner cases present of each of those versions. It turned out that CI minutes got now quite limited for free plans...
Bumping to 4.0.0 will allow us to use reference counting, although I am not sure if we have dev time to really work on that. That would be also huge bump of 6 years, from supporting environments set in 2014 to 2020.
Such big change could be also postponed to be introduced when major breaking changes would be landing in master branch as well. Otherwise bumping to 3.5 is some middle step.
Before any change we should definitely investigate what R version are data.table users using based on data from http://cran-logs.rstudio.com (see @arunsrinivasan 's https://github.com/arunsrinivasan/cran.stats)
My personal preference would be to support as old as feasible R version, possibly removing R 3.4.4 and R 3.5.0 CI jobs, and leaving R 3.1.0 job.
The text was updated successfully, but these errors were encountered: