-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2.0.0 Release Candidate #9497
Comments
This is exciting! Thanks for driving it. Quick question: why are you bumping the version to v2.1 though? #9498 |
@terrytangyuan #9498 is updating the development branch. We have a separate branch for the 2.0 release. |
It's bumping from 2.0.0 to 2.1.0 directly. Are there new features in 2.1.0 that are worth a minor release? |
@terrytangyuan The 2.1 bump is for the master branch as the future release version. |
Maybe I am not following the current way of releasing in this project but usually we only bump versions right before we release. This way we can determine the release version based on the additional commits on top of the last release. Another concern is that if users install the R package directly from GitHub, they would not be able to upgrade later to the official 2.1 release since it already exists. One convention (at least in the R community) is to use |
Yeah, it would be confusing for R users. For Python users, the version is actually |
thanks for this, will the release date also be 28 Aug or later? |
It will be a bit later even if everything goes well. We still need to go through all the CI building and package submissions. |
Delaying the release due to blocking spark issues: #9510 . |
@hcho3 We are hitting the total size limit on pypi. I will be removing some prior rc releases (like 1.0.0rc1) from PyPI, what do you think?
|
Sure, sounds good to me |
I remember we can request for exceptions by filing an issue to PyPI. |
Yes, as documented in https://pypi.org/help/#project-size-limit . I will do that after the release. At the moment, I think pulling down prior RC releases seems reasonable. |
I'm pulling down binary files and keeping the release tags. The source package will continue to be available. |
Some other R failures on CRAN:
I'm running the tests with a 12 core 24 thread machine, couldn't reproduce any of the time violation. |
I run the test against |
I think those example names correspond to the names of the So for example, xgboost/R-package/man/xgb.config.Rd Lines 20 to 28 in 730bc1f
I can see there that |
For the two plotting ones where CRAN is also complaining about the absolute time
It might help to:
e.g. are 50 iterations really necessary to show what this function does? xgboost/R-package/R/xgb.plot.deepness.R Lines 49 to 52 in 730bc1f
I also have not been able to reproduce it. There are some details here about the CRAN check farm, but a lot that's missing (like values of |
@jameslamb Thank you for sharing!
This could be it! Let me try to limit the datatable thread usage and re-submit. |
Second attempt after #9591:
On my machine:
|
I have manually verified that some of these failing examples conform to having less or equal to 2 threads using gdb, by looking at gdb hints like After #9591 , I don't think there's anything we can do from the xgboost's side other than removing those examples. Interestingly, I was only able to get some warnings by using clang instead of gcc to compile xgboost. #9591 listed some notes on how to do it. @jameslamb |
I agree, that just can't be right. To try to help, I looked into a mirror of the R source code, to try to understand the code in I tried to build the source distribution of details (click me)I ran the following: python ./tests/ci_build/test_r_package.py --task build and that yielded this error
which I didn't understand, given that the script appears to define xgboost/tests/ci_build/test_r_package.py Lines 279 to 285 in 38ac52d
So I decided to do this with LightGBM, which I knew how to build a CRAN-style source distribution for. sh build-cran-package.sh --no-build-vignettes
MAKEFLAGS=-j3 \
_R_CHECK_EXAMPLE_TIMING_THRESHOLD_=0 \
_R_CHECK_EXAMPLE_TIMING_CPU_TO_ELAPSED_THRESHOLD_=0.1 \
R --vanilla CMD check \
--no-codoc \
--no-manual \
--no-tests \
--no-vignettes \
--run-dontrun \
--run-donttest \
--timings \
./lightgbm_4.1.0.99.tar.gz These have the following meanings:
I hope that information will be helpful to you in debugging this. If you tell me how to build the |
@jameslamb Thank you for sharing the detailed information!
After which, there will be a tarball in the working directory. With the same script, you can run
The check uses
Thank you! This is useful, let me try to produce one and share it here. |
I can reproduce the error now based on the
A simple script for the ease of sharing results: import pandas as pd
from io import StringIO
path = "./xgboost.Rcheck/xgboost-Ex.timings"
with open(path, "r") as fd:
content = fd.readlines()
newlines = []
for line in content:
line = line.strip()
newlines.append(line)
con_content = '\n'.join(newlines)
df = pd.read_csv(StringIO(con_content), delimiter="\t")
ratio_n = "user/elapsed"
df[ratio_n] = df["user"] / df["elapsed"]
df.to_markdown("timings.md")
offending = df[df[ratio_n] > 2.5]
offending.to_markdown("offending.md") |
Extracting the example out as an independent script doesn't reproduce the error. For instance, I took the example from library(xgboost)
data(agaricus.train, package = "xgboost")
## Keep the number of threads to 1 for examples
nthread <- 1
data.table::setDTthreads(nthread)
train <- agaricus.train
bst <- xgboost(
data = train$data, label = train$label, max_depth = 2,
eta = 1, nthread = nthread, nrounds = 2, objective = "binary:logistic"
)
config <- xgb.config(bst) And run: time Rscript test-config.R This is reported by
Similar observation on update |
@jameslamb I think I have found (one of) the causes for XGBoost. XGBoost uses multiple threads to load model, this is used in one of the examples |
@hcho3 We will delay the R release to 2.1. Anything that concerns configuration is not trivial. |
Alternatively, we just restrict the number of threads during model load for R build. |
I wrote a helper script for running examples individually: library(pkgload)
library(xgboost)
files <- list.files("./man")
run_example_timeit <- function(f) {
path <- paste("./man/", f, sep = "")
print(paste("Test", f))
flush.console()
t0 <- proc.time()
run_example(path)
t1 <- proc.time()
list(file = f, time = t1 - t0)
}
timings <- lapply(files, run_example_timeit)
for (t in timings) {
ratio <- t$time[1] / t$time[3]
if (!is.na(ratio) && !is.infinite(ratio) && ratio >= 2.5) {
print(paste("Offending example:", t$file, ratio))
}
} |
!!! this is very helpful, thank you! |
Failed CRAN check with reverse dependencies. Likely caused by the change of the default tree method and the addition of initial estimation. Debian logPackage check result: OK Changes to worse in reverse depends: Package: CausalGPS
Package: CRE
Package: GPCERF
Package: lime
Package: pdp
Package: personalized --- re-building ‘fitting_itrs_with_xgboost.Rmd’ using rmarkdown Quitting from lines 125-141 [unnamed-chunk-4] (fitting_itrs_with_xgboost.Rmd) --- re-building ‘multicategory_treatments_with_personalized.Rmd’ using rmarkdown --- re-building ‘usage_of_the_personalized_package.Rmd’ using rmarkdown SUMMARY: processing the following file failed: Error: Vignette re-building failed. Package: personalized
Package: pmml
|
We will delay the release for R to 2.1. In the meantime, I will prepare for a patch release for https://github.com/dmlc/xgboost/projects/12. Thank you to everyone who has participated! |
Sounds good, |
We are about to release version 2.0.0 of XGBoost. We invite everyone to try out the release candidate (RC).
Roadmap: https://github.com/dmlc/xgboost/projects/2
Release note: #9484
Feedback period: until the end of August 28, 2023. No new feature will be added to the release; only critical bug fixes will be backported.
@dmlc/xgboost-committer
Available packages:
R binary packages with CUDA enabled for testing:
sha256sum:
Install:
Show instructions (Maven/SBT)
Maven
SBT
Starting from 1.2.0, XGBoost4J-Spark supports training with NVIDIA GPUs. To enable this capability, download artifacts suffixed with
-gpu
, as follows:Show instructions (Maven/SBT)
Maven
SBT
backports
The text was updated successfully, but these errors were encountered: