Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] CRAN note about example timings > 5 seconds #2988

Closed
jameslamb opened this issue Apr 9, 2020 · 4 comments
Closed

[R-package] CRAN note about example timings > 5 seconds #2988

jameslamb opened this issue Apr 9, 2020 · 4 comments

Comments

@jameslamb
Copy link
Collaborator

See #2985 (comment).

The R package has been just barely under this timing I guess:

* checking examples ... NOTE
Examples with CPU or elapsed time > 5s
     user system elapsed
dim 5.426  0.264    5.79

We hopefully can get back under it by making some of the examples less costly to run or using \dontrun guards. I would prefer to make the examples simpler, since \dontrun guards means that we lose testing of the documentation.

I ran the following on my Mac to get timings for the examples

Rscript build_r.R --skip-install
R CMD CHECK lightgbm*.tar.gz --as-cran --timings
Rscript -e "
    timing_df <- read.delim('lightgbm.Rcheck/lightgbm-Ex.timings');
    print(timing_df);
    print(paste0('total time (s) ', sum(timing_df[['name']])))
    "

The timings are as shown:

                             name  user system elapsed
dim                         0.695 0.036  0.731      NA
dimnames.lgb.Dataset        0.056 0.007  0.053      NA
getinfo                     0.060 0.002  0.056      NA
lgb.Dataset                 0.049 0.007  0.049      NA
lgb.Dataset.construct       0.043 0.001  0.035      NA
lgb.Dataset.create.valid    0.034 0.002  0.036      NA
lgb.Dataset.save            0.043 0.002  0.036      NA
lgb.Dataset.set.categorical 0.044 0.002  0.037      NA
lgb.Dataset.set.reference   0.034 0.001  0.036      NA
lgb.cv                      0.284 0.084  0.188      NA
lgb.dump                    0.070 0.012  0.059      NA
lgb.get.eval.result         0.066 0.013  0.060      NA
lgb.importance              0.264 0.042  0.214      NA
lgb.interprete              0.951 0.048  0.507      NA
lgb.load                    0.148 0.021  0.079      NA
lgb.model.dt.tree           0.118 0.029  0.116      NA
lgb.plot.importance         0.130 0.028  0.124      NA
lgb.plot.interpretation     0.926 0.042  0.479      NA
lgb.prepare                 0.020 0.001  0.012      NA
lgb.prepare2                0.010 0.001  0.006      NA
lgb.prepare_rules           0.015 0.002  0.019      NA
lgb.prepare_rules2          0.011 0.002  0.014      NA
lgb.save                    0.064 0.014  0.061      NA
lgb.train                   0.064 0.012  0.060      NA
lgb.unloader                0.066 0.013  0.063      NA
predict.lgb.Booster         0.068 0.013  0.060      NA
readRDS.lgb.Booster         0.076 0.013  0.077      NA
saveRDS.lgb.Booster         0.073 0.014  0.067      NA
setinfo                     0.052 0.003  0.046      NA
slice                       0.052 0.002  0.045      NA
[1] "total time (s) 4.586"

it looks like the example for lgb.interprete() and lgb.plot.interpretationo() take almost 2 seconds to run. I'll see if I can speed them up or just \dontrun them.

@StrikerRUS
Copy link
Collaborator

Is this worth at all? Maybe just count this NOTE in allowed? I think in the future it will be good to have some kind of complete and big examples for better user understandings of how LightGBM works. Or it is better to move to demo as much examples content as possible?

@jameslamb
Copy link
Collaborator Author

The best solution would be to write proper vignettes. If we don't have an issue for that i'll make one.

demo stuff is tough for users to discover...you have to literally call demo() in an R console. If your need evidence that demo isn't used that much in R projects these days, note that pkgdown doesn't render the demos anywhere: https://lightgbm.readthedocs.io/en/latest/R/reference/

Vignettes is the right place for long-form documentation.

example: future:

image

https://cran.r-project.org/web/packages/future/vignettes/future-1-overview.html

image

These vignettes get indexed by search engines, are more expressive because they're written in R markdown (so you can mix in formatting and long-form text), and automatically get put into an 'Articles' section in pkgdown ([example])https://uptake.github.io/pkgnet//articles/pkgnet-intro.html)

Is this worth at all?

I think it's worthwhile to have one example for ever exported object in a package. I think it's powerful to have one copy-pastable example in the documentation from ?<object-name>. It gives users a sense of what the valid values are.

So I am going to create a PR today that speeds up the current examples.

@StrikerRUS
Copy link
Collaborator

Oh, I meant vignettes. I remember your comment #1944 (comment).

Is this worth at all?

Will be it possible to keep all future examples under 5s? Or at some point in future we will have to ignore this NOTE anyway sooner or later?

@jameslamb
Copy link
Collaborator Author

I think it will be be possible to keep them well under 5s. I think I can get the examples down to a bare minimum, and otherwise we can wrap them in \dontrun blocks. The less special cases we have to rely on CRAN accepting, the better.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants