Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement for #205 #206

Closed
wants to merge 27 commits into from
Closed

Enhancement for #205 #206

wants to merge 27 commits into from

Conversation

Max-Bladen
Copy link
Collaborator

test: first commit to trial the new test system which draws the ground truth values from external RData files

test: first commit to trial the new test system which draws the ground truth values from external RData files
@Max-Bladen Max-Bladen added the enhancement-request New feature or request label Apr 5, 2022
@Max-Bladen Max-Bladen self-assigned this Apr 5, 2022
@Max-Bladen Max-Bladen linked an issue Apr 5, 2022 that may be closed by this pull request
@Max-Bladen Max-Bladen changed the title Test commit Enhancement for #205 Apr 5, 2022
@Max-Bladen Max-Bladen mentioned this pull request Apr 10, 2022
@Max-Bladen
Copy link
Collaborator Author

Max-Bladen commented Apr 11, 2022

Hey @aljabadi

I'd love your feedback for this PR as I'm still in the early stages of it and any changes you suggest would be better now rather than later.

For a given test case (that isnt testing for a specific error), rather than just testing for a single numeric output, example:

expect_equal(matrix(auc.plsda$Comp1),rbind(0.863, 2.473e-05)

I am saving the entire output to a repo (found here) and loading it in within each test and testing the function against that entire output object, example:

GH.URL <- "https://github.com/Max-Bladen/mixOmicsWork/blob/main/Test%20Ground%20Truths/auroc/basic.plsda.RData?raw=true"
.load_url(GH.URL)
    
data(breast.tumors)
X <- breast.tumors$gene.exp
Y <- breast.tumors$sample$treatment
    
res.plsda <- plsda(X, Y, ncomp = 2)
    
plsda.auroc = auroc(res.plsda, roc.comp = 1, print = FALSE)
    
expect_equal(plsda.auroc, GT.plsda.auroc)

where GT.plsda.auroc is the saved file at the GH.URL (GT standing for Ground Truth).

My rationale is that when testing for individual numeric values, there is potentially unintended changes to the output values/structure outside of this one value that are being totally missed. The one downside I see to this is the runtime, such that running .load_url() takes about 2-3seconds per call.

You can see that I'm trying to impose more of a structure to each test document, such that new tests can be added easily. I also want the analytical and graphical methods to have all (or most) of their potential input objects tested for, as well as ensuring each is tested on a variety of datasets. This structure and expansion to the test sets is outlined on this wiki page.

@aljabadi
Copy link
Collaborator

Hi @Max-Bladen,

This would be a great idea to extend the tests to further elements in the output. However, we want to keep tests as isolated from the outside of the repo as possible.

We also need to be aware that some of the output elements are generated by the dependencies (say a graph), and as long as they adjust their visualisation based on those, changes to them should not break the tests.

Here is what I suggest for expanding the tests to further elements:

  • Subset the object to include only certain elements most important to our methods
  • Ensure the subset is of a small size
  • Ensure the object is representable by dput function (it's a function that allows you to print an object's definition and recreate that object programmatically, as opposed to having to save and reload). Be mindful to exclude non-standard elements (e.g. <environemnt>.). See example below
library(testthat)
library(ggplot2)
# my ground truth output
output = list(data = data.frame(a=c(1,2,3)), graph = ggplot(data.frame(b=c(5,6,7))) + geom_point(), class=c('data.frame', 'foo'))

# dput shows the object definition which has non-standard elements (e.g. facet)
dput(output)
#> list(
#>   data = structure(list(a = c(1, 2, 3)), class = "data.frame", row.names = c(NA,
#>                                                                              -3L)),
#>   graph = structure(
#>     list(
#>       data = structure(list(), class = "waiver"),
#>       layers = list( <
#>                        environment > ),
#>       scales = < environment > ,
#>       mapping = structure(list(), names = character(0), class = "uneval"),
#>       theme = list(),
#>       coordinates = < environment > ,
#>       facet = < environment > ,
#>       plot_env = <
#>         environment > ,
#>       labels = list()
#>     ),
#>     class = c("gg",
#>               "ggplot")
#>   ),
#>   class = c("data.frame", "foo")
#> )

# I'm only intersted in c('data', 'class') so I only keep them
test_els = c('data', 'class')
dput(output[test_els])
#> list(data = structure(list(a = c(1, 2, 3)), class = "data.frame", row.names = c(NA, -3L)), class = c("data.frame", "foo"))

output_ground_truth = list(data = structure(list(a = c(1, 2, 3)), class = "data.frame", row.names = c(NA, -3L)), class = c("data.frame", "foo"))

expect_equal(output[test_els], output_ground_truth)

# Note that I can be more specific, for instance by adding `output$graph$data` data.frame to
# the output and checking for that too, if that is relevant to my test case

You can event separate the definition of these ground truth elements inside testthat/helper-*.R scripts only if they add too much clutter (see https://www.r-bloggers.com/2020/11/helper-code-and-files-for-your-testthat-tests/). I'd intentionally restrict test cases to anything that is implementable this way to avoid overcomplications of the package.

@Max-Bladen
Copy link
Collaborator Author

Max-Bladen commented Apr 11, 2022

Thanks for the feedback @aljabadi . I agree with everything you mentioned. I didn't know about dput, thats very handy. So instead of this (previously):

GH.URL <- "https://github.com/Max-Bladen/mixOmicsWork/blob/main/
           Test%20Ground%20Truths/auroc/basic.plsda.RData?raw=true"
.load_url(GH.URL)
    
data(breast.tumors)
X <- breast.tumors$gene.exp
Y <- breast.tumors$sample$treatment
    
res.plsda <- plsda(X, Y, ncomp = 2)
    
plsda.auroc = auroc(res.plsda, roc.comp = 1, print = FALSE)
    
expect_equal(plsda.auroc, GT.plsda.auroc)

Do you feel something like this more appropriate

testable.components <- c("Comp1", "Comp2")
    
GT <- list(Comp1 = structure(c(0.863, 2.473e-05), 
                             .Dim = 1:2, 
                             .Dimnames = list("AF vs BE", c("AUC", "p-value"))), 
           Comp2 = structure(c(0.9981, 7.124e-09), 
                             .Dim = 1:2, 
                             .Dimnames = list("AF vs BE", c("AUC", "p-value"))))

data(breast.tumors)
X <- breast.tumors$gene.exp
Y <- breast.tumors$sample$treatment
  
res.plsda <- plsda(X, Y, ncomp = 2)
    
plsda.auroc = auroc(res.plsda, roc.comp = 1, print = FALSE)
    
invisible(capture.output(TT <- dput(plsda.auroc[testable.components])))
    
expect_equal(TT, GT)

@Max-Bladen
Copy link
Collaborator Author

Oh also, are you happy with the different types of tests and how I'm structuring things?

@Max-Bladen
Copy link
Collaborator Author

Max-Bladen commented Apr 11, 2022

Sorry for the repeated messages @aljabadi. I've just been adjusting the test-auroc.R files with the new style (can be seen in the above comment). Things get very messy, very quickly, even with minimal outputs - and this is using auroc() which has a comparatively small set of output. Can I suggest that we keep the ground truths in a separate file for each test-....R file? Maybe have a folder (eg ground truths) within the tests folder that we source. It would certainly make each file significantly cleaner

test: second trial commit for new test system. draws ground truths from internal structure objects
fix: minor error in the new `test-auroc.R` meant a test was failing
@aljabadi
Copy link
Collaborator

Hi @Max-Bladen,
If you feel like storing test data is inevitable, I recommend you use inst/testdata. See https://stackoverflow.com/a/41777777.

Please be mindful that currently our package size exceeds the standard size of 5MB so this should be a last resort. The scripts that generate these files must be included within the test functions and only commented out.

@Max-Bladen Max-Bladen mentioned this pull request Apr 24, 2022
@aljabadi
Copy link
Collaborator

aljabadi commented May 2, 2022

Hi @Max-Bladen, looks awesome. Happy to see the files don't take much space.

@aljabadi
Copy link
Collaborator

aljabadi commented May 2, 2022

If a function is improved and the outcomes change, what is the procedure for changing the fixture data? Is this documented and easy to implement?

@Max-Bladen
Copy link
Collaborator Author

The procedure is quite simple. I have some notes written down locally. Once I've worked my way through most of the work for this PR (see the last page of the Wiki), I'll change this page to contain a step-by-step process on how to changing this data. I've tried to design this with adaptability and expansion in mind

Max-Bladen added a commit that referenced this pull request May 16, 2022
test: added test cases for `tune.spls()`. These are just placeholders as more robust testing criteria needs to be established as well as this document will be fully updated as part of PR #206
test: updated the test cases for the following functions: circosPlot, block.splsda, network, pca, perf.diablo, perf.mint.splsda, plotIndiv, plotLoadings, plotVar, predict.

Using these to estimate the increase in coverage once this overhaul is complete
fix: temporarily uncommented out test cases that were failing. Unknown source of error
Max-Bladen and others added 16 commits June 23, 2022 14:23
test: added test cases for `tune.block.splsda()`. Additionally, adjusted `MCV.block.splsda()` so that `n` and `repeat.measures` are calculated at the correct point
test: added test cases for `tune.mint.splsda()`.
fix: adjusted functionality of `.minimal_train_test_subset` to allow for greater user control and consistency. adjusted test cases which utilised these accordingly
refactor: changed notation slightly. "edge cases" are now just referred to as "warnings" as that is all they were used for
tests: added new tests for `tune.splsda()`
tests: added `set.seed` to multilevel `tune.splsda()` test to ensure consistency
fix: temporarily removed a couple tests causing conflict in `test-plotLoadings.R`
fix: attempting to revert `test-plotLoadings.R` back to form seen in master to see if this resolves conflict
fix: commented out tune.splsda multilevel test and repaired error causing plotVar tests to fail.

also renamed the `quiet` function to `.quiet` to bring in line with naming structure of helper functions
tests: implemented way to condense the ground truth files by combining any Testable.Components or Ground.Truths which are identical. implemented for `tune.splsda` as trial
tests: readded tests for `plotLoadings()`
fix: temporarily removed the entire `test-plotLoadings.R` file to allow for rebasing of branch
fix: returned `test-plotLoadings.R`
tests: condensed existing `auroc` tests and expanded on its coverage.
@Max-Bladen
Copy link
Collaborator Author

After disucssion, this idea was to be scraped.

@Max-Bladen Max-Bladen closed this Sep 21, 2022
Max-Bladen added a commit that referenced this pull request Dec 13, 2022
test: added test cases for `tune.spls()`. These are just placeholders as more robust testing criteria needs to be established as well as this document will be fully updated as part of PR #206
Max-Bladen added a commit that referenced this pull request Dec 13, 2022
test: added test cases for `tune.spls()`. These are just placeholders as more robust testing criteria needs to be established as well as this document will be fully updated as part of PR #206
Max-Bladen added a commit that referenced this pull request Dec 13, 2022
fix: implemented BiocParallel functionality into `tune.spls()` via restructuring of the main portion of the function. bplapply() iterates over all pairs of input keepX/Y so only reduces runtime for long lists of test.keepX/Y

refactor: previous PR left duplicate function used for development. this is rectified and more comments are added

test: added test cases for `tune.spls()`. These are just placeholders as more robust testing criteria needs to be established as well as this document will be fully updated as part of PR #206

tests: updated tests to assess actual values alongside RNG control of parallel function calls.

refactor: replaced all calls to `cpus` to `BPPARAM` in `tune()`

docs: updated `tune.Rd` file with cpus -> BPPARAM shift
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement-request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inconsistent and insubstantial test cases
2 participants