-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use existing package to handle distributions and transformations #39
Comments
Some criteria I recommend to use in your choice:
|
Thanks for the pointers, will keep them in mind when investigating which package to use. |
Rationale for distribution package used in
|
distr |
distr6 |
distributional |
distributions3 |
|
---|---|---|---|---|
no. direct dependencies | 8 | 8 | 11 | 3 |
no. recursive dependencies | 9 | NA | 38 | 37 |
no. reverse dependencies | 26* | NA | 7 | 1 |
no. hosted on github | ✗ | ✓ | ✓ | ✓ |
package up-to-date | 2019 | 2022 | 2022 | 2022 |
on CRAN | ✓ | ✗ | ✓ | ✓ |
*some of distr
's reverse dependencies are other packages in the distr
ecosystem
distr6
and distr
are the most complete packages. Having the most distributions. However, they are based on the R6 and S4 object-oriented system in R and therefore may added unnecessary complexity for users that are new to R using these objects. distr6
has the most principled design philosophy and can be easily understood with the selection of vignettes. However, distr6
is no longer hosted on CRAN (it was archived 2022-08-20). distr
has a complex hierachy of classes which will likely not be fully utilised in epiparameter
as we will not be doing many transformations or need to apply arithmetic operations to multiple distributions. The documentation for distr
is unconventional for an R package in that is does not have much documentation and only contains a single vignette with a large amount of information. distr
is also part of a wider ecosystem of R packages developed by the same team (e.g. distrEx
, distrMod
, distrSim
, etc.) which may mean that multiple dependencies are needed for full functionality.
This leaves distributional
and distributions3
. Both implement distributions as S3 objects, and as a result should be easily used by people new to R. The functionality is largely overlapping and the major difference between the packages is the use of vctrs
by distributional
to implement vectorised distribution objects. distributions3
has good documentation however, this is mainly focused on hypothesis testing and not on the distribution objects. distributional
has good documentation at the function level, however, is lacking vignettes. distributions3
implements some zero-truncated distributions but does not allow for truncating an existing distribution object. On the other hand distributional
allows truncation of existing objects across a wider range of distributions. The same goes for zero-inflated distributions, distributions3
implements certain distributions to have zero-inflated versions, whereas distributional
allows zero-inflated probabilities to be applied to a range of distribution objects.
Other packages like extraDistr
were not evaluated as they are mainly to implement uncommon distributions.
One extra good point for distributional is that it should soon be more lightweight: mitchelloharawild/distributional#62. |
Since #85 {epiparameter} uses {distributional} and {distcrete} for handling distributions, therefore closing this issue. |
The implementation of distribution data has been done by several packages (https://cran.r-project.org/web/views/Distributions.html) and the conversion functions that are shipped with them (https://pkg.mitchelloharawild.com/distributional/reference/index.html). I would be good to utilise one of these packages to minimise the dev load on the distribution side of the package and instead have most of the dev focus on epidemiological data storage and extraction.
The text was updated successfully, but these errors were encountered: