Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PBC database as demostration data #44

Merged
merged 13 commits into from
Dec 13, 2024
Merged

Add PBC database as demostration data #44

merged 13 commits into from
Dec 13, 2024

Conversation

DougManuel
Copy link
Contributor

This pull request adds the PBC data to demonstrate how to create and use the package. The PBC data is openly accessible and available in the survival package.
Included are:

  • the data: pbc.Rd
  • variables sheet: pbc_variables.csv
  • variable details sheet: pbc_variable_details.csv
  • dataset metadata: pbc_database.yaml. This is new file, and it may require a discussion. At this point, it is a simple meta to describe the data provenance, but this file could potentially be used for a different purposes, such as generating a data dictionary. For now, it is simply introduced as a good data management practice.

note: mtcars is also included in the package, and this data requires similar improvements and additions as is proposed for the pbc data.

Copy link
Collaborator

@reikookamoto reikookamoto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review the files to ensure they align with the directory structure and organization best practices outlined in the R Packages book

R/pbc_worksheets.R Outdated Show resolved Hide resolved
R/data.R Outdated Show resolved Hide resolved
R/data.R Outdated Show resolved Hide resolved
inst/extdata/pbc.Rd Outdated Show resolved Hide resolved
R/pbc_worksheets.R Outdated Show resolved Hide resolved
DougManuel added a commit that referenced this pull request Nov 26, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
@DougManuel
Copy link
Contributor Author

  • Documentation website not yet building on my computer with errors at the end of derived_variables.Rmd.

Bug fixes and improvements in recent commits include:

  1. Renamed data files for a more consistent approach to address bugs and confusion. See #details in data.R.

pbc.rda - Primary Biliary Cirrhosis (PBC) data.
pbc_metadata.rda - Metadata for the PBC data. A list containing the database name and other identification information.
pbc_variables.rda - variables metadata for the PBC dataset. Metadata for each variable in the PBC data. A data frame with 24 rows and 11 columns.
pbc_variable_details.rda - variable_details metadata for the PBC dataset. Metadata for variable category or value label in the PBC data.

#' @details

  1. Updated pbc_variables:
  • Address bugs in NA data, incorrect names and labels, the addition of the age group categorical variables, and derived variable example 'example_der`.
  1. Changed data in derived_variables.Rmd to pbc data.
  1. prep_pbc_data.R updated to generate all .rda data. Remember to execute this file when you change pbc_variables.csv or the other 'raw' data.

@DougManuel
Copy link
Contributor Author

The package now passes checks and builds.

@reikookamoto @yulric Do you have any other comments to address?

We now have the PBC data for vignettes and for users to test functions.

We'll need to refresh a few vignettes with minimal examples to use alongside the pbc data.

yulric pushed a commit that referenced this pull request Dec 11, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 11, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 11, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 11, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 11, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 11, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 12, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 12, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 12, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 12, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 12, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 12, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 12, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 12, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 12, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 12, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
yulric pushed a commit that referenced this pull request Dec 12, 2024
Starts to transition to pbc data for vignette, but with bugs remaining in derived_variables.Rmd
DougManuel and others added 6 commits December 13, 2024 14:07
…he pbc dataset

* added variables sheet for the pbc dataset
* renamed PBC-variableDetails.csv to follow the naming convention in the
  repo and added fixes to it
  * Fix variable names removing any variables that are not part of the
    dataset
  * Replace empty string wit NA
  * Fix bug with tagged NA for the ascites variable
  * fixed references to the new file
* added metadata for the pbc dataset using the Dublin core standard
* added documentation for all the above elements and the pbc dataset to
  the documentation website
* added code to add the above files to the data directory allowing users
  of the package to access them easily
@yulric yulric merged commit d9ef762 into dev Dec 13, 2024
@yulric yulric deleted the PBC-database branch December 13, 2024 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants