Skip to content

Commit

Permalink
Closes #2186 TERMID -> TERMNUM, TERMNAME -> TERMCHAR (#2241)
Browse files Browse the repository at this point in the history
* #2186 TERMID -> TERMNUM, TERMNAME -> TERMCHAR

* #2186 chroe: styling and tests

* #2186 chore: roxygen

* Test commit to retrigger workflow

* Undo test commit

* #2186 term_renaming: align sdg

---------

Co-authored-by: Bundfuss, Stefan {MDBB~Basel} <stefan.bundfuss@roche.com>
  • Loading branch information
manciniedoardo and bundfussr authored Nov 21, 2023
1 parent e11104a commit 780029f
Show file tree
Hide file tree
Showing 18 changed files with 112 additions and 104 deletions.
8 changes: 8 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,14 @@ order = exprs(my_order_var),
- `derive_var_ontrtfl(span_period)`

- The `derive_param_extreme_record()` function has been superseded in favor of `derive_extreme_event()`. (#2141)

- `create_query_data()` and `derive_vars_query()` updated to rename variables in
query data set as follows: (#2186)

- `TERMNAME` to `TERMCHAR`
- `TERMID` to `TERMNUM`

Users need to adjust their `get_terms()` function accordingly.

## Documentation

Expand Down
28 changes: 14 additions & 14 deletions R/create_query_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,9 @@
#'
#' - `SRCVAR`: the variable to be used for defining a term of the basket,
#' e.g., `AEDECOD`
#' - `TERMNAME`: the name of the term if the variable `SRCVAR` is
#' - `TERMCHAR`: the name of the term if the variable `SRCVAR` is
#' referring to is character
#' - `TERMID` the numeric id of the term if the variable `SRCVAR` is
#' - `TERMNUM` the numeric id of the term if the variable `SRCVAR` is
#' referring to is numeric
#' - `GRPNAME`: the name of the basket. The values must be the same for
#' all observations.
Expand All @@ -56,7 +56,7 @@
#' @details
#'
#' For each `query()` object listed in the `queries` argument, the terms belonging
#' to the query (`SRCVAR`, `TERMNAME`, `TERMID`) are determined with respect
#' to the query (`SRCVAR`, `TERMCHAR`, `TERMNUM`) are determined with respect
#' to the `definition` field of the query: if the definition field of the
#' `query()` object is
#'
Expand Down Expand Up @@ -86,8 +86,8 @@
#' equals `FALSE` for all baskets or none of the queries is an basket , the variable
#' is not created.
#' * `SRCVAR`: Name of the variable used to identify the terms.
#' * `TERMNAME`: Value of the term variable if it is a character variable.
#' * `TERMID`: Value of the term variable if it is a numeric variable.
#' * `TERMCHAR`: Value of the term variable if it is a character variable.
#' * `TERMNUM`: Value of the term variable if it is a numeric variable.
#' * `VERSION`: Set to the value of the `version` argument. If it is not
#' specified, the variable is not created.
#'
Expand All @@ -111,7 +111,7 @@
#'
#' # creating a query dataset for a customized query
#' cqterms <- tribble(
#' ~TERMNAME, ~TERMID,
#' ~TERMCHAR, ~TERMNUM,
#' "APPLICATION SITE ERYTHEMA", 10003041L,
#' "APPLICATION SITE PRURITUS", 10003053L
#' ) %>%
Expand Down Expand Up @@ -468,17 +468,17 @@ assert_db_requirements <- function(version, version_arg_name, fun, fun_arg_name,
#' * An `basket_select()` object is specified to select a query from the SMQ
#' database.
#'
#' * A data frame with columns `SRCVAR` and `TERMNAME` or `TERMID` can
#' * A data frame with columns `SRCVAR` and `TERMCHAR` or `TERMNUM` can
#' be specified to define the terms of a customized query. The `SRCVAR`
#' should be set to the name of the variable which should be used to select
#' the terms, e.g., `"AEDECOD"` or `"AELLTCD"`. `SRCVAR` does not need
#' to be constant within a query. For example a query can be based on
#' `AEDECOD` and `AELLT`.
#'
#' If `SRCVAR` refers to a character variable, `TERMNAME` should be set
#' to the value the variable. If it refers to a numeric variable, `TERMID`
#' If `SRCVAR` refers to a character variable, `TERMCHAR` should be set
#' to the value the variable. If it refers to a numeric variable, `TERMNUM`
#' should be set to the value of the variable. If only character variables
#' or only numeric variables are used, `TERMID` or `TERMNAME` respectively
#' or only numeric variables are used, `TERMNUM` or `TERMCHAR` respectively
#' can be omitted.
#'
#' * A list of data frames and `basket_select()` objects can be specified to
Expand Down Expand Up @@ -529,7 +529,7 @@ assert_db_requirements <- function(version, version_arg_name, fun, fun_arg_name,
#'
#' # creating a query for a customized query
#' cqterms <- tribble(
#' ~TERMNAME, ~TERMID,
#' ~TERMCHAR, ~TERMNUM,
#' "APPLICATION SITE ERYTHEMA", 10003041L,
#' "APPLICATION SITE PRURITUS", 10003053L
#' ) %>%
Expand Down Expand Up @@ -713,7 +713,7 @@ validate_query <- function(obj) {
#' - `terms` is not a data frame,
#' - `terms` has zero observations,
#' - the `SRCVAR` variable is not in `terms`,
#' - neither the `TERMNAME` nor the `TERMID` variable is in `terms`,
#' - neither the `TERMCHAR` nor the `TERMNUM` variable is in `terms`,
#' - `expect_grpname == TRUE` and the `GRPNAME` variable is not in `terms`,
#' - `expect_grpid == TRUE` and the `GRPID` variable is not in `terms`,
#'
Expand Down Expand Up @@ -784,10 +784,10 @@ assert_terms <- function(terms,
)
}
}
if (!"TERMNAME" %in% vars && !"TERMID" %in% vars) {
if (!"TERMCHAR" %in% vars && !"TERMNUM" %in% vars) {
abort(
paste0(
"Variable `TERMNAME` or `TERMID` is required.\n",
"Variable `TERMCHAR` or `TERMNUM` is required.\n",
"None of them is in ",
source_text,
".\n",
Expand Down
30 changes: 15 additions & 15 deletions R/derive_vars_query.R
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@
#' "SCN" variable will be created.
#'
#' For each record in `dataset`, the "NAM" variable takes the value of
#' `GRPNAME` if the value of `TERMNAME` or `TERMID` in `dataset_queries` matches
#' `GRPNAME` if the value of `TERMCHAR` or `TERMNUM` in `dataset_queries` matches
#' the value of the respective SRCVAR in `dataset`.
#' Note that `TERMNAME` in `dataset_queries` dataset may be NA only when `TERMID`
#' Note that `TERMCHAR` in `dataset_queries` dataset may be NA only when `TERMNUM`
#' is non-NA and vice versa.
#' The "CD", "SC", and "SCN" variables are derived accordingly based on
#' `GRPID`, `SCOPE`, and `SCOPEN` respectively,
Expand All @@ -29,7 +29,7 @@
#' @param dataset `r roxygen_param_dataset()`
#'
#' @param dataset_queries A dataset containing required columns `PREFIX`,
#' `GRPNAME`, `SRCVAR`, `TERMNAME`, `TERMID`, and optional columns
#' `GRPNAME`, `SRCVAR`, `TERMCHAR`, `TERMNUM`, and optional columns
#' `GRPID`, `SCOPE`, `SCOPEN`.
#'
#' The content of the dataset will be verified by [assert_valid_queries()].
Expand Down Expand Up @@ -111,7 +111,7 @@ derive_vars_query <- function(dataset, dataset_queries) {
# queries restructured
queries_wide <- dataset_queries %>%
mutate(
TERMNAME = toupper(TERMNAME),
TERMCHAR = toupper(TERMCHAR),
PREFIX_NAM = paste0(PREFIX, "NAM")
) %>%
pivot_wider(names_from = PREFIX_NAM, values_from = GRPNAME) %>%
Expand All @@ -123,12 +123,12 @@ derive_vars_query <- function(dataset, dataset_queries) {
pivot_wider(names_from = PREFIX_SCN, values_from = SCOPEN) %>%
select(-PREFIX) %>%
# determine join column based on type of SRCVAR
# numeric -> TERMID, character -> TERMNAME, otherwise -> error
# numeric -> TERMNUM, character -> TERMCHAR, otherwise -> error
mutate(
tmp_col_type = vapply(dataset[SRCVAR], typeof, character(1)),
TERM_NAME_ID = case_when(
tmp_col_type == "character" ~ TERMNAME,
tmp_col_type %in% c("double", "integer") ~ as.character(TERMID),
tmp_col_type == "character" ~ TERMCHAR,
tmp_col_type %in% c("double", "integer") ~ as.character(TERMNUM),
TRUE ~ NA_character_
)
)
Expand Down Expand Up @@ -196,10 +196,10 @@ derive_vars_query <- function(dataset, dataset_queries) {
#' - `SCOPE`, 'BROAD', 'NARROW', or NA
#' - `SCOPEN`, 1, 2, or NA
#' - `SRCVAR`, e.g., `"AEDECOD"`, `"AELLT"`, `"AELLTCD"`, ...
#' - `TERMNAME`, character, could be NA only at those observations
#' where `TERMID` is non-NA
#' - `TERMID`, integer, could be NA only at those observations
#' where `TERMNAME` is non-NA
#' - `TERMCHAR`, character, could be NA only at those observations
#' where `TERMNUM` is non-NA
#' - `TERMNUM`, integer, could be NA only at those observations
#' where `TERMCHAR` is non-NA
#'
#' @param queries A data.frame.
#'
Expand All @@ -220,7 +220,7 @@ assert_valid_queries <- function(queries, queries_name) {
# check required columns
assert_data_frame(
queries,
required_vars = exprs(PREFIX, GRPNAME, SRCVAR, TERMNAME, TERMID)
required_vars = exprs(PREFIX, GRPNAME, SRCVAR, TERMCHAR, TERMNUM)
)

# check duplicate rows
Expand Down Expand Up @@ -294,10 +294,10 @@ assert_valid_queries <- function(queries, queries_name) {
}

# check illegal term name
if (any(is.na(queries$TERMNAME) & is.na(queries$TERMID)) ||
any(queries$TERMNAME == "" & is.na(queries$TERMID))) {
if (any(is.na(queries$TERMCHAR) & is.na(queries$TERMNUM)) ||
any(queries$TERMCHAR == "" & is.na(queries$TERMNUM))) {
abort(paste0(
"Either `TERMNAME` or `TERMID` need to be specified",
"Either `TERMCHAR` or `TERMNUM` need to be specified",
" in `", queries_name, "`. ",
"They both cannot be NA or empty."
))
Expand Down
4 changes: 2 additions & 2 deletions R/globals.R
Original file line number Diff line number Diff line change
Expand Up @@ -122,8 +122,8 @@ globalVariables(c(
"VAR_CHECK",
"TERM",
"SRCVAR",
"TERMID",
"TERMNAME",
"TERMNUM",
"TERMCHAR",
"TERM_NAME_ID",
"TERM_UPPER",
"atoxgr_criteria_ctcv4",
Expand Down
Binary file modified data/queries.rda
Binary file not shown.
Binary file modified data/queries_mh.rda
Binary file not shown.
4 changes: 2 additions & 2 deletions inst/WORDLIST
Original file line number Diff line number Diff line change
Expand Up @@ -221,8 +221,8 @@ TADJ
TADJAE
TDOSE
TDURD
TERMID
TERMNAME
TERMNUM
TERMCHAR
TID
TLFs
TMF
Expand Down
4 changes: 2 additions & 2 deletions inst/example_scripts/example_query_source.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@
# GRPID, could be NULL
# SCOPE, ‘BROAD’, ‘NARROW’, or NULL
# SRCVAR, e.g., AEDECOD, AELLT, ...
# TERMNAME, non NULL
# TERMCHAR, non NULL

queries <- tibble::tribble(
~PREFIX, ~GRPNAME, ~GRPID, ~SCOPE,
~SCOPEN, ~SRCVAR, ~TERMNAME, ~TERMID,
~SCOPEN, ~SRCVAR, ~TERMCHAR, ~TERMNUM,
"CQ01", "Dermatologic events", NA_integer_, NA_character_,
NA_integer_, "AELLT", "APPLICATION SITE ERYTHEMA", NA_integer_,
"CQ01", "Dermatologic events", NA_integer_, NA_character_,
Expand Down
2 changes: 1 addition & 1 deletion man/assert_terms.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions man/assert_valid_queries.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 6 additions & 6 deletions man/create_query_data.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions man/derive_vars_query.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 5 additions & 5 deletions man/query.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 780029f

Please sign in to comment.