Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor for admiral's use within GSK's ADaM metadata driven system #2441

Closed
bms63 opened this issue May 23, 2024 Discussed in #2349 · 0 comments · Fixed by #2425
Closed

Refactor for admiral's use within GSK's ADaM metadata driven system #2441

bms63 opened this issue May 23, 2024 Discussed in #2349 · 0 comments · Fixed by #2425

Comments

@bms63
Copy link
Collaborator

bms63 commented May 23, 2024

Discussed in #2349

Originally posted by bms63 February 23, 2024
Why some existing admiral functions are not a good fit to be used in Executor
Thursday, February 22, 2024
1:33 PM
Executor is designed to control the inputs and outputs of each derivation, so that when selecting multiple derivations from the library the system can recognize the order of their execution automatically.

Information about inputs and outputs of each derivation is collected as metadata from the user and in order to ensure consistency between what is specified in the metadata and what is actually being used in the derivation the frame work performs the initial filtering and joining of the source data as well as joining the result on to output dataset itself, taking this actions off the user's table.

This makes functions like the following:
image

and other functions utilizing derive_vars_merged function under the hood not useful with Executor.

Proposal:
For each function using derive_vars_merged abstract the part before the derive_vars_merged call into a separate function and call is from the main function.
e.g.

derive_var_merged_exist_flag <- function(dataset,
                                         dataset_add,
                                         by_vars,
                                         new_var,
                                         condition,
                                         true_value = "Y",
                                         false_value = NA_character_,
                                         missing_value = NA_character_,
                                         filter_add = NULL) {
  new_var <- assert_symbol(enexpr(new_var))
  condition <- assert_filter_cond(enexpr(condition))
  filter_add <-
    assert_filter_cond(enexpr(filter_add), optional = TRUE)
 
  add_data <- filter_if(dataset_add, filter_add) %>%
    mutate(!!new_var := if_else(!!condition, 1, 0, 0))
 
  derive_vars_merged(
    dataset,
    dataset_add = add_data,
    by_vars = by_vars,
    new_vars = exprs(!!new_var),
    order = exprs(!!new_var),
    check_type = "none",
    mode = "last"
  ) %>%
    mutate(!!new_var := if_else(!!new_var == 1, true_value, false_value, missing_value))
}

Turned into:

exist_flag <- function(df,
                                         new_var,
                                         condition,
                                         true_value = "Y",
                                         false_value = NA_character_,
                                         missing_value = NA_character_,
                                         filter_add = NULL
                                         ) {

  new_var <- assert_symbol(enexpr(new_var))
  condition <- assert_filter_cond(enexpr(condition))  
  filter_add <-
    assert_filter_cond(enexpr(filter_add), optional = TRUE)
 
  add_data <- filter_if(dataset_add, filter_add) %>%
    mutate(!!new_var := if_else(!!condition, 1, 0, 0))
}
 
derive_var_merged_exist_flag <- function(dataset,
                                         dataset_add,
                                         by_vars,
                                         new_var,
                                         condition,
                                         true_value = "Y",
                                         false_value = NA_character_,
                                         missing_value = NA_character_,
                                         filter_add = NULL) {

  add_data <- exist_flag(df, new_var, condition, true_value, false_value, missing_value, filter_add)
 
  derive_vars_merged(
    dataset,
    dataset_add = add_data,
    by_vars = by_vars,
    new_vars = exprs(!!new_var),
    order = exprs(!!new_var),
    check_type = "none",
    mode = "last"
  ) %>%
    mutate(!!new_var := if_else(!!new_var == 1, true_value, false_value, missing_value))
}

This will allow users, who are not keen on directly using merging onto output dataset make use of admiral functionality without maintaining a separate fork of the repo.

The proposal is that:
• Admiral or GSK Mercury team implements the proposed change and raises as PR in admiral repository
• Admiral team prioritizes the review of the PR and release to make sure Mercury can meet our timelines (release end of Q2 latest)
• Implementing abstracting functions like exist_flag becomes standard practice in admiral

Alexey

@bms63 bms63 linked a pull request May 23, 2024 that will close this issue
15 tasks
bms63 added a commit that referenced this issue May 28, 2024
* added derive_var_exist_flag

* updated documentation for potential PR !!! OTHER THAN EXAMPLES !!!

* restructured per design patterns

* for derive_vars_query (added derive_vars_for_query)

* minor documentation edits

* added new files to match admiral architecture

* troubleshooting tests

* staging for container

* stash for restart

* ensuring tests are uploaded

* fixed spacing

* working as is

* fixed tests

* Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* added tests back in

* updated the news

* styled

* Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Delete .R/admiraldev/DESCRIPTION

removing unncessary file

* Delete .R/admiraldev/INDEX

removing unnecessary file

* Delete .R/admiraldev/NAMESPACE

removing unnecessary changes

* Delete .R/admiraldev/NEWS.md

removing unnecessary changes

* Delete .R/admiraldev/R/admiraldev

removed unnecessary changes

* Delete .R/admiraldev/WORDLIST

removed unnecessary changes

* removed unneeded files

* reset devcontainer.json password

* Reset devcontainer.json username

* reset .gitignore

* Update .gitignore whitespace

* added changes

* Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Fixed testing errors

Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Delete tests/testthat/_snaps/derive_vars_query.new.md

* Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Updated News to include derive_vars_query

Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* staging changes

Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Updating .Rd file after changes to docs

Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Fix spelling error

Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Updated as per comments

Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* update docs

Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Update R/get_flagged_records.R

Co-authored-by: Stefan Bundfuss <80953585+bundfussr@users.noreply.github.com>

* removed .Rprofile

* Updating docs

Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* results of devtools

Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* reset .gitignore

* reset .gitignore

* Added tryCatch - updated per alexey's suggestions

Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Added test for filter condition

Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Added NA_character_ tests

Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Update NEWS.md

Co-authored-by: Stefan Bundfuss <80953585+bundfussr@users.noreply.github.com>

* Update R/derive_vars_query.R

Co-authored-by: Stefan Bundfuss <80953585+bundfussr@users.noreply.github.com>

* Update R/get_flagged_records.R

Co-authored-by: Stefan Bundfuss <80953585+bundfussr@users.noreply.github.com>

* Update R/get_flagged_records.R

Co-authored-by: Stefan Bundfuss <80953585+bundfussr@users.noreply.github.com>

* Update R/get_flagged_records.R

Co-authored-by: Stefan Bundfuss <80953585+bundfussr@users.noreply.github.com>

* Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* Update final touches, add myself to README

Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

* added my name to the WORDLIST

Pushed from Domino: https://domino-uat.gsk.com/workspace/im631011/admiral?executionId=66339539d022324fe623e848

---------

Co-authored-by: Iain McCay <iain.x.mccay@gsk.com>
Co-authored-by: Ben Straub <ben.x.straub@gsk.com>
Co-authored-by: Stefan Bundfuss <80953585+bundfussr@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

1 participant