Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Credential Management to Enable Multi-instance Operations #41

Open
ChemiKyle opened this issue Mar 31, 2022 · 0 comments · Fixed by #43
Open

Support Credential Management to Enable Multi-instance Operations #41

ChemiKyle opened this issue Mar 31, 2022 · 0 comments · Fixed by #43

Comments

@ChemiKyle
Copy link
Contributor

ChemiKyle commented Mar 31, 2022

Note: this issue is describing two features, credential management is required for multi-instance operations. I have already written most of the former, the latter I have a handful of scripts to do but is much more open ended in general.

The ability to shuttle data between different projects (possibly on different servers) is cumbersome with the .env model, requiring supporting multiple variables for different project tokens, and different .env files for multiple REDCap instances.

We would benefit from the ability to support the use of arbitrary credentials to be passed through to REDCapR's API actions. REDCapR has something to do this but it's not an out of the box solution. The idea of having a single object that represents credentials to a project on a specific host is useful however.

Credential storage model

I propose we store credentials in a database table with the following columns, they are a superset of those contained in REDCapR's implementation:

  • redcap_uri
    api url, e.g. "localhost:11051/api"

  • server_short_name
    Memorable name for a server for easy lookup, e.g. "local 1051"

  • username
    e.g. "admin"

  • project_id

  • project_display_name
    app_title from redcap_projects table, the name displayed to users in REDCap UI

  • project_short_name
    Memorable name for easier dev, perhaps in another table that maps project_name to project_short_name; it's a lot easier to just search for the display name and select the right project_id, we don't want anything in prod using anything other than project_id. too bad we need to be able to generically refer to projects across servers!

  • token
    32 chars, 64 chars for Super Token

  • comment
    Any additional commentary

Functions which use these credential objects should only need redcap_uri and token, and should be compatible with REDCapR's credential objects.

Proposed schema, column types copied from REDCap tables where appropriate:

CREATE TABLE IF NOT EXISTS `credentials` (
  `redcap_uri` TEXT NOT NULL,
  `server_short_name` varchar(128) COLLATE utf8mb4_unicode_ci NOT NULL,
  `username` varchar(191) NOT NULL,
  `project_id` int(10) NOT NULL,
  `project_display_name` TEXT NOT NULL,
  `project_short_name` varchar(128) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `token` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL,
  `comment` varchar(256) COLLATE utf8mb4_unicode_ci NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

Separation of credentials and production consideration

Production scripts would read from a central db with tokens for a special API user.

Individual machines would use a local (likely SQLite) db for their own credentials. Regarding the latter, I have experimental functions that scrape a REDCap instance for all of a specified user's existing tokens, and another that creates tokens on a server for a specified list of project IDs.

Easier testing

Combined with functions that create projects and API tokens, this should also let us recreate production servers easily on freshly built local containers and run (automated!) tests.

Caveat

Currently I'm not privvy to API methods to gather External Module settings, I am able to do this with separate conn objects, but this requires specialized or multiple .env files in my current implementation. I also might need access to the server's filesystem to read module config files (i.e. to detect if a field is an event and requires migration to fit the target project).


Examples

Syncing data dictionary from a prod project to a training one

Here is an example of keeping project metadata (data dictionary) synced between 2 projects:

# simplified, minimal sync that uses credential objects
sync_metadata_with_credentials <- function(source_credentials, target_credentials) {
  source_metadata <- REDCapR::redcap_metadata_read(
    redcap_uri = source_credentials$redcap_uri,
    token = source_credentials$token
  )
  target_metadata <- REDCapR::redcap_metadata_read(
    redcap_uri = target_credentials$redcap_uri,
    token = target_credentials$token
  )
  if (!identical(target_metadata$data, source_metadata$data)) {
    REDCapR::redcap_metadata_write(
      ds = source_metadata,
      redcap_uri = target_credentials$redcap_uri,
      token = target_credentials$token
    )
  } else {
    print("metadata already up to date")
  }
}

# use dplyr to locate proper credential sets
source_credentials <- tbl(credential_db_conn, "credentials") %>%
  filter(username == "etl_user") %>%
  filter(project_short_name = "main study") %>%
  filter(server_short_name = "prod") %>%
  collect() %>%
  unnest()

# examine credential object
t(source_credentials)
#                      [,1]                              
# project_id           "15"                              
# username             "etl_user"                           
# token                "MELLON"
# project_display_name "Main Study Long Name in REDCap UI"                    
# project_short_name   "main study"                    
# redcap_uri           "http://redcap-production-server.university.edu/api/"     
# server_short_name    "prod"         
# comment              ""
             

target_credentials <- tbl(credential_db_conn, "credentials") %>%
  filter(username == "admin") %>%
  filter(project_short_name = "main study") %>%
  filter(server_short_name = "training") %>%
  collect() %>%
  unnest()

sync_metadata_with_credentials(source_credentials, target_credentials)

For reference, here is an implementation that requires a special .env object with prefixes for SOURCE and TARGET information:

# a special env 
sync_metadata_with_env <- function(project) {
  PROJECT <- toupper(project)
  # Data dictionary
  source_metadata <- REDCapR::redcap_metadata_read(
    redcap_uri = Sys.getenv("SOURCE_URI"),
    token = Sys.getenv(paste0("SOURCE_", PROJECT, "_TOKEN"))
  )
  target_metadata <- REDCapR::redcap_metadata_read(
    redcap_uri = Sys.getenv("TARGET_URI"),
    token = Sys.getenv(paste0("TARGET_", PROJECT, "_TOKEN"))
  )
  if (!identical(target_metadata$data, source_metadata$data)) {
    REDCapR::redcap_metadata_write(
      ds = source_metadata,
      redcap_uri = Sys.getenv("TARGET_URI"),
      token = Sys.getenv(paste0("TARGET_", PROJECT, "_TOKEN"))
    )
  } else {
    print(paste(project, "is already up to date"))
  }
}
@ChemiKyle ChemiKyle linked a pull request Apr 12, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant