Support Credential Management to Enable Multi-instance Operations #41

ChemiKyle · 2022-03-31T15:55:05Z

Note: this issue is describing two features, credential management is required for multi-instance operations. I have already written most of the former, the latter I have a handful of scripts to do but is much more open ended in general.

The ability to shuttle data between different projects (possibly on different servers) is cumbersome with the .env model, requiring supporting multiple variables for different project tokens, and different .env files for multiple REDCap instances.

We would benefit from the ability to support the use of arbitrary credentials to be passed through to REDCapR's API actions. REDCapR has something to do this but it's not an out of the box solution. The idea of having a single object that represents credentials to a project on a specific host is useful however.

Credential storage model

I propose we store credentials in a database table with the following columns, they are a superset of those contained in REDCapR's implementation:

redcap_uri
api url, e.g. "localhost:11051/api"
server_short_name
Memorable name for a server for easy lookup, e.g. "local 1051"
username
e.g. "admin"
project_id
project_display_name
app_title from redcap_projects table, the name displayed to users in REDCap UI
project_short_name
Memorable name for easier dev, perhaps in another table that maps project_name to project_short_name; ~~it's a lot easier to just search for the display name and select the right project_id, we don't want anything in prod using anything other than project_id.~~ too bad we need to be able to generically refer to projects across servers!
token
32 chars, 64 chars for Super Token
comment
Any additional commentary

Functions which use these credential objects should only need redcap_uri and token, and should be compatible with REDCapR's credential objects.

Proposed schema, column types copied from REDCap tables where appropriate:

CREATE TABLE IF NOT EXISTS `credentials` (
  `redcap_uri` TEXT NOT NULL,
  `server_short_name` varchar(128) COLLATE utf8mb4_unicode_ci NOT NULL,
  `username` varchar(191) NOT NULL,
  `project_id` int(10) NOT NULL,
  `project_display_name` TEXT NOT NULL,
  `project_short_name` varchar(128) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `token` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL,
  `comment` varchar(256) COLLATE utf8mb4_unicode_ci NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

Separation of credentials and production consideration

Production scripts would read from a central db with tokens for a special API user.

Individual machines would use a local (likely SQLite) db for their own credentials. Regarding the latter, I have experimental functions that scrape a REDCap instance for all of a specified user's existing tokens, and another that creates tokens on a server for a specified list of project IDs.

Easier testing

Combined with functions that create projects and API tokens, this should also let us recreate production servers easily on freshly built local containers and run (automated!) tests.

Caveat

Currently I'm not privvy to API methods to gather External Module settings, I am able to do this with separate conn objects, but this requires specialized or multiple .env files in my current implementation. I also might need access to the server's filesystem to read module config files (i.e. to detect if a field is an event and requires migration to fit the target project).

Examples

Syncing data dictionary from a prod project to a training one

Here is an example of keeping project metadata (data dictionary) synced between 2 projects:

# simplified, minimal sync that uses credential objects
sync_metadata_with_credentials <- function(source_credentials, target_credentials) {
  source_metadata <- REDCapR::redcap_metadata_read(
    redcap_uri = source_credentials$redcap_uri,
    token = source_credentials$token
  )
  target_metadata <- REDCapR::redcap_metadata_read(
    redcap_uri = target_credentials$redcap_uri,
    token = target_credentials$token
  )
  if (!identical(target_metadata$data, source_metadata$data)) {
    REDCapR::redcap_metadata_write(
      ds = source_metadata,
      redcap_uri = target_credentials$redcap_uri,
      token = target_credentials$token
    )
  } else {
    print("metadata already up to date")
  }
}

# use dplyr to locate proper credential sets
source_credentials <- tbl(credential_db_conn, "credentials") %>%
  filter(username == "etl_user") %>%
  filter(project_short_name = "main study") %>%
  filter(server_short_name = "prod") %>%
  collect() %>%
  unnest()

# examine credential object
t(source_credentials)
#                      [,1]                              
# project_id           "15"                              
# username             "etl_user"                           
# token                "MELLON"
# project_display_name "Main Study Long Name in REDCap UI"                    
# project_short_name   "main study"                    
# redcap_uri           "http://redcap-production-server.university.edu/api/"     
# server_short_name    "prod"         
# comment              ""
             

target_credentials <- tbl(credential_db_conn, "credentials") %>%
  filter(username == "admin") %>%
  filter(project_short_name = "main study") %>%
  filter(server_short_name = "training") %>%
  collect() %>%
  unnest()

sync_metadata_with_credentials(source_credentials, target_credentials)

For reference, here is an implementation that requires a special .env object with prefixes for SOURCE and TARGET information:

# a special env 
sync_metadata_with_env <- function(project) {
  PROJECT <- toupper(project)
  # Data dictionary
  source_metadata <- REDCapR::redcap_metadata_read(
    redcap_uri = Sys.getenv("SOURCE_URI"),
    token = Sys.getenv(paste0("SOURCE_", PROJECT, "_TOKEN"))
  )
  target_metadata <- REDCapR::redcap_metadata_read(
    redcap_uri = Sys.getenv("TARGET_URI"),
    token = Sys.getenv(paste0("TARGET_", PROJECT, "_TOKEN"))
  )
  if (!identical(target_metadata$data, source_metadata$data)) {
    REDCapR::redcap_metadata_write(
      ds = source_metadata,
      redcap_uri = Sys.getenv("TARGET_URI"),
      token = Sys.getenv(paste0("TARGET_", PROJECT, "_TOKEN"))
    )
  } else {
    print(paste(project, "is already up to date"))
  }
}

The text was updated successfully, but these errors were encountered:

ChemiKyle mentioned this issue Apr 12, 2022

Add credential management #43

Merged

ChemiKyle linked a pull request Apr 12, 2022 that will close this issue

Add credential management #43

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Credential Management to Enable Multi-instance Operations #41

Support Credential Management to Enable Multi-instance Operations #41

ChemiKyle commented Mar 31, 2022 •

edited

Loading

Support Credential Management to Enable Multi-instance Operations #41

Support Credential Management to Enable Multi-instance Operations #41

Comments

ChemiKyle commented Mar 31, 2022 • edited Loading

Credential storage model

Separation of credentials and production consideration

Easier testing

Caveat

Examples

Syncing data dictionary from a prod project to a training one

ChemiKyle commented Mar 31, 2022 •

edited

Loading