Skip to content

Commit

Permalink
Add OHDSI workspace service (#3562)
Browse files Browse the repository at this point in the history
* add ohdsi workspace service (before adjustments to the OSS)

* add execute permission to scripts

* Fix Postgres timeouts when install OHDSI (#3559)

[ohdsi] fix postgres timeouts

* Add OHDSI workspace service (#3552)

* remove todos

* link core vnet to postgres private dns zone when deploying core

* remove synapse references, and add other data sources to the options

* remove postgres_core_dns_link references

* revert synapse reference deletions

* remove non supported dialects

* add execute permission to scripts

* make some of the daimons required

* add required zone field to postgres:
hashicorp/terraform-provider-azurerm#16888

* add fw rule to allow open id authentication in atlas

* fix firewall step

* add README

* update README

* fix linting errors

* fix linting errors

* update changelog

* Update CHANGELOG.md

Co-authored-by: Tamir Kamara <26870601+tamirkamara@users.noreply.github.com>

* add ohdsi ws service to the CI

* clarified  README

* added name, description and overview to the template_schema

* move README content to docs

* change default display name

* add diagram to and instructions about setting up the CDM data source

* add link to ohdsi-on-azure

* move ohdsi-on-azure to the top

* link to OHDSIonAzure for deploying synapse

* Update docs/tre-templates/workspace-services/ohdsi.md

Co-authored-by: Marcus Robinson <marrobi@microsoft.com>

* add Using a sample CDM data source section

---------

Co-authored-by: Tamir Kamara <26870601+tamirkamara@users.noreply.github.com>
Co-authored-by: Marcus Robinson <marrobi@microsoft.com>
  • Loading branch information
3 people authored Jun 26, 2023
1 parent 0ed36d0 commit 8c82fd7
Show file tree
Hide file tree
Showing 36 changed files with 2,009 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .github/workflows/deploy_tre_reusable.yml
Original file line number Diff line number Diff line change
Expand Up @@ -392,6 +392,8 @@ jobs:
BUNDLE_DIR: "./templates/workspace_services/health-services"}
- {BUNDLE_TYPE: "workspace_service",
BUNDLE_DIR: "./templates/workspace_services/databricks"}
- {BUNDLE_TYPE: "workspace_service",
BUNDLE_DIR: "./templates/workspace_services/ohdsi"}
- {BUNDLE_TYPE: "user_resource",
BUNDLE_DIR: "./templates/workspace_services/guacamole/user_resources/guacamole-azure-windowsvm"}
- {BUNDLE_TYPE: "user_resource",
Expand Down Expand Up @@ -549,6 +551,8 @@ jobs:
BUNDLE_DIR: "./templates/workspace_services/health-services"}
- {BUNDLE_TYPE: "workspace_service",
BUNDLE_DIR: "./templates/workspace_services/databricks"}
- {BUNDLE_TYPE: "workspace_service",
BUNDLE_DIR: "./templates/workspace_services/ohdsi"}

environment: ${{ inputs.environmentName }}
steps:
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ FEATURES:
ENHANCEMENTS:
* Workspace networking peering sync is handled natively by Terraform ([#3534](https://github.com/microsoft/AzureTRE/issues/3534))
* Use SMTP built in connector vs API connector in Airlock Notifier ([#3572](https://github.com/microsoft/AzureTRE/issues/3572))
* Added OHDSI workspace service ([#3562](https://github.com/microsoft/AzureTRE/issues/3562))

BUG FIXES:
* Nexus might fail to deploy due to wrong identity used in key-vault extension ([#3492](https://github.com/microsoft/AzureTRE/issues/3492))
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
38 changes: 38 additions & 0 deletions docs/tre-templates/workspace-services/ohdsi.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# OHDSI Workspace Service

!!! warning
- This workspace service does not work "out of the box". It requires additional networking configuration to work properly. See the [networking configuration](#networking-configuration) section for more details.
- Currently the only CDM data source supported by the workspace service is Azure Synapse.

See the [official OHDSI website](https://www.ohdsi.org/) and [The Book of OHDSI](https://ohdsi.github.io/TheBookOfOhdsi/).

This service installs the following resources into an existing virtual network within the workspace:
![OHDSI ATLAS Workspace Service](images/ohdsi_service.png)

## Networking configuration
Deploying the OHDSI workspace is not enough for it to function properly, in order for it to work properly, the following networking configuration should be in place:

### 1. The resource processor should be able to access the CDM data source
Multiple OHDSI workspace services cannot share the same RESULTS and TEMP schemas because each OHDSI instance is changing the schemas, which could potentially cause conflicts.
To avoid this, every workspace service must work on its own schemas. To do this, we use golden copying.
This means that the "main" schemas remain untouched, and every workspace service has its own copy of the RESULTS and TEMP schemas, in the CDM data source, which it can modify.

Since the resource processor is in charge of duplicating the schemas, the CDM data source has to be accessible from the resource processor's VNet in order to be able to create them.

### 2. The workspace should be able to access the CDM data source
In order to access the CDM from ATLAS, the CDM data source should be accessible from the workspace's VNet.
Since the CDM data source is outside of TRE, this is not part of the template, however, there are many ways in which this can be done,
one example would be to to deploy a private endpoint for the CDM data source in the workspace's VNet as part of a custom workspace template.

## Setting up a CDM data source
Currently the only CDM data source supported by the workspace service is Azure Synapse.

If you already have an OMOP CDM data source, then all you have to do is to configure the network as described in the [networking configuration](#networking-configuration) section.

If you're data is in a different format, you can read [here](https://ohdsi.github.io/TheBookOfOhdsi/ExtractTransformLoad.html) how to set up the ETL process to convert your medical data to OMOP format.

## Using a sample CDM data source
If you don't have any data yet, or if you just want a quick start, you can deploy an Azure Synapse CDM data source with sample data using the [OHDSI on Azure](https://github.com/microsoft/OHDSIonAzure) repository.
When deploying set `OMOP CDM Database Type` to `Synapse Dedicated Pool` as per the [deployment guide](https://github.com/microsoft/OHDSIonAzure/blob/main/docs/DeploymentGuide.md#:~:text=OMOP%20CDM%20Database%20Type).

Note that you will need to provision a private endpoint into the Azure TRE workspace that connects to the SQL Dedicated Pool as described in the [networking configuration](#networking-configuration) section.
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,7 @@ nav:
- MLFlow: tre-templates/workspace-services/mlflow.md
- Health Services: tre-templates/workspace-services/health_services.md
- Azure Databricks: tre-templates/workspace-services/databricks.md
- OHDSI: tre-templates/workspace-services/ohdsi.md
- Shared Services:
- Gitea (Source Mirror): tre-templates/shared-services/gitea.md
- Nexus (Package Mirror): tre-templates/shared-services/nexus.md
Expand Down
7 changes: 7 additions & 0 deletions templates/workspace_services/ohdsi/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Local .terraform directories
**/.terraform/*

# TF backend files
**/*_backend.tf

Dockerfile.tmpl
5 changes: 5 additions & 0 deletions templates/workspace_services/ohdsi/.env.sample
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
ID="__CHANGE_ME__"
WORKSPACE_ID="__CHANGE_ME__"
TRE_ID="__CHANGE_ME__"
MGMT_RESOURCE_GROUP_NAME="__CHANGE_ME__"
MGMT_ACR_NAME="__CHANGE_ME__"
29 changes: 29 additions & 0 deletions templates/workspace_services/ohdsi/Dockerfile.tmpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# syntax=docker/dockerfile-upstream:1.4.0
FROM debian:bullseye-slim

# PORTER_INIT

RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache

# sqlcmd is required for schemas initialization in AzureSynapse
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
# ignore lint rule that requires `--no-install-recommends` to allow the microsoft packeges to get everything they need and clear it all up in the end
# hadolint ignore=DL3015
RUN apt-get update && apt-get install -y curl gnupg && \
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - && \
echo 'deb https://packages.microsoft.com/debian/11/prod bullseye main'> /etc/apt/sources.list.d/prod.list && \
apt-get update && apt-get -y install sqlcmd --no-install-recommends && \
apt-get clean && rm -rf /var/lib/apt/lists/*

# Git is required for terraform_azurerm_environment_configuration
RUN --mount=type=cache,target=/var/cache/apt --mount=type=cache,target=/var/lib/apt \
apt-get update && apt-get install -y git --no-install-recommends

# PostgreSql is required by Atlas
RUN --mount=type=cache,target=/var/cache/apt --mount=type=cache,target=/var/lib/apt \
apt-get update && apt-get install -y postgresql-client gettext apache2-utils curl jq --no-install-recommends

# PORTER_MIXINS

# Use the BUNDLE_DIR build argument to copy files into the bundle
COPY --link . ${BUNDLE_DIR}/
7 changes: 7 additions & 0 deletions templates/workspace_services/ohdsi/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# OHDSI Workspace Service

## IMPORTANT
- This workspace service does not work "out of the box". It requires additional networking configuration to work properly.
- Currently the only CDM data source supported by the workspace service is Azure Synapse.

Further details are provided in the [documentation](https://microsoft.github.io/AzureTRE/latest/tre-templates/workspace-services/ohdsi/).
80 changes: 80 additions & 0 deletions templates/workspace_services/ohdsi/parameters.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
{
"schemaType": "ParameterSet",
"schemaVersion": "1.0.1",
"namespace": "",
"name": "tre-workspace-service-ohdsi",
"parameters": [
{
"name": "tre_id",
"source": {
"env": "TRE_ID"
}
},
{
"name": "id",
"source": {
"env": "ID"
}
},
{
"name": "tfstate_container_name",
"source": {
"env": "TERRAFORM_STATE_CONTAINER_NAME"
}
},
{
"name": "tfstate_resource_group_name",
"source": {
"env": "MGMT_RESOURCE_GROUP_NAME"
}
},
{
"name": "tfstate_storage_account_name",
"source": {
"env": "MGMT_STORAGE_ACCOUNT_NAME"
}
},
{
"name": "workspace_id",
"source": {
"env": "WORKSPACE_ID"
}
},
{
"name": "address_space",
"source": {
"env": "ADDRESS_SPACE"
}
},
{
"name": "arm_environment",
"source": {
"env": "ARM_ENVIRONMENT"
}
},
{
"name": "azure_environment",
"source": {
"env": "AZURE_ENVIRONMENT"
}
},
{
"name": "configure_data_source",
"source": {
"env": "CONFIGURE_DATA_SOURCE"
}
},
{
"name": "data_source_config",
"source": {
"env": "DATA_SOURCE_CONFIG"
}
},
{
"name": "data_source_daimons",
"source": {
"env": "DATA_SOURCE_DAIMONS"
}
}
]
}
185 changes: 185 additions & 0 deletions templates/workspace_services/ohdsi/porter.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
---
schemaVersion: 1.0.0
name: tre-workspace-service-ohdsi
version: 0.1.94
description: "An OHDSI workspace service"
registry: azuretre
dockerfile: Dockerfile.tmpl

custom:
dialects:
"Azure Synapse": "synapse"

credentials:
- name: azure_tenant_id
env: ARM_TENANT_ID
- name: azure_subscription_id
env: ARM_SUBSCRIPTION_ID
- name: azure_client_id
env: ARM_CLIENT_ID
- name: azure_client_secret
env: ARM_CLIENT_SECRET

parameters:
- name: workspace_id
type: string
- name: tre_id
type: string
- name: address_space
type: string
description: "Address space for PostgreSQL's subnet"

- name: id
type: string
description: "An Id for this installation"
env: id
- name: tfstate_resource_group_name
type: string
description: "Resource group containing the Terraform state storage account"
- name: tfstate_storage_account_name
type: string
description: "The name of the Terraform state storage account"
- name: tfstate_container_name
env: tfstate_container_name
type: string
default: "tfstate"
description: "The name of the Terraform state storage container"
- name: arm_use_msi
env: ARM_USE_MSI
type: boolean
default: false
- name: arm_environment
type: string
- name: azure_environment
type: string
description: "Used by Azure CLI to set the Azure environment"

# parameters for configuring the data source
- name: configure_data_source
type: boolean
default: false
- name: data_source_config
type: string
default: ""
- name: data_source_daimons
type: string
default: ""

mixins:
- terraform:
clientVersion: 1.4.6
- az:
clientVersion: 2.37.0

outputs:
- name: connection_uri
type: string
applyTo:
- install
- upgrade
- name: webapi_uri
type: string
applyTo:
- install
- upgrade
- name: authentication_callback_uri
type: string
applyTo:
- install
- upgrade
- name: is_exposed_externally
type: boolean
applyTo:
- install
- upgrade


install:
- az:
description: "Set Azure Cloud Environment"
arguments:
- cloud
- set
flags:
name: ${ bundle.parameters.azure_environment }
- az:
description: "Login to Azure"
arguments:
- login
flags:
identity:
username: ${ bundle.credentials.azure_client_id }
- terraform:
description: "Deploy OHDSI workspace service"
vars:
workspace_id: ${ bundle.parameters.workspace_id }
tre_id: ${ bundle.parameters.tre_id }
tre_resource_id: ${ bundle.parameters.id }
address_space: ${ bundle.parameters.address_space }
arm_environment: ${ bundle.parameters.arm_environment }
configure_data_source: ${ bundle.parameters.configure_data_source }
data_source_config: ${ bundle.parameters.data_source_config }
data_source_daimons: ${ bundle.parameters.data_source_daimons }
backendConfig:
resource_group_name: ${ bundle.parameters.tfstate_resource_group_name }
storage_account_name: ${ bundle.parameters.tfstate_storage_account_name }
container_name: ${ bundle.parameters.tfstate_container_name }
key: tre-workspace-service-ohdsi-${ bundle.parameters.id }
outputs:
- name: connection_uri
- name: webapi_uri
- name: authentication_callback_uri
- name: is_exposed_externally
upgrade:
- az:
description: "Set Azure Cloud Environment"
arguments:
- cloud
- set
flags:
name: ${ bundle.parameters.azure_environment }
- az:
description: "Login to Azure"
arguments:
- login
flags:
identity:
username: ${ bundle.credentials.azure_client_id }
- terraform:
description: "Upgrade shared service"
vars:
workspace_id: ${ bundle.parameters.workspace_id }
tre_id: ${ bundle.parameters.tre_id }
tre_resource_id: ${ bundle.parameters.id }
address_space: ${ bundle.parameters.address_space }
arm_environment: ${ bundle.parameters.arm_environment }
configure_data_source: ${ bundle.parameters.configure_data_source }
data_source_config: ${ bundle.parameters.data_source_config }
data_source_daimons: ${ bundle.parameters.data_source_daimons }
backendConfig:
resource_group_name: ${ bundle.parameters.tfstate_resource_group_name }
storage_account_name: ${ bundle.parameters.tfstate_storage_account_name }
container_name: ${ bundle.parameters.tfstate_container_name }
key: tre-workspace-service-ohdsi-${ bundle.parameters.id }
outputs:
- name: connection_uri
- name: webapi_uri
- name: authentication_callback_uri
- name: is_exposed_externally
uninstall:
- terraform:
description: "Tear down OHDSI workspace service"
vars:
workspace_id: ${ bundle.parameters.workspace_id }
tre_id: ${ bundle.parameters.tre_id }
tre_resource_id: ${ bundle.parameters.id }
address_space: ${ bundle.parameters.address_space }
arm_environment: ${ bundle.parameters.arm_environment }
configure_data_source: ${ bundle.parameters.configure_data_source }
data_source_config: ${ bundle.parameters.data_source_config }
data_source_daimons: ${ bundle.parameters.data_source_daimons }
backendConfig:
resource_group_name: ${ bundle.parameters.tfstate_resource_group_name }
storage_account_name: ${ bundle.parameters.tfstate_storage_account_name }
container_name: ${ bundle.parameters.tfstate_container_name }
key: tre-workspace-service-ohdsi-${ bundle.parameters.id }
Loading

0 comments on commit 8c82fd7

Please sign in to comment.