First, please confirm that you meet the following user requirements:
- Access to an Azure resource group where the starter kit has been deployed and the following roles:
Storage Blob Data Contributor
on the PHI storage accountFHIR Data Contributor
on the FHIR server
- All of the files in the sample-data/ directory have been downloaded to your computer.
- They can be downloaded as a zip file from this link.
If your organization has not yet implemented the pipeline, please follow the steps in this Implementation Guide.
If you do not have sufficient access, please contact your organization's Azure administrator.
This how to guide is divided into 4 parts:
- Accessing your Azure account
- Uploading and running data through the pipline
- Monitoring pipeline activities in Azure Data Factory
- Viewing cleaned and enriched data in the FHIR server
The sample-data/
directory contains dummy unsoliciated vaccination record update (VXU) messages that can be used to test the success and failure modes of the ingestion pipeline. To start, let's use VXU_single_messy_demo.hl7
file that has a single VXU message. The Patient Identifier Segment (PID) of this message (shown below) contains some dirty data:
- The patient's name is mixed case and contains a numeric character.
- The patient's phone number is not in a standard format.
- The patient's address is non-standard and has not been geocoded.
PID|1|7777555^4^M11^test^MR^University Hospital^19241011^19241012|PATID7755^5^M11^test1|PATID7758^^^test5|
- doe .^ John1 ^A.
|TEST^Mother, of^L|198505101126+0215|M||2106-3^White^HL70005|
- 555 E. 3065 S.^^Salt Lake CIty^ut^84106^USA
||
- 801-540-3661^^CP
|||M^Married||4880776||||N^NOT HISPANIC OR LATINO^HL70189||N||US^United States of America^ISO3166_1||||N|||20080110015014+0315|||||||
If you would like, feel free to confirm that this is the case by inspecting the file directly in the text editor of your choice.
- Open https://portal.azure.com/ in your browser and log in with your username and password.
- Go to the search bar in the header and search for
Resource groups
. Click onResource groups
in the search dropdown. - Click into the appropriate Azure resource group.
Note: If you prefer, you can upload data using the Azure Storage Explorer Tool. We don't provide instructions for using that tool here, but the broad strokes will be the same - you'll need to upload
sample-data/VXU-V04-01_success_single.hl7
to thesource-data
container in your PHI storage account.
-
Within your
Resource group
, filter down to view onlyStorage account
type resources. To do so, click into theFilter for any field...
search bar and type inStorage account
. -
Click into the
PHI
storage account, which is where all Protected Health Information is stored outside of the FHIR server. The precise name of the storage bucket will have the formphdi{environment}phi{clientId}
, e.g.,phdidevphi1667849158
. -
Go to the left sidebar and under the
Data storage
header, clickContainers
. -
Now we're ready to run a VXU message through the pipeline! Click the
Upload
button in the toolbar. -
Navigate to the folder on your computer where you've downloaded or forked the
sample-data/
folder and select theVXU-V04-01_success_single.hl7
file. ClickOpen
, then clickUpload
to upload this file into thesource-data/vxu/
directory of your PHI bucket.Note: because the ingestion pipeline is event-driven, simply uploading the file is all that is required to trigger the pipeline. There is an event listener monitoring the PHI bucket for file creation events.
Congrats! You've run a VXU message through the pipeline. Now you can view the pipeline run to confirm that it ran successfully without errors.
-
To check that the pipeline has executed, go to the search bar in the header, and search for
Data factories
. Select theData factories
option in the search dropdown. -
Click into your data factory, which will have the form
phdi-{environment}-data-factory-{client-id}
. -
Launch the Data Factory Studio by clicking the blue button that says
Launch studio
-
In the left sidebar, click on the the
Monitor
tab (radar icon) to view the pipeline runs. -
Click into your pipeline run, which will have the form
phdi-{environment}-ingestion
. If you see multiple pipline runs with this name, select the pipeline with the most recent run start time.Note: this pipeline may still have an
In progress
status. You can click the 'refresh' button to view the pipeline run in its most updated status. -
After clicking into your pipeline run, you should see a diagram showing the steps of the pipeline and a table with information about each activity. We should now see by the green checkmarks and
succeeded
statuses next to each activity that the ingestion pipeline has processed one message successfully.Note: You can click the 'refresh' button to view the pipeline run activities in their most recent status.
-
Open another tab and go to https://shell.azure.com.
Instructions for first time users:
A pop up will appear asking you to select either the Bash or PowerShell option. Select the
Bash
option.A second pop up will appear stating "You have no storage mounted". Click
Create storage
. -
Confirm that you're in Bash mode by checking that the dropdown in the top left under the
Microsoft Azure
header hasBash
selected. -
Then in the terminal, type the command
az login
and press enter on your keyboard. -
Click the device login link, and paste in the authentication code.
-
Follow the prompts to complete login. After logging in, you should see a pop up that says "You have signed in...".
Now that we're authenticated, we're going to try to search for a patient named "John Doe" in the VXU message we ran earlier.
- A couple of the commands below will require you to update the URL in the command with the name of your FHIR server. To get the name of your FHIR server, go back to the tab with portal.azure.com open. Then in the search bar, type in
Azure API for FHIR
and select this option in the search dropdown. - You should see your FHIR server which will have the following form:
{environment}fhir{client-id}
. Click into your FHIR server. - Copy the name of your FHIR server which will have the following form: (
{environment}fhir{client-id}
). - Go back to the tab with Cloud Shell open. Copy and paste the command below, then replace the FHIR_SERVER_NAME text with the pasted name of your FHIR server. This will save the name of your FHIR server into a variable that will be referenced in later commands.
export FHIR_SERVER=FHIR_SERVER_NAME
- Copy and paste the below command into the terminal and hit enter. This gets you a bearer token that will be used to authenticate in the next command:
token=$(az account get-access-token --resource=https://$FHIR_SERVER.azurehealthcareapis.com --query accessToken --output tsv)
- Copy and paste the below command into the terminal and hit enter. This uses the bearer token from above to authenticate and search for the "John Doe" user:
RESPONSE=$(curl -X GET --header "Authorization: Bearer $token" "https://$FHIR_SERVER.azurehealthcareapis.com/Patient?family=DOE&given=JOHN")
- Finally, copy and paste the below command into the terminal and hit enter. This pretty-prints the JSON response showing that John Doe was found in the VXU message. After you're done reviewing the response, press
q
on your keyboard to exit out of the jq tool and return to the terminal.echo $RESPONSE | jq | less
For other FHIR API calls that you can make in the Cloud Shell, see documentation on FHIR API Calls
The table below describes the contents and expected ingestion pipeline behavior for each of the other files included in sample-data/
. Choose another message to run through the pipeline to see what a pipeline run with an expected error or a batch message will look like.
Detailed troubleshooting instructions can be found in the Pipeline Troubleshooting Guide
Test File | File Contents | Expected Outcome |
---|---|---|
VXU-V04-01_success_single.hl7 | A single valid VXU message. | The ingestion pipeline will process a single message and upload it to the FHIR server. |
VXU-V04-02_failedConversion.hl7 | A single invalid VXU message that cannot be converted to FHIR. | The ingestion process will fail during the initial conversion to FHIR step. Information about the failure is written to failed_fhir_conversion\vxu\ . |
VXU-V04-02_failedUpload.hl7 | A single VXU message that converts to an invalid FHIR bundle. | The ingestion pipeline will fail during the final step when it attempts to upload the data to the FHIR server. Information about the failure is written to failed_fhir_uploads\vxu\ . |
VXU-V04-02_success_batch.hl7 | A batch Hl7 message containing two valid VXU messages. | The ingestion pipeline is triggered twice and runs successfully to completion both times. |
VXU-V04-03_batch_1_success_1_failConversion.hl7 | A batch Hl7 message containing one valid and one invalid VXU message. | The ingestion pipeline will run twice. On one execution it successfully processes the data and uploads to the FHIR server. On the other execution it fails. |