Skip to content
This repository has been archived by the owner on Jul 22, 2024. It is now read-only.
GregDritschler edited this page Mar 4, 2019 · 3 revisions

Short title

Transforming and loading big data CSV files into a DB2 for z/OS database

Author

URLs

Github repo

Summary

In this Code Pattern, we will generate a set of CSV files, transform them using a tool called SQLite, and load them to a DB2 for z/OS database using a JDBC function called zload.

Technologies

  • Databases
  • Java
  • Systems

Description

This work was done as part of the Summit Health set of code patterns, which demonstrate how cloud technology can access data stored on z/OS systems. We needed a way to generate a large amount of patient health care data to populate the DB2 for z/OS database. We found an open source tool called Synthea which generates the kind of synthentic data we wanted.

The Synthea CSV files needed to be transformed to match the table schemas used in the Summit Health application. We found a public domain tool called SQLite which made these transformations easy.

Finally the transformed CSV files needed to be loaded from a distributed workstation into the DB2 for z/OS database. We used a JDBC function called zload to accomplish this. zload requires DB2 for z/OS version 12.

Flow

A shell script (run.sh or run.bat) drives the processing. There are four main steps as shown below.

  1. The Synthea tool is called to generate a set of CSV files containing synthesized patient health care data.
  2. A JDBC program is called to determine the current maximum patient number in the DB2 for z/OS database.
  3. The SQLite program is called to transform the CSV files produced by Synthea to match the schema of the DB2 for z/OS database.
  4. A JDBC program is called to load the transformed CSV files into the DB2 for z/OS database tables.

Instructions

  1. Install the following prerequisite tools.

  2. Clone and build this project.

  3. Clone and build the Synthea project

  4. Change the following properties in synthea/src/main/resources/synthea.properties:

    • Set exporter.csv.export to true
    • Set generate.append_numbers_to_person_names = false (optional)
  5. Create the DB2 for z/OS database.

  6. Set up environment variables that the script needs to connect to your DB2 for z/OS database.

  7. Run the script from this project with the current directory set to your synthea project.

Components and services

  • IBM DB2 Database

Runtimes

  • Java
Clone this wiki locally