GoogleCloudPlatform · cofin · Sep 13, 2022 · Sep 9, 2022 · Sep 9, 2022 · Sep 9, 2022
diff --git a/README.md b/README.md
@@ -150,16 +150,102 @@ The table below demonstrates, at a high level, the  information that  is being c
 1.6. Repeat step 1.3 for all Oracle databases that you want to assess.
 
 ## Step 2 - Importing the data collected into Google BigQuery for analysis
+Much of the data import and report generation has been automated.  Follow section 2.1 to use the automated process.  Section 2.2 provides instructions for the manual process if that is your preference.  Both processes assume you have rights to create datasets in a Big Query project and access to Data Studio.
 
-2.1. Setup Environment variables (From Google Cloud Shell ONLY).
+Make note of the project name and the data set name and location.  The data set will be created if it does not exist.  
+
+2.1  Automated load process
+
+These instructions are written for running in a Cloud Shell environment.
+
+2.1.1 Clone the Optimus Prime codebase to a working directory.
+
+Create a working directory for the code base, then clone the repository from Github.
+
+Ex:
+```
+mkdir -p ~/code/op
+cd ~/code/op
+git clone https://github.com/GoogleCloudPlatform/oracle-database-assessment
+```
+
+2.1.2 Create a data directory and upload files from the client
+
+Create a directory to hold the output files for processing, then upload the files to that location and uncompress.
+
+Ex:
+```
+mkdir ~/data
+<upload files to data>
+cd data
+<uncompress files>
+```
+
+2.1.3 Configure automation
+
+The automated process is configured via the file <workingdirectory>/oracle-database-assessment/db_addessment/0_configure_op_env.sh.  Edit this file and enter values for these variables:
+
+```
+# This is the name of the project into which you want to load data
+export PROJECTNAME=yourProjectNameHere
+
+# This is the name of the data set into which you want to load. 
+# The dataset will be created if it does not exist.
+# If the datset already exists, it will have this data appoended.
+# Use only alphanumeric characters, - (dash) or _ (underscore)
+# This name must be filesystem and html compatible
+export DSNAME=yourDatasetNameHere
+
+# This is the location in which the dataset should be created.  
+export DSLOC=yourRegionNameHere
+
+# This is the full path into which the customer's files have been extracted.
+export OP_LOG_DIR=/full/Path/To/LogFiles
+
+# This is the name of the report you want to create in DataStudio upon load completion.
+# Use only alphanumeric characters or embed HTML encoding.
+export REPORTNAME="OptimusPrime%20Dashboard%20${DSNAME}"
+```
+
+2.1.4 Execute the load scripts
+
+The load scripts expect to be run from the <workingdirectory>/oracle-database-assessment/db_addessment directory.  Change to this directory and run the following commands in numeric order.  Check output of each for errors before continuing to the next. 
+
+```
+. ./0_configure_op_env.sh
+. ./1_activate_op.sh
+. ./2_load_op.sh
+. ./3_run_op_etl.sh
+. ./4_gen_op_report_url.sh
+```
+
+The function of each script is as follows.
+```
+0_configure_op_env.sh - Defines environment variables that are used in the other scripts.
+1_activate_op.sh - Installs necessary Python support modules and activates the Python virtual environment for Optimus Prime.
+2_load_op.sh - Loads the client data files into the base Optimus Prime tables in the requested data set.
+3_run_op_etl.sh - Installs and runs Big Query procedures that create additional views and tables to support the Optimus Prime dashboard.
+4_gen_op_report_url.sh - Generates the URL to view the newly loaded data using a report template.  
+```
+
+2.1.5 View the data in Optimus Prime Dashboard report
+
+Click the link displayed by script 4_gen_op_report_url.sh to view the report.  Note that this link does not persist the report.
+To save the report for future use, click the '"Edit and Share"' button, then '"Acknowledge and Save"', then '"Add to Report"'.  It will then show up in Data Studio in '"Reports owned by me"' and can be shared with others.
+
+Skip to step 3 to perform additional analysis for anything not contained in the dashboard report.
+
+2.2  Manual load process
+
+2.2.1. Setup Environment variables (From Google Cloud Shell ONLY).
 
 ```
 gcloud auth list
 
 gcloud config set project <project id>
 ```
 
-2.2 Export Environment variables. (Step 1.2 has working directory created)
+2.2.2 Export Environment variables. (Step 1.2 has working directory created)
 
 ```
 export OP_WORKDING_DIR=<<path for working directory>
@@ -169,39 +255,39 @@ mkdir $OP_OUTPUT_DIR/log
 export OP_LOG_DIR=$OP_OUTPUT_DIR/log
 ```
 
-2.3 Create working directory (Skip if you have followed step 1.2 on same server)
+2.2.3 Create working directory (Skip if you have followed step 1.2 on same server)
 
 ```
 mkdir $OP_WORKDING_DIR
 ```
 
-2.4 Clone Github repository (Skip if you have followed step 1.2 on same server)
+2.2.4 Clone Github repository (Skip if you have followed step 1.2 on same server)
 
 ```
 cd <work-directory>
 git clone https://github.com/GoogleCloudPlatform/oracle-database-assessment
 ```
 
-2.5 Create assessment output directory
+2.2.5 Create assessment output directory
 
 ```
 mkdir -p /<work-directory>/oracle-database-assessment-output
 cd /<work-directory>/oracle-database-assessment-output
 ```
 
-2.6 Move zip files to assessment output directory and unzip
+2.2.6 Move zip files to assessment output directory and unzip
 
 ```
 mv <<file file>> /<work-directory>/oracle-database-assessment-output
 unzip <<zip files>>
 ```
 
-2.7. [Create a service account and download the key](https://cloud.google.com/iam/docs/creating-managing-service-accounts#before-you-begin ) .
+2.2.7. [Create a service account and download the key](https://cloud.google.com/iam/docs/creating-managing-service-accounts#before-you-begin ) .
 
 * Set GOOGLE_APPLICATION_CREDENTIALS to point to the downloaded key. Make sure the service account has BigQuery Admin privelege.
 * NOTE: This step can be skipped if using [Cloud Shell](https://ssh.cloud.google.com/cloudshell/)
 
-2.8. Create a python virtual environment to install dependencies and execute the `optimusprime.py` script
+2.2.8. Create a python virtual environment to install dependencies and execute the `optimusprime.py` script
 
 ```
  python3 -m venv $OP_WORKDING_DIR/op-venv

diff --git a/db_assessment/0_configure_op_env.sh b/db_assessment/0_configure_op_env.sh
@@ -0,0 +1,40 @@
+# This file configures the environemt for loading data files from the client
+# Edit this file and set the project name, data set name, and data set location to
+# where you want the data loaded.
+# Ensure you have proper access to the project and rights to create a data set.
+
+# This is the name of the project into which you want to load data
+export PROJECTNAME=yourProjectNameHere
+
+# This is the name of the data set into which you want to load. 
+# The dataset will be created if it does not exist.
+# If the datset already exists, it will have this data appoended.
+# Use only alphanumeric characters, - (dash) or _ (underscore)
+# This name must be filesystem and html compatible
+export DSNAME=yourDatasetNameHere
+
+# This is the location in which the dataset should be created.  
+export DSLOC=yourRegionNameHere
+
+# This is the full path into which the customer's files have been extracted.
+export OP_LOG_DIR=fullPathToLogFiles
+
+# This is the name of the report you want to create in DataStudio upon load completion.
+# Use only alphanumeric characters or embed HTML encoding.
+export REPORTNAME="OptimusPrime%20Dashboard%20${DSNAME}"
+
+# This is the column separator used in the customer's files.  Older versions of 
+# the extract will use semicolon, newer versions will use pipe.
+export COLSEP='|'
+
+
+export OP_WORKING_DIR=$(pwd)
+
+echo
+echo Environment set to load from ${OP_LOG_DIR} into ${PROJECTNAME}.${DSNAME}
+
+if [[  -s ${OP_LOG_DIR}/errors*.log ]] 
+then
+	echo Errors found in data to be loaded.   Please review before continuing.
+	cat ${OP_LOG_DIR}/errors*.log
+fi
diff --git a/db_assessment/1_activate_op.sh b/db_assessment/1_activate_op.sh
@@ -0,0 +1,8 @@
+THISDIR=$(pwd)
+python3 -m venv ${OP_WORKING_DIR}/../op-venv
+source ${OP_WORKING_DIR}/../op-venv/bin/activate
+cd ${OP_WORKING_DIR}/..
+
+pip3 install pip --upgrade
+pip3 install .
+cd ${THISDIR}
diff --git a/db_assessment/2_load_op.sh b/db_assessment/2_load_op.sh
@@ -0,0 +1,13 @@
+THISD=$(pwd)
+bq mk -d --data_location=${DSLOC} ${DSNAME}
+cd ${OP_WORKING_DIR}/..
+for COLID in $(ls -1 ${OP_LOG_DIR}/opdb*| rev | cut -d '.' -f 2 | rev | sort | uniq)
+do
+python3 ./db_assessment/optimusprime.py -sep "${COLSEP}" -dataset ${DSNAME} -fileslocation ${OP_LOG_DIR} -projectname ${PROJECTNAME} -collectionid ${COLID} | tee ${THISD}/opload-${DSNAME}-${COLID}.log
+done
+echo
+echo Logs of this upload are available at:
+echo
+ls -l ${THISD}/opload-${DSNAME}-*.log
+echo
+cd ${THISD}
diff --git a/db_assessment/3_run_op_etl.sh b/db_assessment/3_run_op_etl.sh
@@ -0,0 +1,4 @@
+sed "s/projectID.dataset/${PROJECTNAME}.${DSNAME}/g" op_etl_template.sql > op_etl_${DSNAME}.sql
+bq query  --use_legacy_sql=false <op_etl_${DSNAME}.sql  | tee op_etl_${DSNAME}.log
+echo
+echo A log of this process is available at op_etl_${DSNAME}.log
diff --git a/db_assessment/4_gen_op_report_url.sh b/db_assessment/4_gen_op_report_url.sh
@@ -0,0 +1,56 @@
+# ReportID is taken from the DataStudio template upon which the new report will be created.
+REPORTID=ed2d87f1-e037-4e65-8ef0-4439a3e62aa3
+
+# REPORTNAME and DSNAME are set in another script.
+# The URL template is formatted for editability and readability.
+# Line feeds, carriage returns and spaces will be filtered out when generated.
+# Any new data sources added to the template will need to be modified here.
+URL_TEMPLATE="https://datastudio.google.com/reporting/create?c.
+reportId=${REPORTID}
+&r.reportName=${REPORTNAME}
+&ds.ds106.connector=bigQuery
+&ds.ds106.datasourceName=T_DS_Database_Metrics
+&ds.ds106.projectId=optimusprime-migrations
+&ds.ds106.type=TABLE
+&ds.ds106.datasetId=${DSNAME}
+&ds.ds106.tableId=T_DS_Database_Metrics
+&ds.ds96.connector=bigQuery
+&ds.ds96.datasourceName=T_DS_BMS_sizing
+&ds.ds96.projectId=optimusprime-migrations
+&ds.ds96.type=TABLE
+&ds.ds96.datasetId=${DSNAME}
+&ds.ds96.tableId=T_DS_BMS_sizing
+&ds.ds103.connector=bigQuery
+&ds.ds103.datasourceName=V_DS_BMS_BOM
+&ds.ds103.projectId=optimusprime-migrations
+&ds.ds103.type=TABLE
+&ds.ds103.datasetId=${DSNAME}
+&ds.ds103.tableId=V_DS_BMS_BOM
+&ds.ds169.connector=bigQuery
+&ds.ds169.datasourceName=V_DS_HostDetails
+&ds.ds169.projectId=optimusprime-migrations
+&ds.ds169.type=TABLE
+&ds.ds169.datasetId=${DSNAME}
+&ds.ds169.tableId=V_DS_HostDetails
+&ds.ds68.connector=bigQuery
+&ds.ds68.datasourceName=V_DS_dbfeatures
+&ds.ds68.projectId=optimusprime-migrations
+&ds.ds68.type=TABLE
+&ds.ds68.datasetId=${DSNAME}
+&ds.ds68.tableId=V_DS_dbfeatures
+&ds.ds12.connector=bigQuery
+&ds.ds12.datasourceName=V_DS_dbsummary
+&ds.ds12.projectId=optimusprime-migrations
+&ds.ds12.type=TABLE
+&ds.ds12.datasetId=${DSNAME}
+&ds.ds12.tableId=V_DS_dbsummary"
+
+echo
+echo The Optimus Prime dashboard report \"${REPORTNAME}\" is available at the link below
+echo
+echo ${URL_TEMPLATE} | sed 's/\r//g;s/\n//g;s/ //g'
+echo
+echo Click the link to view the report.  
+echo To create a persistent copy of this report:
+echo Click the '"Edit and Share"' button, then '"Acknowledge and Save"', then '"Add to Report"'.
+echo It will then show up in Data Studio in '"Reports owned by me"' and can be shared with others.