Skip to content

Commit

Permalink
add(script): trino-wrapper
Browse files Browse the repository at this point in the history
  • Loading branch information
rohank07 committed May 4, 2022
1 parent c5b7982 commit c727edf
Show file tree
Hide file tree
Showing 23 changed files with 309 additions and 39 deletions.
11 changes: 10 additions & 1 deletion .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,15 @@

# Tests / Quality Checks


## Are there breaking changes?
Ask yourself the next question;
- [ ] Do we want to maintain the previous image from which we had to do breaking changes from?

If no, then carry on. If yes, there is a breaking change and we **want to maintain the previous image** do the following
- [ ] Create a new branch for the current version (ex v1) based off the current master/main branch
- [ ] Increment the tag in the CI for pushes to master/main (v1 to v2)
- [ ] Change the CI that on pushes to the newly created "v1" branch (the name of the newly created branch we want to maintain is) it will push to the ACR.
## Automated Testing/build and deployment
- [ ] Does the image pass CI successfully (build, pass vulnerability scan, and pass automated test suite)?
- [ ] If new features are added (new image, new binary, etc), have new automated tests been added to cover these?
Expand All @@ -21,4 +30,4 @@
## Code review

- [ ] Have you added the `auto-deploy` tag to your PR before your most recent push to this repo? This causes CI to build the image and push to our ACR, letting reviewers access the built image without having to create it themselves
- [ ] Have you chosen a reviewer, attached them as a reviewer to this PR, and messaged them with the SHA-pinned image name for the final image to test (e.g. `k8scc01covidacr.azurecr.io/machine-learning-notebook-cpu:746d058e2f37e004da5ca483d121bfb9e0545f2b`)?
- [ ] Have you chosen a reviewer, attached them as a reviewer to this PR, and messaged them with the SHA-pinned image name for the final image to test on the **dev cluster** (e.g. `k8scc01covidacrdev.azurecr.io/jupyterlab-cpu:746d058e2f37e004da5ca483d121bfb9e0545f2b`)?
39 changes: 31 additions & 8 deletions .github/workflows/build_push.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@
# a. REGISTRY_USERNAME with ACR username
# b. REGISTRY_PASSWORD with ACR Password
# c. AZURE_CREDENTIALS with the output of `az ad sp create-for-rbac --sdk-auth`
# d. DEV_REGISTRY_USERNAME with the DEV ACR username
# e. DEV_REGISTRY_PASSWORD with the DEV ACR Password
#
# 2. Change the values for the REGISTRY_NAME, CLUSTER_NAME, CLUSTER_RESOURCE_GROUP and NAMESPACE environment variables (below in build-push).
name: build_and_push
Expand Down Expand Up @@ -52,8 +54,8 @@ jobs:
build-push:
env:
REGISTRY: k8scc01covidacr.azurecr.io
REGISTRY_NAME: k8scc01covidacr
DEV_REGISTRY_NAME: k8scc01covidacrdev
CLUSTER_NAME: k8s-cancentral-01-covid-aks
CLUSTER_RESOURCE_GROUP: k8s-cancentral-01-covid-aks
LOCAL_REPO: localhost:5000
Expand All @@ -62,7 +64,7 @@ jobs:
matrix:
notebook:
# TODO: Pull this from a settings file or Makefile, that way Make can have the same list
# - docker-stacks-datascience-notebook # Debugging
#- docker-stacks-datascience-notebook # Debugging
- rstudio
- sas
- jupyterlab-cpu
Expand All @@ -77,6 +79,19 @@ jobs:
ports:
- 5000:5000
steps:
- name: Set ENV variables for a PR containing the auto-deploy tag
if: github.event_name == 'pull_request' && contains( github.event.pull_request.labels.*.name, 'auto-deploy')
run: |
echo "REGISTRY=k8scc01covidacrdev.azurecr.io" >> "$GITHUB_ENV"
echo "IMAGE_VERSION=dev" >> "$GITHUB_ENV"
- name: Set ENV variables for pushes to master
if: github.event_name == 'push' && github.ref == 'refs/heads/master'
run: |
echo "REGISTRY=k8scc01covidacr.azurecr.io" >> "$GITHUB_ENV"
echo "IMAGE_VERSION=v1" >> "$GITHUB_ENV"
echo "IS_LATEST=true" >> "$GITHUB_ENV"
- uses: actions/checkout@master

- name: Free up all available disk space before building
Expand All @@ -97,9 +112,15 @@ jobs:
login-server: ${{ env.REGISTRY_NAME }}.azurecr.io
username: ${{ secrets.REGISTRY_USERNAME }}
password: ${{ secrets.REGISTRY_PASSWORD }}


# Connect to Azure DEV Container registry (ACR)
- uses: azure/docker-login@v1
with:
login-server: ${{ env.DEV_REGISTRY_NAME }}.azurecr.io
username: ${{ secrets.DEV_REGISTRY_USERNAME }}
password: ${{ secrets.DEV_REGISTRY_PASSWORD }}

# Image building/storing locally

- name: Make Dockerfiles
run: make generate-dockerfiles

Expand Down Expand Up @@ -138,7 +159,7 @@ jobs:
- uses: Azure/container-scan@v0
if: steps.notebook-name.outputs.NOTEBOOK_NAME != 'sas'
env:
TRIVY_TIMEOUT: 10m0s # Trivy default is 2min. Some images take a bit longer
TRIVY_TIMEOUT: 30m0s # Trivy default is 2min. Some images take a bit longer
with:
image-name: ${{ steps.build-image.outputs.full_image_name }}
severity-threshold: CRITICAL
Expand All @@ -165,9 +186,11 @@ jobs:
# (get above's name from build-image's output)
- name: Tag images with real repository
if: steps.should-i-push.outputs.boolean == 'true'
run: make post-build/${{ matrix.notebook }} SOURCE_FULL_IMAGE_NAME=${{ steps.build-image.outputs.full_image_name }}
run: >
make post-build/${{ matrix.notebook }} DEFAULT_REPO=$REGISTRY IS_LATEST=$IS_LATEST
IMAGE_VERSION=$IMAGE_VERSION SOURCE_FULL_IMAGE_NAME=${{ steps.build-image.outputs.full_image_name }}
- name: Push image to ACR
- name: Push image to registry
if: steps.should-i-push.outputs.boolean == 'true'
run: |
make push/${{ matrix.notebook }}
make push/${{ matrix.notebook }} DEFAULT_REPO=$REGISTRY
4 changes: 3 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
# https://github.com/jupyter/docker-stacks/blob/master/Makefile

# The docker-stacks tag
DOCKER-STACKS-UPSTREAM-TAG := 512afd49b925
DOCKER-STACKS-UPSTREAM-TAG := 9ed3b8de5de1

tensorflow-CUDA := 11.1
pytorch-CUDA := 11.0
Expand Down Expand Up @@ -201,6 +201,8 @@ build/%: ## build the latest image
post-build/%: export REPO?=$(DEFAULT_REPO)
post-build/%: export TAG?=$(DEFAULT_TAG)
post-build/%: export SOURCE_FULL_IMAGE_NAME?=
post-build/%: export IMAGE_VERSION?=
post-build/%: export IS_LATEST?=
post-build/%:
# TODO: could check for custom hook in the build's directory
IMAGE_NAME="$(notdir $@)" \
Expand Down
33 changes: 25 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,16 +98,26 @@ GitHub Actions CI is enabled to do building, scanning, automated testing, and (o
* any push to an open PR
This allows for easy scanning and automated testing for images.

GitHub Actions CI also enables pushing built images to our ACR, making them accessible from the platform. This occurs on:
* any push to master
* any push to an open PR **that also has the `auto-deploy` label on the PR**
This allows developers to opt-in to on-platform testing. For example, when you need to build in github and test on platform (or want someone else to be able to pull your image):
GitHub Actions CI also enables pushing built images to our ACRs, making them accessible from the platform.

Pushes to the `master` branch will push to the k8scc01covidacr.azurecr.io ACR and these are accessible from both the dev and prod cluster.
You can access these images using any of the following:
* k8scc01covidacr.azurecr.io/IMAGENAME:SHA
* k8scc01covidacr.azurecr.io/IMAGENAME:SHORT_SHA
* k8scc01covidacr.azurecr.io/IMAGENAME:latest
* k8scc01covidacr.azurecr.io/IMAGENAME:v1


Any push to an open PR **that also has the `auto-deploy` label on the PR**
This allows developers to opt-in to on-platform testing. For example, when you need to build in github and test on platform (or want someone else to be able to pull your image):
* open a PR and add the `auto-deploy` label
* push to your PR and watch the GitHub Action CI
* access your image in Kubeflow via a custom image from any of:
* k8scc01covidacr.azurecr.io/IMAGENAME:SHA
* k8scc01covidacr.azurecr.io/IMAGENAME:SHORT_SHA
* k8scc01covidacr.azurecr.io/IMAGENAME:BRANCH_NAME
* access your image in Kubeflow DEV via a custom image from any of:
* k8scc01covidacrdev.azurecr.io/IMAGENAME:SHA
* k8scc01covidacrdev.azurecr.io/IMAGENAME:SHORT_SHA
* k8scc01covidacrdev.azurecr.io/IMAGENAME:dev (for convenience in testing)

Images pushed to the dev acr are only available to the DEV cluster, attempting to use them in prod will fail.

### Adding new Images

Expand Down Expand Up @@ -145,6 +155,13 @@ If making changes to CI that cannot be done on a branch (eg: changes to issue_co
## Other Development Notes
### The `latest` and `v1` tags for the master branch
These are intended to be `long-lived` in that they will not change. Subsequent pushes will clobber the previous `jupyterlab-cpu:latest` image. Previously when we built and pushed to master with updates to an image, we would need to go and change the spawner to use that new image. This will allow us to have them reference `jupyterlab-cpu:latest` and remove us from needing to update it. Additionally, upon changing the `ImagePullPolicy` to `Always` we could do restarts of workloads and then guarantee that users are on the 'latest' image.
The `v1` tag is intended for when we encounter a breaking change but still want to support the features of that current image. We would then branch off and modify the CI as well as increment the tag.
---
### Set User File Permissions after Every `pip`/`conda` Install or Edit of User's Home Files
The Dockerfiles in this repo are intended to construct compute environments for a non-root user **jovyan** to ensure the end user has the least privileges required for their task, but installation of some of the software needed by the user must be done as the **root** user. This means that installation of anything that should be user editable (eg: `pip` and `conda` installs, additional files in `/home/$NB_USER`, etc.) will by default be owned by **root** and not modifiable by **jovyan**. **Therefore we must change the permissions of these files to allow the user specific access for modification.** For example, most pip install/conda install commands occur as the root user and result in new files in the $CONDA_DIR directory that will be owned by **root** and cause issues if user **jovyan** tried to update or uninstall these packages (as they by default will not have permission to change/remove these files).
Expand Down
2 changes: 1 addition & 1 deletion docker-bits/0_cpu.Dockerfile
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Docker-stacks version tags (eg: `r-4.0.3`) are LIVE images that are frequently updated. To avoid unexpected
# image updates, pin to the docker-stacks git commit SHA tag.
# It can be obtained by running `docker inspect repo/imagename:tag@digest` or from
# It can be obtained by running `docker inspect repo/imagename:tag@digest` or from
# https://github.com/jupyter/docker-stacks/wiki

ARG BASE_VERSION=9ed3b8de5de1
Expand Down
17 changes: 16 additions & 1 deletion docker-bits/4_CLI.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,18 @@ ARG AZCLI_URL=https://aka.ms/InstallAzureCLIDeb
ARG OH_MY_ZSH_URL=https://raw.githubusercontent.com/loket/oh-my-zsh/feature/batch-mode/tools/install.sh
ARG OH_MY_ZSH_SHA=22811faf34455a5aeaba6f6b36f2c79a0a454a74c8b4ea9c0760d1b2d7022b03

ARG TRINO_URL=https://repo1.maven.org/maven2/io/trino/trino-cli/377/trino-cli-377-executable.jar
ARG TRINO_SHA=e81935c3611e2a6c7c2f64720addf5724df235e180919eae31b1b817c925c3ef
# Add helpers for shell initialization
COPY shell_helpers.sh /tmp/shell_helpers.sh

# Install OpenJDK-8
RUN apt-get update && \
apt-get install -y openjdk-8-jre && \
apt-get clean && \
fix-permissions $CONDA_DIR && \
fix-permissions /home/$NB_USER

# kubectl, mc, and az
RUN curl -LO "${KUBECTL_URL}" \
&& echo "${KUBECTL_SHA} kubectl" | sha256sum -c - \
Expand All @@ -48,4 +57,10 @@ RUN curl -LO "${KUBECTL_URL}" \
&& \
wget -q "${OH_MY_ZSH_URL}" -O /tmp/oh-my-zsh-install.sh \
&& echo "${OH_MY_ZSH_SHA} /tmp/oh-my-zsh-install.sh" | sha256sum -c \
&& echo "oh-my-zsh: ok"
&& echo "oh-my-zsh: ok" \
&& \
wget -q "${TRINO_URL}" -O /tmp/trino-original \
&& echo ${TRINO_SHA} /tmp/trino-original | sha256sum -c \
&& echo "trinocli: ok" \
&& chmod +x /tmp/trino-original \
&& sudo mv /tmp/trino-original /usr/local/bin/trino-original
4 changes: 3 additions & 1 deletion docker-bits/∞_CMD.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@ USER root
WORKDIR /home/$NB_USER
EXPOSE 8888
COPY start-custom.sh /usr/local/bin/
COPY mc-tenant-wrapper.sh /usr/local/bin/mc
COPY mc-tenant-wrapper.sh /usr/local/bin/mc
COPY trino-wrapper.sh /usr/local/bin/trino
RUN chmod +x /usr/local/bin/trino

# Add --user to all pip install calls and point pip to Artifactory repository
COPY pip.conf /tmp/pip.conf
Expand Down
12 changes: 12 additions & 0 deletions make_helpers/post-build-hook.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
# REPO (repo to use for new tags)
# GIT_SHA (full SHA for commit)
# BRANCH_NAME (if set, will override BRANCH_NAME computed below)
# IMAGE_VERSION (version of image, increments on breaking change)
# IS_LATEST (occurs on merges into main, will push and clobber previous image tags)

# End repo with exactly one trailing slash, unless it is empty
REPO=$(echo "${REPO}" | sed 's:/*$:/:' | sed 's:^\s*/*\s*$::') ;\
Expand All @@ -27,3 +29,13 @@ docker tag $SOURCE_FULL_IMAGE_NAME $REPO_IMAGE_NAME:$SHORT_SHA
BRANCH_NAME=${BRANCH_NAME:-git rev-parse --abbrev-ref HEAD}
echo "Tagging with BRANCH_NAME ($BRANCH_NAME)"
docker tag $SOURCE_FULL_IMAGE_NAME $REPO_IMAGE_NAME:$BRANCH_NAME

if [ ! -z $IMAGE_VERSION ]; then
echo "Tagging with IMAGE_VERSION ($IMAGE_VERSION)"
docker tag $SOURCE_FULL_IMAGE_NAME $REPO_IMAGE_NAME:$IMAGE_VERSION
fi

if [ $IS_LATEST = true ]; then
echo "Tagging with LATEST"
docker tag $SOURCE_FULL_IMAGE_NAME $REPO_IMAGE_NAME:latest
fi
6 changes: 4 additions & 2 deletions output/docker-stacks-datascience-notebook/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM jupyter/datascience-notebook:512afd49b925
FROM jupyter/datascience-notebook:9ed3b8de5de1

###############################
### docker-bits/∞_CMD.Dockerfile
Expand All @@ -10,7 +10,9 @@ USER root
WORKDIR /home/$NB_USER
EXPOSE 8888
COPY start-custom.sh /usr/local/bin/
COPY mc-tenant-wrapper.sh /usr/local/bin/mc
COPY mc-tenant-wrapper.sh /usr/local/bin/mc
COPY trino-wrapper.sh /usr/local/bin/trino
RUN chmod +x /usr/local/bin/trino

# Add --user to all pip install calls and point pip to Artifactory repository
COPY pip.conf /tmp/pip.conf
Expand Down
11 changes: 11 additions & 0 deletions output/docker-stacks-datascience-notebook/trino-wrapper.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/bash

# Get token from default service account
GET_TOKEN="$(kubectl describe secret default-token | grep 'token:' | sed 's/^.*://')"

#todo: Change server when deployed on dev. Placeholder for now
SERVER=https://trino.example.com

# Trino client pass in server, access token and additional options the user can configures
# todo: remove user option
trino-original --server $SERVER --insecure --access-token $GET_TOKEN --user default "$@"
23 changes: 20 additions & 3 deletions output/jupyterlab-cpu/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

# Docker-stacks version tags (eg: `r-4.0.3`) are LIVE images that are frequently updated. To avoid unexpected
# image updates, pin to the docker-stacks git commit SHA tag.
# It can be obtained by running `docker inspect repo/imagename:tag@digest` or from
# It can be obtained by running `docker inspect repo/imagename:tag@digest` or from
# https://github.com/jupyter/docker-stacks/wiki

ARG BASE_VERSION=9ed3b8de5de1
Expand Down Expand Up @@ -101,9 +101,18 @@ ARG AZCLI_URL=https://aka.ms/InstallAzureCLIDeb
ARG OH_MY_ZSH_URL=https://raw.githubusercontent.com/loket/oh-my-zsh/feature/batch-mode/tools/install.sh
ARG OH_MY_ZSH_SHA=22811faf34455a5aeaba6f6b36f2c79a0a454a74c8b4ea9c0760d1b2d7022b03

ARG TRINO_URL=https://repo1.maven.org/maven2/io/trino/trino-cli/377/trino-cli-377-executable.jar
ARG TRINO_SHA=e81935c3611e2a6c7c2f64720addf5724df235e180919eae31b1b817c925c3ef
# Add helpers for shell initialization
COPY shell_helpers.sh /tmp/shell_helpers.sh

# Install OpenJDK-8
RUN apt-get update && \
apt-get install -y openjdk-8-jre && \
apt-get clean && \
fix-permissions $CONDA_DIR && \
fix-permissions /home/$NB_USER

# kubectl, mc, and az
RUN curl -LO "${KUBECTL_URL}" \
&& echo "${KUBECTL_SHA} kubectl" | sha256sum -c - \
Expand All @@ -117,7 +126,13 @@ RUN curl -LO "${KUBECTL_URL}" \
&& \
wget -q "${OH_MY_ZSH_URL}" -O /tmp/oh-my-zsh-install.sh \
&& echo "${OH_MY_ZSH_SHA} /tmp/oh-my-zsh-install.sh" | sha256sum -c \
&& echo "oh-my-zsh: ok"
&& echo "oh-my-zsh: ok" \
&& \
wget -q "${TRINO_URL}" -O /tmp/trino-original \
&& echo ${TRINO_SHA} /tmp/trino-original | sha256sum -c \
&& echo "trinocli: ok" \
&& chmod +x /tmp/trino-original \
&& sudo mv /tmp/trino-original /usr/local/bin/trino-original

###############################
### docker-bits/5_DB-Drivers.Dockerfile
Expand Down Expand Up @@ -280,7 +295,9 @@ USER root
WORKDIR /home/$NB_USER
EXPOSE 8888
COPY start-custom.sh /usr/local/bin/
COPY mc-tenant-wrapper.sh /usr/local/bin/mc
COPY mc-tenant-wrapper.sh /usr/local/bin/mc
COPY trino-wrapper.sh /usr/local/bin/trino
RUN chmod +x /usr/local/bin/trino

# Add --user to all pip install calls and point pip to Artifactory repository
COPY pip.conf /tmp/pip.conf
Expand Down
11 changes: 11 additions & 0 deletions output/jupyterlab-cpu/trino-wrapper.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/bash

# Get token from default service account
GET_TOKEN="$(kubectl describe secret default-token | grep 'token:' | sed 's/^.*://')"

#todo: Change server when deployed on dev. Placeholder for now
SERVER=https://trino.example.com

# Trino client pass in server, access token and additional options the user can configures
# todo: remove user option
trino-original --server $SERVER --insecure --access-token $GET_TOKEN --user default "$@"
Loading

0 comments on commit c727edf

Please sign in to comment.