Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Update config files for g4 #20

Merged
merged 18 commits into from
May 12, 2020
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 62 additions & 1 deletion tools/jenkins-slave-creation-unix/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,65 @@
<!--- specific language governing permissions and limitations -->
<!--- under the License. -->

This Terraform setup will spawn an instance that is ready to be saved into an AMI to create a Jenkins slave.
This Terraform setup will spawn an instance that is ready to be saved into an AMI to create a Jenkins slave.

# Steps
## Setup Terraform
### Fetch Terraform and unzip the binary

```
wget https://releases.hashicorp.com/terraform/0.12.24/terraform_0.12.24_linux_amd64.zip
sudo apt install unzip
unzip terraform_0.12.24_linux_amd64.zip
```

### Add to path
Add the binary to the environment variable 'PATH'.
For example

```
sudo mv terraform /usr/local/bin/
mkdir /home/ubuntu/bin
mv /usr/local/bin/terraform /home/ubuntu/bin/terraform
```

### Verify
Check whether the terraform binary is in the PATH variable

```
echo $PATH
```

Verify terraform is properly installed

```
$ terraform --version
Terraform v0.12.24
$ which terraform
/home/ubuntu/bin/terraform
```

## Python package requirements
Install the terraform python package

```
pip3 install python_terraform
```

## Fill the redacted information
- infrastructure.tf [Security groups]
- infrastructure.tfvars [`key_name`, `key_path`, `secret_manager_docker_hub_arn`]
- `~/.aws/config` [Isengard account profile]

## Run the AMI creation script

```
./create_slave.sh
```

- Enter the desired directory

## Create an AMI
- Login to AWS Console
- Instance would be created with the name used in `infrastructure.tfvars.instance_name`
- Select Instance -> Actions -> Image -> Create Image
ChaiBapchya marked this conversation as resolved.
Show resolved Hide resolved

This file was deleted.

This file was deleted.

This file was deleted.

134 changes: 0 additions & 134 deletions tools/jenkins-slave-creation-unix/conf-ubuntu-gpu-p3/install.sh

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,14 @@

key_name = "REDACTED"
key_path = "~/.ssh/REDACTED"
instance_type = "g3.8xlarge"
instance_type = "g4dn.4xlarge"

s3_config_bucket = "mxnet-ci-slave-dev"
s3_config_filename = "ubuntu-gpu-g3-config.tar.bz2"
slave_install_script = "conf-ubuntu-gpu-g3/install.sh"
shell_variables_file = "conf-ubuntu-gpu-g3/shell-variables.sh"
ami = "ami-bd8f33c5" # ftp://64.50.236.216/pub/ubuntu-cloud-images/query/xenial/server/released.txt
instance_name = "Slave-base_Ubuntu-GPU-G3"
s3_config_filename = "ubuntu-gpu-g4-config.tar.bz2"
ChaiBapchya marked this conversation as resolved.
Show resolved Hide resolved
slave_install_script = "conf-ubuntu-gpu-g4/install.sh"
shell_variables_file = "conf-ubuntu-gpu-g4/shell-variables.sh"
# Base AMI, defines the OS of the slave instance [here: Ubuntu18.04 base image]
ami = "ami-0d1cd67c26f5fca19" # Ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20200112
instance_name = "Slave-base_Ubuntu-GPU-G4"
aws_region = "us-west-2"
secret_manager_docker_hub_arn = "arn:aws:secretsmanager:us-west-2:REDACTED:secret:REDACTED"
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,10 @@ sudo pip3 install boto3 python-jenkins joblib docker
echo "Installed htop, java, git and python"

#Install nvidia drivers
sudo apt-get -y install nvidia-418
#Chose the latest nvidia driver supported on Tesla driver for Ubuntu18.04
#Refer : https://www.nvidia.com/Download/driverResults.aspx/158191/en-us
sudo apt-get -y install nvidia-driver-435
sudo apt-get -y install nvidia-utils-435

# TODO: - Disabled nvidia updates @ /etc/apt/apt.conf.d/50unattended-upgrades
#Unattended-Upgrade::Package-Blacklist {
Expand All @@ -79,7 +82,12 @@ sudo apt-get install -y docker-ce
sudo usermod -aG docker jenkins_slave
sudo systemctl enable docker #Enable docker to start on startup
sudo service docker restart
echo "Installed docker engine"
# Get latest docker-compose; Ubuntu 18.04 has latest docker in bionic-updates, but not docker-compose and rather ships v1.17 from 2017
# See https://github.com/docker/compose/releases for latest release
# /usr/local/bin is not on the PATH in Jenkins, thus place binary in /usr/bin
sudo curl -L "https://github.com/docker/compose/releases/download/1.25.5/docker-compose-$(uname -s)-$(uname -m)" -o /usr/bin/docker-compose
sudo chmod +x /usr/bin/docker-compose
echo "Installed docker engine and docker-compose"

# Add nvidia-docker and nvidia-docker-plugin
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,4 @@


export S3_CONFIG_BUCKET="mxnet-ci-slave-dev"
export S3_CONFIG_FILE="ubuntu-gpu-p3-config.tar.bz2"
export S3_CONFIG_FILE="ubuntu-gpu-g4-config.tar.bz2"