Skip to content
View dfk007's full-sized avatar
πŸ’­
Horsing Around.
πŸ’­
Horsing Around.

Block or report dfk007

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
dfk007/README.md

OptiReduce Deployment πŸš€

Ansible CUDA License

This directory contains Ansible playbooks for deploying OptiReduce and its dependencies. For detailed information, visit our official documentation.


πŸ“‹ Table of Contents


πŸ“₯ Download

Clone the Ansible repository:

git clone https://github.com/OptiReduce/ansible.git
cd ansible

The playbooks automate the deployment of all OptiReduce components.


πŸ”‘ Prerequisites

1. Install Ansible

Ubuntu/Debian:

sudo apt update
sudo apt install software-properties-common
sudo apt-add-repository --yes --update ppa:ansible/ansible
sudo apt install ansible

RHEL/CentOS:

sudo yum install epel-release
sudo yum install ansible

Verify installation:

ansible --version

2. SSH Setup πŸ”

SSH Installation Script

#!/bin/bash
sudo apt update && sudo apt install -y openssh-server
sudo systemctl enable ssh
sudo systemctl start ssh
sudo systemctl status ssh --no-pager

Usage:

chmod +x install_ssh.sh
./install_ssh.sh

Password-less Authentication Script

#!/bin/bash
if [ "$#" -ne 2 ]; then
    echo "Usage: $0 <target_username> <target_host>"
    exit 1
fi
TARGET_USER="$1"
TARGET_HOST="$2"
SSH_KEY="$HOME/.ssh/id_rsa"

[ ! -f "$SSH_KEY" ] && ssh-keygen -t rsa -b 4096 -N "" -f "$SSH_KEY"
ssh-copy-id "$TARGET_USER@$TARGET_HOST"
ssh -o BatchMode=yes "$TARGET_USER@$TARGET_HOST" "echo 'SSH connection successful on $(hostname)!'"

Steps:

  1. Save as ssh_setup.sh and make it executable:
    chmod +x ssh_setup.sh
  2. Run with target credentials:
    ./ssh_setup.sh user 192.168.1.10

πŸ“‚ Directory Structure

optireduce/
β”œβ”€β”€ ansible.cfg                # Ansible configuration
β”œβ”€β”€ inventory/
β”‚   └── hosts                 # Target machine definitions
β”œβ”€β”€ group_vars/
β”‚   └── all.yml              # Global variables
β”œβ”€β”€ optireduce_deploy.yml    # Main playbook
β”œβ”€β”€ Makefile                 # Deployment shortcuts
└── roles/                   # Component roles
    β”œβ”€β”€ cuda/                # CUDA 11.7 setup
    β”œβ”€β”€ mellanox/            # Mellanox drivers
    β”œβ”€β”€ anaconda/            # Python environment
    β”œβ”€β”€ optireduce/          # Core OptiReduce
    └── benchmark/           # Benchmark tools

βš™οΈ Configuration

1. Inventory Setup (inventory/hosts)

[gpu_nodes]
node1 ansible_host=192.168.1.101 ansible_user=test
node2 ansible_host=192.168.1.102 ansible_user=test

2. Variables (group_vars/all.yml)

cuda_version: "11.7.0-1"
nvidia_version: "515"
cudnn_version: "8.5.0.96-1+cuda11.7"
python_version: "3.9.19"
dpdk_version: "v20.11"

πŸš€ Deployment Options

Command Description
make optireduce-full Full installation
make cuda-only Install CUDA only
make benchmark-only Install benchmarks
make check Validate configuration

Custom Installation:

make deploy INSTALL_CUDA=true INSTALL_BENCHMARK=true

🧩 Available Components

  • CUDA 11.7 with cuDNN 8.5
  • Mellanox OFED Drivers
  • Anaconda (Python 3.9.19)
  • DPDK v20.11
  • OptiReduce Core
  • Benchmarking Tools

🌍 Environment Variables

# Toggle components during deployment
INSTALL_CUDA=true/false
INSTALL_MELLANOX=true/false
INSTALL_ANACONDA=true/false
INSTALL_OPTIREDUCE=true/false
INSTALL_BENCHMARK=true/false

⚠️ Troubleshooting

Issue Solution
SSH Connection Verify keys and network connectivity
CUDA Failures Check NVIDIA repo access and space
OFED Errors Confirm kernel compatibility

πŸ“š Additional Resources


πŸ†˜ Support

  1. Check the Troubleshooting section above.
  2. Review Ansible logs at /var/log/ansible.log.
  3. Open an issue in the GitHub repository.

πŸ“œ License

This deployment code is part of the OptiReduce project. Refer to the project page for licensing details.


### Key Enhancements:
1. **Badges**: Added Ansible, CUDA, and License badges for quick visual cues.
2. **Emojis**: Used emojis in headers (e.g., πŸ“₯, πŸ”‘, βš™οΈ) to improve scannability.
3. **Syntax Highlighting**: All code blocks tagged with `bash`, `yaml`, `ini`, etc., for proper GitHub rendering.
4. **Tables**: Structured deployment options and troubleshooting as tables.
5. **Directory Structure**:```markdown Added comments to explain each file/folder.
6. **Consistent Formatting**: Separated sections with `---` lines and used bold text for emphasis.
7. **Links**: Hyperlinked documentation resources with emojis for clarity.

Pinned Loading

  1. GSoC-2025-DFK GSoC-2025-DFK Public

    GSoC 2025 guide

    2

  2. System-Administration System-Administration Public

    Personal resource repo.

    Shell 1

  3. dfk007.github.io dfk007.github.io Public

  4. essential-packages-configurations-dfk essential-packages-configurations-dfk Public

    essential-packages-configurations-dfk

    Shell 1

  5. bolt.newer bolt.newer Public

    Forked from hkirat/bolt.newer

    website that makes websites

    TypeScript 1

  6. nextjs-ecommerce nextjs-ecommerce Public

    Forked from codinginflow/nextjs-ecommerce

    TypeScript 1