Skip to content

Latest commit

 

History

History
238 lines (167 loc) · 5.73 KB

PyTorch-install.md

File metadata and controls

238 lines (167 loc) · 5.73 KB

NVIDIA Driver & Pytorch installation

The simple setup tutorial for deep learning beginner.

STEP 1: Install NVIDIA Driver

sudo apt-get update
sudo apt-get upgrade
  • Disable Nouveau
sudo vi /etc/modprobe.d/disable-nouveau.conf

//insert the following lines
blacklist nouveau
options nouveau modeset=0

  • Update kernel initramfs
sudo update-initramfs -u
sudo reboot
  • Install NVIDIA driver

hit Ctrl+Alt+F1(tty1) and login, stop lightdm service & install dirver

sudo service lightdm stop
sudo apt-get install nvidia-375
sudo reboot

(optional)To install latest drivers add PPA:

sudo apt-get purge nvidia-*
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-3xx

Then, you can use nvidia-smi to get the GPU info.

STEP 2: Install CUDA 8.0

  • Download setup file from official website
  • The latest release version is 9.0, but we use CUDA 8.0 cuda_8.0.61_375.26_linux.run
cd Downloads/
sudo chmod a+x cuda_8.0.61_375.26_linux.run
sudo ./cuda_8.0.61_375.26_linux.run
  • Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 375.26?
  • (Attention!) Please input no
Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 375.26?
(y)es/(n)o/(q)uit: no

Install the CUDA 8.0 Toolkit?
(y)es/(n)o/(q)uit: y
Enter Toolkit Location
 [ default is /usr/local/cuda-8.0 ]: 
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y
Install the CUDA 8.0 Samples?
(y)es/(n)o/(q)uit: y
Enter CUDA Samples Location
 [ default is /home/bg ]: 

  • Set up the environment
sudo vi ~/.bashrc  
//insert the following lines
export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
  • Reload .bashrc
source ~/.bashrc
  • (optional)You can also test CUDA Samples
cd /usr/local/cuda-8.0/samples/1_Utilities/deviceQuery
sudo make
sudo ./deviceQuery

STEP 3: Install Cudnn

  • Download cuDNN files from official website (need to login, cuDNN v6.0 for CUDA 8.0 in this tutorial)
cd Downloads/
tar -zxvf cudnn-8.0-linux-x64-v6.0.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda-8.0/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda-8.0/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda-8.0/lib64/libcudnn*

STEP 4-1: Install PyTorch via virtualenv


virtualenv is a tool to create isolated Python environments.
We recommend you to use it instead of "native" pip.

    1. Install pip and virtualenv(python3)
sudo apt-get install python3-pip python3-dev python-virtualenv
    1. Create a virtualenv environment:
virtualenv --system-site-packages -p python3 targetDirectory 

where targetDirectory specifies the top of the virtualenv tree, our instructions assume that targetDirectory is ~/deeplearning, but you may choose any directory.

    1. Activate the virtualenv environment:
source ~/deeplearning/bin/activate

The preceding source command should change your prompt to the following:

(deeplearning) bg@bg-cgi:~$
    1. Ensure pip ≥8.1 is installed:
(deeplearning) bg@bg-cgi:~$ easy_install -U pip
    1. Run this command to install Pytorch(python3.5):
pip3 install http://download.pytorch.org/whl/cu80/torch-0.2.0.post3-cp35-cp35m-manylinux1_x86_64.whl 
pip3 install torchvision

Then, just test the Warm Up demo:

(deeplearning) bg@bg-cgi:~$ git clone https://github.com/2017-fall-DL-training-program/PyTorchWarmUp.git
(deeplearning) bg@bg-cgi:~$ cd PyTorchWarmUp/
(deeplearning) bg@bg-cgi:~/PyTorchWarmUp$ python3 CNN_MNIST_pytorch.py

After training(a few mins), you can see the following result:

Train Epoch: 20 [58496/60000 (97%)] Loss: 0.012621
Train Epoch: 20 [58624/60000 (98%)] Loss: 0.011611
Train Epoch: 20 [58752/60000 (98%)] Loss: 0.010518
Train Epoch: 20 [58880/60000 (98%)] Loss: 0.008049
Train Epoch: 20 [59008/60000 (98%)] Loss: 0.042140
Train Epoch: 20 [59136/60000 (99%)] Loss: 0.002592
Train Epoch: 20 [59264/60000 (99%)] Loss: 0.003716
Train Epoch: 20 [59392/60000 (99%)] Loss: 0.006034
Train Epoch: 20 [59520/60000 (99%)] Loss: 0.015948
Train Epoch: 20 [59648/60000 (99%)] Loss: 0.005287
Train Epoch: 20 [59776/60000 (100%)]    Loss: 0.039094
Train Epoch: 20 [44928/60000 (100%)]    Loss: 0.005245

Test set: Average loss: 0.0271, Accuracy: 9908/10000 (99%)

STEP 4-2: Install TensorFlow via virtualenv

  • 1 - 4: same as PyTorch

    1. Issue one of the following commands to install TensorFlow(GPU) in the active virtualenv environment:
(deeplearning) bg@bg-cgi:~$ pip install --upgrade tensorflow      # for Python 2.7
(deeplearning) bg@bg-cgi:~$ pip3 install --upgrade tensorflow     # for Python 3.n
(deeplearning) bg@bg-cgi:~$ pip install --upgrade tensorflow-gpu  # for Python 2.7 and GPU
(deeplearning) bg@bg-cgi:~$ pip3 install --upgrade tensorflow-gpu # for Python 3.n and GPU

    1. When you are done using Pytorch(TensorFlow), you may deactivate the environment by invoking the deactivate function as follows:
(deeplearning) bg@bg-cgi:~$ deactivate 
    1. Uninstalling Pytorch(TensorFlow)

To uninstall Pytorch(TensorFlow), simply remove the tree you created. For example:

rm -r ~/deeplearning

Reference