Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

PaddlePaddle Document for HackMIT

Mimee edited this page Sep 26, 2017 · 52 revisions

Prizes

At HackMIT, we are looking for winning teams that successfully employ Paddle as part of their application or development process. To be considered for the grand prize, please be prepared to have git commit logs or other evidence showing that all the code was written during the event, and that Paddle is used.

In addition, we reward teams focusing on potentially big impact and/or interesting use cases.

PaddlePaddle Introduction

PaddlePaddle is a deep learning framework originally developed within Baidu. It is widely used Baidu, including in the search engine, the advertising system, autonomous driving, speech-based AI systems, etc.

In September 2016, Baidu open-sourced PaddlePaddle. Compared to other deep learning frameworks, PaddlePaddle has a focus on usability and practicality. It is one of the most actively developed deep learning frameworks.

Recent Work

PaddlePaddle used to release a binary executable file that accepts a model configuration file in Python. Since February 2017, PaddlePaddle provides a Python API and the team is porting many applications and writing a tutorial using the API.

The team is also rewriting the C++ core -- including an innovative conceptual model nested blocks other than widely accepted graph of operators. [TODO(wangkuiyi): link to design doc] Also the team is integrating PaddlePaddle and Kubernetes to enable very large scale distributed training.

Pre-req: Install and use Docker

Paddle requires Docker to manage dependencies and load models. Docker for beginners

For those of you using Jupyter notebooks (previously iPython notebooks), Paddle provides an interactive book where the tutorials can be run.

Pretrained-Models

https://github.com/PaddlePaddle/book/wiki/Using-Pre-trained-Models

End-to-End MNIST Demo

The code and the documentation for this portion lives here.

PaddlePaddle Inference Server Example

https://github.com/PaddlePaddle/book/tree/develop/serve

Usage of Python API

How to Train Model With Paddle Book Docker Image

  1. pull and run the docker image

    To run a CPU-only Docker image:

    docker run -it -v $HOME/.cache:/root/.cache --name xxx paddlepaddle/book:latest /bin/bash

    To run a CUDA-enabled Docker image:

    nvidia-docker run -it -v $HOME/.cache:/root/.cache --name xxx paddlepaddle/book:latest-gpu /bin/bash
    • Use --name xx option to specify a name for your container, so when you leave. you can use docker start xx and docker attach xx to reenter it.
  2. Navigate to the book chapter directory, such as 02.recognize_digits

    cd /book/02.recognize_digits/
  3. Run training process.

    python train.py
  4. Save Model and copy model out of Docker container.

    During training, the training process will save parameter to disk file like params_pass_10.tar, if you want to copy it out, you can copy it into /root/.cache and then get the model file in $HOME/.cache of the host machine.

How to use GPU resource on AWS

For general instructions on using connecting EC2 instance, please refer to AWS's detailed guide.

  1. On your local machine, open up the Terminal application, or other command-line shell applications, navigate to the directory of the private key file hackmit-paddlepaddle-1.pem provided (please contact hackmit-baidu.slack.com for your copy).

  2. Change the permissions of the .pem file so only the root user can read it: chmod 400 hackmit-paddlepaddle-1.pem

    or other filenames using chmod 400 $filename.

  3. Use the ssh command to connect to the instance with the IP address provided. For example, if the IP address is 10.254.142.33,

    ssh -i hackmit-paddlepaddle-1.pem ubuntu@10.254.142.33

  4. Once logged in, you will be able to run docker and nvidia-docker.