Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MMSIG] Support mmdeploy Docker for Jetson #2587

Open
wants to merge 34 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
df5c9b4
Update README.md
yinfan98 Dec 5, 2023
02e9770
Create Jetson_docker.md
yinfan98 Dec 5, 2023
3908c0d
Create Dockerfile
yinfan98 Dec 6, 2023
59198d1
Create Dockerfile
yinfan98 Dec 6, 2023
6a6a09c
Merge pull request #1 from yinfan98/patch-3
yinfan98 Dec 6, 2023
2221b58
Update README.md
yinfan98 Dec 6, 2023
4ba4e09
Update README_zh-CN.md
yinfan98 Dec 6, 2023
130a337
Update Jetson_docker.md
yinfan98 Dec 6, 2023
1dd3ed8
Create Jetson_docker.md
yinfan98 Dec 6, 2023
c5f943b
Update Jetson_docker.md
yinfan98 Dec 6, 2023
e3d15d3
Update Jetson_docker.md
yinfan98 Dec 6, 2023
f6aa719
Update README_zh-CN.md
yinfan98 Dec 6, 2023
904bee1
Update Jetson_docker.md
yinfan98 Dec 6, 2023
7ab4bed
Update Dockerfile
yinfan98 Dec 6, 2023
00161a6
Update Dockerfile
yinfan98 Dec 6, 2023
1e61f31
Update Dockerfile
yinfan98 Dec 6, 2023
1f79ff0
Update Dockerfile
yinfan98 Dec 8, 2023
f382574
Update Dockerfile
yinfan98 Dec 8, 2023
3264283
Update Jetson_docker.md
yinfan98 Dec 8, 2023
3577922
Update Jetson_docker.md
yinfan98 Dec 8, 2023
4021170
Update Jetson_docker.md
yinfan98 Dec 8, 2023
d1a2fda
Update Jetson_docker.md
yinfan98 Dec 8, 2023
3b25e3f
Update Jetson_docker.md
yinfan98 Dec 8, 2023
aa08272
Update Dockerfile
yinfan98 Dec 8, 2023
80bf688
Update Dockerfile
yinfan98 Dec 8, 2023
8bef364
Update Jetson_docker.md
yinfan98 Dec 8, 2023
1d3403c
Update Jetson_docker.md
yinfan98 Dec 8, 2023
4e47dfc
Update Dockerfile
yinfan98 Dec 14, 2023
4873416
Update Dockerfile
yinfan98 Dec 14, 2023
ab78255
Update Dockerfile
yinfan98 Dec 14, 2023
26f427e
Create distribute.py
yinfan98 Dec 25, 2023
40ca356
Update Dockerfile
yinfan98 Dec 25, 2023
e222c3c
Update Jetson_docker.md
yinfan98 Dec 25, 2023
14f4f3a
Update Jetson_docker.md
yinfan98 Dec 25, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -284,6 +284,7 @@ Please read [getting_started](docs/en/get_started.md) for the basic usage of MMD
- [Build for Win10](docs/en/01-how-to-build/windows.md)
- [Build for Android](docs/en/01-how-to-build/android.md)
- [Build for Jetson](docs/en/01-how-to-build/jetsons.md)
- [BUild for Jetson Docker](docs/en/01-how-to-build/Jetson_docker.md)
- [Build for SNPE](docs/en/01-how-to-build/snpe.md)
- [Cross Build for aarch64](docs/en/01-how-to-build/cross_build_ncnn_aarch64.md)
- User Guide
Expand Down
1 change: 1 addition & 0 deletions README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -268,6 +268,7 @@ MMDeploy 是 [OpenMMLab](https://openmmlab.com/) 模型部署工具箱,**为
- [Build for Win10](docs/zh_cn/01-how-to-build/windows.md)
- [Build for Android](docs/zh_cn/01-how-to-build/android.md)
- [Build for Jetson](docs/zh_cn/01-how-to-build/jetsons.md)
- [BUild for Jetson Docker](docs/zh_cn/01-how-to-build/Jetson_docker.md)
- [Build for SNPE](docs/zh_cn/01-how-to-build/snpe.md)
- [Cross Build for aarch64](docs/zh_cn/01-how-to-build/cross_build_ncnn_aarch64.md)
- 使用
Expand Down
119 changes: 119 additions & 0 deletions docker/Jetson/Jetpack4.6/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
FROM nvcr.io/nvidia/l4t-pytorch:r32.7.1-pth1.10-py3

ARG MMDEPLOY_VERSION=main
ENV NVIDIA_VISIBLE_DEVICE all
ENV NVIDIA_DRIVER_CAPABILITIES all
ENV CUDA_HOME="/usr/local/cuda"
ENV PATH="/usr/local/cuda/bin:${PATH}"
ENV LD_LIBRARY_PATH="/usr/local/cuda/lib64:/usr/local/lib/python3.8/dist-packages/opencv-python.libs/${LD_LIBRARY_PATH}"
ENV TENSORRT_DIR="/usr/include/aarch64-linux-gnu"

ENV DEBIAN_FRONTEND=nointeractive
ENV FORCE_CUDA="1"

USER root
WORKDIR /root/workspace

# install dependencies && reinstall python3.8
RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 42D5A192B819C5DA &&\
apt-get remove python3 &&\
apt-get update &&\
apt-get install -y vim wget libspdlog-dev libssl-dev libpng-dev pkg-config libhdf5-100 libhdf5-dev patch --no-install-recommends\
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need libhdf5-100 libhdf5-dev patch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python3.8 python3.8-dev python3.8-pip --no-install-recommends &&\
python3.8 -m pip install --upgrade --no-cache-dir setuptools packaging 'Cython<3' wheel &&\
python3.8 -m pip install --no-cache-dir --verbose wget psutil numpy &&\
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need to install these packages?

Copy link
Contributor Author

@yinfan98 yinfan98 Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python3.8 -m pip install --upgrade --no-cache-dir setuptools packaging 'Cython<3' wheel &&\
python3.8 -m pip install --no-cache-dir --verbose wget psutil numpy &&\

this package for build the pytorch, I referenc from l4t repo

python3.8 -m pip install --upgrade --force-reinstall --no-cache-dir --verbose cmake protobuf
python3.8 -m pip install onnx==1.10 versioned-hdf5 numpy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why install versioned-hdf5 numpy here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numpy here need to update, versioned-hdf5 here for pycuda


# build pytorch 1.10.0 for python3.8
# Hope it can works lol
# patch for https://github.com/pytorch/pytorch/issues/45323. Save here, maybe I will meet this issue.
# RUN PYTHON_ROOT=`pip3 show torch | grep Location: | cut -d' ' -f2` && \
# TORCH_CMAKE_CONFIG=$PYTHON_ROOT/torch/share/cmake/Torch/TorchConfig.cmake && \
# echo "patching _GLIBCXX_USE_CXX11_ABI in ${TORCH_CMAKE_CONFIG}" && \
# sed -i 's/ set(TORCH_CXX_FLAGS "-D_GLIBCXX_USE_CXX11_ABI=")/ set(TORCH_CXX_FLAGS "-D_GLIBCXX_USE_CXX11_ABI=0")/g' ${TORCH_CMAKE_CONFIG}
RUN apt-get update && \
apt-get install -y --no-install-recommends \
libopenblas-dev \
libopenmpi-dev \
openmpi-bin \
openmpi-common \
gfortran \
libomp-dev \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get clean

RUN git clone --branch v1.10.0 --depth=1 --recursive https://github.com/pytorch/pytorch /tmp/pytorch && \
cd /tmp/pytorch && \
wget https://gist.githubusercontent.com/dusty-nv/ce51796085178e1f38e3c6a1663a93a1/raw/4f1a0f948150c91f877aa38075835df748c81fe5/pytorch-1.10-jetpack-4.5.1.patch &&\
patch -p1 < pytorch-1.10-jetpack-4.5.1.patch &&\
export USE_NCCL=0 && \
export USE_QNNPACK=0 && \
export USE_PYTORCH_QNNPACK=0 && \
export USE_NATIVE_ARCH=1 && \
export USE_DISTRIBUTED=1 && \
export USE_TENSORRT=0 && \
python3.8 -m pip install -r requirements.txt && \
python3.8 -m pip install --no-cache-dir scikit-build ninja && \
python3.8 setup.py bdist_wheel && \
cp dist/*.whl /root/workspace && \
rm -rf /tmp/pytorch
RUN python3 -m pip install --verbose /opt/torch*.whl

# build torchvision for python3.8
RUN apt-get update && \
apt-get install -y --no-install-recommends \
libjpeg-dev \
zlib1g-dev \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get clean
RUN git clone --branch v0.11.1 --recursive --depth=1 https://github.com/pytorch/vision torchvision && \
cd torchvision && \
git checkout v0.11.1 && \
python3.8 setup.py bdist_wheel && \
cp dist/torchvision*.whl /opt && \
rm -rf ../torchvision
RUN python3.8 -m pip install --no-cache-dir --verbose /opt/torchvision*.whl

# build onnxruntime for python3.8
RUN wget https://nvidia.box.com/shared/static/m9bz827ljmn771kvkjksdchmkczt3xke.whl -O onnxruntime_gpu-1.10.0-cp38-cp38-linux_aarch64.whl &&\
python3.8 -m pip install --no-cache-dir onnxruntime_gpu-1.10.0-cp38-cp38-linux_aarch64.whl

# install mmcv
RUN git clone --branch 2.x https://github.com/open-mmlab/mmcv.git
RUN cd mmcv &&\
python3.8 -m pip install --no-cache-dir opencv-python==4.5.4.60 &&\
MMCV_WITH_OPS=1 python3 -m pip install -e .

# build ppl.cv
RUN git clone https://github.com/openppl-public/ppl.cv.git &&\
echo "export PPLCV_DIR=/root/workspace/ppl.cv" >> ~/.bashrc &&\
cd ppl.cv &&\
./build.sh cuda

# build mmdeploy
RUN git clone --recursive -b $MMDEPLOY_VERSION --depth 1 https://github.com/open-mmlab/mmdeploy &&\
cd mmdeploy &&\
mkdir -p build && cd build &&\
cmake .. \
-DMMDEPLOY_TARGET_BACKENDS="trt" \
-DTENSORRT_DIR=TENSORRT_DIR &&\
make -j$(nproc) && make install && cd .. &&\
python3 -m pip install --upgrade setuptools &&\
python3 -m pip install -e . &&\
mkdir -p build && cd build &&\
cmake .. \
-DMMDEPLOY_BUILD_SDK=ON \
-DMMDEPLOY_BUILD_SDK_PYTHON_API=ON \
-DMMDEPLOY_BUILD_EXAMPLES=ON \
-DMMDEPLOY_TARGET_DEVICES="cuda;cpu" \
-DMMDEPLOY_TARGET_BACKENDS="trt" \
-DTENSORRT_DIR=TENSORRT_DIR \
-Dpplcv_DIR=/root/workspace/ppl.cv/cuda-build/install/lib/cmake/ppl \
-DMMDEPLOY_CODEBASES=all && \
make -j$(nproc) && make install

ENV MMDeploy_DIR="/root/workspace/mmdeploy/build/install/lib/cmake/MMDeploy"
ENV LD_LIBRARY_PATH="/root/workspace/mmdeploy/build/lib:${BACKUP_LD_LIBRARY_PATH}"
ENV PATH="/root/workspace/mmdeploy/build/bin:${PATH}"
ENV PYTHONPATH="/root/workspace/mmdeploy:${PYTHONPATH}"
78 changes: 78 additions & 0 deletions docker/Jetson/Jetpack5/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
FROM nvcr.io/nvidia/l4t-pytorch:r35.2.1-pth2.0-py3

ARG MMDEPLOY_VERSION=main
ENV NVIDIA_VISIBLE_DEVICE all
ENV NVIDIA_DRIVER_CAPABILITIES all
ENV CUDA_HOME="/usr/local/cuda"
ENV PATH="/usr/local/cuda/bin:${PATH}"
ENV LD_LIBRARY_PATH="/usr/local/cuda/lib64:/usr/local/lib/python3.8/dist-packages/opencv-python.libs${LD_LIBRARY_PATH}"
ENV TENSORRT_DIR="/usr/include/aarch64-linux-gnu"

ENV DEBIAN_FRONTEND=nointeractive
ENV FORCE_CUDA="1"

USER root
WORKDIR /root/workspace

# install dependencies
RUN apt-get update &&\
apt-get install -y vim wget libspdlog-dev libssl-dev libpng-dev pkg-config libhdf5-103 libhdf5-dev --no-install-recommends &&\
python3 -m pip install onnx versioned-hdf5

# install onnxruntime
RUN wget https://nvidia.box.com/shared/static/mvdcltm9ewdy2d5nurkiqorofz1s53ww.whl -O onnxruntime_gpu-1.15.1.whl &&\
python3 -m pip install --no-cache-dir onnxruntime_gpu-1.15.1-cp38-cp38-linux_aarch64.whl

# install mmcv
RUN git clone --branch 2.x https://github.com/open-mmlab/mmcv.git &&\
python3 -m pip install --no-cache-dir opencv-python==4.5.4.60 opencv-contrib-python==4.5.4.60 opencv-python-headless==4.5.4.60 &&\
MMCV_WITH_OPS=1 python3 -m pip install -e .

# build ppl.cv
RUN git clone https://github.com/openppl-public/ppl.cv.git &&\
echo "export PPLCV_DIR=/root/workspace/ppl.cv" >> ~/.bashrc &&\
./build.sh cuda

# build mmdeploy
RUN git clone --recursive -b $MMDEPLOY_VERSION --depth 1 https://github.com/open-mmlab/mmdeploy &&\
cd mmdeploy &&\
mkdir -p build && cd build &&\
cmake .. \
-DMMDEPLOY_TARGET_BACKENDS="trt" \
-DTENSORRT_DIR=TENSORRT_DIR &&\
make -j$(nproc) && make install && cd .. &&\
cd mmdeploy &&\
python3 -m pip install --upgrade setuptools &&\
python3 -m pip install -e . &&\
mkdir -p build && cd build &&\
cmake .. \
-DMMDEPLOY_BUILD_SDK=ON \
-DMMDEPLOY_BUILD_SDK_PYTHON_API=ON \
-DMMDEPLOY_BUILD_EXAMPLES=ON \
-DMMDEPLOY_TARGET_DEVICES="cuda;cpu" \
-DMMDEPLOY_TARGET_BACKENDS="trt" \
-DTENSORRT_DIR=TENSORRT_DIR \
-Dpplcv_DIR=/root/workspace/ppl.cv/cuda-build/install/lib/cmake/ppl \
-DMMDEPLOY_CODEBASES=all && \
make -j$(nproc) && make install

# add patch to solve the build error
RUN sed -i '/def _run_symbolic_method(g, op_name, symbolic_fn, args):/,/except TypeError as e:/\
{
s/\
return symbolic_fn(g, \*args)/\
graph_context = jit_utils.GraphContext(\
graph=g,\
block=g.block(),\
opset=GLOBALS.export_onnx_opset_version,\
original_node=None, # type: ignore[arg-type]\
params_dict=_params_dict,\
env={},\
)\
return symbolic_fn(graph_context, \*args)/\
}' /usr/local/lib/python3.8/dist-packages/torch/onnx/symbolic_helper.py

ENV MMDeploy_DIR="/root/workspace/mmdeploy/build/install/lib/cmake/MMDeploy"
ENV LD_LIBRARY_PATH="/root/workspace/mmdeploy/build/lib:${BACKUP_LD_LIBRARY_PATH}"
ENV PATH="/root/workspace/mmdeploy/build/bin:${PATH}"
ENV PYTHONPATH="/root/workspace/mmdeploy:${PYTHONPATH}"
134 changes: 134 additions & 0 deletions docs/en/01-how-to-build/Jetson_docker.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# Use Jetson Docker Image

This document guides how to install mmdeploy with [Docker](https://docs.docker.com/get-docker/) on Jetson.

## Get prebuilt docker images

MMDeploy provides prebuilt docker images for the convenience of its users on [Docker Hub](https://hub.docker.com/r/openmmlab/mmdeploy). The docker images are built on
the latest and released versions. We release two docker version, for Jetpack=5.1 and Jetpack=4.6.1
For instance, the image with tag `openmmlab/mmdeploy_jetpack5:v1` is built for Jetpack5.1 and the image with tag `openmmlab/mmdeploy_jetpack4.6.1:v1` is build for Jetpack 4.6.1.
The specifications of the Docker Images are shown below.

- jetpack5.1

| Item | Version |
| :---------: | :---------: |
| Jetpack | 5.1 |
| Python | 3.8.10 |
| Torch | 2.0.0 |
| TorchVision | 0.15.0 |

- jetpack4.6.1

| Item | Version |
| :---------: | :---------: |
| Jetpack | 4.6.1 |
| Python | 3.8.10 |
| Torch | 1.10.0 |
| TorchVision | 0.11.0 |

- jetpack 5.1
```shell
export TAG=openmmlab/mmdeploy_jetpack5:v1
docker pull $TAG
```
- jetpack 4.6.1
```shell
export TAG=openmmlab/mmdeploy_jetpack4.6:v1
docker pull $TAG
```
## Build docker images (optional)

If the prebuilt docker images do not meet your requirements,
then you can build your own image by running the following script.
The docker file is `docker/jetson/jetpack5/Dockerfile` and `docker/jetson/jetpack4.6/Dockerfile`,

```shell
docker build docker/jetson/jetpack5 -t openmmlab/mmdeploy_jetpack5:v1 .
//
docker build docker/jetson/jetpack4.6 -t openmmlab/mmdeploy_jetpack4.6:v1 .
```

## Run docker container

After pulling or building the docker image, you can use `docker run` to launch the docker service:

```shell
docker run -it --rm --runtime nvidia --network host openmmlab/mmdeploy_jetpack5:v1
//
docker run -it --rm --runtime nvidia --network host openmmlab/mmdeploy_jetpack4.6:v1
```

## TroubleShooting
update: I solved the problem 3, 4 using sed in docker.
If you using the jetpack5, it has some question need to solve.
1. OpenCV problem
if you find import cv2 wrong, can't find the libpng15.so
```shell
ln -s /usr/local/lib/python3.x/dist-packages/opencv-python.libs/* /usr/lib
```

2. mmdetection problem
if you find installed the mmdetection, but import the mmdet failed. you should use this to install
```shell
python3 -m pip install --user -e .
```

3. Jetson No distributed problem(this is rewrited with the PR)
if you convert the model like [Jetson.md](https://github.com/open-mmlab/mmdeploy/blob/main/docs/en/01-how-to-build/jetsons.md)
you may find torch.distributed has no attribute ReduceOp.
I just issue and make a simple patch, add file jetson_patch.py on ./mmdeploy/tools/
```python
import torch.distributed
if not torch.distributed.is_available():
torch.distributed.ReduceOp = lambda: None
```
and import jetson_patch at the beginning which file you want.
I know is not quietly ellegant, but it works well...(for Jetson AGX Orin)

4. Jetpack with PyTorch 2.0 has some issue
> If you use the docker, we help you change the PyTorch in dockerfile.

we need to modify torch.onnx._run_symbolic_method
**from**
```python
def _run_symbolic_method(g, op_name, symbolic_fn, args):
r"""
This trampoline function gets invoked for every symbolic method
call from C++.
"""
try:
return symbolic_fn(g, *args)
except TypeError as e:
# Handle the specific case where we didn't successfully dispatch
# to symbolic_fn. Otherwise, the backtrace will have the clues
# you need.
e.args = ("{} (occurred when translating {})".format(e.args[0], op_name),)
raise
```
**to**
```python
@_beartype.beartype
def _run_symbolic_method(g, op_name, symbolic_fn, args):
r"""
This trampoline function gets invoked for every symbolic method
call from C++.
"""
try:
graph_context = jit_utils.GraphContext(
graph=g,
block=g.block(),
opset=GLOBALS.export_onnx_opset_version,
original_node=None, # type: ignore[arg-type]
params_dict=_params_dict,
env={},
)
return symbolic_fn(graph_context, *args)
except TypeError as e:
# Handle the specific case where we didn't successfully dispatch
# to symbolic_fn. Otherwise, the backtrace will have the clues
# you need.
e.args = (f"{e.args[0]} (occurred when translating {op_name})",)
raise
```
Finally we can use Jetpack5.1 && MMDeploy happily:)
Loading
Loading