Fix inference container #135

albert17 · 2022-03-09T04:47:31Z

No description provided.

albert17 · 2022-03-09T04:48:05Z

Fixes NVIDIA-Merlin/Transformers4Rec#373

IamGianluca · 2022-03-09T13:48:30Z

docker/inference/dockerfile.ctr

+# Triton Server
+FROM ${FULL_IMAGE} as full
+WORKDIR /opt/tritonserver
+COPY --chown=1000:1000 --from=full /opt/tritonserver/LICENSE .
+COPY --chown=1000:1000 --from=full /opt/tritonserver/TRITON_VERSION .
+COPY --chown=1000:1000 --from=full /opt/tritonserver/NVIDIA_Deep_Learning_Container_License.pdf .
+COPY --chown=1000:1000 --from=full /opt/tritonserver/bin bin/
+COPY --chown=1000:1000 --from=full /opt/tritonserver/lib lib/
+COPY --chown=1000:1000 --from=full /opt/tritonserver/include include/
+COPY --chown=1000:1000 --from=full /opt/tritonserver/repoagents/ repoagents/ 
+COPY --chown=1000:1000 --from=full /usr/bin/serve /usr/bin/.


I think we will need to add these lines also in dockerfile.torch.

@IamGianluca Triton server is already installed in pytorch and tensorflow inference containers

albertoa@pursuit-dgxstation:~/Projects/Merlin/docker/inference$ docker run --pull always --gpus=all -it --ipc=host --cap-add SYS_NICE nvcr.io/nvstaging/merlin/merlin-pytorch-inference:22.03 /bin/bash 22.03: Pulling from nvstaging/merlin/merlin-pytorch-inference Digest: sha256:8be045dfbb42ea128aca833b78f4847bcc69557ba97f5e65a3703f50606fc646 Status: Image is up to date for nvcr.io/nvstaging/merlin/merlin-pytorch-inference:22.03 ============================= == Triton Inference Server == ============================= NVIDIA Release 22.02 (build 32400308) Copyright (c) 2018-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. Various files include modifications (c) NVIDIA CORPORATION. All rights reserved. This container image and its contents are governed by the NVIDIA Deep Learning Container License. By pulling and using the container, you accept the terms and conditions of this license: https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license NOTE: Legacy NVIDIA Driver detected. Compatibility mode ENABLED. root@3f045a14f399:/opt/tritonserver# ls -la total 12396 drwxr-xr-x 1 root root 4096 Mar 4 10:17 . drwxr-xr-x 1 root root 4096 Feb 18 05:47 .. -rw-rw-r-- 1 triton-server triton-server 1485 Feb 18 01:23 LICENSE -rw-rw-r-- 1 triton-server triton-server 3012640 Feb 18 01:23 NVIDIA_Deep_Learning_Container_License.pdf -rw-rw-r-- 1 triton-server triton-server 7 Feb 18 01:23 TRITON_VERSION drwxr-xr-x 1 triton-server triton-server 4096 Mar 4 10:17 backends drwxr-xr-x 2 triton-server triton-server 4096 Feb 18 05:47 bin drwxrwxr-x 15 root root 4096 Mar 4 08:20 cmake-3.21.1 -rw-r--r-- 1 root root 9629567 Jul 27 2021 cmake-3.21.1.tar.gz drwxr-xr-x 3 triton-server triton-server 4096 Feb 18 05:47 include drwxr-xr-x 2 triton-server triton-server 4096 Feb 18 05:47 lib -rwxrwxr-x 1 triton-server triton-server 4266 Feb 18 05:41 nvidia_entrypoint.sh drwxr-xr-x 1 triton-server triton-server 4096 Feb 18 05:48 repoagents root@3f045a14f399:/opt/tritonserver# ls -la bin/ total 10848 drwxr-xr-x 2 triton-server triton-server 4096 Feb 18 05:47 . drwxr-xr-x 1 root root 4096 Mar 4 10:17 .. -rwxr-xr-x 1 triton-server triton-server 11092616 Feb 18 01:38 tritonserver root@3f045a14f399:/opt/tritonserver#

This problems only happens in ctr (hugectr) since the base image is very minimal. I was working in getting smaller inference container since size was a problem for cloud providers, and I cut too much.

Got it! Thank you for the explanation @albert17 👍

yingcanw · 2022-03-14T10:23:02Z

@albert17 ,when this PR can be merged, because inference-related CI of HugeCTR backend has failed to pass using old container

albert17 · 2022-03-16T17:51:50Z

merling-pytorch-inference and merlin-tensorflow-inference nightly are pushed

rnyak · 2022-03-16T19:33:27Z

@albert17 did you add FIL to both tensorflow-inference and pytorch-inference nightly containers?

albert17 · 2022-03-16T22:12:50Z

@IamGianluca Please try

docker pull nvcr.io/nvidia/merlin/merlin-inference:nightly

Initial pull

4ee800e

albert17 mentioned this pull request Mar 9, 2022

[BUG] tritonserver not found in Merlin-inference 22.03 image NVIDIA-Merlin/Transformers4Rec#373

Closed

IamGianluca reviewed Mar 9, 2022

View reviewed changes

Alberto Alvarez added 3 commits March 9, 2022 13:43

Removes numba

db83a5c

Fixes full reference

5ae7346

Adds pahs

8e0fdd6

This was linked to issues Mar 10, 2022

Merlin inference Pytorch Numpy error #138

Closed

Merlin Inference Pytorch Numba #139

Closed

albert17 mentioned this pull request Mar 10, 2022

[BUG and QUESTION] Merlin inference container 22.03 from NGC not working #142

Closed

albert17 linked an issue Mar 10, 2022 that may be closed by this pull request

[BUG and QUESTION] Merlin inference container 22.03 from NGC not working #142

Closed

albert17 requested a review from rnyak March 10, 2022 22:42

Alberto Alvarez added 6 commits March 10, 2022 17:29

Updates CI

49eb534

Update packes

17486a0

updates tests

6afa4f2

Tmp

db16008

update testing script

d239f97

fixes install

27b69c2

jperez999 approved these changes Mar 13, 2022

View reviewed changes

albert17 and others added 4 commits March 14, 2022 18:35

Merge branch 'NVIDIA-Merlin:main' into inference

90e2f7f

Add core

68a76b9

Add all components

7e4acde

Updates ci container

f713ee1

albert17 linked an issue Mar 15, 2022 that may be closed by this pull request

Add merlin-systems to the containers #152

Closed

Alberto Alvarez added 4 commits March 15, 2022 16:38

transformers

0b93013

Update testting

b726bb3

Adds fil backend

88e12f2

Fixes spaces

240685d

fixes pckconfig

356e9f1

Fix path

9aca0af

albert17 and others added 5 commits March 16, 2022 15:34

Merge branch 'NVIDIA-Merlin:main' into inference

0085ad7

Add packages

3e81413

Fix typo

26f5989

Fixes vulnerabilties

4e18863

Add python backend

ba7a1b0

albert17 merged commit c18ba02 into NVIDIA-Merlin:main Mar 17, 2022

karlhigley mentioned this pull request Mar 17, 2022

Add systems to the inference containers NVIDIA-Merlin/systems#41

Closed

karlhigley mentioned this pull request Mar 28, 2022

[BUG] hugectr not available in merlin-inference and merlin-tensorflow-training #161

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix inference container #135

Fix inference container #135

albert17 commented Mar 9, 2022

albert17 commented Mar 9, 2022

IamGianluca Mar 9, 2022

albert17 Mar 10, 2022

IamGianluca Mar 11, 2022

yingcanw commented Mar 14, 2022

albert17 commented Mar 16, 2022

rnyak commented Mar 16, 2022

albert17 commented Mar 16, 2022

Fix inference container #135

Fix inference container #135

Conversation

albert17 commented Mar 9, 2022

albert17 commented Mar 9, 2022

IamGianluca Mar 9, 2022

Choose a reason for hiding this comment

albert17 Mar 10, 2022

Choose a reason for hiding this comment

IamGianluca Mar 11, 2022

Choose a reason for hiding this comment

yingcanw commented Mar 14, 2022

albert17 commented Mar 16, 2022

rnyak commented Mar 16, 2022

albert17 commented Mar 16, 2022