Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce TorchServe Docker GPU Image Size #2392

Merged
merged 10 commits into from
Jun 13, 2023
Merged

Conversation

agunapal
Copy link
Collaborator

@agunapal agunapal commented Jun 5, 2023

Description

TorchServe GPU Docker Image Size has increased with each release

0.6.0 GPU image size

pytorch/torchserve           0.6.0-gpu                           fb6d4b85847d   11 months ago   4.49GB

0.8.0 Nightly GPU Image

pytorch/torchserve-nightly   latest-gpu                          4595b0ca83a3   12 hours ago    8.4GB

The main contributers are

  • 2 GB from increase in PyTorch binaries and its dependencies(torch, torchvision, torchaudio)
3.9G	/home/ubuntu/anaconda3/envs/pytorch_2.0
1.9G	/home/ubuntu/anaconda3/envs/pytorch_1.11

  • 1.04 GB from the nvidia runtime image
nvidia/cuda                  11.7.0-cudnn8-runtime-ubuntu20.04   6e0488db6af9   5 months ago    2.92GB
nvidia/cuda                  10.2-cudnn8-runtime-ubuntu18.04     9134b931c303   5 months ago    1.88GB
nvidia/cuda                  11.7.0-base-ubuntu20.04     3790a37af140   5 months ago    211MB
  • 280 MB from jdk 11 to jdk 17

Solution

  1. Use nvidia base image instead of runtime image (https://github.com/NVIDIA/nvidia-docker/wiki/CUDA)

This reduces image size by 3GB

pytorch/ts-base              latest-gpu                          70dac4f0a335   3 hours ago     5.23GB

Whats in the base image vs runtime image

base

ls /usr/local/cuda/lib64
libcudart.so.12  libcudart.so.12.1.105

runtime

ls /usr/local/cuda/lib64
libOpenCL.so.1         libcublasLt.so.12.1.3.1  libcufftw.so.11.0.2.54   libcurand.so.10.3.2.106      libcusparse.so.12.1.0.106  libnppicc.so.12.1.0.40   libnppig.so.12.1.0.40   libnppisu.so.12.1.0.40  libnvJitLink.so.12.1.105  libnvjpeg.so.12
libOpenCL.so.1.0       libcudart.so.12          libcufile.so.0           libcusolver.so.11            libnppc.so.12              libnppidei.so.12         libnppim.so.12          libnppitc.so.12         libnvToolsExt.so          libnvjpeg.so.12.2.0.2
libOpenCL.so.1.0.0     libcudart.so.12.1.105    libcufile.so.1.6.1       libcusolver.so.11.4.5.107    libnppc.so.12.1.0.40       libnppidei.so.12.1.0.40  libnppim.so.12.1.0.40   libnppitc.so.12.1.0.40  libnvToolsExt.so.1        libnvrtc-builtins.so.12.1
libcublas.so.12        libcufft.so.11           libcufile_rdma.so.1      libcusolverMg.so.11          libnppial.so.12            libnppif.so.12           libnppist.so.12         libnpps.so.12           libnvToolsExt.so.1.0.0    libnvrtc-builtins.so.12.1.105
libcublas.so.12.1.3.1  libcufft.so.11.0.2.54    libcufile_rdma.so.1.6.1  libcusolverMg.so.11.4.5.107  libnppial.so.12.1.0.40     libnppif.so.12.1.0.40    libnppist.so.12.1.0.40  libnpps.so.12.1.0.40    libnvblas.so.12           libnvrtc.so.12
libcublasLt.so.12      libcufftw.so.11          libcurand.so.10          libcusparse.so.12            libnppicc.so.12            libnppig.so.12           libnppisu.so.12         libnvJitLink.so.12      libnvblas.so.12.1.3.1     libnvrtc.so.12.1.105

Fixes #(issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Feature/Issue validation/testing

Please describe the Unit or Integration tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

  • Start TorchServe
 docker run --rm -it --gpus all -p 8080:8080 -p 8081:8081 -p 8082:8082 -v $(pwd)/model_store:/home/model-server/model-store pytorch/ts-new:latest-gpu
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2023-06-06T00:27:16,436 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2023-06-06T00:27:16,490 [INFO ] main org.pytorch.serve.metrics.configuration.MetricConfiguration - Successfully loaded metrics configuration from /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml
2023-06-06T00:27:16,578 [INFO ] main org.pytorch.serve.ModelServer - 
Torchserve version: 0.8.0
TS Home: /home/venv/lib/python3.9/site-packages
Current directory: /home/model-server
Temp directory: /home/model-server/tmp
Metrics config path: /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml
Number of GPUs: 1
Number of CPUs: 8
Max heap size: 7940 M
Python executable: /home/venv/bin/python
Config file: /home/model-server/config.properties
Inference address: http://0.0.0.0:8080
Management address: http://0.0.0.0:8081
Metrics address: http://0.0.0.0:8082
Model Store: /home/model-server/model-store
Initial Models: N/A
Log dir: /home/model-server/logs
Metrics dir: /home/model-server/logs
Netty threads: 32
Netty client threads: 0
Default workers per model: 1
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Limit Maximum Image Pixels: true
Prefer direct buffer: false
Allowed Urls: [file://.*|http(s)?://.*]
Custom python dependency for model allowed: false
Enable metrics API: true
Metrics mode: log
Disable system metrics: false
Workflow Store: /home/model-server/model-store
Model config: N/A
2023-06-06T00:27:16,583 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager -  Loading snapshot serializer plugin...
2023-06-06T00:27:16,600 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2023-06-06T00:27:16,643 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://0.0.0.0:8080
2023-06-06T00:27:16,644 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2023-06-06T00:27:16,645 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://0.0.0.0:8081
2023-06-06T00:27:16,645 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2023-06-06T00:27:16,646 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://0.0.0.0:8082
Model server started.

  • Register Model
curl -X POST "localhost:8081/models?model_name=mnist&url=mnist.mar&initial_workers=4"
{
  "status": "Model \"mnist\" Version: 1.0 registered with 4 initial workers"
}

  • Inference
curl http://127.0.0.1:8080/predictions/mnist -T examples/image_classifier/mnist/test_data/8.png
8

Regression Tests

test_distributed_inference_handler.py::test_large_model_inference SKIPPED (Distributed inference requires multi-gpu machine, skipping for now)                         [  0%]
test_example_dcgan.py::test_model_archive_creation ERROR                                                                                                               [  1%]
test_example_dcgan.py::test_model_register_unregister ERROR                                                                                                            [  2%]
test_example_dcgan.py::test_image_generation_without_any_input_constraints ERROR                                                                                       [  3%]
test_example_dcgan.py::test_image_generation_with_input_constraints ERROR                                                                                              [  4%]
test_example_intel_extension_for_pytorch.py::test_single_worker_affinity SKIPPED (Make sure intel-extension-for-pytorch is installed and torch.backends.xeon.run_c...) [  5%]
test_example_intel_extension_for_pytorch.py::test_multi_worker_affinity SKIPPED (Make sure intel-extension-for-pytorch is installed and torch.backends.xeon.run_cp...) [  6%]
test_example_intel_extension_for_pytorch.py::test_worker_scale_up_affinity SKIPPED (Make sure intel-extension-for-pytorch is installed and torch.backends.xeon.run...) [  7%]
test_example_intel_extension_for_pytorch.py::test_worker_scale_down_affinity SKIPPED (Make sure intel-extension-for-pytorch is installed and torch.backends.xeon.r...) [  8%]
test_example_micro_batching.py::test_single_example_inference[yaml_config] PASSED                                                                                      [  9%]
test_example_micro_batching.py::test_multi_example_inference[4-yaml_config] PASSED                                                                                     [ 10%]
test_example_micro_batching.py::test_multi_example_inference[4-no_config] PASSED                                                                                       [ 11%]
test_example_micro_batching.py::test_single_example_inference[no_config] PASSED                                                                                        [ 11%]
test_example_micro_batching.py::test_multi_example_inference[16-no_config] PASSED                                                                                      [ 12%]
test_example_micro_batching.py::test_multi_example_inference[16-yaml_config] PASSED                                                                                    [ 13%]
test_example_scriptable_tokenzier.py::test_handler PASSED                                                                                                              [ 14%]
test_example_scriptable_tokenzier.py::test_inference_with_untrained_model_and_sample_text PASSED                                                                       [ 15%]
test_example_scriptable_tokenzier.py::test_inference_with_untrained_model_and_empty_string 2023-06-06T19:17:23,816 [INFO ] W-9001-scriptable_tokenizer_untrained_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 1023
PASSED2023-06-06T19:17:23,816 [INFO ] W-9001-scriptable_tokenizer_untrained_1.0 TS_METRICS - WorkerThreadTime.Milliseconds:3.0|#Level:Host|#hostname:7df4f21537d1,timestamp:1686079043                                                                      [ 16%]

test_example_scriptable_tokenzier.py::test_inference_with_pretrained_model 2023-06-06T19:18:13,585 [DEBUG] W-9002-scriptable_tokenizer_1.0 org.pytorch.serve.wlm.WorkerThread - sent a reply, jobdone: true
2023-06-06T19:18:13,585 [INFO ] W-9002-scriptable_tokenizer_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 1160
2023-06-06T19:18:13,585 [INFO ] W-9002-scriptable_tokenizer_1.0 TS_METRICS - WorkerThreadTime.Milliseconds:2.0|#Level:Host|#hostname:7df4f21537d1,timestamp:1686079093
FAILED                                                                                      [ 17%]
test_gRPC_inference_api.py::test_inference_apis PASSED                                                                                                                 [ 18%]
test_gRPC_inference_api.py::test_inference_stream_apis PASSED                                                                                                          [ 19%]
test_gRPC_management_apis.py::test_management_apis PASSED                                                                                                              [ 20%]
test_handler.py::test_mnist_model_register_and_inference_on_valid_model PASSED                                                                                         [ 21%]
test_handler.py::test_mnist_model_register_using_non_existent_handler_with_nonzero_workers PASSED                                                                      [ 22%]
test_handler.py::test_mnist_model_register_scale_inference_with_non_existent_handler PASSED                                                                            [ 22%]
test_handler.py::test_mnist_model_register_and_inference_on_valid_model_explain PASSED                                                                                 [ 23%]
test_handler.py::test_kserve_mnist_model_register_and_inference_on_valid_model PASSED                                                                                  [ 24%]
test_handler.py::test_kserve_mnist_model_register_scale_inference_with_non_existent_handler PASSED                                                                     [ 25%]
test_handler.py::test_kserve_mnist_model_register_and_inference_on_valid_model_explain PASSED                                                                          [ 26%]
test_handler.py::test_huggingface_bert_batch_inference PASSED                                                                                                          [ 27%]
test_handler.py::test_MMF_activity_recognition_model_register_and_inference_on_valid_model SKIPPED (MMF doesn't support PT 1.10 yet)                                   [ 28%]
test_handler.py::test_huggingface_bert_model_parallel_inference PASSED                                                                                                 [ 29%]
test_handler.py::test_echo_stream_inference PASSED                                                                                                                     [ 30%]
test_metrics.py::test_logs_created PASSED                                                                                                                              [ 31%]
test_metrics.py::test_logs_startup_cfg_created_snapshot_enabled PASSED                                                                                                 [ 32%]
test_metrics.py::test_logs_startup_cfg_created_snapshot_disabled PASSED                                                                                                [ 33%]
test_metrics.py::test_metrics_startup_cfg_created_snapshot_enabled PASSED                                                                                              [ 33%]
test_metrics.py::test_metrics_startup_cfg_created_snapshot_disabled PASSED                                                                                             [ 34%]
test_metrics.py::test_log_location_var_snapshot_disabled PASSED                                                                                                        [ 35%]
test_metrics.py::test_log_location_var_snapshot_enabled PASSED                                                                                                         [ 36%]
test_metrics.py::test_async_logging PASSED                                                                                                                             [ 37%]
test_metrics.py::test_async_logging_non_boolean PASSED                                                                                                                 [ 38%]
test_metrics.py::test_metrics_location_var_snapshot_disabled PASSED                                                                                                    [ 39%]
test_metrics.py::test_metrics_location_var_snapshot_enabled PASSED                                                                                                     [ 40%]
test_metrics.py::test_log_location_and_metric_location_vars_snapshot_enabled PASSED                                                                                    [ 41%]
test_metrics.py::test_log_location_var_snapshot_disabled_custom_path_read_only PASSED                                                                                  [ 42%]
test_metrics.py::test_metrics_location_var_snapshot_enabled_rdonly_dir PASSED                                                                                          [ 43%]
test_metrics.py::test_metrics_log_mode PASSED                                                                                                                          [ 44%]
test_metrics.py::test_metrics_prometheus_mode PASSED                                                                                                                   [ 44%]
test_metrics.py::test_collect_system_metrics_when_not_disabled PASSED                                                                                                  [ 45%]
test_metrics.py::test_disable_system_metrics_using_config_properties PASSED                                                                                            [ 46%]
test_metrics.py::test_disable_system_metrics_using_environment_variable PASSED                                                                                         [ 47%]
test_metrics_kf.py::test_logs_created PASSED                                                                                                                           [ 48%]
test_metrics_kf.py::test_logs_startup_cfg_created_snapshot_enabled PASSED                                                                                              [ 49%]
test_metrics_kf.py::test_logs_startup_cfg_created_snapshot_disabled PASSED                                                                                             [ 50%]
test_metrics_kf.py::test_metrics_startup_cfg_created_snapshot_enabled PASSED                                                                                           [ 51%]
test_metrics_kf.py::test_metrics_startup_cfg_created_snapshot_disabled PASSED                                                                                          [ 52%]
test_metrics_kf.py::test_log_location_var_snapshot_disabled PASSED                                                                                                     [ 53%]
test_metrics_kf.py::test_log_location_var_snapshot_enabled PASSED                                                                                                      [ 54%]
test_metrics_kf.py::test_async_logging PASSED                                                                                                                          [ 55%]
test_metrics_kf.py::test_async_logging_non_boolean PASSED                                                                                                              [ 55%]
test_metrics_kf.py::test_metrics_location_var_snapshot_disabled PASSED                                                                                                 [ 56%]
test_metrics_kf.py::test_metrics_location_var_snapshot_enabled PASSED                                                                                                  [ 57%]
test_metrics_kf.py::test_log_location_and_metric_location_vars_snapshot_enabled PASSED                                                                                 [ 58%]
test_metrics_kf.py::test_log_location_var_snapshot_disabled_custom_path_read_only PASSED                                                                               [ 59%]
test_metrics_kf.py::test_metrics_location_var_snapshot_enabled_rdonly_dir PASSED                                                                                       [ 60%]
test_model_archiver.py::test_multiple_model_versions_registration PASSED                                                                                               [ 61%]
test_model_archiver.py::test_duplicate_model_registration_using_local_url_followed_by_http_url PASSED                                                                  [ 62%]
test_model_archiver.py::test_duplicate_model_registration_using_http_url_followed_by_local_url PASSED                                                                  [ 63%]
test_model_archiver.py::test_model_archiver_to_regenerate_model_mar_without_force PASSED                                                                               [ 64%]
test_model_archiver.py::test_model_archiver_to_regenerate_model_mar_with_force PASSED                                                                                  [ 65%]
test_model_archiver.py::test_model_archiver_without_handler_flag PASSED                                                                                                [ 66%]
test_model_archiver.py::test_model_archiver_without_model_name_flag PASSED                                                                                             [ 66%]
test_model_archiver.py::test_model_archiver_without_model_file_flag PASSED                                                                                             [ 67%]
test_model_archiver.py::test_model_archiver_without_serialized_flag PASSED                                                                                             [ 68%]
test_onnx.py::test_convert_to_onnx PASSED                                                                                                                              [ 69%]
test_onnx.py::test_model_packaging_and_start PASSED                                                                                                                    [ 70%]
test_onnx.py::test_model_start PASSED                                                                                                                                  [ 71%]
test_onnx.py::test_inference PASSED                                                                                                                                    [ 72%]
test_onnx.py::test_stop PASSED                                                                                                                                         [ 73%]
test_pytorch_profiler.py::test_profiler_default_and_custom_handler[/serve/test/pytest/profiler_utils/resnet_custom.py] PASSED                                          [ 74%]
test_pytorch_profiler.py::test_profiler_default_and_custom_handler[image_classifier] PASSED                                                                            [ 75%]
test_pytorch_profiler.py::test_profiler_arguments_override[/serve/test/pytest/profiler_utils/resnet_profiler_override.py] PASSED                                       [ 76%]
test_pytorch_profiler.py::test_batch_input[/serve/test/pytest/profiler_utils/resnet_profiler_override.py] 
PASSED                                                       [ 77%]
test_sm_mme_requirements.py::test_no_model_loaded PASSED                                                                                                               [ 77%]
test_sm_mme_requirements.py::test_oom_on_model_load FAILED                                                                                                             [ 78%]
test_sm_mme_requirements.py::test_oom_on_invoke FAILED                                                                                                                 [ 79%]
test_snapshot.py::test_snapshot_created_on_start_and_stop PASSED                                                                                                       [ 80%]
test_snapshot.py::test_snapshot_created_on_management_api_invoke PASSED                                                                                                [ 81%]
test_snapshot.py::test_start_from_snapshot PASSED                                                                                                                      [ 82%]
test_snapshot.py::test_start_from_latest PASSED                                                                                                                        [ 83%]
test_snapshot.py::test_start_from_read_only_snapshot PASSED                                                                                                            [ 84%]
test_snapshot.py::test_no_config_snapshots_cli_option PASSED                                                                                                           [ 85%]
test_snapshot.py::test_start_from_default PASSED                                                                                                                       [ 86%]
test_snapshot.py::test_start_from_non_existing_snapshot PASSED                                                                                                         [ 87%]
test_snapshot.py::test_torchserve_init_with_non_existent_model_store PASSED                                                                                            [ 88%]
test_snapshot.py::test_restart_torchserve_with_last_snapshot_with_model_mar_removed PASSED                                                                             [ 88%]
test_snapshot.py::test_replace_mar_file_with_dummy PASSED                                                                                                              [ 89%]
test_snapshot.py::test_restart_torchserve_with_one_of_model_mar_removed PASSED                                                                                         [ 90%]
test_torch_compile.py::TestTorchCompile::test_archive_model_artifacts PASSED                                                                                           [ 91%]
test_torch_compile.py::TestTorchCompile::test_start_torchserve PASSED                                                                                                  [ 92%]
test_torch_compile.py::TestTorchCompile::test_server_status PASSED                                                                                                     [ 93%]
test_torch_compile.py::TestTorchCompile::test_registered_model PASSED                                                                                                  [ 94%]
test_torch_compile.py::TestTorchCompile::test_serve_inference PASSED                                                                                                   [ 95%]
test_torch_xla.py::TestTorchXLA::test_archive_model_artifacts SKIPPED (PyTorch/XLA is not installed)                                                                   [ 96%]
test_torch_xla.py::TestTorchXLA::test_start_torchserve SKIPPED (PyTorch/XLA is not installed)                                                                          [ 97%]
test_torch_xla.py::TestTorchXLA::test_server_status SKIPPED (PyTorch/XLA is not installed)                                                                             [ 98%]
test_torch_xla.py::TestTorchXLA::test_registered_model SKIPPED (PyTorch/XLA is not installed)                                                                          [ 99%]
test_torch_xla.py::TestTorchXLA::test_serve_inference SKIPPED (PyTorch/XLA is not installed)                                                                           [100%]


Large Model Test


root@7707582f4555:/home/serve/examples/large_models/Huggingface_pippy# curl -v "http://localhost:8080/predictions/opt" -T sample_text.txt
*   Trying 127.0.0.1:8080...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8080 (#0)
> PUT /predictions/opt HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.68.0
> Accept: */*
> Content-Length: 44
> Expect: 100-continue
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 
< x-request-id: 4384256c-7d5b-44b6-bdfc-b55fa5df1cad
< Pragma: no-cache
< Cache-Control: no-cache; no-store, must-revalidate, private
< Expires: Thu, 01 Jan 1970 00:00:00 UTC
< content-length: 67
< connection: keep-alive
< 
Hey, are you conscious? Can you talk to me?
I

The following


The
* Connection #0 to host localhost left intact
root@7707582f4555:/home/serve/examples/large_models/Huggingface_pippy# 

Checklist:

  • Did you have fun?
  • Have you added tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

@agunapal agunapal changed the title Reduce TorchServe Docker GPU Image Size (WIP)Reduce TorchServe Docker GPU Image Size Jun 5, 2023
@codecov
Copy link

codecov bot commented Jun 6, 2023

Codecov Report

Merging #2392 (9bf462c) into master (7270447) will not change coverage.
The diff coverage is n/a.

❗ Current head 9bf462c differs from pull request most recent head 21da71b. Consider uploading reports for the commit 21da71b to get more accurate results

@@           Coverage Diff           @@
##           master    #2392   +/-   ##
=======================================
  Coverage   72.01%   72.01%           
=======================================
  Files          78       78           
  Lines        3648     3648           
  Branches       58       58           
=======================================
  Hits         2627     2627           
  Misses       1017     1017           
  Partials        4        4           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@agunapal agunapal changed the title (WIP)Reduce TorchServe Docker GPU Image Size Reduce TorchServe Docker GPU Image Size Jun 6, 2023
@agunapal agunapal requested review from msaroufim, mreso and lxning June 6, 2023 00:21
Copy link
Collaborator

@mreso mreso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

docker/Dockerfile Outdated Show resolved Hide resolved
docker/Dockerfile Outdated Show resolved Hide resolved
@agunapal agunapal merged commit 253f8a3 into master Jun 13, 2023
@agunapal agunapal deleted the issues/reduce_docker_gpu_size branch June 13, 2023 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants