Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: nvidia-persistenced to Nvidia kmod packages #122

Merged
merged 7 commits into from
Oct 8, 2024

Conversation

isaac-400
Copy link

Closes bottlerocket-os/bottlerocket#3960

Description of changes:

Whenever the NVIDIA device resources are no longer in use, the NVIDIA
kernel driver will tear down the device state. nvidia-persistenced
activates persistence mode, which keeps the device files open preventing
the kernel from removing the device state [1]. This is desirable in
applications that may suffer performance hits due to repeated device
initialization.

The NVIDIA device drivers ship with templates for running
nvidia-persistenced as a systemd unit [2]. This change uses that template.

The nvidia-persistenced documentation advises that while the systemd
unit can run as root, the unit should provide a non-root user under which
nvidia-persistenced will run. This change adds a non-root user for nvidia-persistenced.

See the documentation included with the NVIDIA driver for more
information about nvidia-persistenced [3].

Testing Done:

ECS: The unit sets the devices to persistence mode, and we can see that the
device is set correctly inside a task container.

k8s: Node successfully joined cluster and persistence mode is enabled inside test pod.

aws-ecs-1-nvidia

bash-5.1# apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "39f91d5d-dirty",
    "pretty_name": "Bottlerocket OS 1.21.1 (aws-ecs-1-nvidia)",
    "variant_id": "aws-ecs-1-nvidia",
    "version_id": "1.21.1"
  }
}
bash-5.1# docker ps
CONTAINER ID   IMAGE                                  COMMAND                  CREATED          STATUS          PORTS     NAMES
b90b0fec4e7e   fedora                                 "sh -c 'sleep infini…"   38 seconds ago   Up 37 seconds             ecs-ecs-test-4-main-948ee88ccac2abbd6d00
6a16e877533f   amazon/amazon-ecs-pause:bottlerocket   "/usr/bin/pause"         44 seconds ago   Up 44 seconds             ecs-ecs-test-4-internalecspause-f4fbfab2fde3e3f60600
bash-5.1# docker exec -it b90b0fec4e7e nvidia-smi
Tue Sep  3 19:03:36 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.256.02   Driver Version: 470.256.02   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A10G         On   | 00000000:00:1E.0 Off |                    0 |
|  0%   28C    P8     9W / 300W |      0MiB / 22731MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
bash-5.1# systemctl status nvidia-persistenced
● nvidia-persistenced.service - NVIDIA Persistence Daemon
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/nvidia-persistenced.service; enabled; preset: enabled)
     Active: active (running) since Tue 2024-09-03 19:00:29 UTC; 3min 17s ago
   Main PID: 1914 (nvidia-persiste)
      Tasks: 1 (limit: 18928)
     Memory: 4.5M
     CGroup: /system.slice/nvidia-persistenced.service
             └─1914 /x86_64-bottlerocket-linux-gnu/sys-root/usr/libexec/nvidia/tesla/bin/470.256.02/nvidia-persistenced --user nvidia

Sep 03 19:00:28 ip-10-194-20-18.us-west-2.compute.internal nvidia-smi[1889]:
Sep 03 19:00:28 ip-10-194-20-18.us-west-2.compute.internal nvidia-smi[1889]: +-----------------------------------------------------------------------------+
Sep 03 19:00:28 ip-10-194-20-18.us-west-2.compute.internal nvidia-smi[1889]: | Processes:                                                                  |
Sep 03 19:00:28 ip-10-194-20-18.us-west-2.compute.internal nvidia-smi[1889]: |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
Sep 03 19:00:28 ip-10-194-20-18.us-west-2.compute.internal nvidia-smi[1889]: |        ID   ID                                                   Usage      |
Sep 03 19:00:28 ip-10-194-20-18.us-west-2.compute.internal nvidia-smi[1889]: |=============================================================================|
Sep 03 19:00:28 ip-10-194-20-18.us-west-2.compute.internal nvidia-smi[1889]: |  No running processes found                                                 |
Sep 03 19:00:28 ip-10-194-20-18.us-west-2.compute.internal nvidia-smi[1889]: +-----------------------------------------------------------------------------+
Sep 03 19:00:28 ip-10-194-20-18.us-west-2.compute.internal nvidia-persistenced[1914]: Started (1914)
Sep 03 19:00:29 ip-10-194-20-18.us-west-2.compute.internal systemd[1]: Started NVIDIA Persistence Daemon.
bash-5.1#

aws-ecs-2-nvidia

bash-5.1# apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "39f91d5d-dirty",
    "pretty_name": "Bottlerocket OS 1.21.1 (aws-ecs-2-nvidia)",
    "variant_id": "aws-ecs-2-nvidia",
    "version_id": "1.21.1"
  }
}
bash-5.1# docker ps
CONTAINER ID   IMAGE                                  COMMAND                  CREATED          STATUS          PORTS     NAMES
e87a4e93439c   fedora                                 "sh -c 'sleep infini…"   8 seconds ago    Up 6 seconds              ecs-ecs-test-2-main-8883a48ba4b884efda01
3eca5efed891   amazon/amazon-ecs-pause:bottlerocket   "/usr/bin/pause"         13 seconds ago   Up 12 seconds             ecs-ecs-test-2-internalecspause-96db8af6f891edf5b501
bash-5.1# docker exec -it e87a4e93439c nvidia-smi
Wed Sep  4 16:57:12 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A10G                    On  | 00000000:00:1E.0 Off |                    0 |
|  0%   26C    P8               8W / 300W |      0MiB / 23028MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
bash-5.1# systemctl status nvidia-persistenced
● nvidia-persistenced.service - NVIDIA Persistence Daemon
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/nvidia-persistenced.service; enabled; preset: enabled)
     Active: active (running) since Wed 2024-09-04 16:42:22 UTC; 15min ago
   Main PID: 1430 (nvidia-persiste)
      Tasks: 1 (limit: 18906)
     Memory: 41.3M
        CPU: 4.445s
     CGroup: /system.slice/nvidia-persistenced.service
             └─1430 /x86_64-bottlerocket-linux-gnu/sys-root/usr/libexec/nvidia/tesla/bin/nvidia-persistenced --user nvidia

Sep 04 16:42:20 ip-172-31-83-142.ec2.internal nvidia-smi[1412]:
Sep 04 16:42:20 ip-172-31-83-142.ec2.internal nvidia-smi[1412]: +---------------------------------------------------------------------------------------+
Sep 04 16:42:20 ip-172-31-83-142.ec2.internal nvidia-smi[1412]: | Processes:                                                                            |
Sep 04 16:42:20 ip-172-31-83-142.ec2.internal nvidia-smi[1412]: |  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
Sep 04 16:42:20 ip-172-31-83-142.ec2.internal nvidia-smi[1412]: |        ID   ID                                                             Usage      |
Sep 04 16:42:20 ip-172-31-83-142.ec2.internal nvidia-smi[1412]: |=======================================================================================|
Sep 04 16:42:20 ip-172-31-83-142.ec2.internal nvidia-smi[1412]: |  No running processes found                                                           |
Sep 04 16:42:20 ip-172-31-83-142.ec2.internal nvidia-smi[1412]: +---------------------------------------------------------------------------------------+
Sep 04 16:42:20 ip-172-31-83-142.ec2.internal nvidia-persistenced[1430]: Started (1430)
Sep 04 16:42:22 ip-172-31-83-142.ec2.internal systemd[1]: Started NVIDIA Persistence Daemon.
bash-5.1#

aws-k8s-1.23-nvidia

[nix-shell:~/workplace/bottlerocket]$  kubectl exec gpu-test -c main -- nvidia-smi
 kubectl exec gpu-test -c main -- nvidia-smi
Tue Sep  3 23:10:20 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.256.02   Driver Version: 470.256.02   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A10G         On   | 00000000:00:1E.0 Off |                    0 |
|  0%   25C    P8     9W / 300W |      0MiB / 22731MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

[nix-shell:~/workplace/bottlerocket]$  kubectl exec gpu-test -c main -- nvidia-smi --query-gpu=gpu_name,persistence_mode --format=csv
 kubectl exec gpu-test -c main -- nvidia-smi --query-gpu=gpu_name,persistence_mode --format=csv
name, persistence_mode
NVIDIA A10G, Enabled

aws-k8s-1.26-nvidia

[nix-shell:~/workplace/bottlerocket]$  kubectl exec gpu-test -c main -- nvidia-smi
 kubectl exec gpu-test -c main -- nvidia-smi
Wed Sep  4 16:59:43 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A10G                    On  | 00000000:00:1E.0 Off |                    0 |
|  0%   24C    P8               9W / 300W |      0MiB / 23028MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+


[nix-shell:~/workplace/bottlerocket]$  kubectl exec gpu-test -c main -- nvidia-smi --query-gpu=gpu_name,persistence_mode --format=csv
 kubectl exec gpu-test -c main -- nvidia-smi --query-gpu=gpu_name,persistence_mode --format=csv
name, persistence_mode
NVIDIA A10G, Enabled

aws-k8s-1.30-nvidia

[nix-shell:~/workplace/bottlerocket]$  kubectl exec gpu-test -c main -- nvidia-smi
 kubectl exec gpu-test -c main -- nvidia-smi
Wed Sep  4 17:25:29 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A10G                    On  | 00000000:00:1E.0 Off |                    0 |
|  0%   26C    P8              12W / 300W |      0MiB / 23028MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

[nix-shell:~/workplace/bottlerocket]$  kubectl exec gpu-test -c main -- nvidia-smi --query-gpu=gpu_name,persistence_mode --format=csv
 kubectl exec gpu-test -c main -- nvidia-smi --query-gpu=gpu_name,persistence_mode --format=csv
name, persistence_mode
NVIDIA A10G, Enabled

Terms of contribution:

By submitting this pull request, I agree that this contribution is
dual-licensed under the terms of both the Apache License, version 2.0,
and the MIT license.

@isaac-400 isaac-400 marked this pull request as ready for review September 4, 2024 17:53
@isaac-400 isaac-400 force-pushed the icf/nvidia-persistenced branch from ec03060 to a38557d Compare September 5, 2024 20:00
@isaac-400 isaac-400 force-pushed the icf/nvidia-persistenced branch from a38557d to 5bb984f Compare September 6, 2024 20:55
Copy link
Contributor

@arnaldo2792 arnaldo2792 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good! Thanks for the contribution!

@bcressey bcressey self-requested a review September 10, 2024 21:25
@@ -148,10 +149,15 @@ install -m 755 nvidia-smi %{buildroot}%{_cross_libexecdir}/nvidia/tesla/bin/%{te
install -m 755 nvidia-debugdump %{buildroot}%{_cross_libexecdir}/nvidia/tesla/bin/%{tesla_470}
install -m 755 nvidia-cuda-mps-control %{buildroot}%{_cross_libexecdir}/nvidia/tesla/bin/%{tesla_470}
install -m 755 nvidia-cuda-mps-server %{buildroot}%{_cross_libexecdir}/nvidia/tesla/bin/%{tesla_470}
install -m 755 nvidia-persistenced %{buildroot}%{_cross_libexecdir}/nvidia/tesla/bin/%{tesla_470}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about installing nvidia-persistenced in %{_cross_bindir}, since it is meant to be a system service. The problem with this approach is that if in the future NVIDIA ships other run archives different than the tesla archive and we have to include it, there might be two versions of nvidia-persistenced that would have to be shipped.

I think we can keep it as it is, and going forward, we could do some guessing at runtime based on the driver that was loaded to override the path to nvidia-persistenced using systemd drop-ins.

[Service]
Type=forking
# Run the NVIDIA System Management Interface to create the device nodes.
ExecStart=__NVIDIA_BINDIR__/nvidia-persistenced
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 5.15 and 6.1, the path to nvidia-persistenced is more consistent and deterministic (it doesn't include the version). It should be fine to just use the full path here and you will save yourself the extra sed.

@isaac-400 isaac-400 force-pushed the icf/nvidia-persistenced branch from 5bb984f to 49937d9 Compare September 12, 2024 17:04
@isaac-400
Copy link
Author

Tested the latest round of changes on aws-ecs-2-nvidia:

bash-5.1# apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "a63007cf-dirty",
    "pretty_name": "Bottlerocket OS 1.22.0 (aws-ecs-2-nvidia)",
    "variant_id": "aws-ecs-2-nvidia",
    "version_id": "1.22.0"
  }
}
bash-5.1# docker ps
CONTAINER ID   IMAGE                                         COMMAND                  CREATED          STATUS          PORTS     NAMES
8485910eb8c0   public.ecr.aws/docker/library/fedora:latest   "sh -c 'sleep infini…"   15 seconds ago   Up 13 seconds             ecs-gpu-test-24-main-e6be94f7b6d2ccac1d00
77960735e8cd   amazon/amazon-ecs-pause:bottlerocket          "/usr/bin/pause"         20 seconds ago   Up 19 seconds             ecs-gpu-test-24-internalecspause-9eebfbbcf3f4f2c03300
bash-5.1# docker exec -it 8485910eb8c0 nvidia-smi
Thu Sep 12 17:02:56 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A10G                    On  | 00000000:00:1E.0 Off |                    0 |
|  0%   23C    P8               9W / 300W |      0MiB / 23028MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
bash-5.1# systemctl status nvidia-persistenced
● nvidia-persistenced.service - NVIDIA Persistence Daemon
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/nvidia-persistenced.service; enabled; preset: enabled)
     Active: active (running) since Thu 2024-09-12 16:58:18 UTC; 5min ago
   Main PID: 1415 (nvidia-persiste)
      Tasks: 1 (limit: 18906)
     Memory: 37.0M
        CPU: 1.949s
     CGroup: /system.slice/nvidia-persistenced.service
             └─1415 /usr/libexec/nvidia/tesla/bin/nvidia-persistenced

Sep 12 16:58:16 ip-10-194-20-100.us-west-2.compute.internal systemd[1]: Starting NVIDIA Persistence Daemon...
Sep 12 16:58:16 ip-10-194-20-100.us-west-2.compute.internal nvidia-persistenced[1415]: Started (1415)
Sep 12 16:58:18 ip-10-194-20-100.us-west-2.compute.internal systemd[1]: Started NVIDIA Persistence Daemon.
bash-5.1#

@bcressey
Copy link
Contributor

bcressey commented Sep 12, 2024

This older doc says:

It is strongly recommended, though not required, that the daemon be run as a non-root user for security purposes.

So I really would like to bring back the --user option, or run the daemon directly as that user via User=.

@bcressey
Copy link
Contributor

I'd recommend including nvidia-modprobe in the package, but not installing it setuid. Then you could add a second systemd unit that invokes it as root, which nvidia-persistenced.service would depend on.

@isaac-400 isaac-400 force-pushed the icf/nvidia-persistenced branch from 49937d9 to ce6bdf4 Compare October 4, 2024 17:07
@isaac-400
Copy link
Author

isaac-400 commented Oct 4, 2024

Summary of changes

  • Install nvidia-modprobe as setuid.
  • Use the --user option with a non-root user, so nvidia-persistenced can drop root permissions.
  • Removed the NoNewPermissions systemd option, since this inhibits the non-root user from calling nvidia-modprobe setuid.
  • Modified the systemd units to be compatible with recent load-kernel-modules changes.

Testing

aws-ecs-1-nvidia

bash-5.1# apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "26ec2cc5-dirty",
    "pretty_name": "Bottlerocket OS 1.23.0 (aws-ecs-1-nvidia)",
    "variant_id": "aws-ecs-1-nvidia",
    "version_id": "1.23.0"
  }
}
bash-5.1# systemctl status nvidia-persistenced
● nvidia-persistenced.service - NVIDIA Persistence Daemon
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/nvidia-persistenced.service; enabled; preset: enabled)
     Active: active (running) since Fri 2024-10-04 15:58:20 UTC; 1min 29s ago
   Main PID: 1942 (nvidia-persiste)
      Tasks: 1 (limit: 18928)
     Memory: 576.0K
     CGroup: /system.slice/nvidia-persistenced.service
             └─1942 /x86_64-bottlerocket-linux-gnu/sys-root/usr/libexec/nvidia/tesla/bin/470.256.02/nvidia-persistenced --user nvidia --verbose

Oct 04 15:58:19 ip-10-194-20-226.us-west-2.compute.internal systemd[1]: Starting NVIDIA Persistence Daemon...
Oct 04 15:58:19 ip-10-194-20-226.us-west-2.compute.internal nvidia-persistenced[1942]: Verbose syslog connection opened
Oct 04 15:58:19 ip-10-194-20-226.us-west-2.compute.internal nvidia-persistenced[1942]: Now running with user ID 981 and group ID 981
Oct 04 15:58:19 ip-10-194-20-226.us-west-2.compute.internal nvidia-persistenced[1942]: Started (1942)
Oct 04 15:58:19 ip-10-194-20-226.us-west-2.compute.internal nvidia-persistenced[1942]: device 0000:00:1e.0 - registered
Oct 04 15:58:20 ip-10-194-20-226.us-west-2.compute.internal nvidia-persistenced[1942]: device 0000:00:1e.0 - persistence mode enabled.
Oct 04 15:58:20 ip-10-194-20-226.us-west-2.compute.internal nvidia-persistenced[1942]: device 0000:00:1e.0 - NUMA memory onlined.
Oct 04 15:58:20 ip-10-194-20-226.us-west-2.compute.internal nvidia-persistenced[1942]: Local RPC services initialized
Oct 04 15:58:20 ip-10-194-20-226.us-west-2.compute.internal systemd[1]: Started NVIDIA Persistence Daemon.
bash-5.1#

aws-k8s-1.24-nvidia

bash-5.1# apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "26ec2cc5-dirty",
    "pretty_name": "Bottlerocket OS 1.23.0 (aws-k8s-1.24-nvidia)",
    "variant_id": "aws-k8s-1.24-nvidia",
    "version_id": "1.23.0"
  }
}
bash-5.1#
bash-5.1# systemctl status nvidia-persistenced
● nvidia-persistenced.service - NVIDIA Persistence Daemon
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/nvidia-persistenced.service; enabled; preset: enabled)
     Active: active (running) since Fri 2024-10-04 16:31:49 UTC; 32min ago
   Main PID: 1365 (nvidia-persiste)
      Tasks: 1 (limit: 18929)
     Memory: 37.1M
     CGroup: /system.slice/nvidia-persistenced.service
             └─1365 /usr/libexec/nvidia/tesla/bin/nvidia-persistenced --user nvidia --verbose

Oct 04 16:31:43 ip-192-168-19-182.ec2.internal systemd[1]: Starting NVIDIA Persistence Daemon...
Oct 04 16:31:43 ip-192-168-19-182.ec2.internal nvidia-persistenced[1365]: Verbose syslog connection opened
Oct 04 16:31:43 ip-192-168-19-182.ec2.internal nvidia-persistenced[1365]: Now running with user ID 981 and group ID 981
Oct 04 16:31:43 ip-192-168-19-182.ec2.internal nvidia-persistenced[1365]: Started (1365)
Oct 04 16:31:43 ip-192-168-19-182.ec2.internal nvidia-persistenced[1365]: device 0000:00:1e.0 - registered
Oct 04 16:31:49 ip-192-168-19-182.ec2.internal nvidia-persistenced[1365]: device 0000:00:1e.0 - persistence mode enabled.
Oct 04 16:31:49 ip-192-168-19-182.ec2.internal nvidia-persistenced[1365]: device 0000:00:1e.0 - NUMA memory onlined.
Oct 04 16:31:49 ip-192-168-19-182.ec2.internal nvidia-persistenced[1365]: Local RPC services initialized
Oct 04 16:31:49 ip-192-168-19-182.ec2.internal systemd[1]: Started NVIDIA Persistence Daemon.
bash-5.1#
bash-5.1# uname -a
Linux ip-192-168-19-182.ec2.internal 5.15.167 #1 SMP Fri Oct 4 02:55:35 UTC 2024 x86_64 GNU/Linux

aws-ecs-2-nvidia

bash-5.1# apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "26ec2cc5-dirty",
    "pretty_name": "Bottlerocket OS 1.23.0 (aws-ecs-2-nvidia)",
    "variant_id": "aws-ecs-2-nvidia",
    "version_id": "1.23.0"
  }
}
bash-5.1# systemctl status nvidia-persistenced
● nvidia-persistenced.service - NVIDIA Persistence Daemon
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/nvidia-persistenced.service; enabled; preset: enabled)
     Active: active (running) since Fri 2024-10-04 16:07:08 UTC; 38s ago
   Main PID: 1466 (nvidia-persiste)
      Tasks: 1 (limit: 18904)
     Memory: 620.0K
        CPU: 15ms
     CGroup: /system.slice/nvidia-persistenced.service
             └─1466 /usr/libexec/nvidia/tesla/bin/nvidia-persistenced --user nvidia --verbose

Oct 04 16:07:06 ip-10-194-20-105.us-west-2.compute.internal systemd[1]: Starting NVIDIA Persistence Daemon...
Oct 04 16:07:06 ip-10-194-20-105.us-west-2.compute.internal nvidia-persistenced[1466]: Verbose syslog connection opened
Oct 04 16:07:06 ip-10-194-20-105.us-west-2.compute.internal nvidia-persistenced[1466]: Now running with user ID 981 and group ID 981
Oct 04 16:07:06 ip-10-194-20-105.us-west-2.compute.internal nvidia-persistenced[1466]: Started (1466)
Oct 04 16:07:08 ip-10-194-20-105.us-west-2.compute.internal nvidia-persistenced[1466]: device 0000:00:1e.0 - registered
Oct 04 16:07:08 ip-10-194-20-105.us-west-2.compute.internal nvidia-persistenced[1466]: device 0000:00:1e.0 - persistence mode enabled.
Oct 04 16:07:08 ip-10-194-20-105.us-west-2.compute.internal nvidia-persistenced[1466]: device 0000:00:1e.0 - NUMA memory onlined.
Oct 04 16:07:08 ip-10-194-20-105.us-west-2.compute.internal nvidia-persistenced[1466]: Local RPC services initialized
Oct 04 16:07:08 ip-10-194-20-105.us-west-2.compute.internal systemd[1]: Started NVIDIA Persistence Daemon.
bash-5.1#

@isaac-400 isaac-400 requested a review from bcressey October 4, 2024 17:08
Copy link
Contributor

@bcressey bcressey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Some feedback on the systemd unit, but no blockers.

Comment on lines +14 to +10
[Install]
RequiredBy=preconfigured.target
Copy link
Contributor

@bcressey bcressey Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this doesn't specifically need to run in an early phase of boot, I'd just put it with the Fabric Manager in multi-user.target:

Suggested change
[Install]
RequiredBy=preconfigured.target
[Install]
WantedBy=multi-user.target

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can run at any time, but we should prefer that it runs earlier rather than later. Downstream customers may not always initialize the GPU device files themselves, so running this unit early ensures that those files are properly set by the time their units begin (as part of multi-user.target for example).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should run before services like ecs-gpu-init so that we can rely on this service to create the devices.

packages/kmod-6.1-nvidia/nvidia-persistenced.service Outdated Show resolved Hide resolved
packages/kmod-6.1-nvidia/nvidia-persistenced.service Outdated Show resolved Hide resolved
@@ -0,0 +1 @@
u nvidia - "nvidia-persistenced user"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since we might use it for something else because the username isn't persistenced specific:

Suggested change
u nvidia - "nvidia-persistenced user"
u nvidia - "nvidia user"

Isaac Feldman added 6 commits October 4, 2024 20:09
Whenever the NVIDIA device resources are no longer in use, the NVIDIA
kernel driver will tear down the device state. `nvidia-persistenced`
activates persistence mode, which keeps the device files open which
prevents the kernel from removing the device state. This is desirable
in applications that may suffer performance hits due to repeated
device initialization.

The NVIDIA device drivers ship with templates for running
`nvidia-persistenced` as a systemd unit. This change uses that
template.

The `nvidia-persistenced` documentation advises that while the systemd
unit can run as root, the unit should provide a non-root user for
`nvidia-persistenced` to run under.

See the documentation included with the NVIDIA driver for more
information about `nvidia-persistenced`.
@isaac-400 isaac-400 force-pushed the icf/nvidia-persistenced branch from ce6bdf4 to bd5e101 Compare October 4, 2024 20:37
@isaac-400
Copy link
Author

Re-built with the tmpfiles.d change:

● nvidia-persistenced.service - NVIDIA Persistence Daemon
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/nvidia-persistenced.service; enabled; preset: enabled)
     Active: active (running) since Fri 2024-10-04 20:23:15 UTC; 13min ago
   Main PID: 1436 (nvidia-persiste)
      Tasks: 1 (limit: 18904)
     Memory: 37.1M
        CPU: 1.959s
     CGroup: /system.slice/nvidia-persistenced.service
             └─1436 /usr/libexec/nvidia/tesla/bin/nvidia-persistenced --user nvidia --verbose

Oct 04 20:23:10 ip-10-194-20-136.us-west-2.compute.internal systemd[1]: Starting NVIDIA Persistence Daemon...
Oct 04 20:23:10 ip-10-194-20-136.us-west-2.compute.internal nvidia-persistenced[1436]: Verbose syslog connection opened
Oct 04 20:23:10 ip-10-194-20-136.us-west-2.compute.internal nvidia-persistenced[1436]: Directory /var/run/nvidia-persistenced will not be removed on exit
Oct 04 20:23:10 ip-10-194-20-136.us-west-2.compute.internal nvidia-persistenced[1436]: Now running with user ID 981 and group ID 981
Oct 04 20:23:10 ip-10-194-20-136.us-west-2.compute.internal nvidia-persistenced[1436]: Started (1436)
Oct 04 20:23:10 ip-10-194-20-136.us-west-2.compute.internal nvidia-persistenced[1436]: device 0000:00:1e.0 - registered
Oct 04 20:23:15 ip-10-194-20-136.us-west-2.compute.internal nvidia-persistenced[1436]: device 0000:00:1e.0 - persistence mode enabled.
Oct 04 20:23:15 ip-10-194-20-136.us-west-2.compute.internal nvidia-persistenced[1436]: device 0000:00:1e.0 - NUMA memory onlined.
Oct 04 20:23:15 ip-10-194-20-136.us-west-2.compute.internal nvidia-persistenced[1436]: Local RPC services initialized
Oct 04 20:23:15 ip-10-194-20-136.us-west-2.compute.internal systemd[1]: Started NVIDIA Persistence Daemon.

@isaac-400 isaac-400 requested a review from bcressey October 4, 2024 21:48
Copy link
Contributor

@yeazelm yeazelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@isaac-400
Copy link
Author

Now ecs-gpu-init runs after nvidia-persistenced:

[root@admin]# sudo sheltie
bash-5.1# systemctl status nvidia-persistenced
● nvidia-persistenced.service - NVIDIA Persistence Daemon
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/nvidia-persistenced.service; enabled; preset: enabled)
     Active: active (running) since Tue 2024-10-08 00:40:40 UTC; 1min 34s ago
   Main PID: 1450 (nvidia-persiste)
      Tasks: 1 (limit: 18904)
     Memory: 37.1M
        CPU: 1.974s
     CGroup: /system.slice/nvidia-persistenced.service
             └─1450 /usr/libexec/nvidia/tesla/bin/nvidia-persistenced --user nvidia --verbose

Oct 08 00:40:36 ip-10-194-20-228.us-west-2.compute.internal systemd[1]: Starting NVIDIA Persistence Daemon...
Oct 08 00:40:36 ip-10-194-20-228.us-west-2.compute.internal nvidia-persistenced[1450]: Verbose syslog connection opened
Oct 08 00:40:36 ip-10-194-20-228.us-west-2.compute.internal nvidia-persistenced[1450]: Directory /var/run/nvidia-persistenced will not be removed on exit
Oct 08 00:40:36 ip-10-194-20-228.us-west-2.compute.internal nvidia-persistenced[1450]: Now running with user ID 981 and group ID 981
Oct 08 00:40:36 ip-10-194-20-228.us-west-2.compute.internal nvidia-persistenced[1450]: Started (1450)
Oct 08 00:40:36 ip-10-194-20-228.us-west-2.compute.internal nvidia-persistenced[1450]: device 0000:00:1e.0 - registered
Oct 08 00:40:40 ip-10-194-20-228.us-west-2.compute.internal nvidia-persistenced[1450]: device 0000:00:1e.0 - persistence mode enabled.
Oct 08 00:40:40 ip-10-194-20-228.us-west-2.compute.internal nvidia-persistenced[1450]: device 0000:00:1e.0 - NUMA memory onlined.
Oct 08 00:40:40 ip-10-194-20-228.us-west-2.compute.internal nvidia-persistenced[1450]: Local RPC services initialized
Oct 08 00:40:40 ip-10-194-20-228.us-west-2.compute.internal systemd[1]: Started NVIDIA Persistence Daemon.
bash-5.1# systemctl status ecs-gpu-init
● ecs-gpu-init.service - Initialize ECS GPU config
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/ecs-gpu-init.service; enabled; preset: enabled)
     Active: active (exited) since Tue 2024-10-08 00:40:40 UTC; 1min 40s ago
   Main PID: 1466 (code=exited, status=0/SUCCESS)
        CPU: 17ms

Oct 08 00:40:40 ip-10-194-20-228.us-west-2.compute.internal systemd[1]: Starting Initialize ECS GPU config...
Oct 08 00:40:40 ip-10-194-20-228.us-west-2.compute.internal systemd[1]: Finished Initialize ECS GPU config.
bash-5.1#

@bcressey bcressey merged commit 62450eb into bottlerocket-os:develop Oct 8, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nvidia-container-cli timeout error when running ECS tasks
4 participants