Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HuggingFaceModel #21

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
c582ac8
Draft models and tests
simple-easydev Apr 9, 2024
e761491
update huggleface model
simple-easydev Apr 10, 2024
365927a
Done first version of HuggingFaceModel
simple-easydev Apr 11, 2024
d9c19b7
Fix tiny bugs
simple-easydev Apr 11, 2024
ab22045
Fix feedbacks
simple-easydev Apr 11, 2024
4fc31b4
Fix missing feedback
simple-easydev Apr 11, 2024
d40be9d
[wip] gpu support
jjleng Mar 26, 2024
5af3e16
feat(gpu): run models on cuda GPUs
jjleng Apr 5, 2024
615cc4d
feat(gpu): make nvidia device plugin tolerate model group taints
jjleng Apr 6, 2024
8181314
feat(gpu): set n_gpu_layers to offload work to gpu for the llama.cpp …
jjleng Apr 9, 2024
91e4571
feat(gpu): larger disk for gpu nodes
jjleng Apr 9, 2024
28075b7
feat(gpu): make model group node disk size configerable
jjleng Apr 10, 2024
ac8c726
feat(gpu): be able to request a number of GPUs through config
jjleng Apr 10, 2024
a945de8
docs: update README with the GPU support message
jjleng Apr 10, 2024
62e9a62
docs: add llama2 chat template for the invoice extraction example
jjleng Apr 10, 2024
c842495
docs: README for the invoice extraction example
jjleng Apr 10, 2024
ed40b64
docs(invoice_extraction): gpu_cluster.yaml for GPU inferences
jjleng Apr 10, 2024
0aadc74
feat: remove finalizers before tearing down a cluster
jjleng Apr 10, 2024
4e2bdf7
chore: bump version
jjleng Apr 10, 2024
6f88d8a
docs: instructions for installing the pack CLI
jjleng Apr 11, 2024
c1bcd37
update the progress status logging for downloading
simple-easydev Apr 13, 2024
a0f0ad4
docs: add pulumi CLI as a dependency
jjleng Apr 13, 2024
5863ad0
Fix test case for HuggingFaceModel.upload_file_to_s3
simple-easydev Apr 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
feat(gpu): make nvidia device plugin tolerate model group taints
jjleng authored and simple-easydev committed Apr 12, 2024
commit 615cc4da220b8250f144e6ef25332c2bad6e1601
72 changes: 67 additions & 5 deletions paka/cluster/nvidia_device_plugin.py
Original file line number Diff line number Diff line change
@@ -3,7 +3,7 @@


def install_nvidia_device_plugin(
k8s_provider: k8s.Provider, version: str = "main"
k8s_provider: k8s.Provider, version: str = "v0.15.0-rc.2"
) -> None:
"""
Installs the NVIDIA device plugin for GPU support in the cluster.
@@ -17,9 +17,71 @@ def install_nvidia_device_plugin(
Returns:
None
"""
# This will install a DaemonSet in the kube-system namespace
k8s.yaml.ConfigFile(
"nvidia-device-plugin",
file=f"https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/{version}/nvidia-device-plugin.yml",

k8s.apps.v1.DaemonSet(
"nvidia-device-plugin-daemonset",
metadata=k8s.meta.v1.ObjectMetaArgs(
namespace="kube-system",
),
spec=k8s.apps.v1.DaemonSetSpecArgs(
selector=k8s.meta.v1.LabelSelectorArgs(
match_labels={
"name": "nvidia-device-plugin-ds",
},
),
update_strategy=k8s.apps.v1.DaemonSetUpdateStrategyArgs(
type="RollingUpdate",
),
template=k8s.core.v1.PodTemplateSpecArgs(
metadata=k8s.meta.v1.ObjectMetaArgs(
labels={
"name": "nvidia-device-plugin-ds",
},
),
spec=k8s.core.v1.PodSpecArgs(
tolerations=[
k8s.core.v1.TolerationArgs(
key="nvidia.com/gpu",
operator="Exists",
effect="NoSchedule",
),
k8s.core.v1.TolerationArgs(operator="Exists"),
],
priority_class_name="system-node-critical",
containers=[
k8s.core.v1.ContainerArgs(
image=f"nvcr.io/nvidia/k8s-device-plugin:{version}",
name="nvidia-device-plugin-ctr",
env=[
k8s.core.v1.EnvVarArgs(
name="FAIL_ON_INIT_ERROR",
value="false",
)
],
security_context=k8s.core.v1.SecurityContextArgs(
allow_privilege_escalation=False,
capabilities=k8s.core.v1.CapabilitiesArgs(
drop=["ALL"],
),
),
volume_mounts=[
k8s.core.v1.VolumeMountArgs(
name="device-plugin",
mount_path="/var/lib/kubelet/device-plugins",
)
],
)
],
volumes=[
k8s.core.v1.VolumeArgs(
name="device-plugin",
host_path=k8s.core.v1.HostPathVolumeSourceArgs(
path="/var/lib/kubelet/device-plugins",
),
)
],
),
),
),
opts=pulumi.ResourceOptions(provider=k8s_provider),
)