Add example for llama.cpp #174

justinsb · 2024-07-20T13:49:16Z

In the first commit we just bring up llama.cpp, not really using LWS.

In the second commit we really use LeaderWorkerSet, leveraging llama.cpp's RPC support

liurupeng · 2024-07-23T05:22:28Z

thanks @justinsb for the contribution! could you add a readme for this example like this? (https://github.com/kubernetes-sigs/lws/blob/main/docs/examples/vllm/README.md)

Not (yet) using the leader functionality, just going direct to a worker.

Previously we weren't actually running on multiple pods; now we are.

liurupeng · 2024-07-23T20:44:38Z

docs/examples/llamacpp/images/llamacpp-leader/Dockerfile

+# GGML_RPC=ON: Builds RPC support
+# BUILD_SHARED_LIBS=OFF: Don't rely on shared libraries like libggml
+RUN cmake . -DGGML_RPC=ON -DBUILD_SHARED_LIBS=OFF -DGGML_DEBUG=1
+RUN cmake --build . --config Release --parallel 8


is the parallel here "tensor parallelism" or "pipeline parallelism"?

This is just running the cmake build in parallel. Just a slightly faster docker build, no runtime effect :-)

liurupeng · 2024-07-23T20:45:43Z

docs/examples/llamacpp/README.md

+llama.cpp began as a project to support CPU-only inference on a single node, but has
+since expanded to support accelerators and distributed inference.
+
+l.md)


is this added accidentally?

Yes, I'll remove ... I'm thinking I should add support for GPUs also next, so maybe I'll do that at the same time!

liurupeng · 2024-07-23T20:47:40Z

thanks @justinsb! very nice example for using CPU for multi node inference!

liurupeng · 2024-07-23T20:47:42Z

/approve
/lgtm

k8s-ci-robot · 2024-07-23T20:47:50Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: justinsb, liurupeng

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [liurupeng]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot requested review from ahg-g and liurupeng July 20, 2024 13:49

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 20, 2024

justinsb force-pushed the examples_llamacpp branch from a3b66d4 to 333d338 Compare July 20, 2024 18:53

justinsb added 2 commits July 23, 2024 13:28

Add example for llama.cpp

fa6b1c6

Not (yet) using the leader functionality, just going direct to a worker.

llamacpp example: actually use leaders & workers

3565efe

Previously we weren't actually running on multiple pods; now we are.

justinsb force-pushed the examples_llamacpp branch from 333d338 to 3565efe Compare July 23, 2024 17:28

liurupeng reviewed Jul 23, 2024

View reviewed changes

k8s-ci-robot assigned liurupeng Jul 23, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 23, 2024

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 23, 2024

k8s-ci-robot merged commit c8ae17b into kubernetes-sigs:main Jul 23, 2024
8 checks passed

liurupeng mentioned this pull request Sep 13, 2024

Release v0.4.0 requirements #161

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add example for llama.cpp #174

Add example for llama.cpp #174

justinsb commented Jul 20, 2024 •

edited

Loading

liurupeng commented Jul 23, 2024

liurupeng Jul 23, 2024

justinsb Jul 23, 2024

liurupeng Jul 23, 2024

justinsb Jul 23, 2024

liurupeng commented Jul 23, 2024

liurupeng commented Jul 23, 2024

k8s-ci-robot commented Jul 23, 2024

Add example for llama.cpp #174

Add example for llama.cpp #174

Conversation

justinsb commented Jul 20, 2024 • edited Loading

liurupeng commented Jul 23, 2024

liurupeng Jul 23, 2024

Choose a reason for hiding this comment

justinsb Jul 23, 2024

Choose a reason for hiding this comment

liurupeng Jul 23, 2024

Choose a reason for hiding this comment

justinsb Jul 23, 2024

Choose a reason for hiding this comment

liurupeng commented Jul 23, 2024

liurupeng commented Jul 23, 2024

k8s-ci-robot commented Jul 23, 2024

justinsb commented Jul 20, 2024 •

edited

Loading