Skip to content

Determine level of HF and Docker Model integration #6

@deitch

Description

@deitch

Docker model CLI has the ability to pull models from any OCI registry, usually just Docker hub and Hugging Face. For this purpose, HF has OCI distribution support, at least for some of its models. It is not 100% clear which models those are, and when it decided to return proper 200 vs 4xx error codes, other than the requirement that the model be in GGUF format.

Either way, model has the ability to pull models down and then run them locally. It currently uses llama.cpp, but expects to support other runtimes and formats eventually. You also can control and configure it using env vars, e.g. its cache dir or the llama.cpp binary path.

Key question: do we want to leverage or reuse docker model?

The model CLI is licensed Apache 2.0, which means we can use it as-is, leverage the library, or copy and modify any parts of it.

The advantage of using docker model is that we get a CLI (or at least library) that already has much of what we need.

The disadvantage is that it does it in its own way. The limitations are:

  • models are stored in ~/.docker/models/, an OCI layout-v1-like directory structure. This is built into model. As above, we can configure the location using an env var, but not the format.
  • only GGUF is support currently, whereas we currently are working with onnx and bundle. We eventually want to get to GGUF and llama.cpp, but we do not have support for it today, and we are not sure we want to be limited to just that.
  • only OCI interface is supported for downloads. While HF supports it for some models (see above), it is not HF's preferred transport method, which is transitioning rapidly to xet.

The key question is HuggingFace compatibility. When we discussed this by phone last week, we had said we believed the following use cases to be important:

  • "I already downloaded a model using hf CLI or compatible tools, I want to use those to run on your chip with nekko CLI."
  • "I downloaded (and ran) a model using nekko CLI, I want to interact with those model using the hf or compatible tools and libraries with which I usually work."

The above means 100% compatibility with HF for downloading, uploading and caching models. If that is important to us, then model is of little value to us, other than a convenient library for some functions (maybe).

I don't think we should make significant changes to nekko until we resolve this question, as it changes direction.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions