Describe nekko UX

This ties into some of the other issues. We need to capture in one place what the CLI UX should look like, what commands and flags are needed.

Currently, it is a single command, roughly like this:

```sh
nekko.sh -r <runtime> -m <model> [-d <dataset>] [-i <image>] [-c <command>]
```

Which lets you pick:

* what kind of runtime, which sets the default command and image
* model location (OCI registry or HF)
* dataset location (OCI registry or HF)
* runtime image to use
* command to execute

When you run it, it:

1. Downloads the model and dataset, if not already cached
2. Runs the runtime image with the command

This does not give you a lot of flexibility.

## Proposed new setup

First, there need to be multiple commands. The following are recommended:

* `nekko run` - equivalent of today: run a given model with a given image and command (default set by runtime, overridable) after ensuring it is present. But, this should be an inference run, i.e. minimal interactiveness 
* `nekko pull` - pull a model, dataset or runtime image; may need to be split into multiple subcommands, since it is not always clear which is a runtime image and which is model/dataset
* `nekko push` - for future; push a model or dataset (and maybe runtime image)
* `nekko develop` - similar to today, but with commands and setup to enable interactive running
* `nekko list` - list downloaded models and datasets

Note the different experience between `nekko develop` and `nekko run`. With `develop`, people expect to get an interactive shell; with `nekko run`, people expect inference to run, either interactively like with an LLM, or one-shot and exit, like with a vision model and dataset.

We need different runtimes, for now, at least: onnx-eis, onnx-runtime, llama.cpp. There are two ways to do this: flag or subcommand.

* Flag: `nekko run -r onnx-eis` vs `nekko run -r llama.cpp`
* Subcommand: `nekko llama.cpp run` vs `nekko onnx-eis run`

There are pros and cons to both. The one advantage to a CLI flag, is that we might be able to determine the runtime dynamically, by looking at the model once downloaded. Then again, that may be a bit of a"magic". When you run HF CLI or libraries, does it automatically determine what the model type is and launch a runtime for it? Is that an expected behaviour?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Describe nekko UX #9

Proposed new setup

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Describe nekko UX #9

Description

Proposed new setup

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions