Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature-request] Add support for mlc-llm #18

Open
shwu-nyunai opened this issue Sep 5, 2024 · 0 comments
Open

[feature-request] Add support for mlc-llm #18

shwu-nyunai opened this issue Sep 5, 2024 · 0 comments

Comments

@shwu-nyunai
Copy link
Contributor

shwu-nyunai commented Sep 5, 2024

Feature type?
Algorithm request

A proposal draft (if any)
MLC-LLM is an LLM Deployment Engine with ML Compilation.
https://github.com/mlc-ai/mlc-llm

It has a very wide environment and backend support
image

Primarily,

  • the different quantisation schemes should be supported ootb.

    • In the past, we've had tested out that the outputs from nyuntam's w4a16 quant algo (awq) can be directly used as inputs for the q4f16_awq quantisation scheme of mlcllm. We expect that this should still hold true. Ideally, if someone choses q4f16_awq as the quantisation, nyuntam's AutoAWQ should be used as the intermediary job, and the output(s) be used to continue mlc-llm's weight conversion and model compilation.
    • for 3bit quantisation, mlc-llm supports Omniquant's inputs as per this notebook
  • all the platforms supported by mlc-llm should supported ootb, though testing for the same is subject to a test environment availability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant