[Model] Add Phi3.5-mini #555

CharlieFRuan · 2024-08-23T16:14:11Z

This PR adds the newly release Phi3.5-mini, adding the following model_ids to our prebuilt model list:

Phi-3.5-mini-instruct-q4f16_1-MLC (4k KVCache)
Phi-3.5-mini-instruct-q4f32_1-MLC (4k KVCache)
Phi-3.5-mini-instruct-q4f16_1-MLC-1k (1k KVCache)
Phi-3.5-mini-instruct-q4f16_1-MLC-1k (1k KVCache)

See mlc-ai/binary-mlc-llm-libs#136 for on which commits of TVM and MLC-LLM this is compiled with.

Note that Phi-3.5-mini comes with support up to 128K context (unlike Phi-3-mini which only has 4k) thanks to rope scaling which MLC-LLM supports, which you can take advantage of in WebLLM by increasing ModelRecord.overrides.context_window_size or specifying it in ChatOptions when loading a model, as long as there is enough VRAM.

### Change - #555 ### TVMjs - Updated to current head: apache/tvm@1518008 - Main change is apache/tvm#17251 - This is needed for WASMs compiled after apache/tvm#17257 is merged (e.g. Phi-3.5). TVM global functions that returns bool need this PR to run correctly (e.g. `AcceptToken()` in BNFGrammar) in runtime. - However, these are backward compatible to WASMs compiled prior to this PR. Tested with Phi-3 (old WASM) running grammar.

This PR adds the newly release Phi3.5-mini, adding the following `model_id`s to our prebuilt model list: - `Phi-3.5-mini-instruct-q4f16_1-MLC` (4k KVCache) - `Phi-3.5-mini-instruct-q4f32_1-MLC` (4k KVCache) - `Phi-3.5-mini-instruct-q4f16_1-MLC-1k` (1k KVCache) - `Phi-3.5-mini-instruct-q4f16_1-MLC-1k` (1k KVCache) See mlc-ai/binary-mlc-llm-libs#136 for on which commits of TVM and MLC-LLM this is compiled with. Note that Phi-3.5-mini comes with support up to 128K context (unlike Phi-3-mini which only has 4k) thanks to rope scaling which MLC-LLM supports, which you can take advantage of in WebLLM by increasing `ModelRecord.overrides.context_window_size` or specifying it in `ChatOptions` when loading a model, as long as there is enough VRAM.

### Change - mlc-ai#555 ### TVMjs - Updated to current head: apache/tvm@1518008 - Main change is apache/tvm#17251 - This is needed for WASMs compiled after apache/tvm#17257 is merged (e.g. Phi-3.5). TVM global functions that returns bool need this PR to run correctly (e.g. `AcceptToken()` in BNFGrammar) in runtime. - However, these are backward compatible to WASMs compiled prior to this PR. Tested with Phi-3 (old WASM) running grammar.

[Model] Add Phi3.5-mini

fc1919f

CharlieFRuan merged commit 2639a80 into mlc-ai:main Aug 23, 2024
1 check passed

CharlieFRuan mentioned this pull request Aug 23, 2024

[Version] Bump version to 0.2.62 #556

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] Add Phi3.5-mini #555

[Model] Add Phi3.5-mini #555

CharlieFRuan commented Aug 23, 2024 •

edited

Loading

[Model] Add Phi3.5-mini #555

[Model] Add Phi3.5-mini #555

Conversation

CharlieFRuan commented Aug 23, 2024 • edited Loading

CharlieFRuan commented Aug 23, 2024 •

edited

Loading