Models Request: lateral upgrade #1797
JamesClarke7283
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I would like to request some models be replaced with their newer counterparts, i will list below.
phi3.5-mini -> phi4-mini
command-r-plus-08-2024 -> command-a-03-2025
Mistral-Nemo-Instruct-2407 -> Mistral-Small-3.1-24B-Instruct-2503
Note: i realise some of these models are not direct drop in replacements in that they have different capabilities and compute cost.
As far as i know Mistral-Nemo has been superseeded by mistral-small (i may be wrong if mistral plan to release another nemo model).
Mistral-small-3.1 also has vision capability which is nice. And is apache2.0 licensed unlike the previous version of small.
As for command-a, its only a 7b param increase from command-r-plus (the previous flagship model from cohere).
command-r-plus is 104B and command-a is 111B.
Also,
I realise there has been a lot of contrayvercy around the llama4 herd models.
Either way as a nice bonus
meta-llama/Llama-4-Scout-17B-16E-Instruct
would be nice to have, it has 109B params(sparse as its a MoE), so not too much like llama3.1-405B was (i still remember using 405B on hf-chat and the service being stalled).Although not to say the 402B (Sparse, MoE) Maverick model cant be used if you are brave enough. (You can change the number of experts based or compute capacity)
Needless to say Behemoth with its 2T params might melt HF's servers(unless you dedicate clusters to that model specifically), can't wait to see how inference providers try to handle that model when it finishes training.
Anyways that's my 2 cents on keeping the model list fresh, thanks for reading, in advance. (:
Beta Was this translation helpful? Give feedback.
All reactions