Skip to content

Commit

Permalink
Update getting-started.mdx
Browse files Browse the repository at this point in the history
  • Loading branch information
BenHamm authored Oct 6, 2024
1 parent 431b52b commit e2b7f8f
Showing 1 changed file with 1 addition and 15 deletions.
16 changes: 1 addition & 15 deletions fern/docs/text-gen-solution/getting-started.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,30 +5,16 @@ subtitle: >-
slug: text-gen-solution/getting-started
---

In the coming months, we will launch additional features including efficient fine-tuning, longer-context models, JSON mode support, and other features.

## Self-Service Models

We are always expanding our offering of models and other features. Presently, OctoAI supports the following models & checkpoints for self-service models:
Presently, OctoAI supports the following models & checkpoints for self-service models:

**Meta-Llama-3.1** Meta's latest LLM family. Llama-3.1 offers improved understanding and instruction following as compared to Llama-3.0, expands context length, supports 8 languages (English, French, German, Hindi, Italian, Portuguese, Spanish, and Thai), and includes support for function calling. Model is offered in 8B-, 70B-, and 405B-parameter variants. The 405B variant is the most powerful open source model yet released.

**Mistral-Nemo-Instruct:** Released in 2024, Mistral Nemo is a 12B parameter model developed in collaboration with NVIDIA. Mistral Nemo uses a standard architecture, making it easy to integrate as a drop-in replacement for systems using Mistral 7B. The model is released under the Apache 2.0 license excels in reasoning, world knowledge, and coding accuracy for its size category, and is designed for multilingual applications, with strong performance in English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi. [Read more.](https://mistral.ai/news/mistral-nemo/)

**Phi-3.5-Vision-Instruct:** Released in August 2024 by Microsoft, Phi-3.5-vision-instruct is a lightweight, state-of-the-art open multimodal model with 4.2B parameters. It features a 128K token context length and excels in multi-frame image understanding and reasoning. The model is built upon high-quality datasets including synthetic data and filtered publicly available websites, focusing on reasoning-dense data for both text and vision. Phi-3.5-vision-instruct demonstrates improved performance on various benchmarks, including MMMU (43.0), MMBench (81.9), and TextVQA (72.0). It supports advanced capabilities such as detailed image comparison, multi-image summarization/storytelling, and video summarization. [Read more.](https://huggingface.co/microsoft/Phi-3.5-vision-instruct/)

**Mistral-7b-Instruct-v0.3:** Updated by Mistral AI in May 2024, Mistral-7B-Instruct-v0.3 is an instruction-tuned model that achieves high-quality performance at a very low parameter count.Compared to the previous version, this model expands the vocabulary to 32,768, adds support for the v3 Tokenizer, and enabled function calling. [Read more.](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) This model is available for commercial use. We offer a single endpoint here: the 7B parameter model, which supports up to 32,768 tokens.

**Mistral-8x7b-Instruct:** Mistral AI's December 2023 release, Mistral-8x7b-Instruct, is a "mixture of experts" model utilizing conditional computing for efficient token generation, reducing computational demands while improving response quality (GPT-4 is widely believed to be an MoE model). Mistral-8x7b-Instruct brings these efficiencies to the open-source LLM realm, and it is licensed for commercial use. It supports up to 32,768 tokens. [Read more](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1).

**Nous-Hermes-2-Mixtral-8x7b-DPO** The flagship Nous Research model trained over the Mixtral 8x7B MoE LLM. The model was trained on over 1,000,000 entries of data, as well as other high quality data from open datasets across the AI landscape, achieving state of the art performance on a variety of tasks. It supports up to 32,768 tokens.

**WizardLM2-8x22B** A fine tune of Mixtral-8x22b published by Microsoft, competitive with closed source models on a variety of benchmarks. Microsoft specifically trained this model to support complex instruction following, and it has gained a folloiwng with the open source community and OctoAI customers. As with Mixtral-Instruct, this supports up to 65,536 tokens. [Read more.](https://wizardlm.github.io/WizardLM2/)

**Llama Guard 2** An 8B parameter Llama 3-based LLM content moderation model released by Meta, which can classify text as safe or unsafe according to an editable set of policies. As an 8B parameter model, it is optimized for latency and can be used to moderate other LLM interactions in real time. [Read more](https://huggingface.co/meta-llama/Meta-Llama-Guard-2-8). Note: This model requires a specific prompt template to be applied, and is not compatible with the ChatCompletion API.

**GTE Large** An embeddings model released by Alibaba DAMO Academy. Trained on a large-scale corpus of relevance text pairs, covering a wide range of domains and scenarios. Consistently ranked highly on Huggingface’s [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard). In combination with a vector database, this embeddings model is especially useful for powering semantic search and Retrieval Augmented Generation (RAG) applications. [Read more.](https://huggingface.co/thenlper/gte-large)

For pricing of all of these endpoints, please refer to our [pricing page](/docs/getting-started/pricing-and-billing).

## Web UI playground
Expand Down

0 comments on commit e2b7f8f

Please sign in to comment.