Use frontier open LLMs like Kimi K2, DeepSeek V3.1, GLM 4.5 and more in VS Code with GitHub Copilot Chat powered by Hugging Face Inference Providers 🔥
- Install the HF Copilot Chat extension here.
- Open VS Code's chat interface.
- Click the model picker and click "Manage Models...".
- Select "Hugging Face" provider.
- Provide your Hugging Face Token, you can get one in your settings page. You only need to give it the inference.serverless permissions.
- Choose the models you want to add to the model picker. 🥳
- Access SoTA open-source LLMs with tool calling capabilities.
- Single API to switch between multiple providers: Cerebras, Cohere, Fireworks AI, Groq, HF Inference, Hyperbolic, Nebius, Novita, Nscale, SambaNova, Together AI, and more. See the full list of partners in the Inference Providers docs.
- Built for high availability (across providers) and low latency.
- Transparent pricing: what the provider charges is what you pay.
💡 The free Hugging Face user tier gives you a small amount of monthly inference credits to experiment. Upgrade to Hugging Face PRO or Enterprise for $2 in monthly credits plus pay-as-you-go access across all providers!
- VS Code 1.104.0 or higher.
- Hugging Face access token with
inference.serverless
permissions.
git clone https://github.com/huggingface/huggingface-vscode-chat
cd huggingface-vscode-chat
npm install
npm run compile
Press F5 to launch an Extension Development Host.
Common scripts:
- Build:
npm run compile
- Watch:
npm run watch
- Lint:
npm run lint
- Format:
npm run format
- Inference Providers documentation: https://huggingface.co/docs/inference-providers/index
- VS Code Chat Provider API: https://code.visualstudio.com/api/extension-guides/ai/language-model-chat-provider
- Open issues: https://github.com/huggingface/huggingface-vscode-chat/issues
- License: MIT License Copyright (c) 2025 Hugging Face