-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
Please explain the motivation behind the feature request
We need to add Venice.ai as a provider to enhance our AI capabilities while maintaining strong privacy guarantees and transparency. Venice.ai offers several unique advantages:
- Privacy-First: Unlike many AI providers, Venice.ai is built with privacy as a core principle and does not store or utilize user data.
- Open Source Foundation: All models are open-source, providing full transparency and auditability.
- Diverse Model Selection: Access to powerful models like Llama 3.3 (70B), Mistral 3.1 (24B), and specialized models for coding (Qwen 2.5 Coder) and vision tasks (Qwen 2.5 VL).
- OpenAI API Compatibility: Easy integration with existing OpenAI-based implementations.
This addition would give users more control over their data while maintaining high-quality AI capabilities.
Venice is also Agent-first, which is a good fit for Goose.
Autonomous AI Agents can programmatically access Venice.ai’s APIs without any human interaction using the “api_keys” endpoint. AI Agents are now able to manage their own wallets on the BASE blockchain, allowing them to programmatically acquire and stake VVV token to earn a daily VCU inference allocation. Venice’s new API endpoint allows them to automate further by generating their own API key.
Describe the solution you'd like
Implement Venice.ai as a provider with the following features:
- Provider Configuration:
interface VeniceConfig {
apiKey: string;
baseUrl: string; // defaults to Venice's API endpoint
defaultModel?: string; // defaults to "llama-3.3-70b"
}- Model Support Matrix:
- General Purpose:
llama-3.3-70b(65K context),llama-3.1-405b(65K context) - Code Optimization:
qwen-2.5-coder-32b(32K context) - Vision Capabilities:
qwen-2.5-vl(32K context) - Fast Inference:
llama-3.2-3b(131K context) - Reasoning:
deepseek-r1-671b(131K context)
- Feature Support:
- Function Calling
- Response Schema Support
- Web Search Integration
- Vision Processing
- Extended Context Windows (up to 131K tokens)
- Integration with existing authentication and configuration systems
Describe alternatives you've considered
- Using individual open-source models directly:
- Pros: Maximum control and customization
- Cons: Higher maintenance overhead, need for own infrastructure
- Other privacy-focused providers:
- Pros: Similar privacy guarantees
- Cons: Smaller model selection, less specialized models
- Self-hosting open-source models:
- Pros: Complete control over data and models
- Cons: Significant infrastructure costs, maintenance complexity
Additional context
Key Technical Details:
- All models support temperature and top_p adjustments
- Context windows range from 32K to 131K tokens
- Specialized models for different use cases (coding, vision, reasoning)
- OpenAI-compatible API makes integration straightforward
Implementation Priority:
- Core API integration with general-purpose models
- Specialized model support (code, vision)
- Advanced features (function calling, web search)
Model Capabilities Overview:
Most Powerful: llama-3.1-405b (405B parameters)
Most Versatile: mistral-31-24b (vision, function calling, web search)
Fastest: llama-3.2-3b
Best for Code: qwen-2.5-coder-32b
Vision Support: qwen-2.5-vl
- I have verified this does not duplicate an existing feature request