Skip to content

Add Venice.ai as provider #2239

@faces-of-eth

Description

@faces-of-eth

Please explain the motivation behind the feature request

We need to add Venice.ai as a provider to enhance our AI capabilities while maintaining strong privacy guarantees and transparency. Venice.ai offers several unique advantages:

  1. Privacy-First: Unlike many AI providers, Venice.ai is built with privacy as a core principle and does not store or utilize user data.
  2. Open Source Foundation: All models are open-source, providing full transparency and auditability.
  3. Diverse Model Selection: Access to powerful models like Llama 3.3 (70B), Mistral 3.1 (24B), and specialized models for coding (Qwen 2.5 Coder) and vision tasks (Qwen 2.5 VL).
  4. OpenAI API Compatibility: Easy integration with existing OpenAI-based implementations.

This addition would give users more control over their data while maintaining high-quality AI capabilities.

Venice is also Agent-first, which is a good fit for Goose.

Autonomous AI Agents can programmatically access Venice.ai’s APIs without any human interaction using the “api_keys” endpoint. AI Agents are now able to manage their own wallets on the BASE blockchain, allowing them to programmatically acquire and stake VVV token to earn a daily VCU inference allocation. Venice’s new API endpoint allows them to automate further by generating their own API key.

Describe the solution you'd like

Implement Venice.ai as a provider with the following features:

  1. Provider Configuration:
interface VeniceConfig {
  apiKey: string;
  baseUrl: string; // defaults to Venice's API endpoint
  defaultModel?: string; // defaults to "llama-3.3-70b"
}
  1. Model Support Matrix:
  • General Purpose: llama-3.3-70b (65K context), llama-3.1-405b (65K context)
  • Code Optimization: qwen-2.5-coder-32b (32K context)
  • Vision Capabilities: qwen-2.5-vl (32K context)
  • Fast Inference: llama-3.2-3b (131K context)
  • Reasoning: deepseek-r1-671b (131K context)
  1. Feature Support:
  • Function Calling
  • Response Schema Support
  • Web Search Integration
  • Vision Processing
  • Extended Context Windows (up to 131K tokens)
  1. Integration with existing authentication and configuration systems

Describe alternatives you've considered

  1. Using individual open-source models directly:
  • Pros: Maximum control and customization
  • Cons: Higher maintenance overhead, need for own infrastructure
  1. Other privacy-focused providers:
  • Pros: Similar privacy guarantees
  • Cons: Smaller model selection, less specialized models
  1. Self-hosting open-source models:
  • Pros: Complete control over data and models
  • Cons: Significant infrastructure costs, maintenance complexity

Additional context

Key Technical Details:

  • All models support temperature and top_p adjustments
  • Context windows range from 32K to 131K tokens
  • Specialized models for different use cases (coding, vision, reasoning)
  • OpenAI-compatible API makes integration straightforward

Implementation Priority:

  1. Core API integration with general-purpose models
  2. Specialized model support (code, vision)
  3. Advanced features (function calling, web search)

Model Capabilities Overview:

Most Powerful: llama-3.1-405b (405B parameters)
Most Versatile: mistral-31-24b (vision, function calling, web search)
Fastest: llama-3.2-3b
Best for Code: qwen-2.5-coder-32b
Vision Support: qwen-2.5-vl
  • I have verified this does not duplicate an existing feature request

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions