feat: AI 代理 Wasm 插件接入 Together AI #1617

VinciWu557 · 2024-12-22T14:57:27Z

Ⅰ. Describe what this PR did

Support Together AI Chat Completion Model. API Doc: https://docs.together.ai/docs/chat-overview

Ⅱ. Does this pull request fix one issue?

fixes #964

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

docker-compose.yaml

version: '3.7'
services:
  envoy:
    image: higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:v1.4.0-rc.1
    entrypoint: /usr/local/bin/envoy
    # 注意这里对wasm开启了debug级别日志，正式部署时则默认info级别
    command: -c /etc/envoy/envoy.yaml --component-log-level wasm:debug
    depends_on:
      - httpbin
    networks:
      - wasmtest
    ports:
      - "10000:10000"
    volumes:
      - ./envoy.yaml:/etc/envoy/envoy.yaml
      - ./plugin.wasm:/etc/envoy/plugin.wasm

  httpbin:
    image: kennethreitz/httpbin:latest
    networks:
      - wasmtest
    ports:
      - "12345:80"

networks:
  wasmtest: {}

envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: together_ai
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/plugin.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                "activeProviderId":"together-ai",
                                "providers": [
                                  {
                                    "id": "together-ai",
                                    "type": "together-ai",
                                    "domain": "api.together.xyz",
                                    "apiTokens": [
                                      "xxx"
                                    ],
                                    "modelMapping": {
                                      "*": "Qwen/Qwen2.5-72B-Instruct-Turbo"
                                    }
                                  }
                                ]
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: together_ai
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: together_ai
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: api.together.xyz
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "api.together.xyz"

测试请求：

curl -X POST 'http://localhost:10000/v1/chat/completions' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "Qwen/Qwen2.5-72B-Instruct-Turbo",
    "messages": [
        {
            "role": "user",
            "content": "Who are you?"
        }
    ]
  }'

响应：

{
  "id": "8f60ee79ed51f8f2",
  "object": "chat.completion",
  "created": 1734879103,
  "model": "Qwen/Qwen2.5-72B-Instruct-Turbo",
  "prompt": [],
  "choices": [
    {
      "finish_reason": "eos",
      "seed": 4673125895036746000,
      "logprobs": null,
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I am Qwen, a large language model created by Alibaba Cloud. I am designed to assist users in generating various types of text, such as articles, stories, poems, and more, as well as to answer questions and engage in conversations. How can I assist you today?",
        "tool_calls": []
      }
    }
  ],
  "usage": {
    "prompt_tokens": 33,
    "completion_tokens": 57,
    "total_tokens": 90
  }
}

Ⅴ. Special notes for reviews

CLAassistant · 2024-12-22T14:57:33Z

All committers have signed the CLA.

CH3CHO · 2024-12-23T02:18:45Z

plugins/wasm-go/extensions/ai-proxy/provider/together_ai.go

+	return providerTypeTogetherAI
+}
+
+func (m *togetherAIProvider) OnRequestHeaders(ctx wrapper.HttpContext, apiName ApiName, log wrapper.Log) (types.Action, error) {


这个函数的签名与 RequestHeadersHandler 接口中定义的不一致。

好的，我看一下

CH3CHO

LGTM. Thanks.

codecov-commenter · 2024-12-23T07:48:12Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 43.50%. Comparing base (ef31e09) to head (86cf20f).
Report is 240 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1617      +/-   ##
==========================================
+ Coverage   35.91%   43.50%   +7.59%     
==========================================
  Files          69       76       +7     
  Lines       11576    12325     +749     
==========================================
+ Hits         4157     5362    +1205     
+ Misses       7104     6627     -477     
- Partials      315      336      +21

see 69 files with indirect coverage changes

feat: 接入 Together AI

930fa28

VinciWu557 requested review from cr7258, CH3CHO and rinfx as code owners December 22, 2024 14:57

Merge branch 'main' into feat/ai-proxy-support-together-ai

5eb5cbe

CH3CHO requested changes Dec 23, 2024

View reviewed changes

fix: OnRequestHeaders 函数签名修改

86cf20f

CH3CHO approved these changes Dec 23, 2024

View reviewed changes

CH3CHO merged commit 909cc0f into alibaba:main Dec 23, 2024
7 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: AI 代理 Wasm 插件接入 Together AI #1617

feat: AI 代理 Wasm 插件接入 Together AI #1617

VinciWu557 commented Dec 22, 2024

CLAassistant commented Dec 22, 2024 •

edited

Loading

CH3CHO Dec 23, 2024

VinciWu557 Dec 23, 2024

CH3CHO left a comment

codecov-commenter commented Dec 23, 2024

feat: AI 代理 Wasm 插件接入 Together AI #1617

feat: AI 代理 Wasm 插件接入 Together AI #1617

Conversation

VinciWu557 commented Dec 22, 2024

Ⅰ. Describe what this PR did

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

CLAassistant commented Dec 22, 2024 • edited Loading

CH3CHO Dec 23, 2024

Choose a reason for hiding this comment

VinciWu557 Dec 23, 2024

Choose a reason for hiding this comment

CH3CHO left a comment

Choose a reason for hiding this comment

codecov-commenter commented Dec 23, 2024

Codecov Report

CLAassistant commented Dec 22, 2024 •

edited

Loading