Skip to content

feat: 缓存时间差异化计费支持;供应商/key 级别缓存时间偏好设置 #277

@ding113

Description

@ding113

Anthropic 的缓存时间分为 5 min 和 1 hour,二者缓存写入和读取费率略有不同。
默认情况下,客户端会请求 5min 缓存。而对于长对话需求,1h 缓存更有可能命中,即可以降低缓存的创建开销。
差异化计费主要依赖响应体message_start中的usage字段实现。

event: message_start
data: {"type":"message_start","message":{"model":"claude-sonnet-4-5-20250929","id":"msg_013EzTQuQxsySLNwQeLiDHfH","type":"message","role":"assistant","content":[],"stop_reason":null,"stop_sequence":null,"usage":{"input_tokens":0,"cache_creation_input_tokens":797,"cache_read_input_tokens":118215,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":797},"output_tokens":24,"service_tier":"standard"}}        }

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    Status

    Done

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions