Skip to content

[RFC][Store]Provide BatchPut() and BatchGet() interfaces for fine-grained KV Slices #380

@zhaoyongke

Description

@zhaoyongke

Mooncake Store provides put()/get() interfaces with single key for now, it's ok for a couple of tokens' KV save/load.

While in vLLM and SGLang, an optimization is transfering KVs layer-by-layer for better compute-communication overlapping.

vllm offloading

sglang cache controller

However, when KVs split into layers, the value size will decreased sharply, and the number of requests increased accordingly.

To reduce the overhead of single key put()/get, we may provide batched python interfaces as:

def batch_put(key : list[str], value : list[byte]):
    ...
def batch_get(key : list[str]) -> list[byte]
    ...

To add batch_put(), batch_get() interfaces, some considerations are needed:

  • Should be called asyncly
  • Should provide detailed status for each key/value
  • Should update metrics correctly
  • (Optional)Better storage and network-bandwidth utilizing
  • (Optional)Auto-batching, keep higher-level api unchanged

Welcome to codesign & review & contribue & PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions