[RFC][Store]Provide BatchPut() and BatchGet() interfaces for fine-grained KV Slices

Mooncake Store provides put()/get() interfaces with single key for now, it's ok for a couple of tokens' KV save/load. 

While in vLLM and SGLang, an optimization is transfering KVs layer-by-layer for better compute-communication overlapping.

[vllm offloading](https://github.com/vllm-project/vllm/issues/15123)

[sglang cache controller](https://github.com/sgl-project/sglang/blob/b146555749f84a684c7cf5e9d2950ca474b82de2/python/sglang/srt/managers/cache_controller.py#L312)

However, when KVs split into layers, the value size will decreased sharply, and the number of requests increased accordingly. 

To reduce the overhead of single key put()/get, we may provide batched python interfaces as:

```
def batch_put(key : list[str], value : list[byte]):
    ...
def batch_get(key : list[str]) -> list[byte]
    ...
```
To add batch_put(), batch_get() interfaces, some considerations are needed:

- Should be called asyncly
- Should provide detailed status for each key/value
- Should update metrics correctly
- (Optional)Better storage and network-bandwidth utilizing
- (Optional)Auto-batching, keep higher-level api unchanged

Welcome to codesign & review & contribue & PR!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC][Store]Provide BatchPut() and BatchGet() interfaces for fine-grained KV Slices #380

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC][Store]Provide BatchPut() and BatchGet() interfaces for fine-grained KV Slices #380

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions