### 🚀 The feature, motivation and pitch Similar to https://github.com/vllm-project/vllm/pull/22036 We can optimize the `reshape_and_cache` Cuda kernel. Pick it up if you are interested.