Skip to content

Commit b8b6adb

Browse files
committed
Address review comments
1 parent 19c9516 commit b8b6adb

File tree

3 files changed

+9
-9
lines changed

3 files changed

+9
-9
lines changed

doc/en/mooncake-store.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -398,11 +398,11 @@ Importantly, the memory managed by the buffer allocator does not reside within t
398398
399399
Mooncake Store provides two concrete implementations of `BufferAllocatorBase`:
400400
401-
**OffsetBufferAllocator**: This allocator is derived from [OffsetAllocator](https://github.com/sebbbi/OffsetAllocator), which uses a custom bin-based allocation strategy that supports fast hard realtime `O(1)` offset allocation with minimal fragmentation.
401+
**OffsetBufferAllocator (default and recommended)**: This allocator is derived from [OffsetAllocator](https://github.com/sebbbi/OffsetAllocator), which uses a custom bin-based allocation strategy that supports fast hard realtime `O(1)` offset allocation with minimal fragmentation. Mooncake Store optimizes this allocator based on the specific memory usage characteristics of LLM inference workloads, thereby enhancing memory utilization in LLM scenarios.
402402
403-
**CachelibBufferAllocator**: This allocator leverages Facebook's [CacheLib](https://github.com/facebook/CacheLib) to manage memory using a slab-based allocation strategy. It provides efficient memory allocation with good fragmentation resistance and is well-suited for high-performance scenarios.
403+
**CachelibBufferAllocator (deprecated)**: This allocator leverages Facebook's [CacheLib](https://github.com/facebook/CacheLib) to manage memory using a slab-based allocation strategy. It provides efficient memory allocation with good fragmentation resistance and is well-suited for high-performance scenarios. However, in our modified version, it does not handle workloads with highly variable object sizes effectively, so it is currently marked as deprecated.
404404
405-
Mooncake Store optimizes both allocators based on the specific memory usage characteristics of LLM inference workloads, thereby enhancing memory utilization in LLM scenarios. Users can choose the allocator that best matches their performance and memory usage requirements through the `--memory-allocator` startup parameter of `master_service`. The default and recommended option is OffsetBufferAllocator.
405+
Users can choose the allocator that best matches their performance and memory usage requirements through the `--memory-allocator` startup parameter of `master_service`.
406406
407407
Both allocators implement the same interface as `BufferAllocatorBase`. The main interfaces of the `BufferAllocatorBase` class are as follows:
408408

doc/zh/mooncake-store.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -404,11 +404,11 @@ tl::expected<void, ErrorCode> Remove(const std::string& key);
404404
405405
Mooncake Store 提供了 `BufferAllocatorBase` 的两个具体实现:
406406
407-
**OffsetBufferAllocator**:该分配器源自 [OffsetAllocator](https://github.com/sebbbi/OffsetAllocator),采用自定义的基于 bin 的分配策略,支持实时性要求较高的 `O(1)` 级偏移分配,并能最大限度地减少内存碎片。
407+
**OffsetBufferAllocator(默认且推荐)**:该分配器源自 [OffsetAllocator](https://github.com/sebbbi/OffsetAllocator),采用自定义的基于 bin 的分配策略,支持实时性要求较高的 `O(1)` 级偏移分配,并能最大限度地减少内存碎片。Mooncake Store 根据大语言模型推理任务的特定内存使用特性,对该内存分配器进行了优化,从而提升了在 LLM 场景下的内存利用率
408408
409-
**CachelibBufferAllocator**:该分配器基于 Facebook 的 [CacheLib](https://github.com/facebook/CacheLib),采用基于 slab 的分配策略进行内存管理,具有良好的碎片控制能力,适用于高性能场景。
409+
**CachelibBufferAllocator(不推荐)**:该分配器基于 Facebook 的 [CacheLib](https://github.com/facebook/CacheLib),采用基于 slab 的分配策略进行内存管理,具有良好的碎片控制能力,适用于高性能场景。不过,我们修改后的版本目前在处理对象大小剧烈变化的工作负载时表现不佳,因此暂时将其标记为不推荐
410410
411-
Mooncake Store 根据大语言模型推理任务的特定内存使用特性,对两种内存分配器进行了优化,从而提升了在 LLM 场景下的内存利用率。用户可以通过 `master_service` 的启动参数 `--memory-allocator` 选择最符合其性能需求和内存使用模式的分配器。默认且推荐的选项是 **OffsetBufferAllocator**
411+
用户可以通过 `master_service` 的启动参数 `--memory-allocator` 选择最符合其性能需求和内存使用模式的分配器。
412412
413413
这两种分配器都实现了 `BufferAllocatorBase` 接口。`BufferAllocatorBase` 类的主要接口如下:
414414

docs/source/design/mooncake-store.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -398,11 +398,11 @@ Importantly, the memory managed by the buffer allocator does not reside within t
398398
399399
Mooncake Store provides two concrete implementations of `BufferAllocatorBase`:
400400
401-
**OffsetBufferAllocator**: This allocator is derived from [OffsetAllocator](https://github.com/sebbbi/OffsetAllocator), which uses a custom bin-based allocation strategy that supports fast hard realtime `O(1)` offset allocation with minimal fragmentation.
401+
**OffsetBufferAllocator (default and recommended)**: This allocator is derived from [OffsetAllocator](https://github.com/sebbbi/OffsetAllocator), which uses a custom bin-based allocation strategy that supports fast hard realtime `O(1)` offset allocation with minimal fragmentation. Mooncake Store optimizes this allocator based on the specific memory usage characteristics of LLM inference workloads, thereby enhancing memory utilization in LLM scenarios.
402402
403-
**CachelibBufferAllocator**: This allocator leverages Facebook's [CacheLib](https://github.com/facebook/CacheLib) to manage memory using a slab-based allocation strategy. It provides efficient memory allocation with good fragmentation resistance and is well-suited for high-performance scenarios.
403+
**CachelibBufferAllocator (deprecated)**: This allocator leverages Facebook's [CacheLib](https://github.com/facebook/CacheLib) to manage memory using a slab-based allocation strategy. It provides efficient memory allocation with good fragmentation resistance and is well-suited for high-performance scenarios. However, in our modified version, it does not handle workloads with highly variable object sizes effectively, so it is currently marked as deprecated.
404404
405-
Mooncake Store optimizes both allocators based on the specific memory usage characteristics of LLM inference workloads, thereby enhancing memory utilization in LLM scenarios. Users can choose the allocator that best matches their performance and memory usage requirements through the `--memory-allocator` startup parameter of `master_service`. The default and recommended option is OffsetBufferAllocator.
405+
Users can choose the allocator that best matches their performance and memory usage requirements through the `--memory-allocator` startup parameter of `master_service`.
406406
407407
Both allocators implement the same interface as `BufferAllocatorBase`. The main interfaces of the `BufferAllocatorBase` class are as follows:
408408

0 commit comments

Comments
 (0)