Skip to content

Commit aa5b9e4

Browse files
committed
move variable to additional config
Signed-off-by: chenwaner <861645847@qq.com>
1 parent 8740191 commit aa5b9e4

File tree

3 files changed

+6
-5
lines changed

3 files changed

+6
-5
lines changed

docs/source/user_guide/additional_config.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ The details of each config option are as follows:
4040
| `use_cached_graph` | bool | `False` | Whether to use cached graph |
4141
| `graph_batch_sizes` | list[int] | `[]` | The batch size for torchair graph cache |
4242
| `graph_batch_sizes_init` | bool | `False` | Init graph batch size dynamically if `graph_batch_sizes` is empty |
43+
| `enable_kv_nz`| bool | `False` | Whether to enable kvcache NZ layout |
4344

4445
**ascend_scheduler_config**
4546

@@ -59,12 +60,14 @@ A full example of additional configuration is as follows:
5960
"enabled": true,
6061
"use_cached_graph": true,
6162
"graph_batch_sizes": [1, 2, 4, 8],
62-
"graph_batch_sizes_init": true
63+
"graph_batch_sizes_init": true,
64+
"enable_kv_nz": false
6365
},
6466
"ascend_scheduler_config": {
6567
"enabled": true,
6668
"chunked_prefill_enabled": true,
6769
},
68-
"expert_tensor_parallel_size": 1
70+
"expert_tensor_parallel_size": 1,
71+
"refresh": false,
6972
}
7073
```

vllm_ascend/ascend_config.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ def __init__(self, torchair_graph_config):
5555
"graph_batch_sizes_init", False)
5656
self.enable_multistream_shared_expert = torchair_graph_config.get(
5757
"enable_multistream_shared_expert", False)
58+
self.enable_kv_nz = torchair_graph_config.get("enable_kv_nz", False)
5859

5960
if not isinstance(self.graph_batch_sizes, list):
6061
raise TypeError("graph_batch_sizes must be list[int]")

vllm_ascend/envs.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -55,9 +55,6 @@
5555
# Find more detail here: https://www.hiascend.com/document/detail/zh/canncommercial/81RC1/developmentguide/opdevg/ascendcbestP/atlas_ascendc_best_practices_10_0043.html
5656
"VLLM_ENABLE_MC2":
5757
lambda: bool(int(os.getenv("VLLM_ENABLE_MC2", '0'))),
58-
# Whether to enable the kvcache nz optimization, the default value is False.
59-
"VLLM_ENABLE_KV_NZ":
60-
lambda: bool(int(os.getenv("VLLM_ENABLE_KV_NZ", '0'))),
6158
# Whether to enable the topk optimization. It's disabled by default for experimental support
6259
# We'll make it enabled by default in the future.
6360
"VLLM_ASCEND_ENABLE_TOPK_OPTIMZE":

0 commit comments

Comments
 (0)