Skip to content

Commit 93ef4bb

Browse files
committed
Merge branch 'pr_1481' into pangumoe_w8a8c8
2 parents 20deb50 + 9c40943 commit 93ef4bb

34 files changed

+1590
-328
lines changed
Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
name: Release Checklist
2+
description: Generate a release checklist issue when prepare a new release.(Used for release team)
3+
title: "[Release]: Release checklist for v"
4+
5+
body:
6+
- type: textarea
7+
attributes:
8+
description: >
9+
Brief info for the new release.
10+
label: Release Checklist
11+
value: >
12+
**Release Version**:
13+
14+
**Release Branch**:
15+
16+
**Release Date**:
17+
18+
**Release Manager**:
19+
- type: textarea
20+
attributes:
21+
description: >
22+
Release notes.
23+
label: Prepare Release Note
24+
value: >
25+
- [ ] Create a new issue for release feedback
26+
27+
- [ ] Write the release note PR.
28+
29+
- [ ] Update the feedback issue link in docs/source/faqs.md
30+
31+
- [ ] Add release note to docs/source/user_guide/release_notes.md
32+
33+
- [ ] Update version info in docs/source/community/versioning_policy.md
34+
35+
- [ ] Update contributor info in docs/source/community/contributors.md
36+
37+
- [ ] Update package version in docs/conf.py
38+
- type: textarea
39+
attributes:
40+
description: >
41+
Make sure the code is merged.
42+
label: PR need Merge
43+
value: >
44+
- [ ] PR link1
45+
46+
- [ ] PR link2
47+
48+
- [ ] ...
49+
- type: textarea
50+
attributes:
51+
description: >
52+
Make sure the new Feature/Function is tested
53+
label: Functional Test
54+
value: >
55+
- [ ] Feature1
56+
57+
- [ ] Bug1
58+
59+
- [ ] ...
60+
- type: textarea
61+
attributes:
62+
description: >
63+
Make sure the doc is updated.
64+
label: Doc Test
65+
value: >
66+
- [ ] Tutorial is updated.
67+
68+
- [ ] User Guide is updated.
69+
70+
- [ ] Developer Guide is updated.
71+
- type: textarea
72+
attributes:
73+
description: >
74+
Make sure the artifacts is ready
75+
label: Prepare Artifacts
76+
value: >
77+
- [ ] Docker image is ready.
78+
79+
- [ ] Wheel package is ready.
80+
- type: textarea
81+
attributes:
82+
description: >
83+
Start to release.
84+
label: Release Step
85+
value: >
86+
- [ ] Release note PR is merged.
87+
88+
- [ ] Post the release on GitHub release page.
89+
90+
- [ ] Generate official doc page on https://app.readthedocs.org/dashboard/
91+
92+
- [ ] Wait for the wheel package to be available on https://pypi.org/project/vllm-ascend
93+
94+
- [ ] Wait for the docker image to be available on https://quay.io/ascend/vllm-ascend
95+
96+
- [ ] Upload 310p wheel to Github release page
97+
98+
- [ ] Brodcast the release news (By message, blog , etc)
99+
100+
- [ ] Close this issue

.github/workflows/accuracy_test.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -380,7 +380,7 @@ jobs:
380380
const pr = await github.rest.pulls.create({
381381
owner: 'vllm-project',
382382
repo: 'vllm-ascend',
383-
head: `${{ github.actor }}:${{ env.BRANCH_NAME }}`,
383+
head: `vllm-ascend-ci:${{ env.BRANCH_NAME }}`,
384384
base: '${{ github.event.inputs.vllm-ascend-version }}',
385385
title: `[Doc] Update accuracy reports for ${{ github.event.inputs.vllm-ascend-version }}`,
386386
body: `The accuracy results running on NPU Altlas A2 have changed, updating reports for:

.github/workflows/nightly_benchmarks.yaml

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -145,8 +145,8 @@ jobs:
145145
- name: Install elastic_tool
146146
if: github.event_name != 'pull_request'
147147
run: |
148-
pip install escli-tool==0.2.2
149-
148+
pip install escli-tool==0.2.3
149+
150150
- name: Collect pr info from vllm-project/vllm-ascend
151151
if: github.event_name != 'pull_request'
152152
run: |
@@ -176,24 +176,28 @@ jobs:
176176
commit_time=$(git show -s --format=%cd $commit_hash --date=iso-strict)
177177
commit_time_no_tz=${commit_time::19}
178178
pip install -e .
179-
179+
180180
echo "------------------------"
181181
echo "commit_id: $commit_id"
182182
echo "commit_title: $commit_title"
183183
echo "commit_time: $commit_time_no_tz"
184184
echo "vllm branch: ${{ matrix.vllm_branch }}"
185185
echo "vllm-ascend branch: ${{ matrix.vllm_ascend_branch }}"
186186
echo "------------------------"
187-
187+
188188
cd /github/home
189-
bash benchmarks/scripts/run-performance-benchmarks.sh
189+
ERROR_MSG=""
190+
if ! bash benchmarks/scripts/run-performance-benchmarks.sh; then
191+
ERROR_MSG="Benchmark failed to run"
192+
fi
190193
# send the result to es
191194
escli add --vllm_branch ${{ matrix.vllm_branch }} \
192195
--vllm_ascend_branch ${{ matrix.vllm_ascend_branch }} \
193196
--commit_id $commit_id \
194197
--commit_title "$commit_title" \
195198
--created_at "$commit_time_no_tz" \
196199
--res_dir ./benchmarks/results \
200+
--error $ERROR_MSG \
197201
--extra_feat '{"VLLM_USE_V1": "${{ matrix.vllm_use_v1 }}"}'
198202
rm -rf ./benchmarks/results
199203
cd -

.github/workflows/vllm_ascend_test.yaml

Lines changed: 100 additions & 100 deletions
Original file line numberDiff line numberDiff line change
@@ -144,7 +144,7 @@ jobs:
144144
VLLM_USE_MODELSCOPE: True
145145
strategy:
146146
matrix:
147-
vllm_version: [main, v0.9.1]
147+
vllm_version: [main, ]
148148
steps:
149149
- name: Install packages
150150
run: |
@@ -193,111 +193,111 @@ jobs:
193193
name: vllm-ascend
194194
verbose: true
195195

196-
e2e:
197-
needs: [lint]
198-
# only trigger e2e test on pull request after lint passed
199-
if: ${{ needs.lint.result == 'success' && github.event_name == 'pull_request' }}
200-
strategy:
201-
max-parallel: 2
202-
matrix:
203-
os: [linux-arm64-npu-1]
204-
vllm_version: [main, v0.9.1]
205-
name: singlecard e2e test
206-
runs-on: ${{ matrix.os }}
207-
container:
208-
# TODO(yikun): Remove m.daocloud.io prefix when infra proxy ready
209-
image: m.daocloud.io/quay.io/ascend/cann:8.1.rc1-910b-ubuntu22.04-py3.10
210-
env:
211-
VLLM_LOGGING_LEVEL: ERROR
212-
steps:
213-
- name: Check npu and CANN info
214-
run: |
215-
npu-smi info
216-
cat /usr/local/Ascend/ascend-toolkit/latest/"$(uname -i)"-linux/ascend_toolkit_install.info
217-
218-
- name: Config mirrors
219-
run: |
220-
sed -i 's|ports.ubuntu.com|mirrors.tuna.tsinghua.edu.cn|g' /etc/apt/sources.list
221-
pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
222-
apt-get update -y
223-
apt install git -y
224-
git config --global url."https://gh-proxy.test.osinfra.cn/https://github.com/".insteadOf https://github.com/
225-
226-
- name: Checkout vllm-project/vllm-ascend repo
227-
uses: actions/checkout@v4
228-
229-
- name: Install system dependencies
230-
run: |
231-
apt-get -y install `cat packages.txt`
232-
apt-get -y install gcc g++ cmake libnuma-dev
233-
234-
- name: Checkout vllm-project/vllm repo
235-
uses: actions/checkout@v4
236-
with:
237-
repository: vllm-project/vllm
238-
ref: ${{ matrix.vllm_version }}
239-
path: ./vllm-empty
240-
241-
- name: Install vllm-project/vllm from source
242-
working-directory: ./vllm-empty
243-
run: |
244-
VLLM_TARGET_DEVICE=empty pip install -e .
245-
246-
- name: Install vllm-project/vllm-ascend
247-
env:
248-
PIP_EXTRA_INDEX_URL: https://mirrors.huaweicloud.com/ascend/repos/pypi
249-
run: |
250-
pip install -r requirements-dev.txt
251-
pip install -v -e .
252-
253-
- name: Run e2e test for V1 Engine
254-
env:
255-
VLLM_USE_V1: 1
256-
VLLM_WORKER_MULTIPROC_METHOD: spawn
257-
VLLM_USE_MODELSCOPE: True
258-
run: |
259-
pytest -sv tests/e2e/singlecard/test_offline_inference.py
260-
# TODO: switch hf to modelscope
261-
VLLM_USE_MODELSCOPE=False HF_ENDPOINT=https://hf-mirror.com \
262-
pytest -sv tests/e2e/singlecard/test_ilama_lora.py
263-
pytest -sv tests/e2e/singlecard/test_guided_decoding.py
264-
pytest -sv tests/e2e/singlecard/test_camem.py
265-
pytest -sv tests/e2e/singlecard/ \
266-
--ignore=tests/e2e/singlecard/test_offline_inference.py \
267-
--ignore=tests/e2e/singlecard/test_ilama_lora.py \
268-
--ignore=tests/e2e/singlecard/test_guided_decoding.py \
269-
--ignore=tests/e2e/singlecard/test_camem.py
270-
271-
- name: Run e2e test on V0 engine
272-
if: ${{ github.event_name == 'schedule' }}
273-
env:
274-
VLLM_USE_V1: 0
275-
VLLM_USE_MODELSCOPE: True
276-
run: |
277-
pytest -sv tests/e2e/singlecard/test_offline_inference.py
278-
# TODO: switch hf to modelscope
279-
VLLM_USE_MODELSCOPE=False HF_ENDPOINT=https://hf-mirror.com \
280-
pytest -sv tests/e2e/singlecard/test_ilama_lora.py
281-
pytest -sv tests/e2e/singlecard/test_guided_decoding.py
282-
pytest -sv tests/e2e/singlecard/test_camem.py
283-
pytest -sv tests/e2e/singlecard/test_prompt_embedding.py
284-
pytest -sv tests/e2e/singlecard/ \
285-
--ignore=tests/e2e/singlecard/test_offline_inference.py \
286-
--ignore=tests/e2e/singlecard/test_ilama_lora.py \
287-
--ignore=tests/e2e/singlecard/test_guided_decoding.py \
288-
--ignore=tests/e2e/singlecard/test_camem.py \
289-
--ignore=tests/e2e/singlecard/test_prompt_embedding.py \
290-
--ignore=tests/e2e/singlecard/core/test_ascend_scheduler.py \
291-
--ignore=tests/e2e/singlecard/core/test_ascend_scheduler_e2e.py
196+
# e2e:
197+
# needs: [lint]
198+
# # only trigger e2e test on pull request after lint passed
199+
# if: ${{ needs.lint.result == 'success' && github.event_name == 'pull_request' }}
200+
# strategy:
201+
# max-parallel: 2
202+
# matrix:
203+
# os: [linux-arm64-npu-1]
204+
# vllm_version: [main, ]
205+
# name: singlecard e2e test
206+
# runs-on: ${{ matrix.os }}
207+
# container:
208+
# # TODO(yikun): Remove m.daocloud.io prefix when infra proxy ready
209+
# image: m.daocloud.io/quay.io/ascend/cann:8.1.rc1-910b-ubuntu22.04-py3.10
210+
# env:
211+
# VLLM_LOGGING_LEVEL: ERROR
212+
# steps:
213+
# - name: Check npu and CANN info
214+
# run: |
215+
# npu-smi info
216+
# cat /usr/local/Ascend/ascend-toolkit/latest/"$(uname -i)"-linux/ascend_toolkit_install.info
217+
218+
# - name: Config mirrors
219+
# run: |
220+
# sed -i 's|ports.ubuntu.com|mirrors.tuna.tsinghua.edu.cn|g' /etc/apt/sources.list
221+
# pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
222+
# apt-get update -y
223+
# apt install git -y
224+
# git config --global url."https://gh-proxy.test.osinfra.cn/https://github.com/".insteadOf https://github.com/
225+
226+
# - name: Checkout vllm-project/vllm-ascend repo
227+
# uses: actions/checkout@v4
228+
229+
# - name: Install system dependencies
230+
# run: |
231+
# apt-get -y install `cat packages.txt`
232+
# apt-get -y install gcc g++ cmake libnuma-dev
233+
234+
# - name: Checkout vllm-project/vllm repo
235+
# uses: actions/checkout@v4
236+
# with:
237+
# repository: vllm-project/vllm
238+
# ref: ${{ matrix.vllm_version }}
239+
# path: ./vllm-empty
240+
241+
# - name: Install vllm-project/vllm from source
242+
# working-directory: ./vllm-empty
243+
# run: |
244+
# VLLM_TARGET_DEVICE=empty pip install -e .
245+
246+
# - name: Install vllm-project/vllm-ascend
247+
# env:
248+
# PIP_EXTRA_INDEX_URL: https://mirrors.huaweicloud.com/ascend/repos/pypi
249+
# run: |
250+
# pip install -r requirements-dev.txt
251+
# pip install -v -e .
252+
253+
# - name: Run e2e test for V1 Engine
254+
# env:
255+
# VLLM_USE_V1: 1
256+
# VLLM_WORKER_MULTIPROC_METHOD: spawn
257+
# VLLM_USE_MODELSCOPE: True
258+
# run: |
259+
# pytest -sv tests/e2e/singlecard/test_offline_inference.py
260+
# # TODO: switch hf to modelscope
261+
# VLLM_USE_MODELSCOPE=False HF_ENDPOINT=https://hf-mirror.com \
262+
# pytest -sv tests/e2e/singlecard/test_ilama_lora.py
263+
# pytest -sv tests/e2e/singlecard/test_guided_decoding.py
264+
# pytest -sv tests/e2e/singlecard/test_camem.py
265+
# pytest -sv tests/e2e/singlecard/ \
266+
# --ignore=tests/e2e/singlecard/test_offline_inference.py \
267+
# --ignore=tests/e2e/singlecard/test_ilama_lora.py \
268+
# --ignore=tests/e2e/singlecard/test_guided_decoding.py \
269+
# --ignore=tests/e2e/singlecard/test_camem.py
270+
271+
# - name: Run e2e test on V0 engine
272+
# if: ${{ github.event_name == 'schedule' }}
273+
# env:
274+
# VLLM_USE_V1: 0
275+
# VLLM_USE_MODELSCOPE: True
276+
# run: |
277+
# pytest -sv tests/e2e/singlecard/test_offline_inference.py
278+
# # TODO: switch hf to modelscope
279+
# VLLM_USE_MODELSCOPE=False HF_ENDPOINT=https://hf-mirror.com \
280+
# pytest -sv tests/e2e/singlecard/test_ilama_lora.py
281+
# pytest -sv tests/e2e/singlecard/test_guided_decoding.py
282+
# pytest -sv tests/e2e/singlecard/test_camem.py
283+
# pytest -sv tests/e2e/singlecard/test_prompt_embedding.py
284+
# pytest -sv tests/e2e/singlecard/ \
285+
# --ignore=tests/e2e/singlecard/test_offline_inference.py \
286+
# --ignore=tests/e2e/singlecard/test_ilama_lora.py \
287+
# --ignore=tests/e2e/singlecard/test_guided_decoding.py \
288+
# --ignore=tests/e2e/singlecard/test_camem.py \
289+
# --ignore=tests/e2e/singlecard/test_prompt_embedding.py \
290+
# --ignore=tests/e2e/singlecard/core/test_ascend_scheduler.py \
291+
# --ignore=tests/e2e/singlecard/core/test_ascend_scheduler_e2e.py
292292

293293
e2e-4-cards:
294-
needs: [e2e]
295-
if: ${{ needs.e2e.result == 'success' }}
294+
# needs: [e2e]
295+
# if: ${{ needs.e2e.result == 'success' }}
296296
strategy:
297297
max-parallel: 1
298298
matrix:
299299
os: [linux-arm64-npu-4]
300-
vllm_version: [main, v0.9.1]
300+
vllm_version: [main, ]
301301
name: multicard e2e test
302302
runs-on: ${{ matrix.os }}
303303
container:

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -196,3 +196,5 @@ kernel_meta/
196196

197197
# version file generated by setuptools-scm
198198
/vllm_ascend/_version.py
199+
# build info file generated by setup.py
200+
/vllm_ascend/_build_info.py

benchmarks/scripts/run-performance-benchmarks.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
#!/bin/bash
2-
2+
set -e
33

44
check_npus() {
55
# shellcheck disable=SC2155

benchmarks/scripts/run_accuracy.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -138,8 +138,6 @@ def generate_md(model_name, tasks_list, args, datasets):
138138
```bash
139139
{run_cmd}
140140
```
141-
</div>
142-
<div>&nbsp;</div>
143141
"""
144142

145143
header = (

0 commit comments

Comments
 (0)