Add ninja to dependency #21

WoosukKwon · 2023-04-02T01:59:32Z

The compilation time of flash-attn can be drastically reduced if ninja is installed. Related issue: Dao-AILab/flash-attention#150

…ock_size [CPU] Support for larger block_size

Fix more logging lint errors

Signed-off-by: Nick Hill <nickhill@us.ibm.com> Co-authored-by: Daniel Clark <daniel.clark@ibm.com>

make package version control by setuptools_scm to keep the same with vllm Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

Co-authored-by: Lucia Fang <fanglu@meta.com>

Co-authored-by: root <root@smc300x-ccs-aus-gpue77e.prov.aus.ccs.cpe.ice.amd.com>

* remove duplicated code Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> * remove more Signed-off-by: Kunshang Ji <kunshang.ji@intel.com> --------- Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

New Industry Use Cases (vllm-project#21-30): - vllm-project#21 Game Development: AI game testing + balance tuning - vllm-project#22 Construction: Vision AI safety inspection - vllm-project#23 Agriculture/Smart Farm: Crop monitoring + pest detection - vllm-project#24 Government/Public: Document automation + citizen services - vllm-project#25 Energy/Utilities: Grid monitoring + anomaly detection - vllm-project#26 Environment/Sustainability: Carbon tracking + ESG reporting - vllm-project#27 Fashion/Apparel: Trend analysis + inventory optimization - vllm-project#28 Sports/Fitness: Performance analytics + tactical analysis - vllm-project#29 Automotive/Mobility: Autonomous driving simulation - vllm-project#30 Space/Aerospace: Satellite image analysis Advanced Architecture Patterns: 1. Event-Driven Pattern: Webhook → Event Bus → Agent triggers 2. Streaming Pattern: Large dataset processing with chunking 3. Batch Processing Pattern: Celery-based parallel processing 4. Circuit Breaker Pattern: Fault tolerance + auto recovery 5. CQRS + Event Sourcing: Command/Query separation 6. Saga Pattern: Distributed transaction management Guide now covers: - 30+ industry-specific MCP implementations - 6 production-ready architecture patterns - Real-world scalability solutions - Enterprise integration strategies - Total: 8,672 lines (from 7,249)

Add ninja to dependency

86983b7

WoosukKwon merged commit 2c5cd0d into main Apr 2, 2023

WoosukKwon deleted the ninja branch April 2, 2023 02:00

shanshanpt mentioned this pull request Nov 17, 2023

Run long conetxt error : CUDA error: an illegal memory access was encountered #1700

Closed

junior-zsy mentioned this pull request Nov 20, 2023

Error with 32k Long Text in chatglm2-6b-32k Model #1725

Closed

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

Add ninja to dependency (vllm-project#21)

f4354de

slyalin pushed a commit to slyalin/vllm that referenced this pull request Apr 3, 2024

Merge pull request vllm-project#21 from luo-cheng2021/luocheng/var_bl…

ee5c232

…ock_size [CPU] Support for larger block_size

tdg5 pushed a commit to tdg5/vllm that referenced this pull request Apr 25, 2024

Merge pull request vllm-project#21 from tdg5/exp-2

36cf873

Fix more logging lint errors

z103cb referenced this pull request in z103cb/opendatahub_vllm May 7, 2024

fix: Missed TLS config logic from internal fork (opendatahub-io#21)

7df0eb8

Signed-off-by: Nick Hill <nickhill@us.ibm.com> Co-authored-by: Daniel Clark <daniel.clark@ibm.com>

yuhuixu1993 mentioned this pull request Jun 2, 2024

[Bug]: loading squeezellm model #5190

Closed

alixiaodi mentioned this pull request Aug 2, 2024

[Bug]: #7072

Closed

wuhuikx pushed a commit to wuhuikx/vllm that referenced this pull request Mar 27, 2025

[Misc] version control by setuptools_scm (vllm-project#21)

c59375c

make package version control by setuptools_scm to keep the same with vllm Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

hao-cold mentioned this pull request May 13, 2025

[Bug]: CUDA error: an illegal instruction was encountered #18045

Closed

1 task

markmc mentioned this pull request May 21, 2025

[Bug][Failing Test]: Distributed Comm Ops - distributed/test_shm_broadcast.py #18492

Closed

1 task

zerosurplus mentioned this pull request Jun 16, 2025

[Bug]: torch.distributed.DistNetworkError: The client socket has timed out after 600000ms while trying to connect to (172.17.0.9, 46229). #19670

Open

1 task

xiaomofang mentioned this pull request Jul 31, 2025

[Bug]: There is an issue with speculative inference in Eagle mode, where the context length of vLLM inference is constrained by the draft model. #21986

Open

1 task

zyongye pushed a commit to zyongye/vllm that referenced this pull request Aug 5, 2025

Support Responses Streaming (vllm-project#21)

b775a39

zyongye pushed a commit to zyongye/vllm that referenced this pull request Aug 6, 2025

Support Responses Streaming (vllm-project#21)

696cfb8

heheda12345 pushed a commit to heheda12345/vllm that referenced this pull request Sep 29, 2025

support mtp with indexer kv (vllm-project#21)

6a29a01

Co-authored-by: Lucia Fang <fanglu@meta.com>

Michel-debug mentioned this pull request Oct 23, 2025

[Bug]: qwen3-vl-2b after ms-swift fine-tuning lance errors #27405

Closed

1 task

inkcherry pushed a commit to inkcherry/vllm that referenced this pull request Nov 6, 2025

debug (vllm-project#21)

e96685a

Co-authored-by: root <root@smc300x-ccs-aus-gpue77e.prov.aus.ccs.cpe.ice.amd.com>

acodercat mentioned this pull request Nov 10, 2025

[Bugfix] Add strong reference to CUDA pluggable allocator callbacks #23477

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add ninja to dependency #21

Add ninja to dependency #21

Uh oh!

WoosukKwon commented Apr 2, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Add ninja to dependency #21

Add ninja to dependency #21

Uh oh!

Conversation

WoosukKwon commented Apr 2, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants