Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Snyk] Fix for 14 vulnerabilities #33

Closed
wants to merge 573 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
573 commits
Select commit Hold shift + click to select a range
c7334d2
[Core] Support offline use of local cache for models (#4374)
prashantgupta24 Apr 27, 2024
b2ae047
[BugFix] Fix return type of executor execute_model methods (#4402)
njhill Apr 27, 2024
3a3ea57
[BugFix] Resolved Issues For LinearMethod --> QuantConfig (#4418)
robertgshaw2-redhat Apr 27, 2024
53299da
[Misc] fix typo in llm_engine init logging (#4428)
DefTruth Apr 28, 2024
91aabb0
Add more Prometheus metrics (#2764)
ronensc Apr 28, 2024
46a9863
[CI] clean docker cache for neuron (#4441)
simon-mo Apr 28, 2024
0733647
[mypy][5/N] Support all typing on model executor (#4427)
rkooo567 Apr 29, 2024
70d7507
[Kernel] Marlin Expansion: Support AutoGPTQ Models with Marlin (#3922)
robertgshaw2-redhat Apr 29, 2024
1e25f6a
[CI] hotfix: soft fail neuron test (#4458)
simon-mo Apr 29, 2024
826a21c
[Core][Distributed] use cpu group to broadcast metadata in cpu (#4444)
youkaichao Apr 29, 2024
aa6c82d
[Misc] Upgrade to `torch==2.3.0` (#4454)
mgoin Apr 30, 2024
61e6343
[Bugfix][Kernel] Fix compute_type for MoE kernel (#4463)
WoosukKwon Apr 30, 2024
c92cac9
[Core]Refactor gptq_marlin ops (#4466)
jikunshang Apr 30, 2024
d1176c8
[BugFix] fix num_lookahead_slots missing in async executor (#4165)
leiwen83 Apr 30, 2024
eb69d24
[Doc] add visualization for multi-stage dockerfile (#4456)
prashantgupta24 Apr 30, 2024
e65c20e
[Kernel] Support Fp8 Checkpoints (Dynamic + Static) (#4332)
robertgshaw2-redhat Apr 30, 2024
d9e9d52
[Frontend] Support complex message content for chat completions endpo…
fgreinacher Apr 30, 2024
3fc345a
[Frontend] [Core] Tensorizer: support dynamic `num_readers`, update v…
alpayariyak Apr 30, 2024
80d0058
[Bugfix][Minor] Make ignore_eos effective (#4468)
bigPYJ1151 Apr 30, 2024
f8e22b3
fix_tokenizer_snapshot_download_bug (#4493)
kingljl Apr 30, 2024
ecb3620
Unable to find Punica extension issue during source code installation…
kingljl May 1, 2024
edd9c67
[Core] Centralize GPU Worker construction (#4419)
njhill May 1, 2024
1b710d6
[Misc][Typo] type annotation fix (#4495)
HarryWu99 May 1, 2024
28a0f80
[Misc] fix typo in block manager (#4453)
Juelianqvq May 1, 2024
c49d777
Allow user to define whitespace pattern for outlines (#4305)
robcaulk May 1, 2024
8eba757
[Misc]Add customized information for models (#4132)
jeejeelee May 1, 2024
9f85f52
[Test] Add ignore_eos test (#4519)
rkooo567 May 1, 2024
e38157e
[Bugfix] Fix the fp8 kv_cache check error that occurs when failing to…
AnyISalIn May 1, 2024
e5814bc
[Bugfix] Fix 307 Redirect for `/metrics` (#4523)
robertgshaw2-redhat May 1, 2024
6e3a823
[Doc] update(example model): for OpenAI compatible serving (#4503)
fpaupier May 1, 2024
4d3057b
[Bugfix] Use random seed if seed is -1 (#4531)
sasha0552 May 1, 2024
f1f98b6
[CI/Build][Bugfix] VLLM_USE_PRECOMPILED should skip compilation (#4534)
tjohnson31415 May 1, 2024
4107938
[Speculative decoding] Add ngram prompt lookup decoding (#4237)
leiwen83 May 1, 2024
b2eed51
[Core] Enable prefix caching with block manager v2 enabled (#4142)
leiwen83 May 1, 2024
4c28921
[Core] Add `multiproc_worker_utils` for multiprocessing-based workers…
njhill May 1, 2024
0c9d74c
[Kernel] Update fused_moe tuning script for FP8 (#4457)
pcmoritz May 1, 2024
a7e8e4d
[Bugfix] Add validation for seed (#4529)
sasha0552 May 1, 2024
0793528
[Bugfix][Core] Fix and refactor logging stats (#4336)
esmeetu May 1, 2024
07632b0
[Core][Distributed] fix pynccl del error (#4508)
youkaichao May 1, 2024
e742ab4
[Misc] Remove Mixtral device="cuda" declarations (#4543)
pcmoritz May 1, 2024
6fe52f3
[Misc] Fix expert_ids shape in MoE (#4517)
WoosukKwon May 1, 2024
2ec5dd2
[MISC] Rework logger to enable pythonic custom logging configuration …
May 2, 2024
121847e
[Bug fix][Core] assert num_new_tokens == 1 fails when SamplingParams.…
rkooo567 May 2, 2024
3bb37bd
[CI]Add regression tests to ensure the async engine generates metrics…
ronensc May 2, 2024
fe82250
[mypy][6/N] Fix all the core subdirectory typing (#4450)
rkooo567 May 2, 2024
9ff783f
[Core][Distributed] enable multiple tp group (#4512)
youkaichao May 2, 2024
d3ab1c7
[Kernel] Support running GPTQ 8-bit models in Marlin (#4533)
alexm-redhat May 2, 2024
beecc8e
[mypy][7/N] Cover all directories (#4555)
rkooo567 May 2, 2024
b553e05
[Misc] Exclude the `tests` directory from being packaged (#4552)
itechbear May 2, 2024
37f8957
[BugFix] Include target-device specific requirements.txt in sdist (#4…
markmc May 2, 2024
d7f5c58
[Misc] centralize all usage of environment variables (#4548)
youkaichao May 2, 2024
df04c10
[kernel] fix sliding window in prefix prefill Triton kernel (#4405)
mmoskal May 2, 2024
299066f
[CI/Build] AMD CI pipeline with extended set of tests. (#4267)
Alexei-V-Ivanov-AMD May 2, 2024
3e9f425
[Core] Ignore infeasible swap requests. (#4557)
rkooo567 May 2, 2024
977a6cd
[Core][Distributed] enable allreduce for multiple tp groups (#4566)
youkaichao May 3, 2024
de6d42a
[BugFix] Prevent the task of `_force_log` from being garbage collecte…
Atry May 3, 2024
deb0ccc
[Misc] remove chunk detected debug logs (#4571)
DefTruth May 3, 2024
9500596
[Doc] add env vars to the doc (#4572)
youkaichao May 3, 2024
a5d0d0e
[Core][Model runner refactoring 1/N] Refactor attn metadata term (#4518)
rkooo567 May 3, 2024
ab445b1
[Bugfix] Allow "None" or "" to be passed to CLI for string args that …
mgoin May 3, 2024
83f0437
Fix/async chat serving (#2727)
schoennenbeck May 3, 2024
0c86070
[Kernel] Use flashinfer for decoding (#4353)
LiuXiaoxuanPKU May 3, 2024
81a9e09
[Speculative decoding] Support target-model logprobs (#4378)
cadedaniel May 3, 2024
cf0665c
[Misc] add installation time env vars (#4574)
youkaichao May 3, 2024
ecb55eb
[Misc][Refactor] Introduce ExecuteModelData (#4540)
comaniac May 4, 2024
8e82b90
[Doc] Chunked Prefill Documentation (#4580)
rkooo567 May 4, 2024
ba2be94
[Kernel] Support MoE Fp8 Checkpoints for Mixtral (Static Weights with…
mgoin May 4, 2024
71bb251
[CI] check size of the wheels (#4319)
simon-mo May 4, 2024
ac5ccb6
[Bugfix] Fix inappropriate content of model_name tag in Prometheus me…
DearPlanet May 4, 2024
52b5bcb
bump version to v0.4.2 (#4600)
simon-mo May 5, 2024
c7426c1
[CI] Reduce wheel size by not shipping debug symbols (#4602)
simon-mo May 5, 2024
352ef7c
Disable cuda version check in vllm-openai image (#4530)
zhaoyang-star May 5, 2024
06241cf
[Bugfix] Fix `asyncio.Task` not being subscriptable (#4623)
DarkLight1337 May 6, 2024
4c758aa
Update vLLM to 323f27b9
joerunde May 6, 2024
b180134
sync with IBM/main@4c758aa2
dtrifiro May 7, 2024
26e6259
Merge pull request #13 from dtrifiro/sync-with-upstream
z103cb May 7, 2024
0b1387b
chore: add OWNERS file to ibm_main
z103cb May 8, 2024
8a6c9c9
Merge pull request #16 from z103cb/update-ibm-main-owners
z103cb May 8, 2024
ca06561
Dockerfile.ubi: improvements
dtrifiro Apr 23, 2024
6100f4b
TGISStatLogger: fix stats usage
dtrifiro May 8, 2024
2caabff
format: make mypy happy (#24)
tjohnson31415 May 8, 2024
c737a7a
ci/build/feat: bump vLLM libs to v0.4.2 and other deps in Dockerfile.…
tjohnson31415 May 8, 2024
06d9876
TGISStatLogger: fix stats usage (#25)
tjohnson31415 May 8, 2024
6084d41
format: make mypy happy (#24)
tjohnson31415 May 8, 2024
ed94d42
ci/build/feat: bump vLLM libs to v0.4.2 and other deps in Dockerfile.…
tjohnson31415 May 8, 2024
1cc8906
Dockerfile.ubi: get rid of --link flags for COPY operations
dtrifiro Apr 23, 2024
9543d0b
TGISStatLogger: fix stats usage
dtrifiro May 8, 2024
21fb852
fix: use vllm_nccl installed nccl version (#26)
tjohnson31415 May 13, 2024
2e81ed2
:bug: fix prometheus metric labels (#27)
joerunde May 14, 2024
a1578c4
Dockerfile: use fixed vllm-provided nccl version
dtrifiro May 14, 2024
3f5757e
Merge pull request #23 from dtrifiro/use-vllm-nccl
openshift-merge-bot[bot] May 14, 2024
27eee94
Dockerfile.ubi: use 9.4 as base UBI tag
dtrifiro May 15, 2024
981e733
Merge branch 'ibm_main' into bump-ubi-base-image-tag
z103cb May 15, 2024
059b81b
Merge pull request #24 from dtrifiro/bump-ubi-base-image-tag
openshift-merge-bot[bot] May 15, 2024
a72d13a
Merge remote-tracking branch 'ibm-vllm/main' into ibm_main_update_051…
z103cb May 16, 2024
81954a7
Merge pull request #25 from z103cb/ibm_main_update_05162022
dtrifiro May 16, 2024
79dce26
[CI] use ccache actions properly in release workflow (#4629)
simon-mo May 6, 2024
d363d39
[CI] Add retry for agent lost (#4633)
cadedaniel May 6, 2024
a547717
Update lm-format-enforcer to 0.10.1 (#4631)
noamgat May 6, 2024
3798adb
[Kernel] Make static FP8 scaling more robust (#4570)
pcmoritz May 7, 2024
73323c3
[Core][Optimization] change python dict to pytorch tensor (#4607)
youkaichao May 7, 2024
ffc7024
[Build/CI] Fixing 'docker run' to re-enable AMD CI tests. (#4642)
Alexei-V-Ivanov-AMD May 7, 2024
07ccdeb
[Bugfix] Fixed error in slice_lora_b for MergedQKVParallelLinearWithL…
FurtherAI May 7, 2024
4fb77a9
[Core][Optimization] change copy-on-write from dict[int, list] to lis…
youkaichao May 7, 2024
7088e42
[Bug fix][Core] fixup ngram not setup correctly (#4551)
leiwen83 May 7, 2024
1571342
[Core][Distributed] support cpu&device in broadcast tensor dict (#4660)
youkaichao May 8, 2024
9e4b2e2
[Core] Optimize sampler get_logprobs (#4594)
rkooo567 May 8, 2024
e7ebde1
[CI] Make mistral tests pass (#4596)
rkooo567 May 8, 2024
1bb5e89
[Bugfix][Kernel] allow non-power-of-2 for prefix prefill with alibi …
DefTruth May 8, 2024
456bcbc
[Misc] Add `get_name` method to attention backends (#4685)
WoosukKwon May 8, 2024
5aedfe8
[Core] Faster startup for LoRA enabled models (#4634)
Yard1 May 8, 2024
2563537
[Core][Optimization] change python dict to pytorch tensor for blocks …
youkaichao May 8, 2024
a696be1
[CI/Test] fix swap test for multi gpu (#4689)
youkaichao May 8, 2024
fe03b5c
[Misc] Use vllm-flash-attn instead of flash-attn (#4686)
WoosukKwon May 8, 2024
4c17d62
[Dynamic Spec Decoding] Auto-disable by the running queue size (#4592)
comaniac May 8, 2024
683a105
[Speculative decoding] [Bugfix] Fix overallocation in ngram + spec lo…
cadedaniel May 8, 2024
53a9503
[Bugfix] Fine-tune gptq_marlin configs to be more similar to marlin (…
alexm-redhat May 9, 2024
d6eb999
[Frontend] add tok/s speed metric to llm class when using tqdm (#4400)
MahmoudAshraf97 May 9, 2024
b346a6d
[Frontend] Move async logic outside of constructor (#4674)
DarkLight1337 May 9, 2024
8427be7
[Misc] Remove unnecessary ModelRunner imports (#4703)
WoosukKwon May 9, 2024
e5b181e
[Misc] Set block size at initialization & Fix test_model_runner (#4705)
WoosukKwon May 9, 2024
0a838de
[ROCm] Add support for Punica kernels on AMD GPUs (#3140)
kliuae May 9, 2024
b4214c5
[Bugfix] Fix CLI arguments in OpenAI server docs (#4709)
DarkLight1337 May 9, 2024
df54be8
[Bugfix] Update grafana.json (#4711)
robertgshaw2-redhat May 9, 2024
475b9a0
[Bugfix] Add logs for all model dtype casting (#4717)
mgoin May 9, 2024
12d23f9
[Model] Snowflake arctic model implementation (#4652)
sfc-gh-hazhang May 9, 2024
d7e6b3f
[Kernel] [FP8] Improve FP8 linear layer performance (#4691)
pcmoritz May 9, 2024
439c463
[Kernel] Refactor FP8 kv-cache with NVIDIA float8_e4m3 support (#4535)
comaniac May 10, 2024
ce0f149
[Core][Distributed] refactor pynccl (#4591)
youkaichao May 10, 2024
8cf6b87
[Misc] Keep only one implementation of the create_dummy_prompt functi…
AllenDou May 10, 2024
bd873f4
chunked-prefill-doc-syntax (#4603)
simon-mo May 10, 2024
4b0058f
[Core]fix type annotation for `swap_blocks` (#4726)
jikunshang May 10, 2024
bee64c4
[Misc] Apply a couple g++ cleanups (#4719)
stevegrubb May 10, 2024
3363a6b
[Core] Fix circular reference which leaked llm instance in local dev …
rkooo567 May 10, 2024
c56ae80
[Bugfix] Fix CLI arguments in OpenAI server docs (#4729)
AllenDou May 10, 2024
3498e74
[Speculative decoding] CUDA graph support (#4295)
heeju-kim2 May 10, 2024
fffb10a
[CI] Nits for bad initialization of SeqGroup in testing (#4748)
robertgshaw2-redhat May 10, 2024
cd8f90f
[Core][Test] fix function name typo in custom allreduce (#4750)
youkaichao May 10, 2024
70fa8fd
[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734)
CatherineSue May 11, 2024
e2302f4
[Model] Add support for IBM Granite Code models (#4636)
yikangshen May 12, 2024
3942ef1
[CI/Build] Tweak Marlin Nondeterminism Issues (#4713)
robertgshaw2-redhat May 13, 2024
6410635
[CORE] Improvement in ranks code (#4718)
SwapnilDreams100 May 13, 2024
0493233
[Core][Distributed] refactor custom allreduce to support multiple tp …
youkaichao May 13, 2024
35a3273
[CI/Build] Move `test_utils.py` to `tests/utils.py` (#4425)
DarkLight1337 May 13, 2024
7a0a670
[Scheduler] Warning upon preemption and Swapping (#4647)
rkooo567 May 13, 2024
98d62a2
[Misc] Enhance attention selector (#4751)
WoosukKwon May 13, 2024
64d2fdc
[Frontend] [Core] perf: Automatically detect vLLM-tensorized model, u…
sangstar May 13, 2024
47f50c5
[Speculative decoding] Improve n-gram efficiency (#4724)
comaniac May 13, 2024
28c395f
[Kernel] Use flash-attn for decoding (#3648)
skrider May 13, 2024
95411c6
[Bugfix] Fix dynamic FP8 quantization for Mixtral (#4793)
pcmoritz May 13, 2024
f4270f2
[Doc] Shorten README by removing supported model list (#4796)
zhuohan123 May 13, 2024
c75ceb4
[Doc] Add API reference for offline inference (#4710)
DarkLight1337 May 14, 2024
0e5d2a9
[Doc] Add meetups to the doc (#4798)
zhuohan123 May 14, 2024
ed2d743
[Core][Hash][Automatic Prefix caching] Accelerating the hashing funct…
KuntaiDu May 14, 2024
929ecdc
[Bugfix][Doc] Fix CI failure in docs (#4804)
DarkLight1337 May 14, 2024
3008471
[Core] Add MultiprocessingGPUExecutor (#4539)
njhill May 14, 2024
73a4168
Add 4th meetup announcement to readme (#4817)
simon-mo May 14, 2024
a69f3af
Revert "[Kernel] Use flash-attn for decoding (#3648)" (#4820)
rkooo567 May 15, 2024
71cd938
[Core][2/N] Model runner refactoring part 2. Combine prepare prefill …
rkooo567 May 15, 2024
6d46185
[CI/Build] Further decouple HuggingFace implementation from ours duri…
DarkLight1337 May 15, 2024
4e0ddd9
[Bugfix] Properly set distributed_executor_backend in ParallelConfig …
zifeitong May 15, 2024
7df0a0b
[Doc] Highlight the fourth meetup in the README (#4842)
zhuohan123 May 15, 2024
e9ddce5
[Frontend] Re-enable custom roles in Chat Completions API (#4758)
DarkLight1337 May 15, 2024
7c731a9
[Frontend] Support OpenAI batch file format (#4794)
wuisawesome May 15, 2024
9002ba4
[Core] Implement sharded state loader (#4690)
aurickq May 16, 2024
f832b56
[Speculative decoding][Re-take] Enable TP>1 speculative decoding (#4840)
comaniac May 16, 2024
117f7b4
Add marlin unit tests and marlin benchmark script (#4815)
alexm-redhat May 16, 2024
7bc509e
[Kernel] add bfloat16 support for gptq marlin kernel (#4788)
jinzhen-lin May 16, 2024
05afbc4
[docs] Fix typo in examples filename openi -> openai (#4864)
wuisawesome May 16, 2024
982a80b
[Frontend] Separate OpenAI Batch Runner usage from API Server (#4851)
wuisawesome May 16, 2024
ddcdb15
[Bugfix] Bypass authorization API token for preflight requests (#4862)
dulacp May 16, 2024
1b7a015
Add GPTQ Marlin 2:4 sparse structured support (#4790)
alexm-redhat May 16, 2024
b5a7ecd
Add JSON output support for benchmark_latency and benchmark_throughpu…
simon-mo May 16, 2024
7459ec4
[ROCm][AMD][Bugfix] adding a missing triton autotune config (#4845)
hongxiayang May 16, 2024
eb62283
[Core][Distributed] remove graph mode function (#4818)
youkaichao May 16, 2024
e1aad8a
[Misc] remove old comments (#4866)
youkaichao May 16, 2024
b551e55
[Kernel] Add punica dimension for Qwen1.5-32B LoRA (#4850)
Silencioo May 16, 2024
71b4283
Update vLLM to 8435b207
tjohnson31415 May 16, 2024
3ac6575
:bug: fixup merge conflicts
joerunde May 16, 2024
6b06bf0
:fire: remove flash attention
joerunde May 16, 2024
9499dce
Use fork for worker multiprocessing method (#29)
njhill May 16, 2024
0b16320
[Kernel] Add w8a8 CUTLASS kernels (#4749)
tlrmchlsmth May 16, 2024
d13ad85
[Bugfix] Fix FP8 KV cache support (#4869)
WoosukKwon May 16, 2024
fd979cb
Support to serve vLLM on Kubernetes with LWS (#4829)
kerthcet May 16, 2024
f6e95fb
[Frontend] OpenAI API server: Do not add bos token by default when en…
bofenghuang May 17, 2024
afcf4c8
[Build/CI] Extending the set of AMD tests with Regression, Basic Corr…
Alexei-V-Ivanov-AMD May 17, 2024
1f614f9
[Bugfix] fix rope error when load models with different dtypes (#4835)
jinzhen-lin May 17, 2024
d0f3b87
Sync huggingface modifications of qwen Moe model (#4774)
eigen2017 May 17, 2024
eab073d
[Doc] Update Ray Data distributed offline inference example (#4871)
Yard1 May 17, 2024
d7f076f
[Bugfix] Relax tiktoken to >= 0.6.0 (#4890)
mgoin May 17, 2024
94a7c8b
[ROCm][Hardware][AMD] Adding Navi21 to fallback to naive attention if…
alexeykondrat May 18, 2024
1102c61
[Lora] Support long context lora (#4787)
rkooo567 May 18, 2024
fc3cc45
[Bugfix][Model] Add base class for vision-language models (#4809)
DarkLight1337 May 19, 2024
41da12f
[Kernel] Add marlin_24 unit tests (#4901)
alexm-redhat May 19, 2024
fd1308b
[Kernel] Add flash-attn back (#4907)
WoosukKwon May 20, 2024
2cc299f
[Model] LLaVA model refactor (#4910)
DarkLight1337 May 20, 2024
c23600a
Remove marlin warning (#4918)
alexm-redhat May 20, 2024
f022464
Update vLLM to da5a0b53
joerunde May 20, 2024
066041a
:sparkles: log all errored requests (#30)
joerunde May 20, 2024
9fe85ab
deps: bump fastapi to >= 0.109.1
dtrifiro May 21, 2024
255735f
Merge pull request #26 from dtrifiro/bump-deps
z103cb May 21, 2024
f25aa53
Merge remote-tracking branch 'IBM/main' into sync-with-ibm
dtrifiro May 21, 2024
5497cf9
Revert "grpc_server: fix tokenizer group usage"
dtrifiro May 21, 2024
312ec7b
TEMP: no shared tokenizer from PR-3512
tjohnson31415 May 10, 2024
bd23984
[Core] Make Ray an optional "extras" requirement
njhill Apr 29, 2024
9407f49
Dockerfile.ubi: remove leftover flash-attn references
dtrifiro May 21, 2024
3be261c
Dockerfile.ubi: remove leftover flash-attn references
dtrifiro May 21, 2024
0dcc6ca
Dockerfile.ubi: get rid of prebuilt-wheel stage
dtrifiro May 21, 2024
a99d732
Dockerfile.ubi: set CMAKE_BUILD_TYPE=Release when building vllm wheel
dtrifiro May 21, 2024
8103d10
Merge pull request #30 from dtrifiro/dockerfile-build-extensions
openshift-merge-bot[bot] May 21, 2024
ba755e7
Merge branch 'ibm_main' into sync-with-ibm
z103cb May 21, 2024
9eaece5
Docker.ubi: add missing package git
z103cb May 21, 2024
38eed8a
Merge pull request #31 from z103cb/ibm_main_add_git_to_ubi_docker
openshift-merge-bot[bot] May 21, 2024
08ce2fa
Merge branch 'ibm_main' into sync-with-ibm
z103cb May 21, 2024
7431143
Merge pull request #28 from dtrifiro/sync-with-ibm
openshift-merge-bot[bot] May 22, 2024
d976df3
[Core] Make Ray an optional "extras" requirement
njhill Apr 29, 2024
78a8dfe
TEMP: no shared tokenizer from PR-3512
tjohnson31415 May 10, 2024
e392b03
Merge pull request #29 from dtrifiro/sync-release-with-ibm
openshift-merge-bot[bot] May 24, 2024
affc486
deps: get rid of duplicated fastapi entry
dtrifiro May 30, 2024
45f4fe4
chore(deps): update dependencies to squash snyk reported issues.
z103cb Jun 3, 2024
8b8fed5
chore: add fork OWNERS
z103cb Apr 30, 2024
5048126
add ubi Dockerfile
dtrifiro May 21, 2024
0264b36
Dockerfile.ubi: remove references to grpc/protos
dtrifiro May 21, 2024
a5047d8
Dockerfile.ubi: use vllm-tgis-adapter
dtrifiro May 28, 2024
955598d
gha: add sync workflow
dtrifiro Jun 3, 2024
119767e
Dockerfile.ubi: use distributed-executor-backend=mp as default
dtrifiro Jun 10, 2024
a82fb14
Dockerfile.ubi: remove vllm-nccl workaround
dtrifiro Jun 13, 2024
cc5d64a
Dockerfile.ubi: add missing requirements-*.txt bind mounts
dtrifiro Jun 18, 2024
f9f3bc7
add triton CustomCacheManger
tdoublep May 29, 2024
40ae5b9
gha: sync-with-upstream workflow create PRs as draft
dtrifiro Jun 19, 2024
5c44d84
add smoke/unit tests scripts
dtrifiro Jun 19, 2024
f722b3e
extras: exit unit tests on err
dtrifiro Jun 20, 2024
5b66f1e
Dockerfile.ubi: misc improvements
dtrifiro May 28, 2024
8720c92
update OWNERS
dtrifiro Jun 21, 2024
eeb6f33
sync release with main @ 8720c92e
dtrifiro Jun 21, 2024
7cc6a9b
Merge pull request #63 from dtrifiro/sync-release-with-main
openshift-merge-bot[bot] Jun 21, 2024
1a26fcf
Dockerfile.ubi: use tensorizer (#64)
prashantgupta24 Jun 25, 2024
32e05a6
fix: docs/requirements-docs.txt to reduce vulnerabilities
snyk-bot Jun 25, 2024
1f39759
Dockerfile.ubi: pin vllm-tgis-adapter to 0.1.2
dtrifiro Jun 26, 2024
7fd7fba
Merge pull request #67 from opendatahub-io/main
openshift-merge-bot[bot] Jun 27, 2024
65fa312
Merge remote-tracking branch 'upstream/release'
dchourasia Jun 28, 2024
e96dd03
gha: fix fetch step in upstream sync workflow
dtrifiro Jul 2, 2024
b4ecf73
gha: always update sync workflow PR body/title
dtrifiro Jul 2, 2024
3ff733b
Dockerfile.ubi: bump vllm-tgis-adapter to 0.1.3
dtrifiro Jul 3, 2024
27d7746
Merge pull request #80 from opendatahub-io/main
openshift-merge-bot[bot] Jul 4, 2024
44ec37d
Merge remote-tracking branch 'upstream/release'
dchourasia Jul 4, 2024
b145c20
Merge branch 'release' into sync-release-with-main
dtrifiro Jul 26, 2024
849f0f5
Merge pull request #110 from opendatahub-io/sync-release-with-main
dtrifiro Jul 26, 2024
bc34f22
Merge remote-tracking branch 'upstream/release'
dchourasia Jul 27, 2024
c35253d
fix: requirements-rocm.txt to reduce vulnerabilities
snyk-bot Jul 27, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/requirements-docs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@ torch
py-cpuinfo
transformers
openai # Required by docs/source/serving/openai_compatible_server.md's vllm.entrypoints.openai.cli_args
anyio>=4.4.0 # not directly required, pinned by Snyk to avoid a vulnerability
7 changes: 7 additions & 0 deletions requirements-rocm.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,10 @@ botocore
ray >= 2.10.0
peft
pytest-asyncio
numpy>=1.22.2 # not directly required, pinned by Snyk to avoid a vulnerability
requests>=2.32.2 # not directly required, pinned by Snyk to avoid a vulnerability
setuptools>=70.0.0 # not directly required, pinned by Snyk to avoid a vulnerability
torch>=2.2.0 # not directly required, pinned by Snyk to avoid a vulnerability
transformers>=4.38.0 # not directly required, pinned by Snyk to avoid a vulnerability
wheel>=0.38.0 # not directly required, pinned by Snyk to avoid a vulnerability
zipp>=3.19.1 # not directly required, pinned by Snyk to avoid a vulnerability