File tree
752 files changed
+38177
-19642
lines changed- .buildkite
- nightly-benchmarks
- scripts
- tests
- scripts
- hardware_ci
- .github
- workflows
- benchmarks
- auto_tune
- disagg_benchmarks
- kernels
- multi_turn
- cmake
- external_projects
- csrc
- attention/mla
- cpu
- cutlass_extensions
- gemm
- collective
- moe
- quantization
- cutlass_w4a8
- cutlass_w8a8/c3x
- fp4
- machete
- docker
- docs
- community
- configuration
- contributing
- model
- deployment
- frameworks
- integrations
- design
- features
- getting_started/installation
- cpu
- gpu
- mkdocs/hooks
- models
- serving
- usage
- examples
- offline_inference
- logits_processor
- online_serving
- prometheus_grafana
- requirements
- tests
- async_engine
- benchmarks
- compile
- piecewise
- core
- detokenizer
- distributed
- encoder_decoder
- engine
- entrypoints
- llm
- offline_mode
- openai
- correctness
- pooling
- correctness
- llm
- openai
- kernels
- attention
- core
- mamba
- moe
- quantization
- kv_transfer
- lora
- model_executor
- model_loader
- models
- language
- generation_ppl_test
- generation
- pooling_mteb_test
- pooling
- multimodal
- generation
- vlm_utils
- pooling
- processing
- multimodal
- neuron
- 1_core
- 2_core
- plugins_tests
- plugins/prithvi_io_processor_plugin
- prithvi_io_processor
- quantization
- runai_model_streamer_test
- samplers
- tensorizer_loader
- tool_use
- tpu
- transformers_utils
- utils_
- v1
- attention
- core
- cudagraph
- e2e
- engine
- entrypoints
- llm
- openai
- responses
- kv_connector/unit
- logits_processors
- metrics
- sample
- spec_decode
- tpu
- tracing
- tools
- profiler
- vllm
- assets
- attention
- backends
- mla
- layers
- ops
- utils
- benchmarks
- lib
- compilation
- config
- core
- distributed
- device_communicators
- kv_transfer
- kv_connector
- v1
- p2p
- kv_pipe
- engine
- multiprocessing
- entrypoints
- openai
- tool_parsers
- executor
- inputs
- logging_utils
- lora
- layers
- punica_wrapper
- model_executor
- layers
- fla
- ops
- fused_moe
- configs
- mamba
- ops
- quantization
- compressed_tensors
- transform
- kernels/mixed_precision
- quark
- utils
- rotary_embedding
- model_loader
- models
- warmup
- multimodal
- platforms
- reasoning
- third_party
- transformers_utils
- configs
- triton_utils
- utils
- v1
- attention/backends
- mla
- core
- sched
- engine
- executor
- metrics
- sample
- logits_processor
- ops
- tpu
- spec_decode
- structured_output
- worker
- worker
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
752 files changed
+38177
-19642
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
9 | | - | |
10 | | - | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
11 | 11 | | |
12 | | - | |
| 12 | + | |
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
218 | 218 | | |
219 | 219 | | |
220 | 220 | | |
221 | | - | |
| 221 | + | |
222 | 222 | | |
223 | 223 | | |
224 | 224 | | |
| |||
0 commit comments