File tree Expand file tree Collapse file tree 1 file changed +11
-4
lines changed Expand file tree Collapse file tree 1 file changed +11
-4
lines changed Original file line number Diff line number Diff line change @@ -11,7 +11,7 @@ This documentation includes information for running the popular Llama 3.1 series
1111The pre-built image includes:
1212
1313-  ROCm™ 6.3.1
14- -  vLLM 0.6.6 
14+ -  vLLM 0.7.3 
1515-  PyTorch 2.7dev (nightly)
1616
1717## Pull latest Docker Image  
@@ -20,18 +20,25 @@ Pull the most recent validated docker image with `docker pull rocm/vllm-dev:main
2020
2121## What is New  
2222
23- nightly_fixed_aiter_integration_final_20250305:
24- -  Performance improvement
23+ 20250305_aiter:
24+ -  vllm 0.7.3
25+ -  HipblasLT 0.13
26+ -  AITER improvements
27+ -  Support for FP8 skinny GEMM
28+ 
252920250207_aiter:
2630-  More performant AITER
2731-  Bug fixes
32+ 
283320250205_aiter:
2934-  [ AITER] ( https://github.com/ROCm/aiter )  support
3035-  Performance improvement for custom paged attention
3136-  Reduced memory overhead bug fix
37+ 
323820250124:
3339-  Fix accuracy issue with 405B FP8 Triton FA
3440-  Fixed accuracy issue with TP8
41+ 
354220250117:
3643-  [ Experimental DeepSeek-V3 and DeepSeek-R1 support] ( #running-deepseek-v3-and-deepseek-r1 ) 
3744
@@ -359,7 +366,7 @@ docker run -it --rm --ipc=host --network=host --group-add render \
359366    --cap-add=CAP_SYS_ADMIN --cap-add=SYS_PTRACE \
360367    --device=/dev/kfd --device=/dev/dri --device=/dev/mem \
361368    -e VLLM_USE_TRITON_FLASH_ATTN=0 \
362-     -e VLLM_FP8_PADDING=0  \
369+     -e  VLLM_MLA_DISABLE=1  \
363370    rocm/vllm-dev:main
364371#  Online serving
365372vllm serve deepseek-ai/DeepSeek-V3 \
 
 
   
 
     
   
   
          
    
    
     
    
      
     
     
    You can’t perform that action at this time.
  
 
    
  
    
      
        
     
       
      
     
   
 
    
    
  
 
  
 
     
    
0 commit comments