[CI/Build] CMakeLists: build all extensions' cmake targets at the same time #5034

dtrifiro · 2024-05-24T14:32:13Z

Instead of building one target at a time, we can specify multiple --target flags simultaneously to cmake, this should speed up the build process a little.

mgoin · 2024-05-24T14:36:24Z

This seems straightforward to me, thanks! Only concern is possibly increasing memory requirements to build because of more targets in flight. Any thoughts @bnellnm ?

Edit: make sure to run format.sh so CI can run

youkaichao · 2024-05-24T16:51:57Z

I don't think this should be turned on by default. We tried very hard to not overwhelm user's machines:

vllm/setup.py

Lines 71 to 105 in 9197709

    
           # 
        
           # Determine number of compilation jobs and optionally nvcc compile threads. 
        
           # 
        
           def compute_num_jobs(self): 
        
               # `num_jobs` is either the value of the MAX_JOBS environment variable 
        
               # (if defined) or the number of CPUs available. 
        
               num_jobs = envs.MAX_JOBS 
        
               if num_jobs is not None: 
        
                   num_jobs = int(num_jobs) 
        
                   logger.info("Using MAX_JOBS=%d as the number of jobs.", num_jobs) 
        
               else: 
        
                   try: 
        
                       # os.sched_getaffinity() isn't universally available, so fall 
        
                       #  back to os.cpu_count() if we get an error here. 
        
                       num_jobs = len(os.sched_getaffinity(0)) 
        
                   except AttributeError: 
        
                       num_jobs = os.cpu_count() 
        
               nvcc_threads = None 
        
               if _is_cuda() and get_nvcc_cuda_version() >= Version("11.2"): 
        
                   # `nvcc_threads` is either the value of the NVCC_THREADS 
        
                   # environment variable (if defined) or 1. 
        
                   # when it is set, we reduce `num_jobs` to avoid 
        
                   # overloading the system. 
        
                   nvcc_threads = envs.NVCC_THREADS 
        
                   if nvcc_threads is not None: 
        
                       nvcc_threads = int(nvcc_threads) 
        
                       logger.info( 
        
                           "Using NVCC_THREADS=%d as the number of nvcc threads.", 
        
                           nvcc_threads) 
        
                   else: 
        
                       nvcc_threads = 1 
        
                   num_jobs = max(1, num_jobs // nvcc_threads) 
        
               return num_jobs, nvcc_threads

If we compile all the targets by default, it will violate this.

We can make this an opt-in feature, and users can turn it on if they know what they are doing.

bnellnm · 2024-05-25T06:39:26Z

This seems straightforward to me, thanks! Only concern is possibly increasing memory requirements to build because of more targets in flight. Any thoughts @bnellnm ?

Edit: make sure to run format.sh so CI can run

I'm not too worried about overloading machines. Even with multiple targets the maximum number of jobs should be honored. The only problem I see with this is that each extension/target could conceivably have a different configuration. Currently, this is not the case but this change could cause problems if targets ever had different configs.

…time

dtrifiro · 2024-05-27T09:39:54Z

@youkaichao

I'm not sure I follow. This should use max_jobs processes with the given number of nvcc threads, the only difference is that cmake --build will be only invoked once and will be aware of all of the targets that need to be built and not have "idle times" between finishing building all of the pieces of one extension and starting compiling the next one.

@mgoin

Regarding memory usage: I noticed a very slight increase in peak memory usage, but overall I think it's negligible and it's still limited by setting proper values for nvcc_threads and max_jobs. Empirically, I saw at most ~3.5GB memory for each nvcc threads, so, for something like MAX_JOBS=8 ( => -j 4) and NVCC_THREADS=2, I saw a peak of ~21GB of memory:

Here is an example of what a build looks like with these settings:

Without this patch, CPU usage (and memory as well) raises, stays at max for a while, then dwindles down until it finishes compiling one extension and then ramps up again for the next extension and so on (looks like the tail of the above cpu usage graph), resulting in some time intervals in which not all of the cores are being used. If I rember correctly, this patch improves build times by up to 20 minutes in some cases

dtrifiro · 2024-05-29T14:26:03Z

Any updates on this?

mgoin

Thank you @dtrifiro for the clear justification on why this patch improves CPU utilization. It is clear that if we run into peak memory consumption issues, we should continue to tune MAX_JOBS and NVCC_THREADS independently of compiling all targets at once.

…e time (vllm-project#5034)

[CI/Build] CMakeLists: build all extension cmake targets at the same …

7249850

…time

dtrifiro force-pushed the setup-build-all-targets branch from 05f5e78 to 7249850 Compare May 27, 2024 09:40

dtrifiro mentioned this pull request May 31, 2024

Dockerfile.ubi: speed up build opendatahub-io/vllm#33

Closed

mgoin approved these changes Jun 1, 2024

View reviewed changes

mgoin merged commit a360ff8 into vllm-project:main Jun 1, 2024
63 checks passed

blinkbear pushed a commit to blinkbear/vllm that referenced this pull request Jun 3, 2024

[CI/Build] CMakeLists: build all extensions' cmake targets at the sam…

1c0cbce

…e time (vllm-project#5034)

dtrifiro deleted the setup-build-all-targets branch June 3, 2024 08:03

robertgshaw2-neuralmagic pushed a commit to neuralmagic/nm-vllm that referenced this pull request Jun 11, 2024

[CI/Build] CMakeLists: build all extensions' cmake targets at the sam…

4b41095

…e time (vllm-project#5034)

joerunde pushed a commit to joerunde/vllm that referenced this pull request Jun 17, 2024

[CI/Build] CMakeLists: build all extensions' cmake targets at the sam…

ac902ef

…e time (vllm-project#5034)

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jun 27, 2024

[CI/Build] CMakeLists: build all extensions' cmake targets at the sam…

764bbff

…e time (vllm-project#5034)

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 8, 2024

[CI/Build] CMakeLists: build all extensions' cmake targets at the sam…

bb95e62

…e time (vllm-project#5034)

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 24, 2024

[CI/Build] CMakeLists: build all extensions' cmake targets at the sam…

3037f72

…e time (vllm-project#5034)

Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024

[CI/Build] CMakeLists: build all extensions' cmake targets at the sam…

d1fdc09

…e time (vllm-project#5034)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI/Build] CMakeLists: build all extensions' cmake targets at the same time #5034

[CI/Build] CMakeLists: build all extensions' cmake targets at the same time #5034

dtrifiro commented May 24, 2024

mgoin commented May 24, 2024 •

edited

Loading

youkaichao commented May 24, 2024

bnellnm commented May 25, 2024 •

edited

Loading

dtrifiro commented May 27, 2024 •

edited

Loading

dtrifiro commented May 29, 2024

mgoin left a comment

[CI/Build] CMakeLists: build all extensions' cmake targets at the same time #5034

[CI/Build] CMakeLists: build all extensions' cmake targets at the same time #5034

Conversation

dtrifiro commented May 24, 2024

mgoin commented May 24, 2024 • edited Loading

youkaichao commented May 24, 2024

bnellnm commented May 25, 2024 • edited Loading

dtrifiro commented May 27, 2024 • edited Loading

dtrifiro commented May 29, 2024

mgoin left a comment

Choose a reason for hiding this comment

mgoin commented May 24, 2024 •

edited

Loading

bnellnm commented May 25, 2024 •

edited

Loading

dtrifiro commented May 27, 2024 •

edited

Loading