Build integrated OpenCL Linux wheels #5252

jgiannuzzi · 2022-05-30T12:35:33Z

This PR builds on @tpboudreau's excellent work in #3144 to add support for building integrated OpenCL wheels on Linux too. It also changes the CI to build this wheel instead of the regular CPU wheel, offering a better out-of-the-box experience for Linux users who want to use LightGBM with GPU support.

Fixes #4684

jameslamb

Thank you very much for working on this!

I personally am not very familiar with OpenCL and not qualified to review this. Hopefully @shiyu1994 @guolinke and @StrikerRUS will be able to provide you some feedback. I'm also pinging @huanzhang12 to possibly help.

tests/python_package_test/conftest.py

StrikerRUS · 2022-06-04T23:05:02Z

@jgiannuzzi
Thank you so much for working on this and for amazing PR!

I personally am not very familiar with OpenCL and not qualified to review this.

Unfortunately, me too. But fortunately, I have an easy access to Ubuntu machine with NVIDIA GPU. So, I'll be able to independently test generated artifacts.

StrikerRUS · 2022-06-04T23:05:43Z

.vsts-ci.yml

+        # on Ubuntu 14.04, test_dual.py fails with newer version of Python
+        PYTHON_VERSION: '3.7'


Could you please share the exact error message?

This is the same error that we get when trying to do gpu-source with Python 3.8 on Ubuntu 14.04

CMakeLists.txt

StrikerRUS · 2022-06-04T23:17:39Z

@tpboudreau @itamarst We'll really appreciate your inputs for this PR!

StrikerRUS · 2022-06-04T23:24:54Z

@jgiannuzzi Please update docs: https://github.com/microsoft/LightGBM/pull/3660/files#diff-e14b183376b1177323f6e7245d8ad64f2cd26f9638c129769d6b3c0ba4698dd5R26.

.ci/test.sh

jgiannuzzi · 2022-06-15T18:03:28Z

@StrikerRUS I have addressed your remaining comments.

@tpboudreau Could you please take a look? For the context, #5282 got merged before this one, switching the OpenCL CPU implementation to PoCL on Linux.

tests/python_package_test/test_dual.py

StrikerRUS · 2022-06-19T14:09:47Z

We've recently moved PoCL installation from the CI runtime (.ci/setup.sh) to the Docker creation phase with the aim to reduce overall CI time: guolinke/lightgbm-ci-docker#26 and #5286. That process introduced some merge conflicts in this PR. Sorry about that! Could you please resolve those conflicts?

StrikerRUS · 2022-06-19T20:37:56Z

Good news!

I just checked generated as CI artifact wheel file on my Ubuntu machine with NVIDIA GPU. I simply installed it with pip install *.whl and after that device='gpu' parameter allowed me to utilize my GPU.

>>> Python 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53)
>>> [GCC 9.4.0] on linux

import numpy as np
import lightgbm as lgb

X = np.random.random((10_000, 200))
y = np.random.random(10_000)

est = lgb.LGBMRegressor(n_estimators=5000).fit(X, y)
>>> [LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.009667 seconds.
>>> You can set `force_col_wise=true` to remove the overhead.
>>> [LightGBM] [Info] Total Bins 51000
>>> [LightGBM] [Info] Number of data points in the train set: 10000, number of used features: 200
>>> [LightGBM] [Info] Start training from score 0.502931

est = lgb.LGBMRegressor(n_estimators=5000, device='gpu').fit(X, y)
>>> [LightGBM] [Info] This is the GPU trainer!!
>>> [LightGBM] [Info] Total Bins 51000
>>> [LightGBM] [Info] Number of data points in the train set: 10000, number of used features: 200
>>> [LightGBM] [Info] Using GPU Device: Tesla V100-SXM2-32GB, Vendor: NVIDIA Corporation
>>> [LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
>>> [LightGBM] [Info] GPU programs have been built
>>> [LightGBM] [Info] Size of histogram bin entry: 8
>>> [LightGBM] [Info] 200 dense feature groups (1.91 MB) transferred to GPU in 0.004207 secs. 0 sparse feature groups
>>> [LightGBM] [Info] Start training from score 0.502931

StrikerRUS · 2022-07-24T14:02:39Z

@jgiannuzzi Hey! Maybe we can merge this PR without the support for aarch64 and add it later in a separate PR?

jgiannuzzi · 2022-07-26T10:13:11Z

Hey @StrikerRUS, I'm very sorry for not having updated this PR yet. I have had a fix for aarch64 for a while but never had the time to update the PR. I will try to do it this week so we can finally have those integrate OpenCL Linux wheels!

StrikerRUS · 2022-07-30T20:28:33Z

@jgiannuzzi No problem. Thanks a lot for all your hard work!

jameslamb · 2022-11-20T17:20:57Z

I tried building this in CI last night (just pushing this branch from a fork to a LightGBM branch so it'd produce artifacts I could download and test with), and unfortunately I saw one failure on the QEMU_multiarch bdist build.

The test_cpu_and_gpu_work() test failed with the following.

lightgbm.basic.LightGBMError: No OpenCL device found

(build link)

I just pushed a9f02d7 updating this to latest master. Let's see if that happens here.

I'm so sorry @jgiannuzzi , I'm trying to test this but LightGBM's CI is not in a good state right now. I'm trying to fix it as fast as I can.

jameslamb · 2022-11-29T16:45:25Z

I realized that in the new manylinux_2_28_x86_64 image used to build x86_64 linux wheels, a too-old OpenCL was being found, and as a result pocl didn't register itself in /etc/OpenCL/vendors.

Fixed that in guolinke/lightgbm-ci-docker#29, which looks like it fixed #5252 (comment).

(ci build link)

I'll try fixing the other two CI failures tonight, they're almost certainly my fault. Sorry about that @jgiannuzzi , I'm doing my best to get this merged.

guolinke · 2022-11-30T14:09:16Z

I am not sure why the "guolinke/lightgbm-ci-docker#29" will auto close this

jameslamb · 2022-11-30T15:00:02Z

I am not sure why the "guolinke/lightgbm-ci-docker#29" will auto close this

oh strange! Probably in the way I phrased "fix" in that PR's description. Sorry about that and thanks for re-opening it.

jameslamb

Alright I've delayed this long enough, going to approve based on the passing CI and all I've learned about PoCL and this PR's other additions while working through #5580.

I'd still like to try these wheels again on different GPUs from a cloud provider, but that can be done later.

@guolinke can you look one more time? If you approve, please merge this (I'll be traveling for the next few days).

@jgiannuzzi thank you SO MUCH for this awesome contribution, and the other excellent contributions you've made to LightGBM along the way. I've learned a lot from you and it's been great working with you. I hope you'll consider contributing more to LightGBM in the future 😁

guolinke

Thank you!

jgiannuzzi · 2022-12-01T08:57:34Z

Thank you @jameslamb and @guolinke, I'm looking forward to having daily builds with GPU support!

I'm sorry I wasn't available to help earlier with the many CI woes, and I'm certainly looking forward to contributing again to LightGBM in the future!

github-actions · 2023-08-15T20:36:28Z

This pull request has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

jgiannuzzi requested review from StrikerRUS, jameslamb, guolinke and shiyu1994 as code owners May 30, 2022 12:35

jgiannuzzi mentioned this pull request May 30, 2022

Build Python wheels that support both GPU and CPU versions out of the box for non-Windows #4684

Closed

jgiannuzzi force-pushed the linux-gpu-wheel branch from c4e6354 to dadd2d3 Compare May 30, 2022 17:11

jgiannuzzi marked this pull request as draft May 31, 2022 08:15

jgiannuzzi force-pushed the linux-gpu-wheel branch 4 times, most recently from bcc37dc to 87afe60 Compare June 1, 2022 18:02

jgiannuzzi marked this pull request as ready for review June 1, 2022 19:14

jgiannuzzi requested a review from jmoralez as a code owner June 1, 2022 19:14

jameslamb added the feature label Jun 4, 2022

jameslamb reviewed Jun 4, 2022

View reviewed changes

tests/python_package_test/conftest.py Outdated Show resolved Hide resolved

StrikerRUS reviewed Jun 4, 2022

View reviewed changes

jgiannuzzi mentioned this pull request Jun 12, 2022

[ci] Run Linux OpenCL tests against POCL instead of the AMD App SDK #5282

Merged

jgiannuzzi force-pushed the linux-gpu-wheel branch from 87afe60 to a326244 Compare June 13, 2022 17:59

StrikerRUS reviewed Jun 13, 2022

View reviewed changes

.ci/test.sh Show resolved Hide resolved

.ci/test.sh Show resolved Hide resolved

jgiannuzzi force-pushed the linux-gpu-wheel branch from 7b11941 to c4e0533 Compare June 14, 2022 16:13

StrikerRUS reviewed Jun 19, 2022

View reviewed changes

tests/python_package_test/test_dual.py Show resolved Hide resolved

This was referenced Nov 20, 2022

[ci] [python-package] Use LightGBM-custom manylinux image for building aarch64 Linux wheels #5595

Closed

[gpu] upgrade to PoCL v3.0 for building Linux integrated OpenCL wheels #5596

Open

jameslamb added 4 commits November 20, 2022 23:45

merge master

3eb5a47

add missing fi dropped in merge conflict resolution

1a4d569

install opencl-headers on bdist task

8c5e47e

Merge branch 'master' into linux-gpu-wheel

466322e

jameslamb mentioned this pull request Nov 29, 2022

ensure PoCL finds newer OpenCL guolinke/lightgbm-ci-docker#29

Merged

use new CI image for x86_64

d5777ea

Merge branch 'master' into linux-gpu-wheel

d0ae211

jameslamb mentioned this pull request Nov 30, 2022

[ci] detect non-default dynamic symbols in check_dynamic_dependencies.py #5610

Merged

update check_dynamic_dependencies script

f36563d

guolinke closed this in guolinke/lightgbm-ci-docker#29 Nov 30, 2022

guolinke reopened this Nov 30, 2022

jameslamb added 2 commits November 30, 2022 13:09

Merge branch 'master' into linux-gpu-wheel

0906621

use main CI image

8dedbe7

jameslamb self-requested a review December 1, 2022 01:23

jameslamb approved these changes Dec 1, 2022

View reviewed changes

guolinke approved these changes Dec 1, 2022

View reviewed changes

guolinke merged commit 38a1f58 into microsoft:master Dec 2, 2022

jameslamb mentioned this pull request Dec 7, 2022

[ci] use LightGBM CI image for building aarch64 wheels (fixes #5595) #5622

Merged

jameslamb removed the awaiting review label Mar 16, 2023

jameslamb mentioned this pull request Apr 19, 2023

[ci] [python-package] check distributions with pydistcheck #5838

Merged

github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023

jgiannuzzi deleted the linux-gpu-wheel branch August 16, 2023 08:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build integrated OpenCL Linux wheels #5252

Build integrated OpenCL Linux wheels #5252

jgiannuzzi commented May 30, 2022

jameslamb left a comment

StrikerRUS commented Jun 4, 2022

StrikerRUS Jun 4, 2022

jgiannuzzi Jun 8, 2022

StrikerRUS commented Jun 4, 2022

StrikerRUS commented Jun 4, 2022

jgiannuzzi commented Jun 15, 2022

StrikerRUS commented Jun 19, 2022

StrikerRUS commented Jun 19, 2022

StrikerRUS commented Jul 24, 2022

jgiannuzzi commented Jul 26, 2022

StrikerRUS commented Jul 30, 2022

jameslamb commented Nov 20, 2022

jameslamb commented Nov 29, 2022

guolinke commented Nov 30, 2022

jameslamb commented Nov 30, 2022

jameslamb left a comment

guolinke left a comment

jgiannuzzi commented Dec 1, 2022

github-actions bot commented Aug 15, 2023

		# on Ubuntu 14.04, test_dual.py fails with newer version of Python
		PYTHON_VERSION: '3.7'

Build integrated OpenCL Linux wheels #5252

Build integrated OpenCL Linux wheels #5252

Conversation

jgiannuzzi commented May 30, 2022

jameslamb left a comment

Choose a reason for hiding this comment

StrikerRUS commented Jun 4, 2022

StrikerRUS Jun 4, 2022

Choose a reason for hiding this comment

jgiannuzzi Jun 8, 2022

Choose a reason for hiding this comment

StrikerRUS commented Jun 4, 2022

StrikerRUS commented Jun 4, 2022

jgiannuzzi commented Jun 15, 2022

StrikerRUS commented Jun 19, 2022

StrikerRUS commented Jun 19, 2022

StrikerRUS commented Jul 24, 2022

jgiannuzzi commented Jul 26, 2022

StrikerRUS commented Jul 30, 2022

jameslamb commented Nov 20, 2022

jameslamb commented Nov 29, 2022

guolinke commented Nov 30, 2022

jameslamb commented Nov 30, 2022

jameslamb left a comment

Choose a reason for hiding this comment

guolinke left a comment

Choose a reason for hiding this comment

jgiannuzzi commented Dec 1, 2022

github-actions bot commented Aug 15, 2023