-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write tests for parallel code #3841
Comments
OK, I see "thumb-up" reactions and based on them marking this issue with "feature request" label 🙂 |
Closed in favor of being in #2302. We decided to keep all feature requests in one place. Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature. |
Hi. I recently used the CLI interface to do "distributed" training on my local machine, so I just set the machine list to 127.0.0.1 with different ports, two config files and ran the lightgbm binary in two terminals. Would this be something along those lines? |
Yep exactly! We'd want a way to do that automatically in CI, but would support a PR that even just adds the tests and proposes a way to easily run and extend them. |
I've been trying this out and currently I have a python script that uses threads to spawn the machines from the CLI. I'm using the data and configurations provided in
|
It's fine to install the Python library as part of such a test.
I think it would be good to target Azure DevOps for such tests. Currently, this project's Python package tests on GitHub Actions (https://github.com/microsoft/LightGBM/blob/d517ba12f2e7862ac533908304dddbd770655d2b/.github/workflows/python_package.yml) are only for Mac. This was done because of some limitations with Mac builds on Azure. Please only add a single job, in the Linux-latest section ( Line 76 in d517ba1
This project's Azure capacity is limited, so I'd appreciate it if you start by testing on your own fork while you develop this. You can comment out most of https://github.com/microsoft/LightGBM/blob/d517ba12f2e7862ac533908304dddbd770655d2b/.vsts-ci.yml when testing on your own fork. You can also take advantage of the fact that all of those jobs run in containers using the public If we find out through your eventual submission that the tests take a very long time to run or are flaky, then we might ask you to move them to GitHub Actions and introduce them as tests that only run when manually triggered by a comment, instead of on every commit.
It sounds reasonable to me, but it's hard to say without seeing the code. If you prepare a pull request, please consider how to make the tests extensible. So, for example, the data in |
Hi @jameslamb. I've been working on this and have ran a simple test in my fork. The relevant file is here and the workflow run is here The workflow right now builds the lightgbm binary, installs the python package and uses pytest to run the tests in that file. I'd appreciate your feedback when you have time. |
Thanks so much for doing this, @jmoralez ! Really, really nice implementation. The way you set up the tests looks like it would be very manageable to extend in the future. I'm going to re-open this issue since you're doing work on this. Could you open a draft PR with your changes? I can help there with other things like how to integrate with the rest of our CI structure. |
oh wow, I never came back and closed this after #4254!! Thanks for going through these old issues @StrikerRUS . I've been trying to keep up with them (and with closing features in favor of putting them in #2302) but it's a lot 😅 |
No problem! Your contribution in issues' management cannot be overestimated! |
I'm marking this issue with
question
label but hope to convert it tofeature request
after some discussions.Right now LightGBM lacks any tests related to parallel (in located in
network
folder) code.MPI
job which run on some our CIs simply execute ordinary tests withserial
single machine tree learner but with dynamic library compiled with MPI support. In other words, we just check that MPI code can be compiled.Recently added Dask module are getting more and more tests and this is very good but they are
I believe it will be good to write some basic tests that will cover low-level LightGBM cpp code. Ideally it should cover both socket and MPI implementations.
Adding such tests will help to improve standalone LightGBM parallel library and in consequence Dask-package built on top of it.
Linking #261.
Refer to #3839 (comment) for test example.
The text was updated successfully, but these errors were encountered: