Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python] Changing n_jobs through scikit-learn interface has no effect #4706

Closed
david-cortes opened this issue Oct 22, 2021 · 4 comments · Fixed by #4822
Closed

[python] Changing n_jobs through scikit-learn interface has no effect #4706

david-cortes opened this issue Oct 22, 2021 · 4 comments · Fixed by #4822

Comments

@david-cortes
Copy link
Contributor

If I set n_jobs in a fitted lightgbm model through the scikit-learn interface, the change has no effect.

Example:

import numpy as np
from lightgbm import LGBMRegressor
from sklearn.datasets import fetch_california_housing
X, y = fetch_california_housing(return_X_y=True)

import numpy as np
X_long = np.repeat(X, 1000).reshape((-1, X.shape[1]))

model = LGBMRegressor(n_jobs=4).fit(X, y)

Now start watching the resource usage for the process and run this:

model.set_params(n_jobs=1)
pred = model.predict(X_long)
@StrikerRUS
Copy link
Collaborator

set_params() affects only fit() stage of the estimator.

Example:

# ... beginning is the same as in the example above

model.set_params(n_jobs=1)
model = model.fit(X_long, np.arange(X_long.shape[0]))  # note that `n_jobs=1` is respected here

For the prediction stage, you should pass n_jobs parameter directly to the predict() method:

# ... beginning is the same as in the example above

pred = model.predict(X_long, n_jobs=1)

Duplicate of #1723 (comment).

@david-cortes
Copy link
Contributor Author

@StrikerRUS This is by the way not mentioned in the docs. There is a section about core parameters which are mostly about training which mentions n_jobs, but no mention in the prediction parameters that n_jobs is also a parameter to be passed there in the other language interfaces.

Would be helpful to have it described in the scikit-learn class docs themselves that the prediction n_jobs is controlled separately, for example by changing the description from Number of parallel threads. to Number of parallel threads to use for training (can be changed at prediction time). or similar. As it reads from the docs, I would assume that n_jobs should control both.

Also would be ideal for the scikit-learn interface to mimic scikit-learn itself, in which changing n_jobs in the model object also affects n_jobs for predictions.

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Oct 22, 2021

I think I agree with you.
Given that even advanced user like you was confused by the docs, it's really something wrong with them.
I'll try to improve them and sklearn-wrapper as well. Thanks a lot for the tips what should be mentioned explicitly!

@StrikerRUS StrikerRUS reopened this Oct 22, 2021
@jameslamb jameslamb changed the title Changing n_jobs through scikit-learn interface has no effect [python] Changing n_jobs through scikit-learn interface has no effect Oct 25, 2021
@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
2 participants