custom objective function for pyspark #1171

kiminsigne · 2021-08-25T23:46:33Z

Hi, I can see that the custom objective function for the Scala API was recently added in this PR, which is really exciting! Is there any idea when this functionally will be added in pyspark (perhaps it has and I haven't found the PR yet)?

I'm very interested in implementing a custom objective function for the LightGBMRanker model using mean average precision (trying to follow the approach in this paper) which is suited for binary relevance, as the current 'lambdarank' function uses NDCG which is best suited for graded relevance measure. It would be nice to have this feature as the xgboost python package has the option to use the rank:map objective in addition to the default rank:ndcg.

Thanks so much! We've been using your model at our company for the past year, but our training data is binary not graded, and I'd love to use something better suited to our data!

The text was updated successfully, but these errors were encountered:

EthanRock · 2021-09-02T02:34:34Z

We are using lightGBM on pyspark to build ML pipeline as well. Wondering if custom loss function is spported for pyspark?

imatiach-msft · 2021-09-02T14:33:04Z

@kiminsigne @EthanRock yes pyspark support for custom objective function is one of the top things I am looking into adding. We just recently added custom objective function in the scala API in May of this year (#1054). However, pyspark is more complicated because I have to run it on each of the worker nodes, I only know that UDF transformations can do this, so I might have to look into how to communicate with the python processes in workers to understand how to make this work.

andrew-arkhipov · 2021-09-02T14:55:35Z

+1 on this. Native LightGBM has cross_entropy as an available objective function, but unfortunately it doesn't seem to be supported in MMLSpark LightGBM. Being able to write a custom objective function that implements cross_entropy would be very helpful.

imatiach-msft · 2021-09-02T15:03:30Z

@andrew-arkhipov it is supported in the scala API, see param here:
https://github.com/microsoft/SynapseML/blob/master/lightgbm/src/main/scala/com/microsoft/ml/spark/lightgbm/params/LightGBMParams.scala#L305
see here for param definition:
https://github.com/microsoft/SynapseML/blob/master/lightgbm/src/main/scala/com/microsoft/ml/spark/lightgbm/params/FObjParam.scala
see example here in scala:
https://github.com/microsoft/SynapseML/blob/master/lightgbm/src/test/scala/com/microsoft/ml/spark/lightgbm/split1/VerifyLightGBMClassifier.scala#L338

It's not yet supported in pyspark because there is no easy way to call the python process from scala worker for an arbitrary function like this. I think I have to look into the interprocess communication code from apache spark to figure out how to enable this scenario.

mhamilton723 · 2021-11-03T23:00:17Z

Thank you @imatiach-msft for answering this

sinnfashen · 2021-11-23T22:12:02Z

Hi there!
Just asking that is there any thoughts/plan on implementing this? I was thinking that the fobj in the python's API worked and spent a lot of time debugging and then I got here...I believe this will be a very useful feature to add. (perhaps also worth noting that in the doc before it's actually implemented so that people know not to try on them yet?)

ssabzevari-antuit · 2022-12-07T14:40:14Z

Any update on PySpark implementation?

yukihiro123 · 2022-12-12T03:16:24Z

Is there a way to use custom objective function in pyspark already implemented?
When setting a python custom objective function to the fobj argument of LightGBMClassifier, the following error was output.

def fobj(pred, label):
    ...
    return grad, hess
lgbm = LightGBMClassifier (fobj=fobj)
model = lgbm.fit(train_sdf)

java.lang.ClassCastException: class net.razorvine.pickle.objects.ClassDictConstructor cannot be cast to class com.microsoft.azure.synapseml.lightgbm.params.FObjTrait

I understand that an error occurred when converting python object to the FObjTrait type.

If there is a way to use your own objective function in pyspark, thank you for giving me a specific example

morulaus · 2024-02-25T04:02:32Z

is there any progress on this? I fail to find any documentation on the subject

mhamilton723 closed this as completed Nov 3, 2021

github-actions bot reopened this Dec 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

custom objective function for pyspark #1171

custom objective function for pyspark #1171

kiminsigne commented Aug 25, 2021

EthanRock commented Sep 2, 2021

imatiach-msft commented Sep 2, 2021

andrew-arkhipov commented Sep 2, 2021

imatiach-msft commented Sep 2, 2021 •

edited

Loading

mhamilton723 commented Nov 3, 2021

sinnfashen commented Nov 23, 2021

ssabzevari-antuit commented Dec 7, 2022

yukihiro123 commented Dec 12, 2022

morulaus commented Feb 25, 2024

custom objective function for pyspark #1171

custom objective function for pyspark #1171

Comments

kiminsigne commented Aug 25, 2021

EthanRock commented Sep 2, 2021

imatiach-msft commented Sep 2, 2021

andrew-arkhipov commented Sep 2, 2021

imatiach-msft commented Sep 2, 2021 • edited Loading

mhamilton723 commented Nov 3, 2021

sinnfashen commented Nov 23, 2021

ssabzevari-antuit commented Dec 7, 2022

yukihiro123 commented Dec 12, 2022

morulaus commented Feb 25, 2024

imatiach-msft commented Sep 2, 2021 •

edited

Loading