Training LightGBMRanker several times gives different NDCG on testing set

I noticed that when training on Databricks with the same parameters on the same data several times, the resulting models don't give the same predictions, as evidenced by different NDCG on a separate testing set.
Here is my training function, my training set has 400K exemples in 5K lists, with 60 features:
```scala
def train(): Unit = {
  val lgbm = new LightGBMRanker()
  .setCategoricalSlotIndexes(Array(0, 2, 3, 4, 6, 7, 8, 59))
  .setFeaturesCol("features")
  .setGroupCol("query_id")
  .setLabelCol("label")
  .setMaxPosition(10)
  .setParallelism("voting")
  .setNumIterations(15)
  .setMaxDepth(4)
  .setNumLeaves(12)
  val training = table(s"training")
  val model = lgbm.fit(training)
}
```
Is that inherent to distributed training (on 5 executors) or should I change some parameters of my `LightGBMRanker` instance?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Training LightGBMRanker several times gives different NDCG on testing set #580

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Training LightGBMRanker several times gives different NDCG on testing set #580

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions