Skip to content

oneDAL Binary Classification LBFGS - Index was outside the bounds of the array. #6533

@luisquintanilla

Description

@luisquintanilla

System Information (please complete the following information):

  • OS & Version: Windows 11
  • ML.NET Version: ML.NET 3.0 prerelease
  • .NET Version: .NET 6 & .NET 6

Describe the bug

Training a binary classification model using the Lbfgs trainer and setting the MLNET_BACKEND environment variable to ONEDAL produces the following error:

Unhandled exception. System.IndexOutOfRangeException: Index was outside the bounds of the array.
   at Microsoft.ML.Trainers.LbfgsTrainerBase`3.TrainCoreOneDal(IChannel ch, RoleMappedData data)
   at Microsoft.ML.Trainers.LbfgsTrainerBase`3.TrainModelCore(TrainContext context)
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor)
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.Fit(IDataView input)
at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
at Program.<Main>$(String[] args) in C:\Dev\OneDalTest\OneDalTest\Program.cs:line 31

To Reproduce

  1. Create a C# console application
  2. Install the latest prerelease versions of Microsoft.ML and Microsoft.ML.OneDAL packages.
  3. Paste the following code into the Program.cs file.
// Initialize MLContext
var ctx = new MLContext();

// Define data
var trainingData = new [] 
{
    new {Arch="ARM", Trainer="LightGBM", oneDALSupport=false},
    new {Arch="x86", Trainer="FastTree", oneDALSupport=true},
    new {Arch="x86", Trainer="LbfgsLogisticRegression", oneDALSupport=true},
    new {Arch="ARM", Trainer="FastTree", oneDALSupport=false}
};

// Load data into IDataView
var trainingDv = ctx.Data.LoadFromEnumerable(trainingData);

// Define data processing pipeline & trainer
var pipeline = 
    ctx.Transforms.Categorical.OneHotEncoding(new [] {
            new InputOutputColumnPair("ArchEncoded", "Arch"),
            new InputOutputColumnPair("TrainerEncoded", "Trainer")})
        .Append(ctx.Transforms.Concatenate("Features", "ArchEncoded", "TrainerEncoded"))
        .Append(ctx.BinaryClassification.Trainers.LbfgsLogisticRegression(labelColumnName:"oneDALSupport"));

// Train model
var model = pipeline.Fit(trainingDv);
  1. Set the MLNET_BACKEND environment variable to ONEDAL
  2. Run the application.

Expected behavior
The model trains successfully.

Additional context

The same code using the FastTree trainer trains the model successfully.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions