You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Many exception messages thrown are unclear - as a result, when an exception occurs, it's challenging to identify whether the issue in with the ML.NET code, with the underlying data, with how the algorithm is being applied, etc. Often it takes stepping through the ML.NET fwk in attempt to get further context.
I logged this as a single issue because I think there would be benefit in looking at all places where exceptions are being thrown\rethrown to ensure that default exception messages aren't provided and that the messages are as clear\rich as possible. Let me know if you would like these broken into separate issues rather than having them combined in one.
Here are some specific examples:
Trainer
Scenario
Actual Message
Suggested Message
1.
N/A
Occurs when invalid field index is provided to the LoadColumn attribute. For example: [LoadColumn(100)] public uint Label { get; set; } In the above code, the value of 100 is an invalid index value since the underlying data has less than 100 columns.
System.ArgumentNullException: 'Value cannot be null. Parameter name: items'
Message should indicate which column has the issue; the reference to parameter ‘items’ is unclear.
2.
N/A
Occurs when Feature column is of some other type than float\single. For example: [ColumnName("Test"), LoadColumn(135)] public uint Test { get; set; }
System.InvalidOperationException: 'Column ‘Test’ has values of UInt32, which is not the same as earlier observed type of Single.'
It’s unclear what “same as earlier observed type” means. Consider rewording to state that the Feature columns must all be of a certain type (e.g. Single).
3.
LightGbm
Occurs when custom gains are specified without providing a group id column. For example: var customGains = new LightGbmRankingTrainer.Options(); customGains.CustomGains = new int[] { 0, 1, 2, 3 };IEstimator<ITransformer> trainer = mlContext.Ranking.Trainers.LightGbm(customGains);IEstimator<ITransformer> trainerPipeline = dataPipeline.Append(trainer); Notice that in the above code, the Group Id isn’t being explicitly set as follows: customGains.RowGroupColumnName = "GroupId";
System.ArgumentOutOfRangeException: 'Need a group column. Parameter name: data'
ArgumentOutOfRangeException is confusing; instead, throw ArgumentNullException or InvalidOperationException. Message should also indicate the ‘Group Id’ column is missing\null; the reference to parameter ‘data’ is unclear.
4.
LightGbm
Occurs when custom gains cardinality doesn’t match the cardinality of the relevance label values. For example: var customGains = new LightGbmRankingTrainer.Options(); customGains.CustomGains = new int[] { 0, 1, 2 }; customGains.RowGroupColumnName = "GroupId"; In the underlying data, the relevance label values are: {0, 1, 2, 3, 4 } – in other words, the cardinality of the relevance label values is greater than the specified custom gains.
System.InvalidOperationException: 'LightGBM Error, code is -1, error message is 'label (0) excel the max range 3'.'
There appears to be a typo – “excel” should say “exceeds”. Also, the message should state that the cardinality of the relevance label values must less than or equal to the cardinality of the custom gains. Note: Refer to similar issue logged directly against LightGBM: microsoft/LightGBM#1090
The text was updated successfully, but these errors were encountered:
Many exception messages thrown are unclear - as a result, when an exception occurs, it's challenging to identify whether the issue in with the ML.NET code, with the underlying data, with how the algorithm is being applied, etc. Often it takes stepping through the ML.NET fwk in attempt to get further context.
I logged this as a single issue because I think there would be benefit in looking at all places where exceptions are being thrown\rethrown to ensure that default exception messages aren't provided and that the messages are as clear\rich as possible. Let me know if you would like these broken into separate issues rather than having them combined in one.
Here are some specific examples:
[LoadColumn(100)] public uint Label { get; set; }
In the above code, the value of 100 is an invalid index value since the underlying data has less than 100 columns.
[ColumnName("Test"), LoadColumn(135)] public uint Test { get; set; }
var customGains = new LightGbmRankingTrainer.Options(); customGains.CustomGains = new int[] { 0, 1, 2, 3 };IEstimator<ITransformer> trainer = mlContext.Ranking.Trainers.LightGbm(customGains);IEstimator<ITransformer> trainerPipeline = dataPipeline.Append(trainer);
Notice that in the above code, the Group Id isn’t being explicitly set as follows:
customGains.RowGroupColumnName = "GroupId";
var customGains = new LightGbmRankingTrainer.Options(); customGains.CustomGains = new int[] { 0, 1, 2 }; customGains.RowGroupColumnName = "GroupId";
In the underlying data, the relevance label values are: {0, 1, 2, 3, 4 } – in other words, the cardinality of the relevance label values is greater than the specified custom gains.
The text was updated successfully, but these errors were encountered: