-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
System Information (please complete the following information):
- OS & Version: Win8, latest version as of this bug entry
- ML.NET Version: 16.13.9
- .NET Version:6.0.303
Describe the bug
When I start ML.net from CLI, I get a OutOfMemoryException
I have 64GB Ram, I have a 330GB csv file of data.
I tried with
To Reproduce
Steps to reproduce the behavior:
- Generate a 330GB file with 4209 columns with random data
- open prompt
- type in command line:
mlnet classification --train-time 75600 --name SampleClassification --log-file-path c:\Log_data.txt --has-header true --label-col 4209 --ignore-cols 0,1,4206,4207,4208 --dataset "c:\data.csv" --test-dataset "c:\test_data.csv" - See error log at the end of this message with the OutOfMemoryException
Expected behavior
I expect ml.net to continue and feed the data as it stream it, so there should be no OutOfMemoryException
When I monitor the mknet.exe prices with task manager, the mlnet.exe process doesn't go high at all, like less than ~14GB. So something is not right as I have 64GB and also it shouldn't matter isn't it as .
Screenshots, Code, Sample Projects
Additional context
Here is the log
Start Training
start nni training
Experiment output folder: C:\Users\W\AppData\Local\Temp\AutoML-NNI\Experiment-GET3JS
System.FormatException: Parsing failed with an exception: Stream reading encountered exception
---> System.FormatException: Stream reading encountered exception
---> System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at System.Text.StringBuilder.ToString()
at System.IO.StreamReader.ReadLine()
at Microsoft.ML.Data.TextLoader.Cursor.LineReader.ThreadProc()
--- End of inner exception stack trace ---
at Microsoft.ML.Data.TextLoader.Cursor.LineReader.GetBatch()
at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.Parse(Int32 tid)
at Microsoft.ML.Data.TextLoader.Cursor.ParallelState.ThreadProc(Object obj)
--- End of inner exception stack trace ---
at Microsoft.ML.Data.TextLoader.Cursor.ParseParallel(ParallelState state)+MoveNext()
at Microsoft.ML.Data.TextLoader.Cursor.MoveNextCore()
at Microsoft.ML.Data.RootCursorBase.MoveNext()
at Microsoft.ML.ModelBuilder.AutoMLService.Proposer.Controller.CountRows(IDataView data, Int64 maxRows) in //src/Microsoft.ML.ModelBuilder.AutoMLService/Proposer/Controller.cs:line 174
at Microsoft.ML.ModelBuilder.AutoMLService.Proposer.Controller.Initialize() in //src/Microsoft.ML.ModelBuilder.AutoMLService/Proposer/Controller.cs:line 111
at Microsoft.ML.ModelBuilder.AutoMLService.Experiments.LocalAutoMLExperiment.ExecuteAsync(IDataView trainData, IDataView validateData, ColumnInformation columnInformation, CancellationToken cancellationToken, CancellationToken timeout) in //src/Microsoft.ML.ModelBuilder.AutoMLService/Experiments/LocalAutoMLExperiment.cs:line 138
at Microsoft.ML.ModelBuilder.AutoMLEngine.StartTrainingAsync(TrainingConfiguration config, PathConfiguration pathConfig, CancellationToken userCancellationToken) in //src/Microsoft.ML.ModelBuilder.AutoMLService/AutoMLEngineService/AutoMLEngine.cs:line 160
at Microsoft.ML.CLI.Runners.AutoMLRunner.ExecuteAsync() in //src/mlnet/Runners/AutoMLRunner.cs:line 88
at Microsoft.ML.CLI.Program.TrainAsync(TrainingConfiguration trainingConfiguration, PathConfiguration pathConfig, AutoMLServiceLogLevel logLevel) in //src/mlnet/Program.cs:line 348
at Microsoft.ML.CLI.Program.AutoMLCommandRunner(AutoMLCommand command, Boolean skipGenerateConsoleApp) in //src/mlnet/Program.cs:line 329
at Microsoft.ML.CLI.Program.<>c.<b__4_0>d.MoveNext() in //src/mlnet/Program.cs:line 89
--- End of stack trace from previous location ---
at System.CommandLine.Invocation.CommandHandler.GetExitCodeAsync(Object value, InvocationContext context)
at System.CommandLine.Invocation.ModelBindingCommandHandler.InvokeAsync(InvocationContext context)
at System.CommandLine.Invocation.InvocationPipeline.<>c__DisplayClass4_0.<b__0>d.MoveNext()
--- End of stack trace from previous location ---
at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass23_0.<b__0>d.MoveNext()
--- End of stack trace from previous location ---
at Microsoft.ML.CLI.Program.<>c__DisplayClass4_0.<b__9>d.MoveNext() in /_/src/mlnet/Program.cs:line 290
--- End of stack trace from previous location ---
at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<b__24_0>d.MoveNext()
--- End of stack trace from previous location ---
at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass22_0.<b__0>d.MoveNext()
--- End of stack trace from previous location ---
at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass11_0.<b__0>d.MoveNext()
--- End of stack trace from previous location ---
at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<b__10_0>d.MoveNext()
--- End of stack trace from previous location ---
at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass14_0.<b__0>d.MoveNext()
Check out log file for more information: c:\Log_data.txt
Exiting ...
C:\Users\W>'