-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Issue, ML.net C#] 330GB csv file of data cause a OutOfMemoryException (2/2) #6297
Comments
This looks like the old version of AutoML. @wil70 can you please try to use the latest preview version of AutoML. https://www.nuget.org/packages/Microsoft.ML.AutoML/0.20.0-preview.22313.1 Here are some samples that use the new API. |
@wil70 let us know if @luisquintanilla comments resolve your issue, and a tip:
|
This issue has been marked |
Hello, Sorry for the delay, it takes time to reproduce the issue and get the error message with big files.
Thanks Wilhelm cc: @dakersnar @LittleLittleCloud @luisquintanilla @michaelgsharp |
Thanks for that detailed report @wil70. @LittleLittleCloud can you please take a look. |
@wil70
and hopefully it can resolve oom exception you have. I just launched an experiment to verify that yesterday and it's still running,, will get back to this thread if the running is succeed. |
Nice, exciting, let me know when done and I can run a test. I'm wondering if that will also solve #6288 Note: My next steps after 330GB will be to aim for 2TB+ file - TY! |
@LittleLittleCloud after you other PR goes in is this issue good to be closed? |
@michaelgsharp Nope, still looking into that @luisquintanilla maybe we should provide a memory-saving automl solution as we have a lot of similar issues on OOM error, in both model builder and automl.net like this one dotnet/machinelearning-modelbuilder#2328 |
@LittleLittleCloud control of or more efficient resource management is something we definitely want to address. Not sure if I'm misremembering. I know that you're able to inspect the amount of resources used by each of the trials. Is there a setting when you configure an experiment where you're able to cap the amount of resources used? As far as I know, today the options for managing resources are:
Are there any others? |
Suggestions (guesses) for future features:
Maybe docs could include a suggestion to add a fixed-sized paging file. I had problems when I allowed Windows to manage it, but after adding a fixed 500-600 gb paging file I no longer experience running out of physical memory with tree-based algorithms. In my datasets it could go as high as 650 GB. I now always add a paging file save to 2 physical nvme drives, and it has been ok. Paging file is able to stripe between the drives so putting to 2 drives is faster. I also tried for a few months with 1TB and 2TB RAM servers, but it did not have a big impact in comparison to nvme paging files, especially considering the cost of such servers. I mostly used Microsoft.ML but also tried with Model Builder. |
System Information (please complete the following information):
Describe the bug
When I start c# AutoML in c# I get a OutOfMemoryException after the memory reach the maximum of 64GB.
I have 64GB Ram, I have a 330GB csv file of data.
Note: I couldn't do it with ML.net CLI due to this bug #6288 so I tried to do it with c# AutoML package. I'm totally new at ML.NET, sorry in adavance for the code quality
To Reproduce
Steps to reproduce the behavior:
Expected behavior
I expect to be able to be able to handle 2TB files and 100K columns without any issue with ML.Net CLI and also with c# on a 64GB ram computer by streaming the data instead of loading all in memeory.
Screenshots, Code, Sample Projects
If applicable, add screenshots, code snippets, or sample projects to help explain your problem.
Additional context
Add any other context about the problem here.
I have a 330gb file (64 gb ram). I tried ML.NET CLI but hit a bug see.
So I'm now trying with c#, the bug is different than the ML.NET CLI issue as it seems to try to load everything in memory
IDataView trainingData = mlContext.Data.LoadFromTextFile(
"c:\data.csv",
separatorChar: ',', hasHeader: true, trimWhitespace: true);
.....
public class ModelInput
{
[LoadColumn(0), NoColumn]
public string _data0 { get; set; }
There is a Exception of type 'System.OutOfMemoryException' was thrown.
(new System.Collections.Generic.Mscorlib_CollectionDebugView<Microsoft.ML.AutoML.RunDetail<Microsoft.ML.Data.MulticlassClassificationMetrics>>(experimentResult.RunDetails).Items[0]).Exception.StackTrace
at Microsoft.ML.Internal.Utilities.OrderedWaiter.Wait(Int64 position, CancellationToken token)
at Microsoft.ML.Data.CacheDataView.GetPermutationOrNull(Random rand)
at Microsoft.ML.Data.CacheDataView.GetRowCursorSetWaiterCore[TWaiter](TWaiter waiter, Func
2 predicate, Int32 n, Random rand) at Microsoft.ML.Data.CacheDataView.GetRowCursorSet(IEnumerable
1 columnsNeeded, Int32 n, Random rand)at Microsoft.ML.Data.OneToOneTransformBase.GetRowCursorSet(IEnumerable
1 columnsNeeded, Int32 n, Random rand) at Microsoft.ML.Data.DataViewUtils.TryCreateConsolidatingCursor(DataViewRowCursor& curs, IDataView view, IEnumerable
1 columnsNeeded, IHost host, Random rand)at Microsoft.ML.Data.TransformBase.GetRowCursor(IEnumerable
1 columnsNeeded, Random rand) at Microsoft.ML.Trainers.TrainingCursorBase.FactoryBase
1.Create(Random rand, Int32[] extraCols)at Microsoft.ML.Trainers.OnlineLinearTrainer
2.TrainCore(IChannel ch, RoleMappedData data, TrainStateBase state) at Microsoft.ML.Trainers.OnlineLinearTrainer
2.TrainModelCore(TrainContext context)at Microsoft.ML.Trainers.TrainerEstimatorBase
2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor) at Microsoft.ML.Trainers.OneVersusAllTrainer.TrainOne(IChannel ch, ITrainerEstimator
2 trainer, RoleMappedData data, Int32 cls)at Microsoft.ML.Trainers.OneVersusAllTrainer.Fit(IDataView input)
at Microsoft.ML.Data.EstimatorChain
1.Fit(IDataView input) at Microsoft.ML.Data.EstimatorChain
1.Fit(IDataView input)at Microsoft.ML.AutoML.RunnerUtil.TrainAndScorePipeline[TMetrics](MLContext context, SuggestedPipeline pipeline, IDataView trainData, IDataView validData, String groupId, String labelColumn, IMetricsAgent`1 metricsAgent, ITransformer preprocessorTransform, FileInfo modelFileInfo, DataViewSchema modelInputSchema, IChannel logger)
There is a Exception of type 'System.OutOfMemoryException' was thrown.
(new System.Collections.Generic.Mscorlib_CollectionDebugView<Microsoft.ML.AutoML.RunDetail<Microsoft.ML.Data.MulticlassClassificationMetrics>>(experimentResult.RunDetails).Items[0]).Exception.InnerException.StackTrace
at Microsoft.ML.Internal.Utilities.ArrayUtils.EnsureSize[T](T[]& array, Int32 min, Int32 max, Boolean keepOld, Boolean& resized)
at Microsoft.ML.Internal.Utilities.BigArray
1.AddRange(ReadOnlySpan
1 src)at Microsoft.ML.Data.CacheDataView.ColumnCache.ImplVec`1.CacheCurrent()
at Microsoft.ML.Data.CacheDataView.Filler(DataViewRowCursor cursor, ColumnCache[] caches, OrderedWaiter waiter)
The text was updated successfully, but these errors were encountered: