Skip to content

Releases: dotnet/machinelearning

ML.NET 4.0.1

14 Jan 23:35
5a76805
Compare
Choose a tag to compare

ML.NET Servicing Release 4.0.1

Bug Fixes

Build / Test updates

  • Update System.Numerics.Tensors version (#7322) (#7355) - Thanks @asmirnov82!
  • [release/4.0] Update dependencies from dotnet/arcade (#7327)
  • Update MicrosoftExtensionsDependencyModelVersion (#7342)

Documentation Updates

  • [release/4.0] Some tweaks to the Microsoft.ML.Tokenizers PACKAGE.md (#7364)
  • [release/4.0] Fix up docs for MLContext (#7363)

v4.0.0-preview1

13 Mar 06:06
86c11e1
Compare
Choose a tag to compare
v4.0.0-preview1 Pre-release
Pre-release

What's Changed

New Contributors

Full Changelog: v3.0.1...v4.0.0-preview1

ML.NET 3.0.1

13 Mar 02:52
9e500a5
Compare
Choose a tag to compare

ML.NET 3.0.1

New Features

  • Add support for Apache.Arrow.Types.TimestampType to DataFrame (#6871) - Thanks @asmirnov82!

Enhancements

  • Update TorchSharp to latest version (#6954)
  • Reorganize dataframe files (#6872) - Thanks @asmirnov82!
  • Add sample variance and standard deviation to NormalizeMeanVariance (#6885) - Thanks @tearlant!
  • Fixes NER to correctly expand/shrink the labels (#6928)

Bug Fixes

  • Fix SearchSpace reference not being included (#6951)
  • Rename NameEntity to NamedEntity (#6917)
  • Fix assert by only accessing idx (#6924)

Build / Test updates

  • Add Backport github workflow (#6944)
  • Branding for 3.0.1 (#6943)
  • Only use semi-colons for NoWarn - fixes build break (#6935)
  • Update dependencies from dotnet/arcade (#6703)
  • Update dependencies from dotnet/arcade (#6957)
  • Migrate to the 'locker' GitHub action for locking closed/stale issues/PRs (#6896)
  • Make double assertions compare with tolerance instead of precision (#6923)
  • Don't include the SDK in our helix payload (#6918)

Documentation Updates

Breaking changes

  • Rename NameEntity to NamedEntity (#6917)

ML.NET 3.0.0

22 Nov 19:45
d96d7b7
Compare
Choose a tag to compare

ML.NET 3.0.0

New Features

  • Add the ability to use Object Detection using TorchSharp (#6605) - We have added a new deep learning model back by TorchSharp that lets you fine tune your own Object Detection model!
  • Add SamplingKeyColumnName to AutoMLExperiment API (#6649) - You can now set the SamplingKeyColumnName when you are using AutoML. Thanks @torronen!
  • Add Object Detection to AutoML Sweeper (#6633) - Added Object Detection to the AutoML Sweeper so now they can be used together.
  • Add String Vector support to DataFrame (#6628) - Adds support for String Vectors in DataFrame. This also allows for Better IDataView <-> DataFrame conversions.
  • Add AutoZero tuner to BinaryClassification (#6615) - Can now use AutoZero tuner in AutoML Binary Classification experiments.
  • Added in fairness assessment and mitigation (#6539) - Support for fairness assessment and mitigation tool
  • Added in Support for some Intel OneDal Algorithms (#6521) - You can now use Intel's OneDal for some algorithms. This gives you access to some accelerated versions of these algorithms. The models are fully interoperable between ML.NET's normal models and these, so you can train with OneDal and then still run on machines where OneDal is not supported. Thanks @rgesteve!
  • Add in ability to have pre-defined weights for ngrams (#6458) - If you know the weights of your NGrams already you can now directly provide that.
  • Add SentenceSimilarity sweepable estimator in AutoML (#6445) - Can now use SentenceSimilarity with the sweepable estimator.
  • Add VBufferDataFrameCoumn to DataFrame (#6409) - Now DataFrame can support the VBuffer from ML.NET so the IDataView <-> DataFrame conversion can work with those types.
  • Added ADO.NET importing/exporting functionality to DataFrame (#5975) - Can now use ADO.NET import/export with DataFrames. Thanks @andrei-faber!
  • Added native binaries for Windows Arm64 (#6813) - This allows certain native transforms to be run on Widows Arm that were disabled before.
  • Switches some computational code to use the new Tensor Primitives package (#6875)
  • Add QA sweepable estimator in AutoML (#6781)
  • Add NameEntityRecognition and Q&A deep learning tasks. (#6760)
  • Adds the ability to load a pre-trained LightGBM file and import it into ML.Net. (#6569)

Enhancements

  • Expose ExperimentSettings.MaxModel as public (#6663) - Exposes ExperimentSettings.MaxModel as public so now you can set the number of Max Models you want for an AutoML experiment.
  • Update to latest version of TorchSharp (#6636) - Updated to the latest version of TorchSharp and fixed any breaking changes so we can take advantage of their new features and bug fixes.
  • Update to latest version of Onnx Runtime (#6624) - Updated to the latest version of Onnx Runtime and fixed any breaking changes so we can take advantage of their new features and bug fixes.
  • Update ML.NET to compile with .NET8 (#6641) - Removed some deprecated code now throws errors on .NET8 as well as other minor fixes to allow working/building with .NET8.
  • Added more logging to Object Detection (#6646) - Added more logging while Object Detection is training so even if epochs take a long time you can be sure things are still moving.
  • Update timeout error message in AutoMLExperiment (#6613) - Updated the error message so it is more clear what happened.
  • Add batchsize and arch to imageClassification SweepableTrainer (#6597) - Added batchsize and arch to the ImageClassification SweepableTrainer so those can now be trained on.
  • Update max_model when trial fails (#6596)
  • Add default search space for standard trainers (#6576) - Added a default search space for all standard trainers so users have reasonable default values.
  • Adding more metrics to BinaryClassification Experiment (#6571)
  • Add checkAlive in NasBertTrainer (#6546) - Now we check between batches if cancellation was requested and stop processing if so.
  • OneDAL - Fallback to default implementation (#6538) - If you specify you want to use OneDal but something happens that prevents you from using it, like it can't find the binaries/etc, it will auto default back to the normal implementation instead of crashing.
  • Add addKeyValueAnnotationsAsText flag in AutoML (#6535)
  • Add continuous resource monitoring to AutoML.IMonitor (#6520) - Thanks @andrasfuchs!
  • Update WebClient to HttpClient implementations (#6476) - Update the usage of WebClient to HttpClient since WebClient is now deprecated. Thanks @rgesteve!
  • Set AutoML trial to unsuccess if trial loss is nan/inf (#6430) - Now trial will be marked as unsuccesssful if the loss is an invalid number.
  • Add diskConvert option in fast tree search space (#6316)
  • Avoid Boxing/Unboxing on accessing elements of VBufferDataFrameColumn (#6867) and (#6865) - Thanks @asmirnov82!
  • Update LightGBM to version 3.X.X from 2.X.X (#6880)
  • Implement vectorized binary arithmetic operations for DataFrames (#6854) - Thanks @asmirnov82!
  • Upgrade .NET Interactive (#6857) - Thanks @colombod!
  • Improve performance of column cloning inside DataFrame arithmetics (#6814) - Thanks @asmirnov82!
  • Add performance benchmarks for dataframe arithmetic operations (#6827) - Thanks @asmirnov82!
  • Simplify tt files for PrimitiveDataFrameColumnAritmetics (#6830) - Thanks @asmirnov82!
  • Improve performance of DataFrame binary comparison operations (#6869) - Thanks @asmirnov82!
  • Allow a CultureInfo to be used for parsing CSV values into DataFrame (#6782) - Thanks @asmirnov82!
  • File-scoped namespaces in files under Prediction (Microsoft.ML.Core) (#6792) - Thanks @Lehonti!
  • File-scoped namespaces in files under ComponentModel (Microsoft.ML.Core) (#6788) - Thanks @Lehonti!
  • File-scoped namespaces in files under Data (Microsoft.ML.Core) (#6789) - Thanks @Lehonti!
  • File-scoped namespaces in files under EntryPoints (Microsoft.ML.Core) (#6790) - Thanks @Lehonti!
  • File-scoped namespaces in files under Environment (Microsoft.ML.Core) (#6791) - Thanks @Lehonti!
  • Add TargetType to Type_convert (#6785)
  • Modernized some argument checks that still used string literals for parameter names (#6766) - Thanks @Lehonti!
  • Improve DataFrame Arithmetics implementation (#6763) - Thanks @asmirnov82!
  • Fixed mac build and minor torch sharp changes (#6776)
  • Clean DataFrame meaningless code (#6761) - Thanks @asmirnov82!
  • Provide ability to filter dataframe column by null via ElementWise Methods (#6723)
  • Add missing implementation for datetime relevant arrow type into dataframe (#6675) - Thanks @asmirnov82!
  • Fix DataFrame to allow to store columns with size more than 2 Gb (#6710) - Thanks @A...
Read more

ML.NET 3.0.0 Preview 2

16 May 16:55
f93ab25
Compare
Choose a tag to compare
Pre-release

ML.NET 3.0.0 Preview

New Features

  • Add the ability to use Object Detection using TorchSharp (#6605) - We have added a new deep learning model back by TorchSharp that lets you fine tune your own Object Detection model!
  • Add SamplingKeyColumnName to AutoMLExperiment API (#6649) - You can now set the SamplingKeyColumnName when you are using AutoML. Thanks @torronen!
  • Add Object Detection to AutoML Sweeper (#6633) - Added Object Detection to the AutoML Sweeper so now they can be used together.
  • Add String Vector support to DataFrame (#6628) - Adds support for String Vectors in DataFrame. This also allows for Better IDataView <-> DataFrame conversions.
  • Add AutoZero tuner to BinaryClassification (#6615) - Can now user AutoZero tuner in AutoML Binary Classification experiments.
  • Added in fairness assessment and mitigation (#6539) - Support for fairness assessment and mitigation tool
  • Added in Support for some Intel OneDal Algorithms (#6521) - You can now use Intel's OneDal for some algorithms. This gives you access to some accelerated versions of these algorithms. The models are fully interoperable between ML.NET's normal models and these, so you can train with OneDal and then still run on machines where OneDal is not supported. Thanks @rgesteve!
  • Add in ability to have pre-defined weights for ngrams (#6458) - If you know the weights of your NGrams already you can now directly provide that.
  • Add SentenceSimilarity sweepable estimator in AutoML (#6445) - Can now use SentenceSimilarity with the sweepable estimator.
  • Add VBufferDataFrameCoumn to DataFrame (#6445) - Now DataFrame can support the VBuffer from ML.NET so the IDataView <-> DataFrame conversion can work with those types.
  • Added ADO.NET importing/exporting functionality to DataFrame (#5975) - Can now use ADO.NET import/export with DataFrames. Thanks @andrei-faber!

Enhancements

  • Expose ExperimentSettings.MaxModel as public (#6663) - Exposes ExperimentSettings.MaxModel as public so now you can set the number of Max Models you want for an AutoML experiment.
  • Update to latest version of TorchSharp (#6636) - Updated to the latest version of TorchSharp and fixed any breaking changes so we can take advantage of their new features and bug fixes.
  • Update to latest version of Onnx Runtime (#6624) - Updated to the latest version of Onnx Runtime and fixed any breaking changes so we can take advantage of their new features and bug fixes.
  • Update ML.NET to compile with .NET8 (#6641) - Removed some deprecated code now throws errors on .NET8 as well as other minor fixes to allow working/building with .NET8.
  • Added more logging to Object Detection (#6646) - Added more logging while Object Detection is training so even if epochs take a long time you can be sure things are still moving.
  • Update timeout error message in AutoMLExperiment (#6613) - Updated the error message so it is more clear what happened.
  • Add batchsize and arch to imageClassification SweepableTrainer (#6597) - Added batchsize and arch to the ImageClassification SweepableTrainer so those can now be trained on.
  • Update max_model when trial fails (#6596)
  • Add default search space for standard trainers (#6576) - Added a default search space for all standard trainers so users have reasonable default values.
  • Adding more metrics to BinaryClassification Experiment (#6571)
  • Add checkAlive in NasBertTrainer (#6546) - Now we check between batches if cancellation was requested and stop processing if so.
  • OneDAL - Fallback to default implementation (#6538) - If you specify you want to use OneDal but something happens that prevents you from using it, like it can't find the binaries/etc, it will auto default back to the normal implementation instead of crashing.
  • Add addKeyValueAnnotationsAsText flag in AutoML (#6535)
  • Add continuous resource monitoring to AutoML.IMonitor (#6520) - Thanks @andrasfuchs!
  • Update WebClient to HttpClient implementations (#6476) - Update a usage of WebClient to HttpClient since WebClient is now deprecated. Thanks @rgesteve!
  • Set AutoML trial to unsuccess if trial loss is nan/inf (#6430) - Now trial will be marked as unsuccesssful if the loss is an invalid number.
  • Add diskConvert option in fast tree search space (#6316)

Bug Fixes

  • Fix DataFrame ToString (#6673) - Use correct alignment for columns to produce readable output when columns have longer names. Thanks @asmirnov82!
  • Fix DataFrame null math (#6661) - Fixes max in DataFrame columns when there are null values to match what Pandas does.
  • Clean up PrimitiveColumnContainer (#6656) - Cleaned up the code in PrimitiveColumnContainer so its more correct and easier to use.
  • Fix Apply in PrimitiveColumnContainer (#6642) - Fixes the Apply method so it no longer changes the source column. Thanks @janholo!
  • Fix datetime null error (#6627) - Fixes loading a null datetime from a database so it now returns correctly instead of throwing an error.
  • Fix AggregateTrainingStopManager is trying to cancel disposed tokens (#6612) - Will no longer try and cancel already disposed tokens.
  • Fix tostring bug for sweepable pipeline (#6610)
  • Change Test to Validate in Dataset manager (#6599)
  • Fixed System.OperationCanceledException when calling experimentResult.BestRun.Estimator.Fit (#6572)
  • Fixed cancellation bug in SweepablePipelineRunner && Fixed object null exception in AutoML v1.0 regression API (#6560)
  • Fixed one dal dispatching issues (#6547) - OneDal now dispatches correctly.
  • Fixed Multi-threaded access issue (#6537) - Fixed a multi-threaded access issue for variable length string arrays in ONNX models.
  • Fixed AutoML experiments in non declarative style not working (#6447)

Build / Test updates

  • Remove MSIL Check for TorchSharp (#6658) - Removes the MSIL check for TorchSharp while we figure out how we want to correctly handle this.
  • Change code coverage build pool (#6647) - Changed codecoverage build pool so the builds are faster and more stable.
  • Update AutoMLExperimentTests.cs to fix timeout error (#6638)
  • Update FabricBot config (#6619)
  • Libraries area pod updates March 2023 (#6607)
  • Update dependencies from dotnet/arcade (#6566 & #6518 & #6451 & #6439)
  • Mac python fix (#6549)
  • Moving onedal nuget download from onedal to native where its needed for building (#6527)
  • New os image for official builds (#6467)

Documentation Updates

Read more

ML.NET 2.0.1 Preview

28 Nov 19:36
afb4b8b
Compare
Choose a tag to compare
ML.NET 2.0.1 Preview Pre-release
Pre-release

Minor update for 2.0.0 that introduces a new api for ProduceWordBag so you can pass in a column that already has the weights set.

ML.NET 1.7.1

09 Mar 18:41
4cf622e
Compare
Choose a tag to compare

Minor servicing update with dependency updates and PFI bug fix for correctly finding the correct transformer to use.

ML.NET 1.7.0 RC 1

21 Oct 18:12
7f013a2
Compare
Choose a tag to compare
ML.NET 1.7.0 RC 1 Pre-release
Pre-release

ML.NET 1.7.0 RC 1

Moving forward, we are going to be aligning more with the overall .NET release schedule. As such, this is a smaller release since we had a larger one just about 3 months ago but it aligns us with the release of .NET 6.

New Features

ML.NET

  • Switched to getting version from assembly custom attributes- (#4512) Remove reliance on getting product version for model.zip/version.txt from FileVersionInfo and replace with using assembly custom attributes. This will help in supporting single file applications. (Thanks @r0ss88)
  • Can now optionally not dispose of the underlying model when you dispose a prediction engine. (#5964) A new prediction engine options class has been added that lets you determine if the underlying model should be disposed of or not when the prediction engine itself is disposed of.
  • Can now set the number of threads that onnx runtime uses (#5962) This lets you specify the number of parallel threads ONNX runtime will use to execute the graph and run the model. (Thanks @yaeldekel)
  • The PFI API has been completely reworked and is now much more user friendly (#5934) You can now get the output from PFI as a dictionary mapping the column name (or the slot name) to its PFI result.

DataFrame

  • Can now merge using multiple columns in a JOIN condition (#5838) (Thanks @asmirnov82)

Enhancements

ML.NET

  • Run formatting on all src projects (#5937) (Thanks @jwood803)
  • Added BufferedStream for reading from DeflateStream - reduces loading time for .NET core (#5924) (Thanks @martintomasek)
  • Update editor config to match Roslyn and format samples (#5893) (Thanks @jwood803)
  • Few more minor editor config changes (#5933)

DataFrame

  • Use Equals and = operator for DataViewType comparison (#5942) (Thanks @thoron)

Bug Fixes

Build / Test updates

  • Changed the queues used for building/testing from Ubuntu 16.04 to 18.04 (#5970)
  • Add in support for building with VS 2022. (#5956)
  • Codecov yml token was added (#5950)
  • Move from XliffTasks to Microsoft.DotNet.XliffTasks (#5887)

Documentation Updates

  • Fixed up Readme, updated the roadmap, and new doc detailing some platform limitations. (#5892)

Breaking Changes

  • None

ML.NET 1.6.0

15 Jul 17:26
eb9cee6
Compare
Choose a tag to compare

ML.NET 1.6.0

New Features

  • Support for Arm/Arm64/Apple Silicon has been added. (#5789) You can now use most ML.NET on Arm/Arm64/Apple Silicon devices. Anything without a hard dependency on x86 SIMD instructions or Intel MKL are supported.
  • Support for specifying a temp path ML.NET will use. (#5782) You can now set the TempFilePath in the MLContext that it will use.
  • Support for specifying the recursion limit to use when loading an ONNX model (#5840) The recursion limit defaults to 100, but you can now specify the value in case you need to use a larger number. (Thanks @Crabzmatic)
  • Support for saving Tensorflow models in the SavedModel format added (#5797) You can now save models that use the Tensorflow SavedModel format instead of just the frozen graph format. (Thanks @darth-vader-lg)
  • DataFrame Specific enhancements
  • Extended DataFrame GroupBy operation (#5821) Extend DataFrame GroupBy operation by adding new property Groupings. This property returns collection of IGrouping objects (the same way as LINQ GroupBy operation does) (Thanks @asmirnov82)

Enhancements

  • Switched from using a fork of SharpZipLib to using the official package (#5735)
  • Let user specify a temp path location (#5782)
  • Clean up ONNX temp models by opening with a "Delete on close" flag (#5782)
  • Ensures the named model is loaded in a PredictionEnginePool before use (#5833) (Thanks @feiyun0112)
  • Use indentation for 'if' (#5825) (Thanks @feiyun0112)
  • Use Append instead of AppendFormat if we don't need formatting (#5826) (Thanks @feiyun0112)
  • Cast by using is operator (#5829) (Thanks @feiyun0112)
  • Removed unnecessary return statements (#5828) (Thanks @feiyun0112)
  • Removed code that could never be executed (#5808) (Thanks @feiyun0112)
  • Remove some empty statements (#5827) (Thanks @feiyun0112)
  • Added in short-circuit logic for conditionals (#5824) (Thanks @feiyun0112)
  • Update LightGBM to v2.3.1 (#5851)
  • Raised the default recursion limit for ONNX models from 10 to 100. (#5796) (Thanks @darth-vader-lg)
  • Speed up the inference of the Tensorflow saved_models. (#5848) (Thanks @darth-vader-lg)
  • Speed-up bitmap operations on images. (#5857) (Thanks @darth-vader-lg)
  • Updated to latest version of Intel MKL. (#5867)
  • AutoML.NET specific enhancements
  • Offer suggestions for possibly mistyped label column names in AutoML (#5624) (Thanks @Crabzmatic)
  • DataFrame Specific enhancements
  • Improve csv parsing (#5711)
  • IDataView to DataFrame (#5712)
  • Update to the latest Microsoft.DotNet.Interactive (#5710)
  • Move DataFrame to machinelearning repo (#5641)
  • Improvements to the sort routine (#5776)
  • Improvements to the Merge routine (#5778)
  • Improve DataFrame exception text (#5819) (Thanks @asmirnov82)
  • DataFrame csv DateTime enhancements (#5834)

Bug Fixes

  • Fix erroneous use of TaskContinuationOptions in ThreadUtils.cs (#5753)
  • Fix a few locations that can try to access a null object (#5804) (Thanks @feiyun0112)
  • Use return value of method (#5818) (Thanks @feiyun0112)
  • Adding throw to some exceptions that weren't throwing them originally (#5823) (Thanks @feiyun0112)
  • Fixed a situation in the CountTargetEncodingTransformer where it never reached the stop condition (#5822) (Thanks @feiyun0112)
  • DataFrame Specific bug fixes
  • Fix issue with DataFrame Merge method (#5768) (Thanks @asmirnov82)

Build / Test updates

  • Changed default branch from master to main (#5715) (#5717) (#5719)
  • Fix for libomp in the CI process for MacOS 11 (#5771)
  • Minor code cleanup. (#5770)
  • Updated arcade to the latest version (#5783)
  • Switched signing certificate to use dotnet certificate (#5794)
  • Building natively and cross targeting for Arm/Arm64/Apple Silicon is now supported. (#5789)
  • Upload classic pdb to symweb (#5816)
  • Fix MacOS CI issue (#5854)
  • Added in a Helix Integration for testing. (#5837)
  • Added in Helix Integration for arm/arm64/Apple Silicon for testing (#5860)

Documentation Updates

  • Fixed markdown issues in MulticlassClassificationMetrics and CalibratedBinaryClassificationMetrics (#5732) (Thanks @R0Wi)
  • Update unix instructions for x-compiling on ARM (#5811)
  • Update Contribution.MD with description of help wanted tags (#5815)
  • Add Korean translation for repo readme.md (#5780) (Thanks @metr0jw)
  • Fix spelling error in MLContext class summary (#5832) (Thanks @Crabzmatic)
  • Update issue templates (#5846)

Breaking Changes

  • None

ML.NET 1.5.5

03 Mar 19:20
58450d4
Compare
Choose a tag to compare

New Features

  • New API allowing confidence parameter to be a double.(#5623)
    . A new API has been added to accept double type for the confidence level. This helps when you need to have higher precision than an int will allow for. (Thank you @esso23)
  • Support to export ValueMapping estimator to ONNX was added (#5577)
  • New API to treat TensorFlow output as batched/not-batched (#5634) A new API has been added so you can specify if the output from TensorFlow is batched or not.

Enhancements

  • Make ColumnInference serializable (#5611)

Bug Fixes

  • AutoML.NET specific fixes.
    • Fixed an AutoML aggregate timeout exception (#5631)
    • Offer suggestions for possibly mistyped label column names in AutoML (#5624) (Thank you @Crabzmatic)
  • Update some ToString conversions (#5627) (Thanks @4201104140)
  • Fixed an issue in SRCnnEntireAnomalyDetector (#5579)
  • Fixed nuget.config multi-feed issue (#5614)
  • Remove references to Microsoft.ML.Scoring (#5602)
  • Fixed Averaged Perceptron default value (#5586)

Build / Test updates

  • Fixing official build by adding homebrew bug workaround (#5596)
  • Nuget.config url fix for roslyn compilers (#5584)
  • Add SymSgdNative reference to AutoML.Tests.csproj (#5559)

Documentation Updates

  • Updated documentation for the correct version of CUDA for TensorFlow. (#5635)
  • Updates documentation for an issue with brew and installing libomp. (#5635)
  • Updated an ONNX url to the correct url. (#5635)
  • Added a note in the documentation that the PredictionEngine is not thread safe. (#5583)

Breaking Changes

  • None