Merge master into release/preview branch for v0.3 #426

shauheen · 2018-06-27T04:46:43Z

This PR is cumulatively merging master into release branch in preparation for v0.3 release.

* Changes to RocketEngine to fix take top k logic. * Add namespace information to allow file to reference correct version of Formatting object.

* make class partial so I can add constuctor in separate file. add constructros for testing * formatting

…rics. Made the private const strings in two classes public. (#276)

* add missing subcomponents * right one * more cleanup

* first attempt * add comments * specify seed for random. make constructor internal.

* Fix for SupportedMetric.ByName() method. Include new unit test for function. * Fix for SupportedMetric.ByName() method. Include new unit test for function. * Fix for SupportedMetric.ByName() method. Include new unit test for function. * Removed unnecessary field filter, per review comment.

When training a FastTreeRanker using the `testFrequency` parameter, it is expected that NDCG is prented every testFrequency iterations. However, instead of NDCG, only empty strings are printed. The root cause was that the MaxDCG property of the dataset was never calculated, so the NDCG calculation is aborted, leaving an empty string as a result. This PR fixes the problem by computing the MaxDCG for the dataset when the Tests are defined (so that if the tests are not defined, the MaxDCG will never be calculated). Closes #242

Our release notes link is broken because the `Documentation` was renamed to `docs`. Fix this for the future to use a redirection link.

* Add release notes for ML.NET 0.2 * Adding release note about TextLoader changes and additional issue/PR references * Addressing comments: fixing typos, changing formatting, and adding references

…291) * Add label/grou/weight column name arguments to CV and train-test macros * Fix unit test. * Merge. * Update CSharp API. * Fix EntryPointCatalog test. * Address PR comments.

* update sample with new text loader API. * update with 0.2 stuff.

* Respect normalization in OVA. * some cleanup * fix copypaste issues

…training and inference (#248) * Export to ONNX and Maml cross-platform executable.

* Add Cluster evaluator * fix copypaste * address comments * formatting

The tests do not pass on systems with locale other than en-US. The error happens since the results are written to files and the contents of the files are compared to set of correct results produced under en-US locale. The fix is to imbue en-US culture to the test thread so that results will be output in format that is comparable with the test format. This patch fixes only tests, but do not guarantee calculation will be correct in production systems using a locale different than en-US. In particular, there can be problems in reading data and then conversing data from characters to numeric format. Fixes #74

…185) * Implement `ICanGetSummaryAsIDataView` on `PcaPredictor` class * Implement `ICanGetSummaryAsIRow` on `LinearPredictor` class

* Disable ols by temporarily removing the entry point. It may be added again once we figure out how to ship MKL as part of this project.

Add `Append` function to pipeline for more fluent API than that allowed by `Add`

fix namespace issue and refactoring

… readable. (#324)

…or (#338) `CalibratorUtils.TrainCalibrator` and `TrainCalibratorIfNeeded` now creates `CalibratedPredictor` instead of `SchemaBindableCalibratedPredictor` whenever the predictor implements `IValueMapper`.

…osal. (#369) * Subclasses of `Stream` now have `Close` call `base.Close` to ensure disposal. * Add DeleteOnClose to File opening. * Remove explicit delete of file. * Remove explicit close of substream. * Since no longer deleting explicitly, no longer need `_overflowPath` member.

* Changed List to HashSet to ensure that there are no duplicates

* Update fast tree argument help text * Update wording * Update API to fix test * Update core manifest JSON to update help text

* Add a way to create a single tree ensemble model from multiple tree ensemble models. * Address PR comments, and fix bugs in serializing/deserializing RegressionTrees. * Address PR comments.

add pipelineitem for Ova

…ryPoints.md and GraphRunner.md (#295) * Adding EntryPoints.md and GraphRunner.md * addressing PR feedback * Updating the title of the GraphRunner.md file * adressing Tom's feedback * adressing feedback * code formatting for class names * Addressing Gal's comments * Adding an example of an entry point. Fixing casing on ML.NET * fixing link

Corrects an unintentional "typo" in FastTreeRanking.cs where there was mistakenly a USE_FASTTREENATIVE2 instead of USE_FASTTREENATIVE. This resulted in some obscure hidden ranking options (distance weighting, normalize query lambdas, and a few others) being unavailable. These are important for some applications.

* LightGBM and test. * add test baselines and nuget source for lightGBM binaries. * Add entrypoint for lightGBM. * add unsafe flag for release build. * update nuget version. * make lightgbm test single threaded. * install gcc on OS machines to resolve dependencies on openmp thatis needed by lightgbm native code. * PR comments. Leave BREW and GCC in bash script to verify macOS tests work. * remove brew and gcc from build script. * PR feedback. * disable test on macOS. * disable test on macOS. * PR feedback.

* Adding Factorization Machines

* ONNX API documentation.

Introduce Ensemble codebase

Create a shorter temp file name for model loading, as well as remove the potential for a race condition among multiple openings by using the creation of a lock file.

… unecessary cmake version requirement (#425)

…lues (#394) * Fix EvaluatorUtils to handle label column of type key without text key values.

TomFinley · 2018-06-27T05:03:05Z

More of a question of procedure, what is the role of this "preview" branch? I would have expected that we'd just take a branch off master at a given point, and call that the release, rather than try to merge into some other branch named "preview."

Also I don't see that there was ever a 0.2 branch. Did someone forget to do that, or was that just not part of the plan?

shauheen · 2018-06-27T05:10:37Z

@TomFinley We did not forget. The plan was to move away from individual branches per release, as we will have many of those and the engineering system needed to support all of those branches would have become complex. Instead it was proposed that release/preview branch would be updated at each release with latest from master and tagged for each release.

TomFinley · 2018-06-27T05:24:34Z

OK. Don't really get it but that's fine. Is the divergence of master and preview intentional? Edit: Whoops, I'm just wrong.

TomFinley

Thanks for doing this @shauheen !

eerhardt

Looks good to me.

eerhardt and others added 30 commits May 30, 2018 15:17

Bump master to v0.3 (#269)

10508f8

RocketEngine fix for selecting top learners (#270)

c259863

* Changes to RocketEngine to fix take top k logic. * Add namespace information to allow file to reference correct version of Formatting object.

small code cleanup (#271)

9d19d0e

Preparation for syncing sources with internal repo (#275)

fbd4de0

* make class partial so I can add constuctor in separate file. add constructros for testing * formatting

Changes to use evaluator metrics names in PipelineSweeperSupportedMet…

ba9c0f6

…rics. Made the private const strings in two classes public. (#276)

add missing subcomponents to sweepers (#278)

5dc7848

* add missing subcomponents * right one * more cleanup

remove lotus references. (#252)

71e7ff3

Random seed and concurrency for tests (#277)

fb06f38

* first attempt * add comments * specify seed for random. make constructor internal.

Fixed typo in the method summary (#296)

3730336

Remove stale line of code from test. (#297)

7881056

Update release notes link to use aka.ms. (#294)

ae31bbb

Our release notes link is broken because the `Documentation` was renamed to `docs`. Fix this for the future to use a redirection link.

Add release notes for ML.NET 0.2 (#301)

d54869e

* Add release notes for ML.NET 0.2 * Adding release note about TextLoader changes and additional issue/PR references * Addressing comments: fixing typos, changing formatting, and adding references

Get the cross validation macro to work with non-default column names (#…

a5faca6

…291) * Add label/grou/weight column name arguments to CV and train-test macros * Fix unit test. * Merge. * Update CSharp API. * Fix EntryPointCatalog test. * Address PR comments.

update sample in README.MD with 0.2 features. (#304)

ab4108d

* update sample with new text loader API. * update with 0.2 stuff.

OVA should respect normalization in underlying learner (#310)

5730685

* Respect normalization in OVA. * some cleanup * fix copypaste issues

Export to ONNX and cross-platform command-line tool to script ML.NET …

1bb1249

…training and inference (#248) * Export to ONNX and Maml cross-platform executable.

Add Cluster evaluator (#316)

53b748d

* Add Cluster evaluator * fix copypaste * address comments * formatting

Add PartitionedFileLoader (#61)

03fade6

Remove unexisting project from solution (#335)

bcce64e

GetSummaryDataView/Row implementation for Pca and Linear Predictors (#…

20099c3

…185) * Implement `ICanGetSummaryAsIDataView` on `PcaPredictor` class * Implement `ICanGetSummaryAsIRow` on `LinearPredictor` class

Disable ordinary least squares by removing the entry point (#286)

28c0709

* Disable ols by temporarily removing the entry point. It may be added again once we figure out how to ship MKL as part of this project.

add append function to pipeline (#284)

d1a350f

Add `Append` function to pipeline for more fluent API than that allowed by `Add`

Removed field/column name checking of input type in TextLoader. (#327)

45ced36

fix namespace issue in CSharpGenerator and some refactoring (#339)

1fc3069

fix namespace issue and refactoring

Using named-tuple in OneToOneTransforms' constructor to make API more…

81d40a9

… readable. (#324)

Minor formatting in CollectionDataSourceTests.cs (#348)

f6c6f5b

Create CalibratedPredictor instead of SchemaBindableCalibratedPredict…

f2888be

…or (#338) `CalibratorUtils.TrainCalibrator` and `TrainCalibratorIfNeeded` now creates `CalibratedPredictor` instead of `SchemaBindableCalibratedPredictor` whenever the predictor implements `IValueMapper`.

TomFinley and others added 20 commits June 18, 2018 12:43

Return distinct array of ParameterSet when ProposeSweep is called (#368)

7f8caf7

* Changed List to HashSet to ensure that there are no duplicates

Update fast tree argument help text (#372)

09f7c66

* Update fast tree argument help text * Update wording * Update API to fix test * Update core manifest JSON to update help text

Combine multiple tree ensemble models into a single tree ensemble (#364)

8b01fc5

* Add a way to create a single tree ensemble model from multiple tree ensemble models. * Address PR comments, and fix bugs in serializing/deserializing RegressionTrees. * Address PR comments.

add pipelineitem for Ova (#363)

e5de547

add pipelineitem for Ova

Fix CV macro to output the warnings data view properly. (#385)

ead943e

Link to an example on using converting ML.NET model to ONNX. (#386)

496d3b9

Adding LDA Transform (#377)

0d5e317

Adding Factorization Machines (#383)

31ae678

* Adding Factorization Machines

ONNX API documentation. (#419)

17f944c

* ONNX API documentation.

Bring ensembles into codebase (#379)

dbbc69e

Introduce Ensemble codebase

enable macOS tests for LightGBM. (#422)

f94203e

Create a shorter temp file name for model loading. (#397)

211c043

Create a shorter temp file name for model loading, as well as remove the potential for a race condition among multiple openings by using the creation of a lock file.

removing extraneous character that broke the linux build, and with it…

6c4470f

… unecessary cmake version requirement (#425)

EvaluatorUtils to handle label column of type key without text key va…

bca008b

…lues (#394) * Fix EvaluatorUtils to handle label column of type key without text key values.

Removing non source files from solution (#362)

36b5bb1

Merge branch 'master' into release/preview

6652320

shauheen requested review from eerhardt and glebuk June 27, 2018 04:48

TomFinley approved these changes Jun 27, 2018

View reviewed changes

shauheen requested a review from danmoseley June 27, 2018 11:54

eerhardt approved these changes Jun 27, 2018

View reviewed changes

eerhardt merged commit 185912b into dotnet:release/preview Jun 27, 2018

ghost locked as resolved and limited conversation to collaborators Mar 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Merge master into release/preview branch for v0.3 #426

Merge master into release/preview branch for v0.3 #426

Uh oh!

shauheen commented Jun 27, 2018

Uh oh!

TomFinley commented Jun 27, 2018 •

edited

Loading

Uh oh!

shauheen commented Jun 27, 2018

Uh oh!

TomFinley commented Jun 27, 2018 •

edited

Loading

Uh oh!

TomFinley left a comment

Uh oh!

eerhardt left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Merge master into release/preview branch for v0.3 #426

Merge master into release/preview branch for v0.3 #426

Uh oh!

Conversation

shauheen commented Jun 27, 2018

Uh oh!

TomFinley commented Jun 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shauheen commented Jun 27, 2018

Uh oh!

TomFinley commented Jun 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TomFinley left a comment

Choose a reason for hiding this comment

Uh oh!

eerhardt left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

TomFinley commented Jun 27, 2018 •

edited

Loading

TomFinley commented Jun 27, 2018 •

edited

Loading