Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some tests (eight) fail in Microsoft.ML.Predictor.Tests #74

Closed
veikkoeeva opened this issue May 8, 2018 · 11 comments
Closed

Some tests (eight) fail in Microsoft.ML.Predictor.Tests #74

veikkoeeva opened this issue May 8, 2018 · 11 comments
Assignees

Comments

@veikkoeeva
Copy link
Contributor

System information

.NET Command Line Tools (2.1.200)

Product Information:
Version: 2.1.200
Commit SHA-1 hash: 2edba8d7f1

Runtime Environment:
OS Name: Windows
OS Version: 10.0.17134
OS Platform: Windows
RID: win10-x64
Base Path: C:\Program Files\dotnet\sdk\2.1.200\

Microsoft .NET Core Shared Framework Host

Version : 2.0.7
Build : 2d61d0b043915bc948ebf98836fefe9ba942be11

Issue

  1. I checkout the project,
  2. Ran .\build.cmd on PowerShell prompt
  3. Opened the project in VS 2017 15.7
  4. Ran all tests, some of which failed (see attached image).

I did expect all tests to pass.

fail line
The assert here fails. I didn't go further to check if the cause is a programming error or something else as this was more of curiosity, but noted here. :)

These are all of the eight tests that fail due to the assert in the previous image:
failed tests

@Anipik
Copy link
Contributor

Anipik commented May 9, 2018

@veikkoeeva can you please paste the error output ?

@veikkoeeva
Copy link
Contributor Author

@Anipik Hmm, indeed... It failed on the assert, but there might be something written to stdout before that (a refactoring idea, might be good to assert the terms directly). I'm not on that code currently, but I'll check in about 14 hours.

@Anipik
Copy link
Contributor

Anipik commented May 9, 2018

@veikkoeeva generally it mentions the line and the file name where the matching of the outputs failed.

veikkoeeva added a commit to veikkoeeva/machinelearning that referenced this issue May 10, 2018
The tests do not pass on machines that have different
formatting than English language. The error happens since
the results are written in different than expected format.

Fixes dotnet#74
@veikkoeeva
Copy link
Contributor Author

veikkoeeva commented May 10, 2018

@Anipik I took a look and tried to fix a bit, but I could lend a hand a bit. Do you happen to know where the results to LogisticRegression-bin-norm-CV-breast-cancer-rp.txt are written? This eludes me somehow. The result file looks like this

LogisticRegression
AUC	Accuracy	Positive precision	Positive recall	Negative precision	Negative recall	Log-loss	Log-loss reduction	F1 Score	AUPRC	/l2	/ot	/nt	Learner Name	Train Dataset	Test Dataset	Results File	Run Time	Physical Memory	Virtual Memory	Command Line	Settings	
0,9945	0,969373	0,959559	0,952772	0,975316	0,977394	0,134393	85,57494	0,956039	0,988987	0,1	0,001	1	LogisticRegression	%Data%		%Output%	99	0	0	maml.exe CV tr=LogisticRegression{l1=1.0 l2=0.1 ot=1e-3 nt=1} threads=- dout=%Output% data=%Data% seed=1 xf=BinNormalizer{col=Features numBins=5}	/l2:0,1;/ot:0,001;/nt:1	

and I suppose the problem is visible. In the PR I have gone through the places the code uses one-by-one and the results in some other files, such as in LogisticRegression-bin-norm-CV-breast-cancer-out.txt look consistent with the assertion data currently (though the runner still reports it as a failure, I'll check that later).

@TomFinley
Copy link
Contributor

Ah that's interesting @veikkoeeva , thanks for bringing this up.

The rp files are written by the so-called ResultProcessor, the code for which lies in the src/Microsoft.ML.ResultProcessor project... so this LogisticRegression-bin-norm-CV-breast-cancer-rp.txt file, I would expect is the result of running ResultProcessor on top of the LogisticRegression-bin-norm-CV-breast-cancer-out.txt I'd expect to see alongside it.

@Anipik
Copy link
Contributor

Anipik commented May 10, 2018

Yeah the problem is that the decimal separator in your language pack is comma instead of decimal.
and we just match the rp files using string matching

The matching is being done here https://github.com/Anipik/machinelearning/blob/master/test/Microsoft.ML.TestFramework/BaseTestPredictorsMaml.cs#L211

The fix could be https://github.com/Anipik/machinelearning/blob/master/test/Microsoft.ML.TestFramework/BaseTestBaseline.cs#L618
instead of directly using . as a data separator you can dynamically obtain the decimal separator

@Anipik
Copy link
Contributor

Anipik commented May 10, 2018

@danmosemsft can you add a non-english queue for this repo too ?

@veikkoeeva
Copy link
Contributor Author

@TomFinley Thanks, I'll see if get the rest fixed today (it's 19:00 here).

@Anipik Hmm, good to know. As you can see, I've tried to fix all instances that print numbers in non-invariant way. It occurred I could fix the comparison, but then one would have files that aren't easy to diff, say, when asking help here or comparing oneself. If this approach is OK, I think I should add a note to the commit about this.

My locale is fi-FI, by the way.

veikkoeeva added a commit to veikkoeeva/machinelearning that referenced this issue May 10, 2018
The tests do not pass on machines that have different formatting than
English language. The error happens since the results are written in different
than expected format.

The purpose here is not to fix just comparison, but also printing to keep the
result files diffable and consistent with the baseline. This also keeps the code
more straight in that human readable output should be printed with invariant culture
if the purpose is culture insensitive machine comparison. This particular aspect could
perhaps be refactored so that the culture is chosen when printing results.

Fixes dotnet#74
veikkoeeva added a commit to veikkoeeva/machinelearning that referenced this issue May 10, 2018
The tests do not pass on machines that have different formatting than
English language. The error happens since the results are written in different
than expected format.

The purpose here is not to fix just comparison, but also printing to keep the
result files diffable and consistent with the baseline. This also keeps the code
more straight in that human readable output should be printed with invariant culture
if the purpose is culture insensitive machine comparison. This particular aspect could
perhaps be refactored so that the culture is chosen when printing results.

Fixes dotnet#74
@danmoseley
Copy link
Member

@danmosemsft can you add a non-english queue for this repo too ?

@Anipik feel free to open an issue, and make the addition, if it's analogous to corefx's.

veikkoeeva added a commit to veikkoeeva/machinelearning that referenced this issue May 11, 2018
The tests do not pass on machines that have different formatting than
English language. The error happens since the results are written in different
than expected format.

1. The main fix is to imbue en-US culture to the test thread so that results
will be output in format that is comparable with the test format.

2. A secondary fix is to make comparisons between culture sensitive data type
representations invariant when they do not have human readable dimensions. In
OptimizationMonitor.cs case the cast between culture sensitive floating point
and string will cause orders of magnitudes of error in output results.

The intention of this path is not to offer a robust solution and remove future
issues. There is room for refactoring where, for instance, locale information
would be applied to input and output and logging/tracing would be clearly
separated from another kind of locale sensitive handling. This way culture
sensitive parts would be separated and particular output formats could be
tested as separate cases if so desired.

Fixes dotnet#74
veikkoeeva added a commit to veikkoeeva/machinelearning that referenced this issue May 11, 2018
The tests do not pass on machines that have different formatting than
English language. The error happens since the results are written in different
than expected format.

1. The main fix is to imbue en-US culture to the test thread so that results
will be output in format that is comparable with the test format.

2. A secondary fix is to make comparisons between culture sensitive data type
representations invariant when they do not have human readable dimensions. In
OptimizationMonitor.cs case the cast between culture sensitive floating point
and string will cause orders of magnitudes of error in output results.

The intention of this path is not to offer a robust solution and remove future
issues. There is room for refactoring where, for instance, locale information
would be applied to input and output and logging/tracing would be clearly
separated from another kind of locale sensitive handling. This way culture
sensitive parts would be separated and particular output formats could be
tested as separate cases if so desired.

Fixes dotnet#74
@Anipik
Copy link
Contributor

Anipik commented May 29, 2018

@veikkoeeva is this issue resolved ?

@veikkoeeva
Copy link
Contributor Author

@Anipik Ah, sorry, I got delayed. I'm currently in www.arctic15.com and rather tied. See the linked PR for further discussion and the last comment. I should be able to get onto it the next weekend, knock on the tree. Naturally if you need to do something in the meantime, feel free to do so. :)

veikkoeeva added a commit to veikkoeeva/machinelearning that referenced this issue Jun 1, 2018
The tests do not pass on systems with locale other than en-US.
The error happens since the results are written to files and the
contents of the files are compared to set of correct results produced
under en-US locale.

The fix is to imbue en-US culture to the test thread so that results
will be output in format that is comparable with the test format.

This patch fixes only tests, but do not guarantee calculation will be
correct in production systems using a locale different than en-US. In
particular, there can be problems in reading data and then conversing
data from characters to numeric format.

Fixes dotnet#74
shauheen pushed a commit that referenced this issue Jun 7, 2018
The tests do not pass on systems with locale other than en-US.
The error happens since the results are written to files and the
contents of the files are compared to set of correct results produced
under en-US locale.

The fix is to imbue en-US culture to the test thread so that results
will be output in format that is comparable with the test format.

This patch fixes only tests, but do not guarantee calculation will be
correct in production systems using a locale different than en-US. In
particular, there can be problems in reading data and then conversing
data from characters to numeric format.

Fixes #74
eerhardt pushed a commit that referenced this issue Jun 27, 2018
* Bump master to v0.3 (#269)

* RocketEngine fix for selecting top learners (#270)

* Changes to RocketEngine to fix take top k logic.

* Add namespace information to allow file to reference correct version of Formatting object.

* small code cleanup (#271)

* Preparation for syncing sources with internal repo (#275)

* make class partial so I can add constuctor in separate file. add constructros for testing

* formatting

* Changes to use evaluator metrics names in PipelineSweeperSupportedMetrics. Made the private const strings in two classes public. (#276)

* add missing subcomponents to sweepers (#278)

* add missing subcomponents

* right one

* more cleanup

* remove lotus references. (#252)

* Random seed and concurrency for tests (#277)

* first attempt

* add comments

* specify seed for random.
make constructor internal.

* Fix SupportedMetric.ByName() method (#280)

* Fix for SupportedMetric.ByName() method. Include new unit test for function.

* Fix for SupportedMetric.ByName() method. Include new unit test for function.

* Fix for SupportedMetric.ByName() method. Include new unit test for function.

* Removed unnecessary field filter, per review comment.

* ML.NET-242: FastTreeRanking per-iteration loss metrics are empty (#289)

When training a FastTreeRanker using the `testFrequency` parameter, it is expected that NDCG is prented every testFrequency iterations. However, instead of NDCG, only empty strings are printed.

The root cause was that the MaxDCG property of the dataset was never calculated, so the NDCG calculation is aborted, leaving an empty string as a result.

This PR fixes the problem by computing the MaxDCG for the dataset when the Tests are defined (so that if the tests are not defined, the MaxDCG will never be calculated).

Closes #242

* Fixed typo in the method summary (#296)

* Remove stale line of code from test. (#297)

* Update release notes link to use aka.ms. (#294)

Our release notes link is broken because the `Documentation` was renamed to `docs`. Fix this for the future to use a redirection link.

* Add release notes for ML.NET 0.2 (#301)

* Add release notes for ML.NET 0.2

* Adding release note about TextLoader changes and additional issue/PR references

* Addressing comments: fixing typos, changing formatting, and adding references

* Get the cross validation macro to work with non-default column names (#291)

* Add label/grou/weight column name arguments to CV and train-test macros

* Fix unit test.

* Merge.

* Update CSharp API.

* Fix EntryPointCatalog test.

* Address PR comments.

* update sample in README.MD with 0.2 features. (#304)

* update sample with new text loader API.

* update with 0.2 stuff.

* OVA should respect normalization in underlying learner (#310)

* Respect normalization in OVA.

* some cleanup

* fix copypaste issues

* Export to ONNX and cross-platform command-line tool to script ML.NET training and inference (#248)

* Export to ONNX and Maml cross-platform executable.

* Add Cluster evaluator (#316)

* Add Cluster evaluator

* fix copypaste

* address comments

* formatting

* Fixes locale dependent test output comparisons (#109)

The tests do not pass on systems with locale other than en-US.
The error happens since the results are written to files and the
contents of the files are compared to set of correct results produced
under en-US locale.

The fix is to imbue en-US culture to the test thread so that results
will be output in format that is comparable with the test format.

This patch fixes only tests, but do not guarantee calculation will be
correct in production systems using a locale different than en-US. In
particular, there can be problems in reading data and then conversing
data from characters to numeric format.

Fixes #74

* Add PartitionedFileLoader (#61)

* Remove unexisting project from solution (#335)

* GetSummaryDataView/Row implementation for Pca and Linear Predictors (#185)

* Implement `ICanGetSummaryAsIDataView` on `PcaPredictor` class
* Implement `ICanGetSummaryAsIRow` on `LinearPredictor` class

* Disable ordinary least squares by removing the entry point (#286)

* Disable ols by temporarily removing the entry point. It may be added again once we figure out how to ship MKL as part of this project.

* add append function to pipeline (#284)

Add `Append` function to pipeline for more fluent API than that allowed by `Add`

* Removed field/column name checking of input type in TextLoader.  (#327)

* fix namespace issue in CSharpGenerator and some refactoring (#339)

fix namespace issue and refactoring

* Using named-tuple in OneToOneTransforms' constructor to make API more readable. (#324)

* Minor formatting in CollectionDataSourceTests.cs (#348)

* Create CalibratedPredictor instead of SchemaBindableCalibratedPredictor (#338)

`CalibratorUtils.TrainCalibrator` and `TrainCalibratorIfNeeded` now creates `CalibratedPredictor` instead of `SchemaBindableCalibratedPredictor` whenever the predictor implements `IValueMapper`.

* Remove reference and dependency on System.ValueTuple (#351)

* Add link to samples (#355)

* Use HideEnumValueAttribute for both manifest and C# API generation. (#356)

* Use HideEnumValueAttribute for both manifest and C# API generation.
* Unhide NAReplaceTransform.ReplacementKind.SpecifiedValue. This may require some other PR to resolve the corresponding issues.

* Move the NuGet package build files into a TFM specific directory. (#370)

When installing Microsoft.ML on an unsupported framework (like net452), it is currently getting installed successfully. However, users should be getting an error stating that net452 is not supported by this package.

The cause is the build files exist for any TFM, which NuGet interprets as this package supports any TFM. Moving the build files to be consistent with the 'lib' folder support.

Fix #357

* `Stream` subclasses now have `Close` call `base.Close` to ensure disposal. (#369)

* Subclasses of `Stream` now have `Close` call `base.Close` to ensure disposal.
* Add DeleteOnClose to File opening.
* Remove explicit delete of file.
* Remove explicit close of substream.
* Since no longer deleting explicitly, no longer need `_overflowPath` member.

* Return distinct array of ParameterSet when ProposeSweep is called (#368)

* Changed List to HashSet to ensure that there are no duplicates

* Update fast tree argument help text (#372)

* Update fast tree argument help text

* Update wording

* Update API to fix test

* Update core manifest JSON to update help text

* Combine multiple tree ensemble models into a single tree ensemble (#364)

* Add a way to create a single tree ensemble model from multiple tree ensemble models.

* Address PR comments, and fix bugs in serializing/deserializing RegressionTrees.

* Address PR comments.

* add pipelineitem for Ova (#363)

add pipelineitem for Ova

* Fix CV macro to output the warnings data view properly. (#385)

* Link to an example on using converting ML.NET model to ONNX. (#386)

* Adding documentation about entry points, and entry points graphs: EntryPoints.md and GraphRunner.md (#295)

* Adding EntryPoints.md and GraphRunner.md

* addressing PR feedback

* Updating the title of the GraphRunner.md file

* adressing Tom's feedback

* adressing feedback

* code formatting for class names

* Addressing Gal's comments

* Adding an example of an entry point. Fixing casing on ML.NET

* fixing link

* Adding LDA Transform (#377)

* Revert to using the native code (#413)

Corrects an unintentional "typo" in FastTreeRanking.cs where there was mistakenly a USE_FASTTREENATIVE2 instead of USE_FASTTREENATIVE. This resulted in some obscure hidden ranking options (distance weighting, normalize query lambdas, and a few others) being unavailable. These are important for some applications.

* LightGBM  (#392)

* LightGBM and test.

* add test baselines and nuget source for lightGBM binaries.

* Add entrypoint for lightGBM.

* add unsafe flag for release build.

* update nuget version.

* make lightgbm test single threaded.

* install gcc on OS machines to resolve dependencies on openmp thatis needed by lightgbm native code.

* PR comments. Leave BREW and GCC in bash script to verify macOS tests work.

* remove brew and gcc from build script.

* PR feedback.

* disable test on macOS.

* disable test on macOS.

* PR feedback.

* Adding Factorization Machines  (#383)

* Adding Factorization Machines

* ONNX API documentation. (#419)

* ONNX API documentation.

* Bring ensembles into codebase (#379)

Introduce Ensemble codebase

* enable macOS tests for LightGBM. (#422)

* Create a shorter temp file name for model loading. (#397)

Create a shorter temp file name for model loading, as well as remove the potential for a race condition among multiple openings by using the creation of a lock file.

* removing extraneous character that broke the linux build, and with it unecessary cmake version requirement (#425)

* EvaluatorUtils to handle label column of type key without text key values (#394)

* Fix EvaluatorUtils to handle label column of type key without text key values.

* Removing non source files from solution (#362)
eerhardt pushed a commit to eerhardt/machinelearning that referenced this issue Jul 27, 2018
The tests do not pass on systems with locale other than en-US.
The error happens since the results are written to files and the
contents of the files are compared to set of correct results produced
under en-US locale.

The fix is to imbue en-US culture to the test thread so that results
will be output in format that is comparable with the test format.

This patch fixes only tests, but do not guarantee calculation will be
correct in production systems using a locale different than en-US. In
particular, there can be problems in reading data and then conversing
data from characters to numeric format.

Fixes dotnet#74
@ghost ghost locked as resolved and limited conversation to collaborators Mar 31, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants