-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update scikit learn to 0.24 #1831
Update scikit learn to 0.24 #1831
Conversation
Checking if outlier_label exists before looking at it.
Just check that all the means are the same.
@joshua-cogliati-inl I'm thinking the best bet would be to merge this first before fixing HERON. Since me updating the ARMA files will then break while using the devel version of RAVEN. |
Job Test qsubs sawtooth on acf5594 : invalidated by @joshua-cogliati-inl FAILED: Diff tests/cluster_tests/AdaptiveSobol/test_parallel_adaptive_sobol |
FYI: Github is confused. The tests have passed: https://civet.inl.gov/pr/15929/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some minor comments for your consideration.
@@ -35,7 +35,6 @@ | |||
<labels>0,0,0,1,1,1,1,2,2,2</labels> | |||
<n_splits>2</n_splits> | |||
<shuffle>False</shuffle> | |||
<random_state>10</random_state> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this node is removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because random_state is now an error if shuffle is False (since random_state is only used when shuffle is True)
@@ -64,6 +64,7 @@ | |||
<Print name="info"> | |||
<type>csv</type> | |||
<source>clusterInfo</source> | |||
<what>Output</what> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any specific reason for this modification?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there are three clusters, and which label is assigned to which is semi random, so if we only output the centers of the clusters, instead of the centers of the clusters and the label, the test doesn't randomly fail. See: tests/framework/PostProcessors/DataMiningPostProcessor/Clustering/gold/KMeans/info.csv
@@ -1,151 +1,151 @@ | |||
x2,x3,labels,x1,x4,Output,component1,component2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you provide some information about the tests regolding in this pull request? A brief description about the causes for the regolding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, updating to sklearn 0.24 resulted in the following diffs (with the sklearn version that the release notes that mention the model):
Diff tests/framework/ROM/SKLearn/linearElasticNetCV 0.23
Diff tests/framework/ROM/SKLearn/linearLARSCV 0.23
Diff tests/framework/ROM/SKLearn/linearLassoCV 0.23
Diff tests/framework/ROM/SKLearn/linearLassoLARSCV 0.23
Diff tests/framework/ROM/SKLearn/linearOMPCV 0.23
Diff tests/framework/ROM/TimeSeries/SyntheticHistory/ARMA ?
Diff tests/framework/PostProcessors/TemporalDataMiningPostProcessor/DimensionalityReduction/SparsePCA 0.22
Diff tests/framework/PostProcessors/TSACharacterizer/Basic ?
Diff tests/framework/PostProcessors/CrossValidations/stratifiedKFold 0.22
Diff tests/framework/PostProcessors/DataMiningPostProcessor/DimensionalityReduction/SparsePCA 0.22
Diff tests/framework/PostProcessors/TemporalDataMiningPostProcessor/DimensionalityReduction/MiniBatchSparsePCA 0.22
Release notes (note that all of them have a Changed Models section):
specs.addSub(InputData.parameterInputFactory("alpha_init", contentType=InputTypes.FloatType, | ||
descr=r"""Initial value for alpha (precision of the noise). | ||
If not set, alpha_init is $1/Var(y)$.""", default=None)) | ||
specs.addSub(InputData.parameterInputFactory("lambda_init", contentType=InputTypes.FloatType, | ||
descr=r"""Initial value for lambda (precision of the weights).""", default=1.0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to regenerate the ROM document to reflect these changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Job Mingw Test on 7a7f0f2 : invalidated by @joshua-cogliati-inl changed civet |
Suggested email: |
7a7f0f2
to
d9991df
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes are good.
Job Test qsubs sawtooth on 6995de8 : invalidated by @joshua-cogliati-inl FAILED: Diff tests/cluster_tests/AdaptiveSobol/test_parallel_adaptive_sobol |
Checklist is satisfied, and PR has been reviewed and approved. |
Retrain ARMAs for idaholab/raven#1831
* Remove old pre-0.17 compatibility code. * Properly handling 'None' and 'most_frequent' outlier_label Checking if outlier_label exists before looking at it. * Fixing default to be float. * changing sklearn to 0.24 * Regolding and test modification for sklearn 0.24 * Doubling relative error. * Labels are somewhat random, so don't print them. Just check that all the means are the same. * Updating generated documents. * Removing check for version for StackingRegressor since 0.24 is now minimum. * Adding StackingRegressor
* idiot-proof getEntity method * splitting and cleaning XMLread * fix missing f-statement * update Simulation to allow subsequent run() calls * formatting updates * create flushOutputDataObject in DataObject and DataSet to reinitialize output DataObjects for successive workflow runs * fixing assembler issue * removing getMessageHandler from builtins * removing duplicate import statement * fixing flushOutputDataObject for when 'prefix' is not a part of the object * restart time each time a simulation is run * ensure time resets when simulation reinitialized * plots generate when RAVEN workflow rerun * fixing DeprecationWarning message * fixing printing issue for HistorySet * reset warning messages if workflow ran previously * fixing colorMap specified does not exist error * fixing 'Tried to add new data to cNDarray' for PostProcessor tests * fixing 'Tried to add new data to cNDarray' for ARMA tests * fixing h5py 'Unable to create group' issue * fix issue with controlFunction in LogicalModel * HybridModel can be re-run in workflow * fixing GeneralPlot.py for re-running OutStreams tests * fix ['prefix'] not in index error * fixing issue seen with GeneticAlgorithm * reset Steps, SolutionExport, and GradientDescent Optimizer for rerunning workflows * finish up flushing GradientDescent Optimizer * flushing GeneticAlgorithm Optimizer * finish flushing SimulatedAnnealing * flushing AdaptiveMonteCarlo and other Samplers * improving flushDataObject * flushing Sobol Sampler * flushing AdaptiveSparseGrid * fixing NetCDF issue where values were appended instead of overwritten * adding flushSampler to Stratified * updating _inputMetaVars for HistorySet * flush Databases using (mostly) existing initializeDatabase * flushing additional Samplers * fixing re-initialization of HDF5 databases for re-running workflows * simplifying Sampler inheritance * formatting touched files * flushing metadataKeys for Samplers and Optimizers * adjust spacing for Run complete! message * updating tests to check for re-running RAVEN workflow * fixing f-string issue * fix new test for re-running RAVEN workflow * vargroups in rrr dataobjects (#1823) * Farm submodule update (#1826) * update FARM submodule version * Parallel improvements (#1825) * If ray instatiated outside, use it. Basically, before, ray was only used if nodes was setup, such as with MPI mode. * No longer automatically adds mpi mode to inner ravens. * Make it a bigger test, and switch to internalParallel * Adding useful debugging information. * Wait for servers to finish starting. Otherwise they become zombies. * Only print out changed part. * Adding debugging info, and force status to disk. * Setting port to 0 so ray chooses an available port. This adds port as a parameter to the starting ray function, and sends in 0. This tells ray to choose an available port, instead of erroring if port 6379 is not available. * Find correct ray start `ray start sometimes appears, so add a space before it. * Don't start JobHandler if running remotely. If we are going to be running remotely, then we should not start job handler. Otherwise, we have to start ray or threads, only to seconds later shutdown them down. * Update scikit learn to 0.24 (#1831) * Remove old pre-0.17 compatibility code. * Properly handling 'None' and 'most_frequent' outlier_label Checking if outlier_label exists before looking at it. * Fixing default to be float. * changing sklearn to 0.24 * Regolding and test modification for sklearn 0.24 * Doubling relative error. * Labels are somewhat random, so don't print them. Just check that all the means are the same. * Updating generated documents. * Removing check for version for StackingRegressor since 0.24 is now minimum. * Adding StackingRegressor * Setup changes (#1748) * Converting to ravenframework * Fix pluginhandler. * Making pyDOE use relative imports. * Adding __init__.py so lazy is found. * pyDOE now really in contrib. * Make checking libraries optional. * Updating for raven package. * Fixing library_report. * There was both a CodeInterfaces directory and a CodeInterfaces.py This caused problems. * Support python 2.0 for utils. * Skip library check if library handler not found. * Adds ability to generate setup.cfg requirements section. Usage: python3 ./scripts/library_handler.py pip --action=setup.cfg > setup.cfg * Updating things from review. * Increasing rel_err due to test failures. (#1836) * update LOGOS submodule (#1835) * HERON Submodule Update (#1837) * reverting PythonRaven to address issues caused by recent changes * reverting PythonRaven again due to circular imports * reverting getMessageHandler * fixing formatting in DataSet.py * adding resetHybridModel method * formatting GradientDescent * restoring commented code block and fixing variable name * formatting comments * updating to common 'flush' method * adding resetSimulation functionality * formatting f-string in MultiRun Co-authored-by: Paul Talbot <paul.talbot@inl.gov> Co-authored-by: Haoyu Wang <63424217+wanghy-anl@users.noreply.github.com> Co-authored-by: Joshua J. Cogliati <joshua-cogliati-inl@users.noreply.github.com> Co-authored-by: Congjian Wang - INL <congjian.wang@inl.gov> Co-authored-by: Dylan McDowell <dylanjm@users.noreply.github.com>
Pull Request Description
What issue does this change request address?
#1679
(pysensors requires scikit-learn 0.24 or newer)
What are the significant changes in functionality due to this change request?
Updates scikit learn to 0.24 in preparation for adding pysensors.
For Change Control Board: Change Request Review
The following review must be completed by an authorized member of the Change Control Board.
<internalParallel>
to True.raven/tests/framework/user_guide
andraven/docs/workshop
) have been changed, the associated documentation must be reviewed and assured the text matches the example.