Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataobject rework - future RAVEN v 1.1 #598

Merged
merged 269 commits into from
May 17, 2018
Merged

Dataobject rework - future RAVEN v 1.1 #598

merged 269 commits into from
May 17, 2018

Conversation

alfoa
Copy link
Collaborator

@alfoa alfoa commented Mar 21, 2018


Pull Request Description

What issue does this change request address? (Use "#" before the issue to link it, i.e., #42.)

Closes #182
Closes #363
Closes #225
Closes #568 (obsolete with respect to the new DataObject)
Closes #56 (obsolete with respect to the new DataObject)
Closes #112
Closes #589
Closes #319
Closes #305
Closes #252
Closes #573
Closes #551
Closes #73
Closes #551 (overcome by establish_conda_env.sh script)
Closes #627
Closes #258
Closes #129
Closes #93
Closes #91
Closes #83
Closes #77
Closes #58
Closes #43
Closes #68

What are the significant changes in functionality due to this change request?

This PR addresses multiple Issues.
The following new features have been added:

  • new Data object structure
  • addition of DataSet class
  • standardization of PostProcessors' outputs
  • New ROMs (time-dependent)
  • Possibility to handle vector input spaces (not just scalars)
  • Degradation problems with printing solved
  • EnsembleModel for Unstructured Input handling
  • Reached 100% tests' documentation
  • New Code interfaces (e.g. SCALE)
  • Library update

THIS PULL REQUEST (AND THE NEW DEVEL) WILL BE THE RAVEN V 1.1 RELEASE


For Change Control Board: Change Request Review

The following review must be completed by an authorized member of the Change Control Board.

  • 1. Review all computer code.
  • 2. If any changes occur to the input syntax, there must be an accompanying change to the user manual and xsd schema. If the input syntax change deprecates existing input files, a conversion script needs to be added (see Conversion Scripts).
  • 3. Make sure the Python code and commenting standards are respected (camelBack, etc.) - See on the wiki for details.
  • 4. Automated Tests should pass, including run_tests, pylint, manual building and xsd tests. If there are changes to Simulation.py or JobHandler.py the qsub tests must pass.
  • 5. If significant functionality is added, there must be tests added to check this. Tests should cover all possible options. Multiple short tests are preferred over one large test. If new development on the internal JobHandler parallel system is performed, a cluster test must be added setting, in XML block, the node <internalParallel> to True.
  • 6. If the change modifies or adds a requirement or a requirement based test case, the Change Control Board's Chair or designee also needs to approve the change. The requirements and the requirements test shall be in sync.
  • 7. The merge request must reference an issue. If the issue is closed, the issue close checklist shall be done.
  • 8. If an analytic test is changed/added is the the analytic documentation updated/added?

alfoa and others added 30 commits November 27, 2017 12:26
* dummy, external model both nominally working

* test_Lorentz needs to pass for externalModel; however, we need hdf5 to work as output first

* test_Lorentz runs, using ExternalModel, so external model and dummy both running for hist and point set
outstreams and performance -> dataobject-rework
* works, but ProbabilityWeight is not always provided by the sampler, like in test_Lorentz using MC

* merging in developments

* stash

* addMetaKeys works, proved with Samplers adding ProbabilityWeight

* cleanup
fixed write...but the history set does not work yet
alfoa and others added 9 commits April 27, 2018 00:47
…dered and consequentially a spurious diff can happen
…m_exponential

PolyExponential, Spline and DMD ROMs
* expand install script for conda 4.4 and beyond

* added explanatory comments
* Optimizer inherits from Sampler

* first implementation: by default copy value to all entries in vector variable, works

* finished test and implementation of simple repeat-value vector variable sampling

* added InputSpecs for optimizer, tests pass

* got input params working for optimizer

* first implementation: by default copy value to all entries in vector variable, works

* finished test and implementation of simple repeat-value vector variable sampling

* added InputSpecs for optimizer, tests pass

* got input params working for optimizer

* stash

* fixed gradient calculation to include vectors, all non-vector tests passing

* fixed gradient calculation to include vectors, all non-vector tests passing, conditional sizing for vector grad

* boundary condition checking, all passing

* redundant trajectories, all passing

* same coordinate check

* dot product step sizing

* stochastic engine is incorrectly sized; currently each entry in vector is being perturbed identically.  Needs work.

* working on constraints, convergence is really poor, needs more help

* first boundary conditions (internal) working, although type change in precond test

* constraints fully done, only precond has a problem still, vector still not converging well

* debugging difference between all scalars and vector

* vector

* time parabola model

* fixed initial step size

* working, although as a vector is a bit slower than all scalars

* vector is faster than scalar, reduced scale of tests (and better solution)

* all passing, but precond, which is having the type error still

* cleaned up, removed scalar comparison test, fixed precond test

* cleanup

* last bit of cleanup, all tests passing

* stash, it appears customsampler and datasets are not yet compatible

* xsd

* stash, <what> cannot handle specific requests

* reloading from dataset csv works be default

* fixed unit test, vector test

* xsd

* CustomSampler handles Point,History,Data sets

* cleanup

* cleanup

* updated custom sampler description docs

* Optimizer uses Custom sampler with vector variables for initial points

* unnecessarily-tested file

* initial round of review comments

* script for disclaimer adding, also added to models in optimizing test dir

* increased verbosity for test debug

* more verbosity for debugging

* gold standard agrees with all test machines, personal cluster profile (my desktop find minimum in traj 1 of 36 instead of 220ish)

* new golds
* exposed RNG to RAVEN python...swig

* fixed for now dist stoch enviroment
* shape from node to attribute

* constants can now be vectors too

* necessary Sampler and Optimizer changes

* extracted common constant reading for sampler, optimizer

* including string custom vector vars

* vector constant works in rrr with optimizer
Copy link
Collaborator

@derekstucki derekstucki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finishing a review part to make sure it's placing comments in the right places.

try:
interpolationND = utils.findCrowModule('randomENG')
print("randomENG","\n",randomENG)
except:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the "no except without an exception" rule apply to this file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

else:
# need the sampleTag meta to load histories
# BY DEFAULT keep everything needed to reload this entity. Inheritors can define _neededForReload to specify what that is.
keep = set(self._inputs + self._outputs + self._metavars + self._neededForReload
Copy link
Collaborator

@derekstucki derekstucki May 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1804 1840: Is this [:] needed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this noted on the right line?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, 1840 instead of 1804. That's what I get for trusting my memory.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's needed. Removed.

@derekstucki
Copy link
Collaborator

Any comment I leave in DataSet.py after about line 1400 is placed near line 1400 on this page and not placed at all when viewing the files changed view. I will put line numbers in all my comments, but that is going to be really annoying for anyone reading my review. Any suggestions on how to fix this would be helpful.

Copy link
Collaborator

@derekstucki derekstucki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completing partial review to make sure comments in other files go where they should.

else:
# need the sampleTag meta to load histories
# BY DEFAULT keep everything needed to reload this entity. Inheritors can define _neededForReload to specify what that is.
keep = set(self._inputs + self._outputs + self._metavars + self._neededForReload
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1842: This seems like a lot of conversion steps. Can it be reduced?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line in question, for reference:

fromColl = list( dict(zip(self._orderedVars,c)) for c in fromColl )

I need a list of realizations that contain hierarchal path endings, where each realization is a dictionary mapping variable names to values for that ending. I'm not sure if there'a a more abbreviated way to do this here, but I'm open to ideas.

@@ -26,7 +26,8 @@

################################################################################
from utils import utils
from DataObjects.Data import Data
from DataObjects.DataObject import DataObject as Data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine this is imported as Data in order to avoid a bunch of name changes throughout the file. Would it be better long term to just make all those changes? A search and replace would make very quick work of it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All references already removed; so I trivially removed the alias now. Fixed.


#External Modules------------------------------------------------------------------------------------
import sys,os
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better separated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

__builtin__.profile
except AttributeError:
# profiler not preset, so pass through
def profile(func):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This try/except seems off. If __builtin__.profile exists, then it still needs to be called as __builtin__.profile, while if it doens't, profile will exist in the main name space. If profile is already in the main name space from __builtin__, why bother importing __builtin__ to test it instead of just testing the one in the main name space? Am I missing something?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the accepted answer to this inquiry, where I got it from.

Copy link
Collaborator

@derekstucki derekstucki May 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is in that answer is exactly what I would expect here, but isn't what is here. Either the try needs to read "profile = __builtin__.profile", or the except needs to have "__builtin__.profile = profile". As written, profile is in one of two different namespaces depending on whether the try fails or not. Unless I'm missing something.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Fixed using your first suggestion.

PaulTalbot-INL and others added 4 commits May 16, 2018 09:33
* pre-merge review comments addressed for modules in framework/DataObjects, with the exception of merging DataObject into DataSet

* removed hierarchal unecessary use of [:]

* remainder of comments addressed
@alfoa alfoa changed the title Dataobject rework - preparation of merge Dataobject rework - future RAVEN v 1.1 May 17, 2018
@alfoa
Copy link
Collaborator Author

alfoa commented May 17, 2018


For Change Control Board: Change Request Review

The following review must be completed by an authorized member of the Change Control Board.

  • 1. Review all computer code. THE REVIEW HAS BEEN PERFORMED BY ALL DEVELOPERS
  • 2. If any changes occur to the input syntax, there must be an accompanying change to the user manual and xsd schema. If the input syntax change deprecates existing input files, a conversion script needs to be added (see Conversion Scripts). DONE
  • 3. Make sure the Python code and commenting standards are respected (camelBack, etc.) - See on the wiki for details. OK
  • 4. Automated Tests should pass, including run_tests, pylint, manual building and xsd tests. If there are changes to Simulation.py or JobHandler.py the qsub tests must pass. OK
  • 5. If significant functionality is added, there must be tests added to check this. Tests should cover all possible options. Multiple short tests are preferred over one large test. If new development on the internal JobHandler parallel system is performed, a cluster test must be added setting, in XML block, the node <internalParallel> to True. ALL THE TESTS HAVE BEEN REWORKED
  • 6. If the change modifies or adds a requirement or a requirement based test case, the Change Control Board's Chair or designee also needs to approve the change. The requirements and the requirements test shall be in sync. OK. ALFOA APPROVES!
  • 7. The merge request must reference an issue. If the issue is closed, the issue close checklist shall be done. SEVERAL ISSUES ARE ADDRESSED HERE
  • 8. If an analytic test is changed/added is the the analytic documentation updated/added? NO OUTCOME DIFFERENCES

@alfoa alfoa dismissed derekstucki’s stale review May 17, 2018 21:53

We addressed what he asked for or postpone the comments to a subsequential PR

Copy link
Collaborator

@PaulTalbot-INL PaulTalbot-INL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Targets met, and both internal and external reviews passed. Tests passing. Approving for merge.

@alfoa alfoa merged commit 688cdd1 into devel May 17, 2018
@alfoa alfoa deleted the dataobject-rework branch July 20, 2018 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment