Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding an option for target normalization in SVR #1853

Open
wants to merge 17 commits into
base: devel
Choose a base branch
from

Conversation

Jimmy-INL
Copy link
Collaborator


Pull Request Description

What issue does this change request address? (Use "#" before the issue to link it, i.e., #42.)
What are the significant changes in functionality due to this change request?

For Change Control Board: Change Request Review

The following review must be completed by an authorized member of the Change Control Board.

  • 1. Review all computer code.
  • 2. If any changes occur to the input syntax, there must be an accompanying change to the user manual and xsd schema. If the input syntax change deprecates existing input files, a conversion script needs to be added (see Conversion Scripts).
  • 3. Make sure the Python code and commenting standards are respected (camelBack, etc.) - See on the wiki for details.
  • 4. Automated Tests should pass, including run_tests, pylint, manual building and xsd tests. If there are changes to Simulation.py or JobHandler.py the qsub tests must pass.
  • 5. If significant functionality is added, there must be tests added to check this. Tests should cover all possible options. Multiple short tests are preferred over one large test. If new development on the internal JobHandler parallel system is performed, a cluster test must be added setting, in XML block, the node <internalParallel> to True.
  • 6. If the change modifies or adds a requirement or a requirement based test case, the Change Control Board's Chair or designee also needs to approve the change. The requirements and the requirements test shall be in sync.
  • 7. The merge request must reference an issue. If the issue is closed, the issue close checklist shall be done.
  • 8. If an analytic test is changed/added is the the analytic documentation updated/added?
  • 9. If any test used as a basis for documentation examples (currently found in raven/tests/framework/user_guide and raven/docs/workshop) have been changed, the associated documentation must be reviewed and assured the text matches the example.

@Jimmy-INL Jimmy-INL changed the title WIP: Adding an option for target normalization in SVR Adding an option for target normalization in SVR Jun 15, 2022
Copy link
Collaborator

@wangcj05 wangcj05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Jimmy-INL, I have a couple of comments for you to consider, please let me know if you have any question.

ravenframework/SupervisedLearning/ScikitLearn/SVM/SVR.py Outdated Show resolved Hide resolved
@@ -195,5 +192,6 @@ def _localNormalizeData(self,values,names,feat):
"""
if not self.info['normalize']:
self.muAndSigmaFeatures[feat] = (0.0,1.0)
self.muAndSigmaTargets[self.target[0]] = (0.0,1.0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can move the "if" check to the SupervisedLearning class

@@ -45,6 +45,7 @@ class SupervisedLearning(BaseInterface):
# 'boolean', 'integer', 'float'
qualityEstType = [] # this describe the type of estimator returned known type are 'distance', 'probability'.
# The values are returned by the self.__confidenceLocal__(Features)
info = {'normalize':None, 'normalizeTargets':None}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's be consistent here, please add "problemtype" in the "info" dict.

@@ -264,9 +266,11 @@ def train(self, tdict, indexMap=None):
# valueToUse can be either a matrix (for who can handle time-dep data) or a vector (for who can not)
if self.dynamicFeatures:
featureValues[:, :, cnt] = (valueToUse[:, :]- self.muAndSigmaFeatures[feat][0])/self.muAndSigmaFeatures[feat][1]
# targetValues[:,cnt] = (targetValues[:]- self.muAndSigmaFeatures[self.target[0]][0])/self.muAndSigmaFeatures[self.target[0]][1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Comment on lines 272 to 273
if 'normalizeTargets' in self.info.keys() and self.info['normalizeTargets']==True:
targetValues = (targetValues - self.muAndSigmaTargets[self.target[0]][0])/self.muAndSigmaTargets[self.target[0]][1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this only works on single target, I think you need to expand for multi-target case.

@@ -280,6 +284,7 @@ def _localNormalizeData(self,values,names,feat):
@ Out, None
"""
self.muAndSigmaFeatures[feat] = mathUtils.normalizationFactors(values[names.index(feat)])
self.muAndSigmaTargets[self.target[0]] = mathUtils.normalizationFactors(values[names.index(self.target[0])])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two comments here:
First, the normalization for target is only based on the first target, I guess we want to normalize each target.
Second, I think either we need to rewrite the _localNormalizeData to handle both feature and target correctly, or have separate methods for feature and target.

Comment on lines 368 to 370
# if self.target[0] in self.muAndSigmaFeatures.keys():
if ('normalizeTargets' in self.info.keys()) and self.info['normalizeTargets']:
target.update((x, y * self.muAndSigmaTargets[self.target[0]][1] + self.muAndSigmaTargets[self.target[0]][0]) for x, y in target.items())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment here, you are only using the first target to do the normalization, I think we need to extend it.

Comment on lines +119 to +120
setting,_ = paramInput.findNodesAndExtractValues(['normalizeTargets'])
self.info['normalizeTargets'] = setting['normalizeTargets']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can create a method to check if we want to perform normalization or not. For SVR, we can compute the ratio (basically, the normalized parameters and the default parameters), if the normalized parameters are too large, we can provide normalization on the targets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants