Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor cost._evaluate methods to accept precomputed signals #436

Closed
BradyPlanden opened this issue Aug 5, 2024 · 3 comments · Fixed by #455
Closed

Refactor cost._evaluate methods to accept precomputed signals #436

BradyPlanden opened this issue Aug 5, 2024 · 3 comments · Fixed by #455
Assignees
Labels
enhancement New feature or request

Comments

@BradyPlanden
Copy link
Member

BradyPlanden commented Aug 5, 2024

Feature description

At the moment, the private cost evaluate methods (cost._evaluate/evaluateS1) complete their computation via side-effects i.e. inputs is passed, but not used, with self.y & self.dy being the objects to operate on:

def _evaluate(self, inputs: Inputs):

This issue is to update these methods as follows:

  • Rename to cost.compute and cost.computeS1
  • Change the method args to be an object of signals that the cost is calculated on
  • Add object checks to the precomputed signal values

Motivation

Opens up an additional API for users to compute a cost directly without triggering and problem.evaluate creating an advanced workflow that can be used for custom cost methods. An example is shown below, although the logic is meaningless.

for i in range(10):
    eval = problem.evaluate([val1, val2])
    for i in range(100):
        eval += np.random.normal(0, sigma, len(t_eval))
        cost.compute(eval)

Possible implementation

The MAP method may need a large refactor.

Additional context

Initial discussion on #435

@BradyPlanden BradyPlanden added the enhancement New feature or request label Aug 5, 2024
@NicolaCourtier
Copy link
Member

NicolaCourtier commented Aug 5, 2024

I think this is a great direction to go in! 💯

Consider compute_for(y) to indicate that the computation is being applied to a given array and distinguish it from the synonym evaluate().

EDIT: After some more thought, I think using any synonym here (such as "compute" is confusing and I would instead vote for something like evaluate_on_prediction(y).

@BradyPlanden
Copy link
Member Author

BradyPlanden commented Aug 7, 2024

Thanks for the comment @NicolaCourtier!

On further thought, I think we can make better use of the __call__ method within the cost class to simplify these methods. Here's what I'm thinking:

  • The __call__ method is updated to implement the current evaluate and evaluateS1 functionality, with an optional arg added as calculate_grad with default as False. This aligns with PyBaMM's solver interface, which should simplify this change for users. I think this is a nice solution as it remains non-breaking for users, with added functionality for obtaining gradients. This results in evaluate and evaluateS1 being redundant and removed. For v24.9, these should be deprecated with full removal in 24.12.
  • _evaluate and _evaluateS1 are refactored into cost.compute with optional arg for returning gradients. This method computes the cost between two vectors without performing a problem.evaluate. The default would be for the ground truth and the user supplied vector, but an optional arg will be added to replace the ground truth in the computation.

To summarise, the API looks like this:

phi = cost([0.5,0.5]) # stays the same
phi, grad = cost([0.5, 0.5], calculate_grad=True) # returns cost and gradient
phi = cost.compute(np.ones(100)*4.0) # compute the cost against the ground truth and the user supplied vector
phi, grad = cost.compute(np.ones(100)*4.0, calculate_grad=True) # compute the cost and gradient against the ground truth and the user supplied vector

This is a large change to the internal cost API, as the S1 methods are removed and replaced with an optional arg in the non-gradient implementation. I think this is worth doing, as the current naming convention isn't clear to users, and would we would end up with less duplicate code. I'm be interested to hear others' thoughts on this!

@NicolaCourtier
Copy link
Member

This sounds good to me- the function names are not direct synonyms and the amount of duplicate code will be reduced!

@BradyPlanden BradyPlanden self-assigned this Aug 7, 2024
@BradyPlanden BradyPlanden linked a pull request Aug 9, 2024 that will close this issue
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants