Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convergence monitors #5590

Merged
merged 14 commits into from
Oct 4, 2024
Merged

Convergence monitors #5590

merged 14 commits into from
Oct 4, 2024

Conversation

jakobtorben
Copy link
Contributor

Implement Convergence Monitoring

This PR introduces a new convergence monitoring feature to improve the robustness and efficiency of the simulator. It is based on the following publication:

Lie, K., Moyner, O., Klemetsdal, Ø., Skaflestad, B., Moncorgé, A., & Kippe, V. (2024). Enhancing Performance of Complex Reservoir Models via Convergence Monitors. ECMOR, 2024(1), 1-9. (https://doi.org/10.3997/2214-4609.202437057)

The convergence monitoring system tracks the convergence behaviour across iterations, applying penalties for non-convergence. If the total penalty count exceeds the specified cut-off limit, the simulator will cut the timestep.

This feature allows early-exiting for steps that are not converging, saving wasted iterations and assembly.

This is the first version that will be iterated on before considering to merge it.

@jakobtorben
Copy link
Contributor Author

This first version implements the following convergence monitors:

  • Distance decay: Define the distance from convergence as a vector of d_i = max(log(r_i ), 0), for each of the convergence metrics of the reservoir. Calculate the L1 norm of the distance vector, and add add a penalty card if the current distance norm is greater than the previous distance norm multiplied by some decay factor (default 0.75): d^k > σ d^k−1.
  • Degradation of reservoir metrics: Add a penalty for each of the metrics (here CNV and MB) that have increased from the previous iteration
  • Unconverged wells: Add a penalty card if there are unconverged wells.

If the total penalty cards if above a given cut-off limit (default 30), cut the timestep.

I tested this on Norne, where we observe a slight decrease in nonlinear and linear iterations. But other cases are probably more suited since Norne does not fail a lot to begin with. (Ignore the zero wasted, I am not exiting the timestep cut gracefully yet).

image

}

template <typename Serializer>
void serializeOp(Serializer& serializer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this will change, but I don't see any situation where this needs to be serialized, obviously not in the initial eclipse state broadcast, and as restart-serialization happens at report steps, there is also no reason to serialize as afaict this is reset at each nonlinear iteration, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are perhaps correct. I just followed the class structure of the other classes but was unsure if I actually needed this part. Subject to change, I am planning that for each report there is a penalty card. Such that at the end of the simulation, I can add the counts to the INFORITER (or similar) for analysis. Do I need the serializer for that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just picking up this point now, sorry for the delayed response.

I don't see any situation where this needs to be serialized,

I switched to using the serializeOp() protocol for the ConvergenceReport object in commit 0b40277 (PR #5338). As long as this information is intended for output to the .INFOITER file, it needs to have a serializeOp() member function.

@jakobtorben
Copy link
Contributor Author

In the second version, the implementation should be the same as the paper, given that the tolerance for adding a card for too large well residual, is the same as OPM default such that the well is unconverged. The reporting has also been fixed, such that the wasted iterations coming from the convergence monitoring is counted. (This fix also involved making sure that failed iterations from NaN and too large residuals errors are counted).

When tested on Norne, the results are currently similar to without using convergence monitoring

image

@jakobtorben
Copy link
Contributor Author

Fixed some bugs and added the penalty counts to the INFOITER file for analysis.

The INFOITER file was used to analyse the convergence behaviour and cut-off values, using a tool similar to the paper.

Norne_step_275

Using this tool we can also estimate the number of iterations saved if using convergence monitoring. Which can be used to find the optimal parameters to use for a specific case:
NORNE_fraction_of_iterations_remaining_after_early_exit

And number of incorrectly aborted timesteps:

NORNE_number_of_incorrectly_aborted_timesteps

Optimal parameters found at cut-off 14 and distance decay factor 0.60, which should give an estimated number of Newton iterations as a factor of 0.989. The small reduction is likely due to Norne not failing a lot to begin with for OPM.

Using these optimal parameters, we can run OPM with convergence monitoring to see if we get any improvements. Here run with relaxed CNV and MB tol equal to original tol to match the format used in the analysis tool.

image

From the results, we see a small reduction in Newton iterations and runtime, as expected from our analysis. Better results are likely achieved on cases with more failed timesteps.

@jakobtorben
Copy link
Contributor Author

Depends on OPM/opm-common#4244.

@jakobtorben
Copy link
Contributor Author

Note that the well convergence metric is not used at the moment. But the plan is to also include well convergence metrics in the convergence monitoring. However, this requires two things:

  • WellConvergenceMetric must to be extended to multi-segment wells.
  • Some logic needs to be added to deal with the fact that number of wells increases during the simulation, which needs to be dealt with when comparing the number of unconverged residuals to the previous iteration.

@jakobtorben
Copy link
Contributor Author

jenkins build this opm-common=4244 please

@jakobtorben jakobtorben marked this pull request as ready for review October 3, 2024 07:07
@atgeirr atgeirr self-assigned this Oct 3, 2024
Copy link
Member

@atgeirr atgeirr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, just some small fry to fix.

opm/simulators/timestepping/ConvergenceReport.hpp Outdated Show resolved Hide resolved
opm/simulators/timestepping/AdaptiveTimeStepping.hpp Outdated Show resolved Hide resolved
opm/simulators/flow/BlackoilModel.hpp Outdated Show resolved Hide resolved
opm/simulators/flow/BlackoilModel.hpp Show resolved Hide resolved
@jakobtorben
Copy link
Contributor Author

jenkins build this please

@atgeirr
Copy link
Member

atgeirr commented Oct 4, 2024

All good, all green! Merging.

@atgeirr atgeirr merged commit 9654215 into OPM:master Oct 4, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants