Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
80f05a4
Config changes for adding a second model dev run
mo-nikosbaltas Dec 17, 2025
6cf6121
Reformatted rose-*.conf filesafter CI failed
mo-nikosbaltas Dec 17, 2025
dd6140b
changes related to #286 and standardise model data
mo-nikosbaltas Dec 19, 2025
4ed330c
#286 after merging with #285 and main
mo-nikosbaltas Dec 23, 2025
23f5754
added CDDS/standardisation support for two models
mo-nikosbaltas Dec 23, 2025
3200651
Update CMEW/flow.cylc
mo-nikosbaltas Dec 24, 2025
4900f99
Update CMEW/app/standardise_model_data/rose-app.conf
mo-nikosbaltas Dec 24, 2025
52fe013
Update CMEW/app/configure_standardise/bin/configure_standardise.sh
mo-nikosbaltas Dec 24, 2025
2988ecb
changes implemented as suggested in PR 305 review
mo-nikosbaltas Dec 24, 2025
1fd6a96
used the env variable VARIANT_LABEL
mo-nikosbaltas Dec 29, 2025
cc0ab8e
Merge branch 'main' into 287-feed-model_id-and-variant_label-to-recipe
mo-nikosbaltas Dec 30, 2025
71beadb
changes to support dual model runs with recipes containing models_id …
mo-nikosbaltas Dec 30, 2025
7fbc3d9
ensures two datasets entries are present
mo-nikosbaltas Dec 31, 2025
7d0d99b
Update CMEW/app/configure_for/bin/test_update_recipe_file.py
mo-nikosbaltas Dec 31, 2025
b492145
Update CMEW/app/configure_for/bin/test_update_recipe_file.py
mo-nikosbaltas Dec 31, 2025
adc40d1
Update CMEW/app/configure_for/bin/test_update_recipe_file.py
mo-nikosbaltas Dec 31, 2025
96cc418
added/edited comments suggested
mo-nikosbaltas Dec 31, 2025
d83d0cb
Merge branch '287-feed-model_id-and-variant_label-to-recipe' of githu…
mo-nikosbaltas Dec 31, 2025
2067f71
comments refinement
mo-nikosbaltas Dec 31, 2025
704cdb7
documentation for dual runs
mo-nikosbaltas Jan 2, 2026
eec1dee
Revert "documentation for dual runs"
mo-nikosbaltas Jan 5, 2026
a040aaf
updated for dual-model run
mo-nikosbaltas Jan 5, 2026
cc0ee6f
Merge branch 'main' into 287-feed-model_id-and-variant_label-to-recipe
alistairsellar Jan 5, 2026
aecaeaf
Update doc/source/user_guide/workflow.rst
mo-nikosbaltas Jan 5, 2026
26bd40e
Update doc/source/user_guide/workflow.rst
mo-nikosbaltas Jan 5, 2026
13327eb
Update doc/source/user_guide/workflow.rst
mo-nikosbaltas Jan 5, 2026
8cf02ca
Update doc/source/user_guide/workflow.rst
mo-nikosbaltas Jan 5, 2026
1f5076e
Update doc/source/user_guide/workflow.rst
mo-nikosbaltas Jan 5, 2026
6c37de2
Update doc/source/user_guide/workflow.rst
mo-nikosbaltas Jan 6, 2026
745927e
Update doc/source/user_guide/workflow.rst
mo-nikosbaltas Jan 6, 2026
9e423bc
some typo corrections
mo-nikosbaltas Jan 7, 2026
8a1d710
chenged some confusing comments
mo-nikosbaltas Jan 7, 2026
160f614
edited comments
mo-nikosbaltas Jan 8, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 19 additions & 3 deletions CMEW/app/configure_for/bin/test_update_recipe_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,18 @@

@pytest.fixture
def mock_env_vars(monkeypatch):
# Time window
monkeypatch.setenv("START_YEAR", "1993")
monkeypatch.setenv("NUMBER_OF_YEARS", "1")

# Reference run metadata
monkeypatch.setenv("REF_MODEL_ID", "HadGEM3-GC31-LL")
monkeypatch.setenv("REF_VARIANT_LABEL", "r1i1p1f3")

# Evaluation run metadata
monkeypatch.setenv("MODEL_ID", "UKESM1-0-LL")
monkeypatch.setenv("VARIANT_LABEL", "r1i1p1f1")


@pytest.fixture
def path_to_updated_recipe_kgo():
Expand All @@ -38,6 +47,12 @@ def path_to_mock_original_recipe():
def test_update_recipe(
mock_env_vars, path_to_updated_recipe_kgo, path_to_mock_original_recipe
):
"""update_recipe should produce the KGO with both datasets updated.

- Dataset[0] uses REF_MODEL_ID / REF_VARIANT_LABEL
- Dataset[1] uses MODEL_ID / VARIANT_LABEL
- start_year and end_year are set from START_YEAR / NUMBER_OF_YEARS
"""
with open(path_to_updated_recipe_kgo, "r") as file_handle:
expected = yaml.safe_load(file_handle)
actual = update_recipe(path_to_mock_original_recipe)
Expand All @@ -51,19 +66,20 @@ def test_main(
path_to_mock_original_recipe,
tmp_path,
):
"""main() should overwrite the recipe in-place with the updated content."""
# Copy the original recipe to a tmp_path location to allow it to be
# overwritten.
path_to_temp_recipe = tmp_path / "tmp_recipe.yml"
shutil.copy(path_to_mock_original_recipe, path_to_temp_recipe)

# Mock the environmental variable 'RECIPE PATH' to the tmp_path location
# where the original recipe is stored.
monkeypatch.setenv("RECIPE_PATH", path_to_temp_recipe)
monkeypatch.setenv("RECIPE_PATH", str(path_to_temp_recipe))

main()

with open(path_to_temp_recipe, "r") as file_handle_1:
actual = file_handle_1.readlines()
actual_lines = file_handle_1.readlines()

with open(path_to_updated_recipe_kgo, "r") as file_handle_2:
kgo_with_comment = file_handle_2.readlines()
Expand All @@ -72,4 +88,4 @@ def test_main(
# 'test_updated_radiation_budget_recipe.yml'.
kgo_without_comment = kgo_with_comment[5:]

assert actual == kgo_without_comment
assert actual_lines == kgo_without_comment
63 changes: 49 additions & 14 deletions CMEW/app/configure_for/bin/update_recipe_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,10 @@ def update_recipe(recipe_path):
"""Update the ESMValTool recipe.

* Read the ESMValTool recipe YAML file from the provided ``recipe_path``
* Update the dataset section of the recipe with a) CMEW required key/values
and b) user configurables values from the Rose suite configuration.
* Update the datasets section of the recipe with:
- CMEW required key/values
- User configurable values from the Rose suite configuration
for both the reference and evaluation model runs.

Recipe file/datasets section snippet (human written YAML)::

Expand All @@ -31,16 +33,17 @@ def update_recipe(recipe_path):
Updated recipe file/datasets section snippet (machine written YAML)::

datasets:
- {dataset: <dataset>, end_year: <end_year>, ensemble: <ensemble>,
end_year: <end_year>, exp: <exp>, grid: <grid>, project: <project>,
start_year: <start_year>}
- {activity: <activity>, dataset: <dataset>, end_year: <end_year>,
ensemble: <ensemble>, exp: <exp>, grid: <grid>, project: <project>,
- {dataset: <ref_model_id>, end_year: <end_year>, ensemble: <ref_variant>,
exp: <exp>, grid: <grid>, project: <project>, start_year: <start_year>}
- {activity: <activity>, dataset: <eval_model_id>, end_year: <end_year>,
ensemble: <eval_variant>, exp: <exp>, grid: <grid>, project: <project>,
start_year: <start_year>}

Notes
-----
The updated recipe includes one additional CMEW required key: "Activity".
The updated recipe includes:
* Reference dataset (index 0) using REF_MODEL_ID and REF_VARIANT_LABEL
* Evaluation dataset (index 1) using MODEL_ID and VARIANT_LABEL

Parameters
----------
Expand All @@ -49,28 +52,60 @@ def update_recipe(recipe_path):

Returns
-------
recipe: dict[str, union[str, int]]
recipe: dict
The content of the ESMValTool recipe with updated datasets section.
"""
# Time window from environment
start_year = int(os.environ["START_YEAR"])
end_year = (
int(os.environ["START_YEAR"]) + int(os.environ["NUMBER_OF_YEARS"]) - 1
)

# Model metadata from environment
ref_model_id = os.environ["REF_MODEL_ID"]
ref_variant = os.environ["REF_VARIANT_LABEL"]
eval_model_id = os.environ["MODEL_ID"]
eval_variant = os.environ["VARIANT_LABEL"]

with open(recipe_path, "r") as file_handle:
recipe = yaml.safe_load(file_handle)
first_dataset = recipe["datasets"][0]
second_dataset = recipe["datasets"][1]
first_dataset.update({"start_year": start_year, "end_year": end_year})
second_dataset.update(

datasets = recipe.get("datasets", [])
if len(datasets) < 2:
raise ValueError(
"Expected at least two datasets in the recipe, "
"one for the reference and one for the evaluation run."
)

# Reference dataset: treat as a GCModelDev / ESMVal / amip run,
# using REF_MODEL_ID & REF_VARIANT_LABEL, with the configured time window.
ref_dataset = datasets[0]
ref_dataset.update(
{
"dataset": ref_model_id,
"project": "ESMVal",
"exp": "amip",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I think we're changing this from "historical". Is it deliberate? If so, the comment should be updated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, I should have picked this up in first review. Actually I think that experiment might be wrong for both runs, including the original. For this recipe (radiation budget) the choice of experiment doesn't make a difference, but it will for some recipes, so experiment should be something that the user defines as part of the model run definition. I've just opened an issue to add that: #316.

For this PR, I propose that we accept that the second run is no more wrong than the first, and that having them consistently called "amip" is as good as any choice. I.e. I propose that we keep "exp": "amip" for both.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think the comment should reflect what's going on, but I will take note to pay more attention to the unchanged code in a review next time.

Copy link
Collaborator Author

@mo-nikosbaltas mo-nikosbaltas Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had spent some time debugging the failure last week when implementing the #287 and I dug out the logs I had kept, so here is the explanation for completion.
There was a failure because the reference dataset was treated as a CMIP6 “historical” run in the recipe, while CDDS had standardised it as a GCModelDev / ESMVal / amip run.
From the ESMValTool log for the reference dataset (dataset index 0, HadGEM3):
'dataset': 'HadGEM3-GC31-LL',
'project': 'ESMVal',
'mip': 'Amon',
'short_name': 'hfls',
'activity': 'CMIP',
'alias': 'None',
'ensemble': 'r5i1p3f3',
'exp': 'historical',
...
So, after executing update_recipe_file.py:
• ‘project’ has been changed to ESMVal
• ‘ensemble’ is r5i1p1f3 (from REF_VARIANT_LABEL)
• But:
o ‘exp’ was still historical
o ‘activity’ was still CMIP
Now looking at where ESMValTool is searching for files (in the logs):
Looked for files matching
/home/users/nikolaos.baltas/cylc-run/CMEW_287/test287c/share/work/GCModelDev/CMIP/MOHC/HadGEM3-GC31-LL/historical/r5i1p1f3/Amon/hfls/gn//hfls_Amon_HadGEM3-GC31-LL_historical_r5i1p1f3_gn.nc
/home/users/nikolaos.baltas/cylc-run/CMEW_287/test287c/share/work/GCModelDev/CMIP/NERC/HadGEM3-GC31-LL/historical/r5i1p1f3/Amon/hfls/gn//hfls_Amon_HadGEM3-GC31-LL_historical_r5i1p1f3_gn.nc
Key bits:
• Path includes GCModelDev/CMIP/.../historical/r5i1p1f3/...
• This is driven by ‘activity: CMIP’ and ‘exp: historical’.
However, the CDDS request (from create_request_file.py) uses:
"experiment_id": "amip",
and is run twice (REF and EVAL) via standardise_model_data. So CDDS is standardising:
• GCModelDev/ESMVal//amip//...
for both runs.
That means:
• CDDS has produced ‘amip’ data
• ESMValTool is still looking for ‘historical’ data for the reference dataset
• Hence: “No input files found for Dataset ... historical ...”
The evaluation dataset works fine because we had explicitly set:
'project': 'ESMVal',
'activity': 'ESMVal',
'exp': 'amip',
'ensemble': 'r1i1p1f1',
...
and ESMValTool finds (from logs):
Found input files for Dataset: hfls, Amon, ESMVal, UKESM1-0-LL, ESMVal, amip, r1i1p1f1, gn, v20251230
So, the fix was to make the reference dataset use the GCModelDev/ESMVal “amip” semantics too.
Need to also override ‘exp’ and ‘activity’ for the ‘reference dataset’ in the same way we do for the evaluation dataset.
I updated that block to:
# Reference dataset: treat as a GCModelDev / ESMVal / amip run,
ref_dataset = datasets[0]
ref_dataset.update(
{
"dataset": ref_model_id,
"project": "ESMVal",
"exp": "amip",
"activity": "ESMVal",
"ensemble": ref_variant,
"start_year": start_year,
"end_year": end_year,
}
)
# Evaluation dataset: ESMVal / amip run using MODEL_ID + VARIANT_LABEL
eval_dataset = datasets[1]
eval_dataset.update(
{
"dataset": eval_model_id,
"project": "ESMVal",
"exp": "amip",
"activity": "ESMVal",
"ensemble": eval_variant,
"start_year": start_year,
"end_year": end_year,
}
)
That aligns both datasets with:
• project: ESMVal
• activity: ESMVal
• exp: amip
which matches what CDDS is actually producing from create_request_file.py.

I hope this answers the question of why overriding 'project' and 'exp' . Now if this is the correct approach we need to investigate further.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy that this comment is resolved from my perspective, but will leave open for @mo-nikosbaltas to close if he and @alistairsellar are satisfied that the "investigate further" aspect has been / is elsewhere addressed.

"activity": "ESMVal",
"ensemble": ref_variant,
"start_year": start_year,
"end_year": end_year,
}
)

# Evaluation dataset: ESMVal / amip run using MODEL_ID and VARIANT_LABEL
eval_dataset = datasets[1]
eval_dataset.update(
{
"dataset": eval_model_id,
"project": "ESMVal",
"exp": "amip",
"activity": "ESMVal",
"ensemble": "r1i1p1f1",
"ensemble": eval_variant,
"start_year": start_year,
"end_year": end_year,
}
)

return recipe


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ def test_create_request(monkeypatch):
monkeypatch.setenv("ROOT_DATA_DIR", "/path/to/data/dir/")
monkeypatch.setenv("SUITE_ID", "u-az513")
monkeypatch.setenv("VARIABLES_PATH", "/path/to/variables.txt")
monkeypatch.setenv("VARIANT_LABEL", "r1i1p1f1")

config = create_request()
actual = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,13 @@
# To be used for testing purposes. Represents KGO for `update_recipe`.
---
datasets:
- dataset: HadGEM3-GC31-LL
- activity: ESMVal
dataset: HadGEM3-GC31-LL
end_year: 1993
ensemble: r1i1p1f3
exp: historical
exp: amip
grid: gn
project: CMIP6
project: ESMVal
start_year: 1993
- activity: ESMVal
dataset: UKESM1-0-LL
Expand Down
54 changes: 32 additions & 22 deletions doc/source/user_guide/workflow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,74 +12,84 @@ An overview of the workflow

``install_env_file``
:Description:
Activates the environment for |ESMValTool|, based on the ``SITE`` provided
Activates the environment for |ESMValTool|, based on the ``SITE`` provided.
:Runs on:
Localhost
:Executes:
The ``install_env_file.sh`` script from the |Rose| app
The ``install_env_file.sh`` script from the |Rose| app.
:Details:
Runs once at the start of the workflow
Runs once at the start of the workflow.

``configure_recipe``
:Description:
Creates and modifies the |ESMValTool| user configuration file,
and writes it to the cylc workflow ``share/etc`` directory
and writes it to the cylc workflow ``share/etc`` directory.
:Runs on:
Localhost
:Executes:
The ``configure_recipe.py`` script from the |Rose| app
The ``configure_recipe.py`` script from the |Rose| app.
:Details:
Runs immediately after the successful completion of the ``install_env_file`` job.
Temporarily, the modified ESMValTool developer configuration file is copied from
the ``configure_recipe`` app to the ``share/etc`` directory in the installed workflow
the ``configure_recipe`` app to the ``share/etc`` directory in the installed workflow.

``configure_for``
:Description:
Copies an updated version of the |ESMValTool| recipe
into the cylc workflow ``share/etc`` directory
in the installed workflow
into the Cylc workflow ``share/etc`` directory
in the installed workflow and configures it
to use standardised model data.
:Runs on:
Localhost
:Executes:
For the required recipe,
executes the ``esmvaltool recipes get`` command
followed by the ``update_recipe_file.py`` script from the |Rose| app
followed by the ``update_recipe_file.py`` script from the |Rose| app.
:Details:
Runs once for each recipe,
immediately after the successful completion
of the ``install_env_file`` job.
The recipe is updated with CMEW required variables
(e.g. "Activity": "ESMVal")
and also with user configurable variables
from the |Rose Edit GUI|_/``rose-suite.conf``
from the |Rose Edit GUI|_/``rose-suite.conf``,
for both model runs.
:Families:
``RECIPE``

``configure_standardise``
:Description:
Creates the ``request.json`` file and variables list which are needed to run
|CDDS| and creates the |CDDS| directory structure.
Creates the |CDDS| request metadata
and variables list required to standardise two model development runs,
then prepares the |CDDS| directory structure.
:Runs on:
Localhost
:Executes:
The ``configure_standardise.sh`` script from the |Rose| app
The ``configure_standardise.sh`` script from the |Rose| app.
:Details:
Runs once for each recipe, immediately after the successful
completion of the ``configure_for`` job
completion of the ``configure_for`` job.
Generates |CDDS| request metadata for each model run (reference and evaluation):
``request_ref.json``, ``request_eval.json``.
Reads model-specific values from the workflow environment.
Creates the required directory structure to support
multiple |CDDS| standardisation workflows
within the same |CMEW| cycle.

``standardise_model_data``
:Description:
Launches the |CDDS| workflow and converts the data into a |CMIP| compliant
format for |ESMValTool|
Launches the CDDS workflow and converts both model runs into |CMIP|-compliant
datasets suitable for |ESMValTool| evaluation.
:Runs on:
Localhost
:Executes:
The ``cdds_convert`` command and the ``restructure_dirs.sh`` script
from the |Rose| app
from the |Rose| app.
:Details:
Runs after the successful completion of the ``configure_standardise`` job.
The ``restructure_dirs.sh`` script moves the standardised data into
a directory with a BADC DRS structure so that |ESMValTool| can find the data
Executes |CDDS| standardisation for both the reference and evaluation model run
to produces |CMIP|-compliant output for each.
Uses ``restructure_dirs.sh`` to move standardised data into a BADC DRS structure.

``housekeeping``
:Description:
Expand All @@ -97,13 +107,13 @@ An overview of the workflow
Runs the requested recipes using |ESMValTool|
:Runs on:
``COMPUTE``, which depends on the ``SITE``; at the Met Office, the
``run_recipe`` jobs will run on SPICE
``run_recipe`` jobs will run on SPICE.
:Executes:
The |ESMValTool| command line script
:Details:
Runs once for each recipe,
after the successful completion of the ``standardise_model_data``
and the ``configure_recipe`` jobs
and the ``configure_recipe`` jobs.
:Families:
``COMPUTE``, ``RECIPE``

Expand All @@ -127,7 +137,7 @@ An overview of the workflow
``pytest`` from the |Rose| app
:Details:
Runs on its own when ``-O unittest`` command is invoked, or runs alongside the
full workflow when running with ``-O test``
full workflow when running with ``-O test``.

Design considerations
---------------------
Expand Down