A bug in calculating accumulated fields (24/240 hours averaged) when using a smaller timestep #1789

Duseong · 2022-06-29T22:32:12Z

Brief summary of bug

It looks like there's a bug in calculating accumulated fields (e.g., TV24, TV240, PAR24_sun, PAR240_sun, etc.) when using a smaller timestep.

General bug information

CTSM version you are using:
ctsm5.1.dev019 (cam6_3_018)

Does this bug cause significantly incorrect results in the model's science?
Yes

Configurations affected:
Regional refinement simulations or any configurations that use a much smaller timestep than the global 1-degree model.

Details of bug

Please see the attached file to see the time series of variables (TV, TV24, TV240, PAR_sun, PAR24_sun, PAR240_sun) at Manaus point over Amazon. Black dots are from the ne30 simulation, and red dots are from the same ne30 simulation but with a different timestep (3.75 minutes -> ATM_NCPL = LND_NCPL of 384).
There are unexpected big fluctuations in 24/240 hours averaged fields in the smaller timestep run, so I guess there is a bug when calculating those fields.
Since regional refinement simulations use smaller timestep, results from those models are heavily affected by these fields. I found that global biogenic emissions, OH, and other chemical fields like CO were changed substantially. It may also affect other atmospheric fields.
Accumulated_fields_time_series.pdf

Important details of your setup / configuration so we can reproduce the bug

I changed ATM_NCPL (=LND_NCPL) from 48 to 384 to see the effects of the timestep on those fields. The spatial resolution was the same for both (ne30).

billsacks · 2022-06-29T22:50:45Z

Thank you for opening this issue. I agree that this appears to be a significant bug!

From looking through the code, I think I see what's happening here: the PERIOD of accumulated fields is read from the initial conditions file, but I don't think it needs to be, and doing so is problematic if the run you're doing uses a different time step than the time step used in creating the initial conditions file originally.

Can you try redoing your test after rebuilding the code with this block of code deleted:

CTSM/src/main/accumulMod.F90

Lines 767 to 776 in 25a7cd3

    
           if (accum(nf)%old_name /= "") then 
        
              varname = trim(accum(nf)%name) // '_PERIOD:' // trim(accum(nf)%old_name) // '_PERIOD' 
        
           else 
        
              varname = trim(accum(nf)%name) // '_PERIOD' 
        
           end if 
        
           call restartvar(ncid=ncid, flag=flag, varname=varname, xtype=ncd_int, & 
        
                long_name='', units='time steps', & 
        
                imissing_value=ispval, ifill_value=huge(1), & 
        
                interpinic_flag='copy', & 
        
                data=accum(nf)%period, readvar=readvar)

If that gives other problems, then a simpler but safer experiment would be to redo your test after setting ./xmlchange CLM_FORCE_COLDSTART=on so that the model doesn't use a restart file at all. (That won't be good for science, but if my hypothesis is right, then it should get around this issue. I'd like to see if that's true.)

billsacks · 2022-06-29T22:53:55Z

(Note to self and @samsrabin : usually my experience is that it's a bad idea to get side-tracked going down tangential rabbit holes, but apparently this one #1684 (comment) was worth following up on....)

Duseong · 2022-06-29T22:58:10Z

Thank you for opening this issue. I agree that this appears to be a significant bug!

From looking through the code, I think I see what's happening here: the PERIOD of accumulated fields is read from the initial conditions file, but I don't think it needs to be, and doing so is problematic if the run you're doing uses a different time step than the time step used in creating the initial conditions file originally.

Can you try redoing your test after rebuilding the code with this block of code deleted:

CTSM/src/main/accumulMod.F90

Lines 767 to 776 in 25a7cd3

if (accum(nf)%old_name /= "") then

varname = trim(accum(nf)%name) // '_PERIOD:' // trim(accum(nf)%old_name) // '_PERIOD'

else

varname = trim(accum(nf)%name) // '_PERIOD'

end if

call restartvar(ncid=ncid, flag=flag, varname=varname, xtype=ncd_int, &

long_name='', units='time steps', &

imissing_value=ispval, ifill_value=huge(1), &

interpinic_flag='copy', &

data=accum(nf)%period, readvar=readvar)

If that gives other problems, then a simpler but safer experiment would be to redo your test after setting ./xmlchange CLM_FORCE_COLDSTART=on so that the model doesn't use a restart file at all. (That won't be good for science, but if my hypothesis is right, then it should get around this issue. I'd like to see if that's true.)

Thanks a lot for the quick response! I will give it a spin and will let you know if it's corrected!

Duseong · 2022-06-30T06:41:41Z

Thanks again for your comment!
Accumulated_fields_time_series_bugfix.pdf

Please see the attached file, which now includes another simulation (blue dots) with your suggestion for the bug fix.

I think it solved the problem, now accumulated fields get much closer to the base simulation.

It's only for 5-day simulation results, so I will do a longer run for the final confirmation, but I don't think that a longer run will show different results.

billsacks · 2022-06-30T17:34:18Z

Thanks a lot for trying that out and posting the results! I'm not too surprised to see that it takes some time for the accumulation fields to adjust to the new time step. It looks to me like after this initial period they appear to be doing the right thing. I'm planning to move ahead with this fix, but please do let us know if you see other problems.

adamrher · 2022-07-05T21:46:45Z

@Duseong says

Since regional refinement simulations use smaller timestep, results from those models are heavily affected by these fields. I found that global biogenic emissions, OH, and other chemical fields like CO were changed substantially. It may also affect other atmospheric fields.

I just want to verify that this issue is purely a problem w/ diagnostic output, and not something that impacts other fields as indicated by this comment above.

lkemmons · 2022-07-05T21:48:35Z

This does have a real impact on MEGAN biogenic emissions.

…

On Tue, Jul 5, 2022 at 3:46 PM Adam Herrington ***@***.***> wrote: @Duseong <https://github.com/Duseong> says Since regional refinement simulations use smaller timestep, results from those models are heavily affected by these fields. I found that global biogenic emissions, OH, and other chemical fields like CO were changed substantially. It may also affect other atmospheric fields. I just want to verify that this issue is purely a problem w/ diagnostic output, and not something that impacts other fields as indicated by this comment above. — Reply to this email directly, view it on GitHub <#1789 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AH5BH7KOPVPN7EEYE45RJSLVSSUNDANCNFSM52HKNKWA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

adamrher · 2022-07-05T21:51:44Z

oy vey. OK thanks.

Duseong · 2022-07-05T21:53:43Z

It's only for 5-day simulation results, so I will do a longer run for the final confirmation, but I don't think that a longer run will show different results.

And Louisa and I conducted several months simulations, and again we confirmed that the method by Bill solved the problem.

lkemmons · 2022-07-06T17:16:02Z

The accumulMod routines are used in many other parts of CLM as well. We have not investigated the impact on other CLM fields.

adamrher · 2022-07-08T23:01:51Z

I'd be interested to know whether our standard CAM6 low-top configuration (not CAM-CHEM) is impacted by this bug. What species are considered MEGAN emissions? Does it refer to aerosol species or chemical species?

In the CAM configuration I'm referring to, these chemical species are dynamically active (i.e., dycore tracers):

DMS, H2O2, H2SO4, SO2, SOAG

... and the aerosol species:

bc_aX, dst_aX, ncl_aX, num_aX, pom_aX, so4_aX, soa_aX

lkemmons · 2022-07-08T23:53:13Z

CAM alone does not use MEGAN. However these accumulated averages are used throughout CLM for a variety of parameters. I have not yet had a chance to look at the impact on physical parameters in CLM, nor the impact on T, Q, etc. in CAM.

billsacks · 2022-07-11T17:21:58Z

@adamrher - just to echo what @lkemmons said, I would (unfortunately) expect this bug to have some impact on any simulation with a non-30-minute time step, but I don't have a sense of how large the impact will be.

dlawrenncar · 2022-07-11T22:24:03Z

My guess, but mainly a guess, is that it wouldn't have much impact on the physical simulation, especially if using the prescribed vegetation (CLMSP) configuration.

…

On Mon, Jul 11, 2022 at 1:22 PM Bill Sacks ***@***.***> wrote: @adamrher <https://github.com/adamrher> - just to echo what @lkemmons <https://github.com/lkemmons> said, I would (unfortunately) expect this bug to have some impact on any simulation with a non-30-minute time step, but I don't have a sense of how large the impact will be. — Reply to this email directly, view it on GitHub <#1789 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFABYVBSMODQQG2LQ2224RLVTRJ4HANCNFSM52HKNKWA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

This doesn't need to be on the restart file, and having it read from the restart file causes wrong behavior when changing the model time step. See ESCOMP#1789 for details Resolves ESCOMP#1789

billsacks · 2022-07-13T03:38:48Z

I have done some more investigation of this. Based on a quick search through the code (i.e., I may have missed something), it looks like, in an SP (i.e., non-BGC) case, the only accumulation field that impacts aspects of the model other than VOC emissions is T10 - the 10-day running mean air temperature - which appears in a few places in the photosynthesis calculations. In a run whose time step isn't too different from 30 minutes – e.g., a 15-minute time step – I wouldn't expect this bug to make much difference in an SP case. But in a run with a much shorter time step, the differences would become larger. I don't have a feeling for how important this T10 variable is in the photosynthesis calculation, but I do see that it has some impact on the evolution of the model.

More accumulation fields come into play in BGC simulations, e.g., impacting various phenology calculations.

Fix accumulation variables when changing model time step Accumulation variables (e.g., 1-day or 10-day averages) were writing and reading their accumulation period (expressed in time steps) to the restart file. This caused incorrect behavior when changing the model time step relative to what was used to create the initial conditions file (typically a 30-minute time step). So, for example, if you are using a 15-minute time step with an initial conditions file that originated from a run with a 30-minute time step (at some point in its history), then an average that was supposed to be 10-day instead becomes 5-day; an average that was supposed to be 1-day becomes 12-hour, etc. (The issue is that the number of time steps in the averaging period was staying fixed rather than the actual amount of time staying fixed.) For our out-of-the-box initial conditions files, this only impacts runs that use something other than a 30-minute time step. Typically this situation arises in configurations with an active atmospheric model that is running at higher resolution than approximately 1 degree. It appears that the biggest impacts are on VOC emissions and in BGC runs; we expect the impact to be small (but still non-zero) in prescribed phenology (SP) runs that don't use VOC emissions. This tag fixes this issue by no longer writing or reading accumulation variables' PERIOD to / from the restart file: this isn't actually needed on the restart file. See some discussion in #1789 for more details, and see #1802 (comment) for some discussion of outstanding weirdness that can result for accumulation variables when changing the model time step. The summary of that comment is: There could be some weirdness at the start of a run, but at least for a startup or hybrid run, that weirdness should work itself out within about the first averaging period. A branch or restart run could have some longer-term potential weirdness, so for now I think we should recommend that people NOT change the time step on a branch or restart run. With (significant?) additional work, we could probably avoid this additional weirdness, but my feeling is that it isn't worth the effort right now. In any case, I feel like my proposed fix will bring things much closer to being correct than they currently are when changing the time step. Resolves #1789 (A bug in calculating accumulated fields (24/240 hours averaged) when using a smaller timestep)

Fix accumulation variables when changing model time step Accumulation variables (e.g., 1-day or 10-day averages) were writing and reading their accumulation period (expressed in time steps) to the restart file. This caused incorrect behavior when changing the model time step relative to what was used to create the initial conditions file (typically a 30-minute time step). So, for example, if you are using a 15-minute time step with an initial conditions file that originated from a run with a 30-minute time step (at some point in its history), then an average that was supposed to be 10-day instead becomes 5-day; an average that was supposed to be 1-day becomes 12-hour, etc. (The issue is that the number of time steps in the averaging period was staying fixed rather than the actual amount of time staying fixed.) For our out-of-the-box initial conditions files, this only impacts runs that use something other than a 30-minute time step. Typically this situation arises in configurations with an active atmospheric model that is running at higher resolution than approximately 1 degree. It appears that the biggest impacts are on VOC emissions and in BGC runs; we expect the impact to be small (but still non-zero) in prescribed phenology (SP) runs that don't use VOC emissions. This tag fixes this issue by no longer writing or reading accumulation variables' PERIOD to / from the restart file: this isn't actually needed on the restart file. See some discussion in ESCOMP#1789 for more details, and see ESCOMP#1802 (comment) for some discussion of outstanding weirdness that can result for accumulation variables when changing the model time step. The summary of that comment is: There could be some weirdness at the start of a run, but at least for a startup or hybrid run, that weirdness should work itself out within about the first averaging period. A branch or restart run could have some longer-term potential weirdness, so for now I think we should recommend that people NOT change the time step on a branch or restart run. With (significant?) additional work, we could probably avoid this additional weirdness, but my feeling is that it isn't worth the effort right now. In any case, I feel like my proposed fix will bring things much closer to being correct than they currently are when changing the time step. Resolved conflicts: doc/ChangeLog doc/ChangeSum python/conda_env_ctsm_py.txt python/ctsm/modify_fsurdat/fsurdat_modifier.py python/ctsm/modify_fsurdat/modify_fsurdat.py

Fix accumulation variables when changing model time step Accumulation variables (e.g., 1-day or 10-day averages) were writing and reading their accumulation period (expressed in time steps) to the restart file. This caused incorrect behavior when changing the model time step relative to what was used to create the initial conditions file (typically a 30-minute time step). So, for example, if you are using a 15-minute time step with an initial conditions file that originated from a run with a 30-minute time step (at some point in its history), then an average that was supposed to be 10-day instead becomes 5-day; an average that was supposed to be 1-day becomes 12-hour, etc. (The issue is that the number of time steps in the averaging period was staying fixed rather than the actual amount of time staying fixed.) For our out-of-the-box initial conditions files, this only impacts runs that use something other than a 30-minute time step. Typically this situation arises in configurations with an active atmospheric model that is running at higher resolution than approximately 1 degree. It appears that the biggest impacts are on VOC emissions and in BGC runs; we expect the impact to be small (but still non-zero) in prescribed phenology (SP) runs that don't use VOC emissions. This tag fixes this issue by no longer writing or reading accumulation variables' PERIOD to / from the restart file: this isn't actually needed on the restart file. See some discussion in ESCOMP#1789 for more details, and see ESCOMP#1802 (comment) for some discussion of outstanding weirdness that can result for accumulation variables when changing the model time step. The summary of that comment is: There could be some weirdness at the start of a run, but at least for a startup or hybrid run, that weirdness should work itself out within about the first averaging period. A branch or restart run could have some longer-term potential weirdness, so for now I think we should recommend that people NOT change the time step on a branch or restart run. With (significant?) additional work, we could probably avoid this additional weirdness, but my feeling is that it isn't worth the effort right now. In any case, I feel like my proposed fix will bring things much closer to being correct than they currently are when changing the time step. Resolved conflict: python/ctsm/modify_fsurdat/fsurdat_modifier.py

Fix accumulation variables when changing model time step Accumulation variables (e.g., 1-day or 10-day averages) were writing and reading their accumulation period (expressed in time steps) to the restart file. This caused incorrect behavior when changing the model time step relative to what was used to create the initial conditions file (typically a 30-minute time step). So, for example, if you are using a 15-minute time step with an initial conditions file that originated from a run with a 30-minute time step (at some point in its history), then an average that was supposed to be 10-day instead becomes 5-day; an average that was supposed to be 1-day becomes 12-hour, etc. (The issue is that the number of time steps in the averaging period was staying fixed rather than the actual amount of time staying fixed.) For our out-of-the-box initial conditions files, this only impacts runs that use something other than a 30-minute time step. Typically this situation arises in configurations with an active atmospheric model that is running at higher resolution than approximately 1 degree. It appears that the biggest impacts are on VOC emissions and in BGC runs; we expect the impact to be small (but still non-zero) in prescribed phenology (SP) runs that don't use VOC emissions. This tag fixes this issue by no longer writing or reading accumulation variables' PERIOD to / from the restart file: this isn't actually needed on the restart file. See some discussion in ESCOMP#1789 for more details, and see ESCOMP#1802 (comment) for some discussion of outstanding weirdness that can result for accumulation variables when changing the model time step. The summary of that comment is: There could be some weirdness at the start of a run, but at least for a startup or hybrid run, that weirdness should work itself out within about the first averaging period. A branch or restart run could have some longer-term potential weirdness, so for now I think we should recommend that people NOT change the time step on a branch or restart run. With (significant?) additional work, we could probably avoid this additional weirdness, but my feeling is that it isn't worth the effort right now. In any case, I feel like my proposed fix will bring things much closer to being correct than they currently are when changing the time step.

ManYue07 · 2022-07-18T14:50:23Z

Thank you for opening this issue. I agree that this appears to be a significant bug!

From looking through the code, I think I see what's happening here: the PERIOD of accumulated fields is read from the initial conditions file, but I don't think it needs to be, and doing so is problematic if the run you're doing uses a different time step than the time step used in creating the initial conditions file originally.

Can you try redoing your test after rebuilding the code with this block of code deleted:

CTSM/src/main/accumulMod.F90

Lines 767 to 776 in 25a7cd3

if (accum(nf)%old_name /= "") then

varname = trim(accum(nf)%name) // '_PERIOD:' // trim(accum(nf)%old_name) // '_PERIOD'

else

varname = trim(accum(nf)%name) // '_PERIOD'

end if

call restartvar(ncid=ncid, flag=flag, varname=varname, xtype=ncd_int, &

long_name='', units='time steps', &

imissing_value=ispval, ifill_value=huge(1), &

interpinic_flag='copy', &

data=accum(nf)%period, readvar=readvar)

If that gives other problems, then a simpler but safer experiment would be to redo your test after setting ./xmlchange CLM_FORCE_COLDSTART=on so that the model doesn't use a restart file at all. (That won't be good for science, but if my hypothesis is right, then it should get around this issue. I'd like to see if that's true.)

Hi, Bill, @billsacks I have q quick question about this bug. Can this bug be interpreted as a time step of fewer than 30 minutes resulting in inconsistent time steps in CLM and CAM? Thus further affecting the simulation results? If so, how to understand the impact of this inconsistency?

billsacks · 2022-07-18T20:55:23Z

Can this bug be interpreted as a time step of fewer than 30 minutes resulting in inconsistent time steps in CLM and CAM?

Not exactly. The issue is more subtle: CTSM has a number of accumulation fields that accumulate averages over some period. These accumulation fields weren't properly handling a change in time step (relative to what was used to generate the initial conditions file). So, for example, if you are using a 15-minute time step with an initial conditions file that originated from a run with a 30-minute time step (at some point in its history), then an average that was supposed to be 10-day instead becomes 5-day; an average that was supposed to be 1-day becomes 12-hour, etc. (The issue is that the number of time steps in the averaging period was staying fixed rather than the actual amount of time staying fixed.) It appears that the biggest impacts are on VOC emissions and in BGC runs; we expect the impact to be small (but still non-zero) in prescribed phenology (SP) runs that don't use VOC emissions.

ManYue07 · 2022-07-19T02:25:52Z

Can this bug be interpreted as a time step of fewer than 30 minutes resulting in inconsistent time steps in CLM and CAM?

Not exactly. The issue is more subtle: CTSM has a number of accumulation fields that accumulate averages over some period. These accumulation fields weren't properly handling a change in time step (relative to what was used to generate the initial conditions file). So, for example, if you are using a 15-minute time step with an initial conditions file that originated from a run with a 30-minute time step (at some point in its history), then an average that was supposed to be 10-day instead becomes 5-day; an average that was supposed to be 1-day becomes 12-hour, etc. (The issue is that the number of time steps in the averaging period was staying fixed rather than the actual amount of time staying fixed.) It appears that the biggest impacts are on VOC emissions and in BGC runs; we expect the impact to be small (but still non-zero) in prescribed phenology (SP) runs that don't use VOC emissions.

Thank you so much for the explanation, it helped a lot.

billsacks added bug something is working incorrectly tag: bug - impacts science next this should get some attention in the next week or two. Normally each Thursday SE meeting. labels Jun 29, 2022

billsacks removed the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Jun 30, 2022

billsacks self-assigned this Jun 30, 2022

This was referenced Jul 11, 2022

Remove accumulation fields' PERIOD from the restart file #1802

Merged

NSTEPS 0 everywhere for accumulation fields on some initial conditions files #1804

Closed

billsacks closed this as completed in 666ddb5 Jul 13, 2022

samsrabin added the science Enhancement to or bug impacting science label Aug 8, 2024

ekluzek added this to CTSM: Upcoming tags Aug 21, 2024

ekluzek moved this to Done (non release/external) in CTSM: Upcoming tags Aug 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A bug in calculating accumulated fields (24/240 hours averaged) when using a smaller timestep #1789

A bug in calculating accumulated fields (24/240 hours averaged) when using a smaller timestep #1789

Duseong commented Jun 29, 2022

billsacks commented Jun 29, 2022 •

edited

Loading

billsacks commented Jun 29, 2022

Duseong commented Jun 29, 2022

Duseong commented Jun 30, 2022

billsacks commented Jun 30, 2022

adamrher commented Jul 5, 2022

lkemmons commented Jul 5, 2022 via email

adamrher commented Jul 5, 2022

Duseong commented Jul 5, 2022

lkemmons commented Jul 6, 2022

adamrher commented Jul 8, 2022

lkemmons commented Jul 8, 2022

billsacks commented Jul 11, 2022

dlawrenncar commented Jul 11, 2022 via email

billsacks commented Jul 13, 2022

ManYue07 commented Jul 18, 2022

billsacks commented Jul 18, 2022

ManYue07 commented Jul 19, 2022

A bug in calculating accumulated fields (24/240 hours averaged) when using a smaller timestep #1789

A bug in calculating accumulated fields (24/240 hours averaged) when using a smaller timestep #1789

Comments

Duseong commented Jun 29, 2022

Brief summary of bug

General bug information

Details of bug

Important details of your setup / configuration so we can reproduce the bug

billsacks commented Jun 29, 2022 • edited Loading

billsacks commented Jun 29, 2022

Duseong commented Jun 29, 2022

Duseong commented Jun 30, 2022

billsacks commented Jun 30, 2022

adamrher commented Jul 5, 2022

lkemmons commented Jul 5, 2022 via email

adamrher commented Jul 5, 2022

Duseong commented Jul 5, 2022

lkemmons commented Jul 6, 2022

adamrher commented Jul 8, 2022

lkemmons commented Jul 8, 2022

billsacks commented Jul 11, 2022

dlawrenncar commented Jul 11, 2022 via email

billsacks commented Jul 13, 2022

ManYue07 commented Jul 18, 2022

billsacks commented Jul 18, 2022

ManYue07 commented Jul 19, 2022

billsacks commented Jun 29, 2022 •

edited

Loading