-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
floating invalid and atm and ice out of sync in IAF run #22
Comments
Not sure if it helps, but
so they're 1 day out of sync |
Here are the changes to the config files in 383b27b we did for the latest run which got us the error:
|
Restart 120 of The simplest thing to do is to keep the start date as 2200-01-01T00:00:00, which is what was done in If you copy Apologies if this misled you |
You may also need to copy |
Yes I can successfully run and complete a 9 month run by linking directly to the unmodified Now to step through and find the floating invalid cause... |
OK thanks for confirming. I think this issue can now be closed, but feel free to re-open if you see this problem again. |
Ok thanks Andrew. It would be nice to be able to shift the dates back for the new cycle - but this seems to mean that we can't? |
There would certainly be a way, but would require careful setup along the lines of the 2nd method in |
Thanks, I also have a run now which worked until 2200-10-27. I am now looking into the floating invalid message I get. |
A quick update as we are still getting 'out of sync' errors: We are trying to run a 1-year simulation with 383b27d by using the restart 120 folder in
We also looked at the output for the day 2200/10/27 when the above run (first bullet point) gave an error and couldn't find something out of the ordinary. Our reason for initially changing |
It is strange that you can get past the floating invalid with one option but not the other... To address the time syncing problem, I have just tried to get a run going with correct dates by following exactly Andrew's 2nd method in It fails on initialization no matter the length of run with a I think this is because in
(where I have changed the It seems thus that the only option is to "fake it" by shifting all the dates forward by a day and missing a day in the forcing. Does this sound right @nichannah @aekiss? Any help would be appreciated... |
Is there an If the former then we really need a standard way to restart a cycle cleanly. |
@rmholmes MOM/FMS will always fail at some point if you start runs after the 28th of the month and have an increment of 'months' You have to start your runs on |
Yes I figured. I tried intervals of years, months or seconds. It failed in every case. Our restart is dated 12-31 so the only option seems to be to fake it somehow. |
I think the best solution would be to modify the global attributes in the ice restart file to reflect the date that you wish to start from. I would rename the original attributes and store them in the modified file so that everything is tracked correctly. |
So if I "fake it", by changing in
(similarly for the date in the ocean file
Why the two years? I only shifted the dates by 1 day and I'm using Thanks @russfiedler . So do I need to change the |
Here is the code I mentioned in the MOM meeting that I wrote for Rishav, so he could edit his CICE restart files and change the dates https://gist.github.com/aidanheerdegen/203af6f6e0a87d1d82704eae9608f099 There is some description in the comments on how to use it. Is that useful? |
Thanks @aidanheerdegen, that looks like what I need, although it fails with:
I've overwritten the istep1, time and nyr attributes using matlab for the restarts, back to the 1960-01-01 values that are in |
Oh right, these are netCDF files so ignore me. The thing I wrote was for the binary CICE restart files. So as you have done edit the global attributes:
and you're laughing |
Ok I can get around the syncing problems and use proper dates (starting from 1960) if i change the ice restart netcdf file attributes as above and use @mauricehuguenin I would suggest going through as I have and trying to run for a year. The syncing problems should be solved, but the blow-up is probably still there. |
Hi @rmholmes, @aidanheerdegen glad you found a recipe that works, apologies if the tutorial was less than helpful. I've added a quick note to the tutorial linking to this discussion, but if you have a clearer way to present a reliable method let me know. Tutorial step 10 ( |
Thanks all for helping! I am now going through the year with 1-month simulations by using Ryan's modified restart files. |
Hi @aekiss the problem is that with I had a quick look through the cice5 code but couldn't figure it out. |
Hmm, interesting. libaccessom2/libcouple/src/accessom2.F90 Line 179 in 78b6a45
I presume the start time is supposed to be passed on to cice, to be used whenever use_restart_time=.false. , but for some reason this isn't working.
@nichannah can you shed any light on this? |
This looks like the same error as in your email on 5 July (it's the same line of the same source file). I don't know what version of MOM you are using (what is the first hash in the ocean (fms) exe name in config.yaml?) but if it's f8967b1 that line is @russfiedler emailed some suggestions back in July as to what might be going wrong there (assuming your version has a similar |
Apologies for not joining this conversation sooner, I've updated my notification settings. Firstly I tried to reproduce this problem using the information given above and @aekiss instructions under "Simple case: no date change" (See https://github.com/COSIMA/access-om2/wiki/Tutorials#starting-a-new-experiment-using-restarts-from-a-previous-experiment). I came across two instances of what appears to be COSIMA/access-om2#149. As @aekiss mentioned the dates in the restart (/g/data/hh5/tmp/cosima/access-om2-025/025deg_jra55v13_iaf_gmredi6/restart120/accessom2_restart.nml) do not look right:
So after copying the restart120 and output120 dirs into my new archive I modified this to be:
I moved the forcing date forward rather than changing the experiment date because it looked like the other models also thought the date should be 2200-01-01T00:00:00. e.g. in /g/data/hh5/tmp/cosima/access-om2-025/025deg_jra55v13_iaf_gmredi6/restart120/ocean/ocean_solo.res
Then, during runtime, I hit the same bug again (COSIMA/access-om2#149) because 1960 is a leap year but 2200 is not. I then tried a run using the latest yatm (I have copied it to /short/public/access-om2/bin/yatm_7cfdd5dc.exe) which has this bug fixed and this appears to be working. Note that the fix is only about 2 weeks old so is younger than this issue. More comments to come. |
I also tried using the "More complicated cases" instructions to restart at 1960-01-01. These instructions look OK but I ran into the same problem as @rmholmes mentioned yesterday with the CICE starting on 1958 when it should be on 1960. It looks like CICE is pulling out the forcing date from libaccessom2 rather than the experiment/model date. It is possible that I have created an issue for this COSIMA/cice5#38. I think the problem has been fixed and a new CICE executable can be found at: /short/public/access-om2/bin/cice_auscom_1440x1080_480p_ab473434_libaccessom2_7cfdd5dc.exe Using this new executable Andrew's instructions for modifying the restart date appear to be working. |
The executables I used thus far are from June:
I now switched to the latest versions from August in
|
With the latest executables, namely:
I can successfully run a full year starting with WOA initial conditions. I have a small issue with collating the archive files but I think this is a minor issue from my side. Running a simulation with the same executables and Ryan's modified restart from #22 (comment) I unfortunately still get the out of sync error despite the new CICE version:
In both cases, I am running with I copied the error logs over to
Would you recommend me going back to the older executables from #22 (comment)? |
@mauricehuguenin, @rmholmes - Now that COSIMA/access-om2#159 has been resolved, can this issue be closed, or are there some remaining problems? |
The floating point has been fixed. I think @mauricehuguenin still had some syncing problems with Nic's new code when |
We would like to be able to do a restart by simply setting |
Please feel free to re-open if there are still problems with this. |
With this config: https://github.com/COSIMA/1deg_jra55_iaf.git If I run for 1 year, and then restart but run for 3 months I get this error:
Ran at this location:
This suggests this is not a solved problem. |
sounds like we should revisit this fix: |
ping @nichannah |
Thanks @aidanheerdegen, will take a look. |
Fix to repeat forcing day when exp is on leap day. #22
Fixed. Was a bug when exp date was leap day and forcing was not we repeat forcing for the whole day ... it was not skipping the 'out of sync' check properly during this day. |
This issue has been mentioned on ACCESS Hive Community Forum. There might be relevant details there: https://forum.access-hive.org.au/t/access-om2-control-runs/258/4 |
Maurice and Ryan have been getting an "atm and ice models out of sync" error in 0.25 deg IAF runs using COSIMA/025deg_jra55_iaf@383b27b
This config uses the latest libaccessom2 (b6caeab) but I don't know of any IAF runs that used anything newer than e8ad372 (from Aug 31, 2018) and there are many differences between them: e8ad372...b6caeab
The text was updated successfully, but these errors were encountered: