Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ALM crashing with "Forcing height is below canopy height for pft index" #1259

Open
bishtgautam opened this issue Feb 8, 2017 · 7 comments
Open
Assignees

Comments

@bishtgautam
Copy link
Contributor

This bug was first emailed by @polunma and have been recently reported by @PeterCaldwell.

In cesm.log:
---------------------------------
4759:  ccm kohlerc - no real(r8) solution found (quartic)
4759:  roots = (-4.460734210126336E-003,4.689603410910435E-003)
4759:  (-3.527252795449640E-002,3.677168593260870E-004)
4759:  (8.007924511427858E-003,-1.027471514914559E-004)
4759:  (-4.498022041789295E-003,-4.954573118745066E-003)
4759:  p0-p3 = -1.223533480500435E-008 -1.351080467777005E-006  0.000000000000000E+000
4759:   3.622335969498418E-002
4759:  rh=  0.967215515496280    
4759:  setting radius to dry radius=  6.964271759508640E-003
4880:  Error: Forcing height is below canopy height for pft index
4878:  BalanceCheck: soil balance error (mm)
4878:  nstep         =      3455688
4878:  errsoi_col    =  -1.162823516679623E-003
4878:  clm model is stopping
4880:  calling getglobalwrite with decomp_index=        68654  and clmlevel= pft
4880:  local  pft      index =        68654
4880:  global pft      index =       320504
4880:  global column   index =       160825
4880:  global landunit index =        41893
4880:  global gridcell index =         9980
0001:  pionfput_mod.F90         128           1        2125          64           1
0001:  0198-03-30_12:00:00
4880:  gridcell longitude    =    272.170820393250    
4880:  gridcell latitude     =    41.9795465246935    
4880:  pft      type         =           15
4880:  column   type         =            1
4880:  landunit type         =            1
4880:  ENDRUN:
4880:  ERROR in CanopyFluxesMod.F90 at line 706      
@bishtgautam
Copy link
Contributor Author

Background of the code in ALM.

  • The error is occurring when ALM solves canopy level fluxes.
  • Atmosphere model passes the lowest level boundary conditions to the land model. These BCs are used to solve canopy level latent and sensible heat in CanopyFluxesMod.F90 via Monin Obukhov theory.
  • A newton-raphson (NR) iteration is used to solve the non-linear equation for surface fluxes and the error got is triggered in the NR solve.

Few possible explanation for the error:

  • NR iteration is incorrectly implemented in the land model.
  • There is a large jump in the BC that atmosphere model sends to the land model between two time steps.

@bishtgautam
Copy link
Contributor Author

@PeterCaldwell : How can I reproduce this error?

@PeterCaldwell
Copy link
Contributor

Hi Gautam - I was afraid you would ask this question. If you continue the beta0 coupled simulation from 0196-01-01 you should be able to reproduce the crash. We can probably give you a run script for this (because changing anything at roundoff level will make the problem go away). My inclination, however, is to wait until the crash happens again (once we do our next round of coupled simulations) and start debugging from there. The question I was trying to ask today was just whether you'd seen this sort of thing before and whether you had any ideas about simple ways to fix this... It sounds like the answer is "no", which I think is fine until next time this comes up.

@bishtgautam
Copy link
Contributor Author

whether you had any ideas about simple ways to fix this

I'm not aware of a simple way to fix this issue. The fix would require figuring out why the code is crashing, which will be non trivial since this bug would require running a couple compset.

I agree that we should wait till the code crashes again with this error.

@PeterCaldwell
Copy link
Contributor

Ok, sounds good. I'll make sure to report here next time we encounter this kind of crash. I'll also give a restart file which reproduces the crash after a simulated day of run time.

@PeterCaldwell
Copy link
Contributor

We now have a case exhibiting this behavior and @jonbob has been looking at it. Any progress on this, Jon?

@jonbob
Copy link
Contributor

jonbob commented Mar 27, 2017

I've got a case setup and now have restart files just 6-7 hours before the failure -- which I can replicate consistently. My failures have so far given up even less information than Chris' -- so I'm compiling a debug version now and hope to have more information soon.

agsalin pushed a commit that referenced this issue Apr 13, 2017
Index in namelist
Adds support for namelist entries of the form foo(3) = 'a'

Test suite: scripts_regression_tests.py, --namelist-only tests for cesm prealpha and aux_clm45
Note there are some expected failures here due to bugs found in the process. One bug was that
for multi-instance cases all namelists were using the 0001 stream file and they each should have
a separate file. Another bug was that only changes in user_nl_d***_0001 were being used in all
instances (user_nl_d***_000n where n>1 was being ignored)
Test baseline: cesm2_0_alpha06g
Test namelist changes:
Test status: bit for bit

Fixes #1248
Fixes #1274

User interface changes?:

Code review: @jgfouca @mvertens @gold2718
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants