Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with cstack error in inlabru #261

Open
tildagasslander opened this issue Feb 12, 2025 · 18 comments
Open

Problem with cstack error in inlabru #261

tildagasslander opened this issue Feb 12, 2025 · 18 comments

Comments

@tildagasslander
Copy link

Hi!

This problem concerns our project in which we are trying to model and predict wildfires using inlabru. Right now we are testing our code on a smaller test set, using only population density as covariate. We are currently attempting on modeling wildfires as an ar1 process in time.

I have encountered a problem where I don't seem to get past the problem with cstack error. I am trying to to a joint model fitting like so:

cmp_ar1_joint <-
  ~ -1 + ## remove intercept as each likelihood has its own intercept
  point_field(geometry, ## keyword to refer to coordinates of SpatialPointsDataFrame
              model = matern) + ## refer back to SPDE defined above
  mark_field(geometry, ## keyword to refer to coordinates of SpatialPointsDataFrame
             group = month, ## group field by temporal index (named ti in data column) 
             group_mapper = bru_mapper_index(12), ## there are 11 levels in our temporal index 
             model = matern, ## refer back to SPDE defined above
             control.group = list(model="ar1", hyper=prior.rho)) + ## AR1 temporal structure with prior on rho
  Inter_point(1) + Inter_mark(1) + ## Intercepts for each likelihood
  scaling(main = 1, model = "linear", mean.linear = 0, prec.linear = 1, n=1, values=1) + ## scaling parameter beta with priors 
  +density ## covariates (column names) 

lik1_ar1 <- bru_obs("cp", ## cox process likelihood
                    formula = geometry ~ point_field + Inter_point, ## model formula for this likelihood
                    include = c("point_field","Inter_point"), ## which model components to include
                    data = data, ## fitting to 'point' dataset
                    domain = list(geometry=mesh), ## spatial domain for area of interest
                    samplers = boundary ## where did sampling take place?
) 
lik2_ar1 <- bru_obs("poisson", ## poisson likelihood
                    formula = mark ~ Inter_mark + point_field*qexppnorm(scaling_latent, rate = prior_rate) + mark_field +
                      density, ## model formula for this likelihood 
                    include = c("Inter_mark","point_field","mark_field","density", "scaling"), 
                    data = data) ## fitting to 'mark' dataset

fit_ar1_joint <- bru(cmp_ar1_joint, lik1_ar1, lik2_ar1) ## bring together model components & likelihoods and fit model

When trying to run the last row we get this error message:

Error: C stack usage 20832033 is too close to the limit

We are using the fm_mesh_2d_inla for creating the mesh and we have tried to install the latest version like so:

pak::pkg_install("inlabru-org/fmesher")

As we get a cstack issue and thus do not get any output (fit_ar1_joint), we have not been able to use the bru_timings() command.

The data we use is an sf object with 11 values for 1000 observations. The lik1_ar1 is 12.4 mb and the lik2_ar1 is 0.5 mb. We are wondering what might be causing the cstack error.

@finnlindgren
Copy link
Collaborator

I thought I had fixed the cstack issue in the fmesher devel version, but clearly not. Since it happens as a part of the reporting of an actual error, I think the key here is to figure out the real error. Please run with verbose options:

fit_ar1_joint <- bru(cmp_ar1_joint, lik1_ar1, lik2_ar1, options = list(verbose = TRUE, bru_verbose = 4))

You can safely remove the include = ... arguments; they are no longer needed, as inlabru automatically detects which components are needed/used.
In case the issue is with numerics in qexppnorm, use the marginal feature instead, which has more stable numerics than a plain qexp(pnorm()) call:

cmp_ar1_joint <-
  ~ ... +
  scaling(1, model = "linear", mean.linear = 0, prec.linear = 1,
       marginal = bru_mapper_marginal(qexp, pexp, dexp, rate = prior_rate) + ## scaling parameter beta with priors 
  +density ## covariates (column names) 

lik2_ar1 <- bru_obs("poisson", ## poisson likelihood
                    formula = mark ~ Inter_mark + point_field * scaling + mark_field + ...

@tildagasslander
Copy link
Author

Hi Finn!

Thank you so much for helping us. We tried adjusting our code according to your recommendations. Once again we got cstack errors but this time we got some additional information as well:

iinla: Evaluate component inputs
iinla: Evaluate component linearisations
Linearise component 'point_field'
Linearise component 'Inter_point'
Linearise component 'point_field'
Linearise component 'mark_field'
Linearise component 'Inter_mark'
Linearise component 'scaling'
Linearise component 'density'
iinla: Evaluate component simplifications
Simplify component 'point_field'
Simplify component 'Inter_point'
Simplify component 'point_field'
Simplify component 'mark_field'
Simplify component 'Inter_mark'
Simplify component 'scaling'
Simplify component 'density'
iinla: Evaluate predictor linearisation
Linearise with respect to component 'point_field'
Linearise with respect to component 'Inter_point'
Linearise with respect to component 'point_field'
Linearise with respect to component 'mark_field'
Linearise with respect to component 'Inter_mark'
Linearise with respect to component 'scaling'
Linearise with respect to component 'density'
iinla: Construct inla stack
iinla: Model initialisation completed
iinla: Iteration 1 [max:10]

 *** inla.core.safe:  The inla program failed, but will rerun in case better initial values may help. try=1/1 
Error: C stack usage  20839446 is too close to the limit

We find it a little hard to understand, but maybe you can help us?

/Tilda and Hanna

@finnlindgren
Copy link
Collaborator

finnlindgren commented Feb 12, 2025

OK, that shows that inlabru completed the model initialisation, but the inla called failed without producing any output at all. Have you been able to run any (simpler) models, to make sure inla can run at all?

For example,

fit <- inla(y~1+x,data=data.frame(y=rnorm(10),x=1:10))

and

fit <- bru(y~1+x,data=data.frame(y=rnorm(10),x=1:10))

@tildagasslander
Copy link
Author

We tried to make a simpler model using that code, and we now got a new error message

> fit <- bru(y~1+x,data=data.frame(y=rnorm(10),x=1:10))

 *** inla.core.safe:  The inla program failed, but will rerun in case better initial values may help. try=1/1 
Warning in iinla(model = info[["model"]], lhoods = info[["lhoods"]], options = info[["options"]]) :
  iinla: Problem in inla: Error in inla.core.safe(formula = formula, family = family, contrasts = contrasts,  : 
  The inla program call crashed.
The inla program failed and the maximum number of tries has been reached.
iinla: Problem in inla: 1: bru(y ~ 1 + x, data = data.frame(y = rnorm(10), x = 1:10))
2: iinla(model = info[["model"]], lhoods = info[["lhoods"]], options = info[[ [...]
3: fm_try_callstack(...)
4: do.call(INLA::inla, inla.options.merged, envir = environment(model$effects))
5: (function (formula = NULL, family = "gaussian", contrasts = NULL, 
       data = NULL, quantiles = c(0.025, 0.5, 0.975), E = NULL, 
       offset = NULL, scale = NULL, weights = NULL, Ntrials = NULL, 
       strata = NULL, lp.scale = NULL, link.covariates = NULL, verbose = inla [...]
       lincomb = NULL, selection = NULL, control.compute = list(), 
       control.predictor = list(), control.family = list(), control.inla = list(), 
       control.fixed = list(), control.mode = list(), control.exper [... truncated]
iinla: Giving up and returning last successfully obtained result for diagnostic purposes.

It seems that indeed inla does not work - we didn't realize this could be the case, what do you think is the problem? We have tried the code on different computers and got the same cstack error. We have installed inla by using

install.packages("INLA",repos=c(getOption("repos"),INLA="https://inla.r-inla-download.org/R/stable"), dep=TRUE)

Thanks for your help!

@finnlindgren
Copy link
Collaborator

Install from "testing", not "stable", then try the inla() basic call again. Also please provide your sessionInfo() output.

@tildagasslander
Copy link
Author

This is the output we get from sessionInfo() after installing from testing instead of stable and trying the inla() basic call. It gave the same error as earlier. Maybe you can spot the bug?

sessionInfo()
R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)

Matrix products: default


locale:
[1] LC_COLLATE=Swedish_Sweden.utf8  LC_CTYPE=Swedish_Sweden.utf8    LC_MONETARY=Swedish_Sweden.utf8
[4] LC_NUMERIC=C                    LC_TIME=Swedish_Sweden.utf8    

time zone: Europe/Stockholm
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] sp_2.2-0       dplyr_1.1.4    sf_1.0-19      inlabru_2.12.0 fmesher_0.2.0  INLA_25.02.10  Matrix_1.7-0  

loaded via a namespace (and not attached):
 [1] vctrs_0.6.5        cli_3.6.3          rlang_1.1.4        DBI_1.2.3          KernSmooth_2.23-24 MatrixModels_0.5-3
 [7] generics_0.1.3     glue_1.7.0         plyr_1.8.9         e1071_1.7-16       pak_0.8.0.1        fansi_1.0.6       
[13] grid_4.4.1         tibble_3.2.1       classInt_0.4-11    lifecycle_1.0.4    compiler_4.4.1     pkgconfig_2.0.3   
[19] Rcpp_1.0.13        rstudioapi_0.16.0  lattice_0.22-6     R6_2.5.1           tidyselect_1.2.1   utf8_1.2.4        
[25] class_7.3-22       parallel_4.4.1     pillar_1.9.0       splines_4.4.1      magrittr_2.0.3     tools_4.4.1       
[31] proxy_0.4-27       withr_3.0.2        units_0.8-5

@finnlindgren
Copy link
Collaborator

I don't think this is the issue, but can you upgrade R to 4.4.2, just in case?

@finnlindgren
Copy link
Collaborator

It appears you still have the CRAN version of fmesher, and not the development version? What does packageVersion("fmesher") say? Although this should have no effect on the basic inla() call, it will have an effect on the error reporting issue in inlabru.

@finnlindgren
Copy link
Collaborator

Can you also show the complete information output you get when running the install.packages() command for INLA?

@tildagasslander
Copy link
Author

Sorry for the confusion, we just started using R three weeks ago, and wanted to use the school computers for extra power but apparently this meant a number of new problems with version updates we are not authorized for....
We now tried on our laptops with the fmesher version 0.2.0.90.// and now it seems to work better (but we suspect we will run out of memory before the code has finished running)- the basic command at least works fine.
You have been really helpful, thanks again!

@finnlindgren
Copy link
Collaborator

Is it some executable code/virus checker issue perhaps? In general, Linux tends to work better than Windows (the windows memory management has some strange behaviours).

@finnlindgren
Copy link
Collaborator

R 4.4.1 should be ok though. But if the system restricts what can run too much, then inla can't run.

@hdovestone
Copy link

Hello Finn!

Now that we have run the code again with proper package versions it starts spinning on the following errror/warning code

Function: GMRFLib_factorise_sparse_matrix_TAUCS(), Line: 932, Thread: 1
	Failed to factorize Q. I will try to fix it...


	Function: GMRFLib_facto *** WARNING *** GMRFLib_2order_approx: rescue NAN/INF values in logl for idx=1212
 *** WARNING *** GMRFLib_2order_approx: reset counter for 2 NAN/INF values in logl
 *** WARNING *** GMRFLib_2order_approx: rescue NAN/INF values in logl for idx=1212

How should we interpret this?

@finnlindgren
Copy link
Collaborator

Please provide more log context if possible; what happens before the first warning can contain useful information.
For example, if it happens immediately, or after some optimisation iterations.
Sometimes setting initial values for the latent variables helps (in particular the scaling component).

@hdovestone
Copy link

bru_options_set(control.compute = list(dic = TRUE, waic = TRUE, cpo=TRUE)) ## compute model assessment scores during model fitting
qexppnorm <- function(x, rate) {
  qexp(pnorm(x, lower.tail = FALSE, log.p = TRUE), rate = rate, log.p = TRUE, lower.tail = FALSE)
} ## Prior for beta
prior_rate <- 1 ## prior for beta
prior.rho = list(theta = list(prior='pccor1', param = c(0, 0.9))) ## prior for rho
matern <- inla.spde2.pcmatern(mesh,
                              prior.range = c(0.2, 0.01),
                              prior.sigma = c(2, 0.01)) ## SPDE priors
## (note: we use the same priors for both fields, but these could easily be specified separately)

## Marked Point Process AR1 Model
cmp_ar1_joint <-
  ~ -1 + ## remove intercept as each likelihood has its own intercept
  point_field(geometry, ## keyword to refer to coordinates of SpatialPointsDataFrame
              model = matern) + ## refer back to SPDE defined above
  mark_field(geometry, ## keyword to refer to coordinates of SpatialPointsDataFrame
             group = month, ## group field by temporal index (named month in data column) 
             group_mapper = bru_mapper_index(12), ## there are 11 levels in our temporal index 
             model = matern, ## refer back to SPDE defined above
             control.group = list(model="ar1", hyper=prior.rho)) + ## AR1 temporal structure with prior on rho
  Inter_point(1) + Inter_mark(1) + ## Intercepts for each likelihood
  scaling(1, model = "linear", mean.linear = 0, prec.linear = 1,
          marginal = bru_mapper_marginal(qexp, pexp, dexp, rate = prior_rate)) + #scaling parameter with beta
  +density
lik1_ar1 <- bru_obs("cp", ## cox process likelihood
                    formula = geometry ~ point_field + Inter_point, ## model formula for this likelihood
                    data = data, ## fitting to 'point' dataset
                    domain = list(geometry=mesh), ## spatial domain for area of interest
                    samplers = boundary ## where did sampling take place?
) #for debug
## (note: here we just use the boundary for samplers but this can also be used for subplots or transects)
lik2_ar1 <- bru_obs("poisson", ## poisson likelihood
                    formula = mark ~ Inter_mark + point_field*scaling + mark_field +
                    density, ## model formula for this likelihood 
                    data = data) ## fitting to 'mark' dataset
fit_ar1_joint <- bru(cmp_ar1_joint, lik1_ar1, lik2_ar1, options = list(verbose = TRUE, bru_verbose = 4)) ## bring together model components & likelihoods and fit model

This is the full code with the priors (except now we dont use qexppnorm anymore as per your suggestion) and below is some of the output when running the last line (there is a lot of output) - it seems to initiate right and then go into the first optimization run but then stops.

iinla: Model initialisation completed
iinla: Iteration 1 [max:10]
\...\
Compute initial values...
   Iter[0] RMS(err) = 1.000, update with step-size = 0.442
   Iter[1] RMS(err) = 0.803, update with step-size = 0.614
   Iter[2] RMS(err) = 0.932, update with step-size = 0.490
   Iter[3] RMS(err) = 0.975, update with step-size = 0.677
   Initial values computed in 0.0171 seconds
   	x[0] = 0.0699
   	x[1] = 0.0699
   	x[2] = 0.0699
   	x[3] = 0.0698
   	x[4] = 0.0699
   	x[31030] = 0.0000
   	x[31031] = -0.2690
   	x[31032] = 1.1952
   	x[31033] = 0.0000
   	x[31034] = 0.0000

Optimise using DEFAULT METHOD
Smart optimise part I: estimate gradient using forward differences

   Function: GMRFLib_factorise_sparse_matrix_TAUCS(), Line: 932, Thread: 0
   Failed to factorize Q. I will try to fix it...

@finnlindgren
Copy link
Collaborator

That does seem to be an initialisation issue. Does it work with just lik1_ar1? If it does, you can also check with just lik2_ar1, although that one will have extra difficulties in the absence of lik1_ar1 (the point field and scaling will be mutually non-identifiable).

Also, is density a variable in the data input? There seems to be a syntax error in your component definition with double "+"; there shouldn't be a + at the start of the density definition line as there is already one at the end of the previous line (which needs to be there for it not to terminate the expression).

@hdovestone
Copy link

Thanks for catching that syntax error!
No, running lik1_ar1 does not work, if i run the following

fit_ar1_joint <- bru(cmp_ar1_joint, lik1_ar1, options = list(verbose = TRUE, bru_verbose = 4))

it very quickly throws itself into the same error loop as above.

@finnlindgren
Copy link
Collaborator

OK, that at least makes it easier to debug/fix, as it's not related to the joint modelling!

Check your spatial field prior; if your CRS is in km, then 0.2 for the range is just 200 metres, which is very small on the scale I think you're modelling. What is your max.edge value? The prior.range value should be larger.
Also, the 2 for sigma is quite large when the field is applied on the log-intensity model; 0.5 or 1 is likely more reasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants