Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create setting to fit models with 0 or 100% encounters in any given year #386

Open
Lewis-Barnett-NOAA opened this issue Nov 26, 2024 · 7 comments

Comments

@Lewis-Barnett-NOAA
Copy link
Collaborator

In these cases, the probability of occurrence can go to -inf or inf.

In VAST, these cases were dealt with by specifying a different ObsModel that would deal with this by fixing the encounter probability to 1 or 0 for any year with 0 or 100% encounters, by specifying:

ObsModel_ez[e,2]=4

or

ObsModel = c(2,4)

Is there a way we can easily add this as an option in sdmTMB? Would it make sense to make this a new family? Or should we just add an example of how to hack TMB to do this?

@seananderson
Copy link
Member

Somewhere (that I've now lost track of) Jim explained that in VAST this maps off those coefficients in TMB and then sets the output in the index to 0 where needed to match a design-based estimator. This makes more sense as a default given VAST was set up primarily for such a use case. I don't think I'd want another set of delta families or that would balloon in number. However, perhaps we could hide such an option in the control list. It's just an issue of setting up the map list appropriately and a bit of book-keeping for output. I think this would mess up the absolute log likelihood if comparing to other distributions that have no such problem (like the Tweedie), but so it is.

@seananderson
Copy link
Member

I worked through some examples of 0% encounters and 100% encounters in these files:
https://github.com/pbs-assess/sdmTMB/blob/allzeros/scratch/all-zeros.R
https://github.com/pbs-assess/sdmTMB/blob/allzeros/scratch/all-ones.R

Until something canned gets baked into the package you can always follow these examples. It's not obvious to me if there's a way to fix parameters in the case of 100% encounters for the Poisson link because the first linear predictor (log numbers) affects both encounter probability and positive catch rate. Maybe it's fine to let it struggle to estimate on its own.

I picked 20 as the fixed value here, but I think any sufficiently large value should do. plogis(20), plogis(-20), and exp(-20) are pretty close to 1, 0, and 0.

@Lewis-Barnett-NOAA
Copy link
Collaborator Author

Thanks Sean! It is kind of an annoying edge case

@Lewis-Barnett-NOAA
Copy link
Collaborator Author

Reopening to get @James-Thorson to weigh in on the 100% positives case to see if he has a trick. Unfortunately we do run into that situation for Bering cod where we also use the poisson-link model currently.

@seananderson
Copy link
Member

@Lewis-Barnett-NOAA @James-Thorson Regarding the Poisson-link, I just added an example at the bottom of the 'all-ones.R' script: https://github.com/pbs-assess/sdmTMB/blob/allzeros/scratch/all-ones.R

I think you similarly want to map off the first linear predictor at a large 'n' for that year since the large n implies a low probability $p$ of observing a zero. The only 'catch' is the value you pick affects interpretation of the matching 2nd linear predictor coefficient since both are combined to create the positive rate. As $n$ goes up, $w$ must go down to achieve the same $r$. But for sufficiently large n, the predictions and index will be effectively identical.

$$ \begin{aligned} \log (n) &= \boldsymbol{X_1} \boldsymbol{\beta_1} + \ldots,\\ \log (w) &= \boldsymbol{X_2} \boldsymbol{\beta_2} + \ldots, \end{aligned} $$

$$ \begin{aligned} p &= 1 - \exp(-n),\\ r &= \frac{n w}{p}, \end{aligned} $$

For now, we can leave this open until some internal handling of this is added.

@Lewis-Barnett-NOAA
Copy link
Collaborator Author

Lewis-Barnett-NOAA commented Dec 4, 2024 via email

@ericward-noaa
Copy link
Collaborator

I've been dealing with this for another index issue -- it's maybe not exactly the same as the pcod case, but wanted to share anyway. Basically, we're working on an index for yellowtail on the west coast. The stock assessment is split in 2, and we're trying to use all data to generate an index for the north region. The issue is that data is sparse in the south, and there's one year with no encounters. As a result, a model that includes 'fyear * region' doesn't converge -- so I re-coded the interactions to turn off the estimate in the year with no encounters (2007). Repo is here:

https://github.com/ericward-noaa/yellowtail-index

and place this is dealt with is here:
https://github.com/ericward-noaa/yellowtail-index/blob/4303806524575e56ee3e0fec2ef9218b5a2896be/example_yt2.Rmd#L93

  • pulling in the data here is a bit ugly because I was pulling that into the indexwc package -- I'll clean this up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants