Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many false absences come up in new filter #392

Closed
gbif-portal opened this issue Sep 11, 2020 · 11 comments
Closed

Many false absences come up in new filter #392

gbif-portal opened this issue Sep 11, 2020 · 11 comments
Assignees

Comments

@gbif-portal
Copy link

Many false absences come up in new filter

It's great to be able to filter for presence or absence records, but currently the absence filter is picking up many records that clearly do not relate to absence, for example specimens or observations with images. It looks like this is due to inferring absence in cases where 'individual count' is wrongly set at 0 in the original record. Is there some way to avoid this, e.g. not inferring absence in such cases when the basis of record is a specimen, or where there is an image/audio associated with an observation?


User: See in registry
System: Chrome 85.0.4183 / Mac OS X 10.15.6
Referer: https://www.gbif.org/occurrence/gallery?basis_of_record=PRESERVED_SPECIMEN&occurrence_status=absent
Window size: width 2032 - height 1306
API log
Site log
System health at time of feedback: OPERATIONAL

@timrobertson100 timrobertson100 transferred this issue from gbif/portal-feedback Sep 17, 2020
@timrobertson100
Copy link
Member

@ahahn-gbif what do you prefer here, please?

Clicking around, the proposal to infer present for the case of individualCount=0 AND occurrenceStatus=NULL AND basisOfRecord=PreservedSpecimen does indicate it'd catch many false absences, and intuition is it is more likely that a specimen record having individualCount of 0 does reflect the common "0 for null numbers" mistake.

@timrobertson100
Copy link
Member

Alternatively, most could be fixed from one provider - perhaps we could confirm they have only presence data and set a default on the occurrenceStatus

image

@ahahn-gbif
Copy link

It sounds plausible to interpret "Specimen" as present; I would not assume the same for "occurrence" or "observation", as this again may be a field default (mapping). The earlier, extensive discussion (see #268) did not consider BoR, so that we probably do want to extend the interpretation here for "Specimen", as @timrobertson100 suggests.

The best scenario is to get data sources updated at their origin,but this can be slow progress. The mail shoot in early 2020 focused on inconsistencies between occurrence status and count. @jlegind, how many of the datasets in the list above were already notified in the first round? Would it make sense to write to at least the top ten in the list displayed above, and set the default for occurrence status (where NULL) if there is no reaction in reasonable time?

@jlegind
Copy link

jlegind commented Sep 17, 2020

I will contact Vince at the NHM London and get an explanation. Hopefully this can provide a basis for an updated filter.

@timrobertson100
Copy link
Member

Update: UK NHM requested default value added (done) so only 258k specimens interpreted as absent exist.

@ahahn-gbif
Copy link

ahahn-gbif commented Sep 22, 2020

I would still be in support of the proposal to change the logic so that count=0 and BoR=Specimen -> PRESENT (unless explicitly stated as absent)

@MattBlissett
Copy link
Member

Does this require another issue, OCCURRENCE_STATUS_INFERRED_FROM_BASIS_OF_RECORD?

muttcg added a commit to gbif/gbif-api that referenced this issue Sep 24, 2020
@muttcg
Copy link
Member

muttcg commented Sep 24, 2020

Should I add the same logic for FOSSIL_SPECIMEN and LIVING_SPECIMEN?

@MattBlissett
Copy link
Member

Yes, fossil, living and preserved specimens.

@ahahn-gbif
Copy link

I was wondering the same. Yes, I think, especially if there is a specimenID of some kind.

muttcg added a commit to gbif/download-query-tools that referenced this issue Sep 24, 2020
muttcg added a commit to gbif/registry that referenced this issue Sep 24, 2020
muttcg added a commit that referenced this issue Oct 1, 2020
* #392 add OCCURRENCE_STATUS_INFERRED_FROM_BASIS_OF_RECORD logic
muttcg added a commit that referenced this issue Oct 6, 2020
@muttcg
Copy link
Member

muttcg commented Oct 8, 2020

Deployed to PROD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants