Annotation are not being used #1679

dustymc · 2018-09-12T15:48:11Z

Arctos has a sophisticated system for annotating various data objects. Submitting an annotation alerts anyone who might have an interest. They're still mostly ignored.

UAM@ARCTOS> select REVIEWED_FG, count(*) from annotations group by REVIEWED_FG;

REVIEWED_FG   COUNT(*)
----------- ----------
	  1	   378
	  0	  4788

2 rows selected.

Most everyone has a few unreviewed annotations hanging around:

 select
  guid_prefix,
  count(*)
from
  annotations,
  cataloged_item,
  collection
where
  annotations.collection_object_id=cataloged_item.collection_object_id and
  cataloged_item.collection_id=collection.collection_id and
  REVIEWED_FG=0
group by 
  guid_prefix
order by
  guid_prefix
 16  ;

GUID_PREFIX						       COUNT(*)
------------------------------------------------------------ ----------
APSU:Herp							      1
CHAS:Bird							     28
CHAS:Egg							      2
CHAS:Ento							      2
CHAS:Mamm							      6
DMNS:Bird							     40
DMNS:Egg							      3
DMNS:Inv							     43
DMNS:Mamm							     10
HWML:Para							    185
KNWR:Herb							      1
KWP:Ento							    301
MLZ:Bird							     10
MSB:Bird							     49
MSB:Fish							      3
MSB:Herp							     26
MSB:Host							      4
MSB:Mamm							    877
MSB:Para							      9
MVZ:Bird							    272
MVZ:Egg 							     12
MVZ:Herp							    490
MVZ:Hild							      1
MVZ:Mamm							    614
MVZObs:Bird							      4
UAM:Alg 							     18
UAM:Bird							    128
UAM:ES								     22
UAM:Ento							     10
UAM:Fish							     34
UAM:Herb							     46
UAM:Herp							      2
UAM:Inv 							     19
UAM:Mamm							    229
UAMObs:Ento							     63
UAMb:Herb							     31
UCM:Bird							    107
UCM:Egg 							     12
UCM:Fish							      1
UCM:Herp							    314
UCM:Obs 							      1
UMNH:Mamm							     48
UNR:Herp							      1
UTEP:Herb							      9
UTEP:Herp							      2
UTEP:Mamm							      4
UWBM:Herp							    496
UWBM:Mamm							     72
UWYMV:Mamm							     16
WNMU:Mamm							      2

50 rows selected.

Is there some way I can help, something we could be doing differently, ???? I don't imagine that our current response inspires users to keep submitting annotations.

(And maybe this should be a section of that publication. We built annotations when a major project to do about the same thing never happened, it's a form of crowdsourcing, grouping multiple annotations is a novel thing that essentially allows annotating arbitrary queries, etc. - AFAIK no other system has anything remotely similar.)

The text was updated successfully, but these errors were encountered:

Jegelewicz · 2018-09-12T16:33:57Z

Who gets notified when an annotation is made? I don't recall getting any emails. If email is too intrusive, how about a notification upon sign-in as you are doing with the "other stuff in bulkoaders"?

dustymc · 2018-09-12T16:40:58Z

notified

That depends on what's being annotated, but primarily the specimen's collection's data quality contact. And that part is easy to adjust.

notification

#1419

Jegelewicz · 2018-09-12T16:47:43Z

If I go to review annotations, I have to search on every collection individually (or the search returns too many) and for many of the collections there are no annotations - I spend a lot of time for no results going collection by collection. Could we search by institution OR collection? At times, I'd like to be able to see just the UTEP annotations or just MSB. Other times I may be more focused on a single collection.

Also, we need a way to mark them resolved, need research, etc. or the number will just continue to increase forever. They should never go away (we had a looooong discussion about this at SPNHC/TDWG) but we need a way to sort them better so that we aren't constantly reviewing them over and over.

dustymc · 2018-09-12T17:09:06Z

search by institution OR collection?

Yea, sure - I'll do that now.

The search form is more or less an afterthought - the original intention was that we'd respond to the emails (eg, instant gratification for the user who's bothered to leave us information). That's obviously not happening to the degree it should, I'm happy to consider WHATEVER.

mark them resolved

We can talk about that too, but there is a way to mark them as "reviewed" and filter on that. Not all will be resolvable, and I'm not really convinced we can categorize that sort of thing - my initial reaction is that "someone's looked at this and responded to the degree current data and resources allow" is good enough.

never go away

They don't and won't - annotations become a part of the specimen record.

looooong discussion about this at SPNHC/TDWG

Lots of those discussions tend to be about "problems" we've long since solved....

Jegelewicz · 2018-09-12T17:53:29Z

there is a way to mark them as "reviewed" and filter on that

I think it would be helpful if this were more obvious. Having to enter "NOT NULL" is not very intuitive. Could there just be a check box where this happens in the background?

[ ] "Check here to see only annotations which have no review comments"

We should think about adding to our annotations so that they can be more easily sorted. Just as we have templates for GitHub issues, templates for coordinate, taxon, other annotations would help those making them as well as those reviewing them.

dustymc · 2018-09-12T18:08:21Z

check box

You can just click the link, but I can do whatever in the UI.

There are also things like...

which have been reviewed, but may need work anyway - why the default view is to see more than you may need.

templates

I don't see a technical problem with that, but I'm not sure I understand what/how that'd work. Demo/example?

Jegelewicz · 2018-09-12T22:13:17Z

You can just click the link

I had no idea and I'd like to make it as easy to understand as possible NULL isn't a word that would resonate with a lot of people.

Jegelewicz · 2018-09-12T22:19:36Z

There are also things like...

screen shot 2018-09-12 at 10 59 51 am

Yep and those are scary. Will what I do to fix mine affect anyone else? Do all of these share a set of coordinates or are they a bunch of separate coords that don't map to Washington? It is overwhelming and probably means that no matter how many times I see it I am just going to move along to something I feel more comfortable with. And if I was the one who fixed mine, it's going to frustrate me that I have to keep seeing it. I know what you are getting at, but I think we need to be as focused as possible, especially when these come from within Arctos. Remember, we have limited resources and time. We'd like to be perfect, but we are really just scrambling to get last year's (decade's?) data in the system in the first place.

Jegelewicz · 2018-09-12T22:20:15Z

I'll see if I can set up a template example in the next week or so...

dustymc · 2018-09-12T23:26:46Z

easy to understand as possible NULL

Yea, the parenthetical bits are meant to help address that.

Will what I do to fix mine affect anyone else?

Hopefully! Ideally the one locality would be fixed, everyone gets emails, yay everybody, yay shared data, all done. It seems like reality doesn't always quite get there.

The data are stored denormalized as multiple annotations. The screenshot above is built from...

UAM@ARCTOS> select * from annotations where annotation_group_id=1862;

ANNOTATION_ID ANNOTATE_D
------------- ----------
CF_USERNAME
------------------------------------------------------------------------------------------------------------------------
COLLECTION_OBJECT_ID TAXON_NAME_ID PROJECT_ID PUBLICATION_ID
-------------------- ------------- ---------- --------------
ANNOTATION
------------------------------------------------------------------------------------------------------------------------
REVIEWER_AGENT_ID REVIEWED_FG
----------------- -----------
REVIEWER_COMMENT
------------------------------------------------------------------------------------------------------------------------
ANNOTATION_GROUP_ID
-------------------
EMAIL
------------------------------------------------------------------------------------------------------------------------
  MEDIA_ID
----------
	 4230 2017-04-24
dlm
	    24427596
Check georeference; does not map to Washington.
	 10014199	    1
MVZ:Bird:183392 is fixed.
	       1862
dustymc@gmail.com


	 4231 2017-04-24
dlm
	    21339142
Check georeference; does not map to Washington.
	 10014199	    1
MVZ:Bird:183392 is fixed.
	       1862
dustymc@gmail.com


	 4232 2017-04-24
dlm
	      364987
Check georeference; does not map to Washington.
	 10014199	    1
MVZ:Bird:183392 is fixed.
	       1862
dustymc@gmail.com


	 4233 2017-04-24
dlm
	    24458363
Check georeference; does not map to Washington.
	 10014199	    1
MVZ:Bird:183392 is fixed.
	       1862
dustymc@gmail.com


	 4234 2017-04-24
dlm
	    24427602
Check georeference; does not map to Washington.
	 10014199	    1
MVZ:Bird:183392 is fixed.
	       1862
dustymc@gmail.com


	 4235 2017-04-24
dlm
	    24266463
Check georeference; does not map to Washington.
	 10014199	    1
MVZ:Bird:183392 is fixed.
	       1862
dustymc@gmail.com


	 4236 2017-04-24
dlm
	    24473289
Check georeference; does not map to Washington.
	 10014199	    1
MVZ:Bird:183392 is fixed.
	       1862
dustymc@gmail.com


	 4237 2017-04-24
dlm
	    23740987
Check georeference; does not map to Washington.
	 10014199	    1
MVZ:Bird:183392 is fixed.
	       1862
dustymc@gmail.com


	 4238 2017-04-24
dlm
	    24427600
Check georeference; does not map to Washington.
	 10014199	    1
MVZ:Bird:183392 is fixed.
	       1862
dustymc@gmail.com


	 4239 2017-04-24
dlm
	    24427598
Check georeference; does not map to Washington.
	 10014199	    1
MVZ:Bird:183392 is fixed.
	       1862
dustymc@gmail.com


	 4240 2017-04-24
dlm
	    24433379
Check georeference; does not map to Washington.
	 10014199	    1
MVZ:Bird:183392 is fixed.
	       1862
dustymc@gmail.com



11 rows selected.

... with the expectation that multi-specimen annotations would usually involve data shared between specimens. (That one's almost certainly from the "find not in the shape" link on edit geography.) It wouldn't be much of a problem for me to allow resolving annotations for individual specimens, but that would require you doing something for each of the 472 specimens involved in https://arctos.database.museum/info/reviewAnnotation.cfm?ANNOTATION_GROUP_ID=701 rather than just doing something for the group.

dustymc · 2018-11-19T18:57:14Z

Update:


UAM@ARCTOS> select REVIEWED_FG, count(*) from annotations group by REVIEWED_FG;

REVIEWED_FG   COUNT(*)
----------- ----------
	  1	   420
	  0	  8993

dustymc · 2019-02-12T15:49:52Z

select REVIEWED_FG, count(*) from annotations group by REVIEWED_FG;

REVIEWED_FG   COUNT(*)
----------- ----------
	  1	   434
	  0	 10013

Added annotations to the report (currently in Random).

dustymc · 2019-08-08T15:26:12Z


UAM@ARCTOS> select REVIEWED_FG, count(*) from annotations group by REVIEWED_FG;

REVIEWED_FG   COUNT(*)
----------- ----------
	  1	   980
	  0	 13769

dustymc · 2019-10-09T14:50:06Z

UAM@ARCTOS> select REVIEWED_FG, count(*) from annotations group by REVIEWED_FG;

REVIEWED_FG   COUNT(*)
----------- ----------
	  1	   988
	  0	 14190

campmlc · 2019-10-09T16:46:12Z

This is a staff time issue. We do see the annotations come through email, but have to find chunks of time to address them. Obviously time is a limitation.

…

On Wed, Oct 9, 2019 at 8:50 AM dustymc ***@***.***> wrote: ***@***.***> select REVIEWED_FG, count(*) from annotations group by REVIEWED_FG; REVIEWED_FG COUNT(*) ----------- ---------- 1 988 0 14190 — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#1679?email_source=notifications&email_token=ADQ7JBGOXZP67M2O3O6F6F3QNXVSBA5CNFSM4FUWWRIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAYE4NQ#issuecomment-540036662>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADQ7JBFX24WJ4EIWDIAXIIDQNXVSBANCNFSM4FUWWRIA> .

dustymc · 2019-10-09T16:55:21Z

Yep, no surprise there.

The "any..." fields may help mitigate known-conflicting data a bit - I was going to mention this in that thread - but they may also make it worse depending if eg, the coordinates or description is actually the problem.

I've been looking for a way to fund this at the level of Arctos, but I'm not sure that's possible. Figuring out the problem may involve digging through your paper files, which probably means this is something each collection will have to deal with individually. If we can get around that, "these are known problems, fixing them and developing the infrastructure to detect/prevent/fix similar problems (and maybe digitizing those paper files) would make our data more capable of doing more things, but we don't have the resources to do that" seems like a sell-able proposal.

mbprondzinski · 2020-09-17T13:07:09Z

Why is there no documentation on this? I can't find any information on what exactly I am supposed to do with these records.

campmlc · 2020-09-17T13:12:45Z

If anyone is poking around in there and wants to write up a quick "this is what I did" as a shared google doc, we can write up something quickly to add to the documentation.

…

On Thu, Sep 17, 2020 at 7:07 AM Mary Beth ***@***.***> wrote: * [EXTERNAL]* Why is there no documentation on this? I can't find any information on what exactly I am supposed to do with these records. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1679 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADQ7JBEBNBJDIDONAVK43P3SGICY3ANCNFSM4FUWWRIA> .

Jegelewicz · 2020-09-17T13:34:06Z

Clearly we need better documentation and probably still a bit more tweaking to this system. I am assigning this to myself, with no promise to tackle it immediately, but maybe by the end of the year....

dustymc · 2020-09-17T15:04:40Z

Latest stats, if anyone's keeping track:

arctosprod@arctos>> select case when reviewer_comment is null then 'unreviewed' else 'reviewed' end sts, count(*) from annotations group by sts;
    sts     | count 
------------+-------
 reviewed   |  1291
 unreviewed | 21631

The first notifications went out last night, I'm not sure we can do more, this can probably be closed or moved to docs.

I did notice that several collections have no active data quality contact so aren't getting the emails; perhaps this is a communication problem as much as anything.

@campmlc are you volunteering? If not I will throw together something minimal for the handbook - I agree this should have basic documentation NOW, we can expand and clean up later.

dustymc · 2020-10-20T15:37:22Z

arctosprod@arctos>> select case when reviewer_comment is null then 'unreviewed' else 'reviewed' end sts, count(*) from annotations group by sts;
    sts     | count 
------------+-------
 reviewed   |  1317
 unreviewed | 22512

dustymc · 2020-12-10T16:50:22Z

arctosprod@arctos>> select case when reviewer_comment is null then 'unreviewed' else 'reviewed' end sts, count(*) from annotations group by sts;
    sts     | count 
------------+-------
 reviewed   |  1332
 unreviewed | 23587

dustymc · 2022-02-15T17:02:06Z

------------+-------
 reviewed   |  2124
 unreviewed | 31112

I don't think there's more to be done here, getting collections to not ignore bad data does not seem to have a technical solution, closing.

dustymc added the Help wanted I have a question on how to use Arctos label Sep 12, 2018

dustymc added this to the Needs Discussion milestone Sep 12, 2018

Jegelewicz added the Display/Interface I don't like the way Arctos looks or it isn't working for me aesthetically. label Sep 12, 2018

dustymc mentioned this issue Sep 24, 2018

classification bulkloader notifications #1696

Closed

dustymc mentioned this issue Nov 19, 2018

Add Wiki links to Agents #1800

Closed

dustymc mentioned this issue Aug 8, 2019

Low Quality Dashboard ArctosDB/arctos-webinars#29

Closed

Jegelewicz self-assigned this Sep 17, 2020

Jegelewicz added the NeedsDocumentation When the issue is resolved in Arctos repository, this should be moved to the Documentation-wiki repo label Sep 17, 2020

Jegelewicz removed this from the Needs Discussion milestone Sep 17, 2020

dustymc mentioned this issue Dec 10, 2020

Geography Proposal #3272

Closed

dustymc added this to the Needs Discussion milestone Jan 21, 2021

dustymc closed this as completed Feb 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Annotation are not being used #1679

Annotation are not being used #1679

dustymc commented Sep 12, 2018

Jegelewicz commented Sep 12, 2018

dustymc commented Sep 12, 2018

Jegelewicz commented Sep 12, 2018

dustymc commented Sep 12, 2018

Jegelewicz commented Sep 12, 2018

dustymc commented Sep 12, 2018

Jegelewicz commented Sep 12, 2018

Jegelewicz commented Sep 12, 2018

Jegelewicz commented Sep 12, 2018

dustymc commented Sep 12, 2018

dustymc commented Nov 19, 2018

dustymc commented Feb 12, 2019

dustymc commented Aug 8, 2019

dustymc commented Oct 9, 2019

campmlc commented Oct 9, 2019 via email

dustymc commented Oct 9, 2019

mbprondzinski commented Sep 17, 2020

campmlc commented Sep 17, 2020 via email

Jegelewicz commented Sep 17, 2020

dustymc commented Sep 17, 2020

dustymc commented Oct 20, 2020

dustymc commented Dec 10, 2020

dustymc commented Feb 15, 2022

Annotation are not being used #1679

Annotation are not being used #1679

Comments

dustymc commented Sep 12, 2018

Jegelewicz commented Sep 12, 2018

dustymc commented Sep 12, 2018

Jegelewicz commented Sep 12, 2018

dustymc commented Sep 12, 2018

Jegelewicz commented Sep 12, 2018

dustymc commented Sep 12, 2018

Jegelewicz commented Sep 12, 2018

Jegelewicz commented Sep 12, 2018

Jegelewicz commented Sep 12, 2018

dustymc commented Sep 12, 2018

dustymc commented Nov 19, 2018

dustymc commented Feb 12, 2019

dustymc commented Aug 8, 2019

dustymc commented Oct 9, 2019

campmlc commented Oct 9, 2019 via email

dustymc commented Oct 9, 2019

mbprondzinski commented Sep 17, 2020

campmlc commented Sep 17, 2020 via email

Jegelewicz commented Sep 17, 2020

dustymc commented Sep 17, 2020

dustymc commented Oct 20, 2020

dustymc commented Dec 10, 2020

dustymc commented Feb 15, 2022