-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
geography: start/stop date? #3018
Comments
I'm very interested and researching solutions to this and a plethora of Locality Services that have a long history stemming from the first georeferencing forays in 1999. In the Darwin Core Hour and associated BBQs earlier this year there was a call for an "Event Gazetteer". The idea as presented was actually to make something like an Event Backbone to parallel the Taxonomic Backbone concept, rather than to just do a Location Backbone. But at the core, either way, could be a time-sensitive gazetteer to facilitate georeferencing, the results of which would be the most unambiguous way to find things spatially ("I mean here on the map" where here is a geometry). I would love to hear any further ideas, suggestions, use cases, but would prefer to get them here if they aren't already in there, for the whole world interested in the subject to see. |
For this, if there was a webservice that would minimally accept coordinates-plus-date and return "geography" (whatever that means...), we could determine if the place was called Whatever County when the Event occurred, whatever it's called now, and use that to flag records (or in this case to avoid flagging them - we've have plenty of actual problems to worry about!). Accepting shapes and returning a list of intersecting shapes would be even better - it's currently difficult to avoid "3 pixels outside of Cibola County" when we're really looking for "says NM, maps to CN." Using the service to pull current (or standardized - I guess I don't actually care how it's standardized) geography would allow us to provide a consistent search environment; everything from (X,Y) would come back in a search for {shapename}. That's currently a huge embarrassing gap in our capabilities. Using the service to pull historical geography would allow us to provide a comprehensive search environment. ("Find stuff from everything that's ever been called String-->shape ("facilitate georeferencing") functionality would be pretty cool, but it's almost secondary from my POV - we have tools for that now. Curators and CMs might have a very different outlook on that, and being able to pull coordinates from multiple services and compare them would be pretty huge. tl;dr: Super cool, how can the Arctos Community help make it a reality? |
Can you elaborate on this one? |
I've added the use cases to the Technical:needs tab of the Imagining a Global Gazetteer Google sheet.
Give me things to test or that you would like statistics for. Have a look at the principles of Higher Geography standardization for VertNet and opine as GitHub issues in that repository. Suggest viable funding options. |
Grab a seal, moose, mouse, plant, fish, and crab from the same place on some beach. The seal will get entered with State, Sea (they swim, the state issues permits), the moose will go to whatever the hunting regulations use, the mouse will get a quad (better to sort them into jars), the plant will get a Feature (NPS likes to pay botanists), the fish will get a drainage, the crab will get a marine designation. It doesn't seem likely that we'll all agree on a single name for that point on the planet, and without that the descriptive data doesn't converge - no search finds everything. (The moose and mouse might converge at the state level, but they never share terminology with the crab.) A service that accepts the point (or better yet, the shape) and returns something (or a set of somethings, or whatever) predictable would allow users to find them all with one search term, without trying to determine just exactly where "Alaska" stops and "Beaufort Sea" begins or "requiring" (we can't) Curators to use (or avoid) anything that we might consider geography, etc. Yay everybody. plus finds ferexample |
Do you mean you find it useful to be able get a textual search term as a proxy for the spatial term you used to find it? That seems odd, but hey. Would this be covered by something like an S2 cell Identifier? Otherwise, why not just use the spatial search directly? |
Yup.
Not so much. It's entered as "Some Country" because whatever reasons (and georeferenced), I think the collections (researchers, county invasive species dept., ...) want to find it by "Some County" or similar no matter what's been asserted/used.
As above I believe. We can ask, I'd certainly be fine without... |
OK, let's see if I got it this time. Data are entered textually with
whatever. The location is georeferenced however. The service can tell you
all the things that place might be called based on what's intersected by
the georeference (backed by a spatial feature data set and an adjoined
thesaurus for the synonyms for the names of the features). Then, a user
searches on whatever and will get results for all features that have that
searched value as a synonym. Solves the issue of "my whatever" isn't the
same as yours and I have no idea what you would call it, and frankly, I
don't care, I just want stuff from there.
Close?
…On Thu, Aug 13, 2020 at 5:34 PM dustymc ***@***.***> wrote:
textual search term as a proxy
Yup.
you used
Not so much. It's entered as "Some Country" because whatever reasons (and
georeferenced), I think the collections (researchers, county invasive
species dept., ...) want to find it by "Some County" or similar no matter
what's been asserted/used.
just use the spatial search
As above I believe. We can ask, I'd certainly be fine without...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3018 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADQ723ZXQQVX347XLO25TDSARE4DANCNFSM4P6QYQTQ>
.
|
Yes. A "you should prefer this" flag on exactly one of those whatevers would be icing, but I don't think it changes functionality. (It does for anything bigger than Arctos.) I think the "just want stuff from there" and "this was in fact Bla County, but now we call it...." services (which might be different views of the same thing) would fundamentally change the nature of the questions the data can answer. I think we'd totally use the "georeferenced however...." (and all the cool stuff we haven't thought of yet too!) but I also think that would be closer to "neato" than "transformational." I don't mean to be overly dismissive of that aspect, even though I probably was.... Also one of the curators involved in this pointed out that some of his stuff was getting flagged because our WKTs have fairly low resolution, which has been a problem before. MAYBE we'll be able to fix that ourselves when/if we get postgis running, but as of right now access to higher resolution shapes than we're capable of dealing with would be a significant reason for Arctos folks to find a way to get behind this effort. |
I'm not sure about this. For example, should everyone prefer an English label? There are surely personal/situational/institutional/national preferences that differ.
A lot like taxonomy, only different - a thesaurus linked to a spatial data store. Probably a LOT more tractable than taxonomy, for me at least.
Can I get an example of something that was flagged? I'm interested in where, how and why. Being able to refer to fully spatially-ennabled representations of places with metadata from a URL in a spreadsheet was one of the fundamental driving forces for locality services - to level the spatial data playing field in occurrence data management. |
Yea maybe, but somehow avoiding Alameda/Alameda County/The County Of Alameda/Alameda Co./etc. seems useful, especially in eg DWC where there's only room for one THING.
We seem to have two underlying currents.
Hosts and parasites are strong evidence that we do need some capability to share events, no matter how precisely we measure. I have little idea about how to reconcile those viewpoints - maybe fleshing it out and providing data organization guidance would be a valuable part of a service.
Valencia County according to Arctos looks like this: The link opens records that say they're from Valencia County but don't map to Valencia County. (And from there Annotations are just a click, so I clicked.) ... and I can't find a good example in that so let's go to Idaho. https://arctos.database.museum/guid/UAMObs:Ento:233962 has a red map border (there's spatial authority data, the record isn't in it), zoom in a bit and.... The WKT only approximates the border, the map point is in fact where it claims to be, the spatial data we have is just wrong. I'm running everything through javascript, so more-precise spatial data (especially for something like Alaska with its eleventy-bajillion islands and long complex coastline) would probably just melt something. A service that could provide both lightweight spatial data for maps and a "in/out" determination with some precision behind it would be valuable. I can only deal with point-in-poly, so something capable of considering the error associated with the point would be even better. |
I think you can do more than that. You can use a ST_DISTANCE function to find things that are as close or closer than the radius. Chalk one up to the point-radius method. ;-) |
If we're going to do this, we should find a path and implement. If we're not, we should delete non-current entries. https://arctos.database.museum/place.cfm?action=detail&geog_auth_rec_id=38 |
I'm calling this a technical problem and going next task, but on the off chance that something radical happens with #4836 will try to delay until after AWG discussion. |
PROPOSAL:
EXPLANATION: ISO8601 datatype because these will generally be year-precision The both-or-neither rule will help prevent duplicates and unnecessary assertions, and keep the concatenation consistent and predictable (eg, users will never see one year, always nothing or a span). Including the dates in the concatenation will further help prevent duplicates (there's a unique index), and allow these to function, especially in string-based applications (like data entry) just like any other geography. Examples:
BEST PRACTICES Noncurrent geography should not be used in accepted Events. When such usage is unavoidable, the goal should be to "upgrade" to current geography when resources allow. (That is, nobody has time to preemptively go figure out what 'Yugoslavia' is supposed to mean, but if that is resolved - maybe when the records are used - then a new Event using current geography should be added, and the 'Yugoslavia' Event should become unaccepted.) DOCUMENTATION NEEDED For admin geography:
For data entry:
HELP! Is this a good/workable solution? I need to implement (or find an alternative or whatever) relatively soon so I'm not wasting time scribbling in remarks. |
Seems OK - it would be nice if we had "begin" dates for all the current stuff - but there is zero chance of that? It seems better than what we are currently doing (nothing) to delineate between current and "old" geography. |
@mkoo @sharpphyl who else cares a lot about geography? |
Issue Documentation is http://handbook.arctosdb.org/how_to/How-to-Use-Issues-in-Arctos.html
Is your feature request related to a problem? Please describe.
Geography isn't temporally stable.
Describe what you're trying to accomplish
Avoid mixing actual spatial/geography problems with "some entity moved some border" problems.
Describe the solution you'd like
Describe alternatives you've considered
Keep publishing cruddy data which isn't what users think it is.
Additional context
#1889 (comment) spawned a comment
https://en.wikipedia.org/wiki/Cibola_County,_New_Mexico
We have similar data in several contexts - Yugoslavia, Soviet Union, Kenya, etc.
Option One would allow us to say "Valencia County, before Cibola County was carved out." That model is capable of accurately modeling the data, but I don't think it's usable - it would essentially require data entry personnel to know which map the collector was looking at, and would likely require us to review thousands of records per year.
Option Two would perhaps only require shifting our viewpoint from seeing geography as authoritative to viewing it as curatorial assertions which might be useful in determining coordinates, which could be used to pull current and actual geography from some webservice.
Option Three: Get @tucotuco to fix this for us....
The text was updated successfully, but these errors were encountered: