-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change term - MaterialSample #314
Comments
I do not agree with this proposal. I think a better approach is to embrace I do agree that the definition of I feel strongly that the DwC class In summary, I agree with the need and justification for a change in DwC to reconcile these terms, but I think the main change should be in the terms |
Agree with @deepreef here. The DINA consortium is in the midst of modelling this and have come to the realization that a catalogued object (= Physical Specimen, Physical Entity) is an instance of a |
@dshorthouse : 100% agreement on all of this. We likewise came to the exact same conclusions (including the move of Obviously it must be true what they say: Great minds think alike. (Or, perhaps, feeble minds think alike? Probably both, and the challenge is figuring out which this represents...) Incidentally, I would add to the list Another conundrum is how to apply So many questions.... |
And yet another conundrum. What is going on with |
If we extend these realizations to their logical conclusion, we have a problem in how we expect our specimen-based data to be interpreted in the context of an The exchange networks of duplicates distributed among herbaria is a concrete example of this. One plant clipped into five pieces, prepared and mounted, each sheet then shipped 'round the world to 5 herbaria. In reality, that's one |
This is a long-standing problem and not just for herbaria. Mammal occurrences end up at different institutions or collections when skins, skeletons and genetic material get separated over the years. |
See ArctosDB/arctos#1966 for another side of the story |
I believe Arctos has all of the "pigeonholing problems" mentioned in this thread. https://arctos.database.museum/guid/UAM:ES:4588 seems to meet some definitions of "FossilSpecimen" and PreservedSpecimen, and is also cataloged as https://arctos.database.museum/guid/UAM:Mamm:53942. Many things in herbaria are "LivingSpecimen" pending a little water and sunlight. catalogNumber and otherCatalogNumbers seem closer to Occurrence than MaterialSample to me, but we could easily map through one more denormalization. (We do have "MaterialSample otherCatalogNumbers" but I don't think they're exposed via DWC.) https://arctos.database.museum/guid/MVZ:Egg:10460 is more or less another example of "rocks with multiple embedded fossils."
We have "there was never a physical part" and "someone says there were physical parts, but they are permanently unavailable for various reasons." I do not see much functional distinction.
Yep! |
Nit-picky, but by "occurrence" here, you mean |
Nit-picky, but correct. |
So eDNA are Personally I think a larger community discussion needs to happen around For the Machine Observations TDWG group, especially for biologging data we are using basisOfRecord to distinguish between observations of an animal where the animal is in hand and having a tag placed on it ( |
Another issue we have grappled with - ArctosDB/arctos#2075 or not finished grappling with.... |
@albenson-usgs If we're strict about the definition of an As for the practicality of where terms are placed in the DwC classes, it has to do with the operational identifiers we attach to these items and what is their cardinality within our collection management systems. If |
Agree with @dshorthouse. This is highly relevant, as my institution is in the process of setting up an environmental sample/eDNA repository in Arctos, similar to an existing repository at the University of Alaska Museum of the North (https://arctos.database.museum/SpecimenSearch.cfm?guid_prefix=UAM%3AEnv). We are considering including all derived taxonomic IDs and genetic sequences under a single catalog number, as having all been derived from the same occurrence (water sample, soil sample). Alternately, we could catalog each unique taxonomic OTU separately, and link it back to the originally source catalog item via url relationships. The latter is entirely feasible but much more complex, especially if there are hundreds of OTUs that result from a single eDNA sample. What we really need is a way to designate the original source sample, e.g. the water or soil, with a unique source identifier similar to an dwc:organism ID. |
Hokay... where to begin? (Note to @timrobertson100: Now is the time to go get that cup of tea...) So, I first climbed into this rabbit hole several years ago, when I started minting DarwinCore began as a way for the Museum community to share data about preserved specimens (fun fact: the term is credited to Allen Allison, who apparently blurted it out by mistake when he meant to say "Dublin Core" at a ZBIG meeting - or so he tells me). Thus, the original implied Somewhere along the way, what we used to think of as "specimens" now became "occurrences", as if they were congruent concepts. But of course, specimens are physical entities with all sorts of properties important to the people who care for them (such as (I trust @tucotuco or @stanblum or someone else active in early DwC activities will correct any errors in this historical synopsis...) I've continued to stare at my ceiling late at night (more often than I should probably admit) pondering the essence and meaning of A lot of the discussion above focuses on the boundary between Much more challenging (for me, at least), is defining the boundary between This distinction (death vs. disintegration) comes into play when trying to understand the boundary between an instance of This post is already too long (even by my standards), so rather than regurgitate all my thinking on this, I'll close by providing a use case, and some follow-up questions. Use case: I think most would agree that the living bird flying across the field is an instance of an That's the easy part. But here are the questions to consider:
I have my own thoughts on answers to these (and other) questions, but obviously this post is already WAY too long! Note: several more posts came in as I was writing this, and I continue to agree 100% with the assertions of @dshorthouse. |
@deepreef Regardless of the persistence of the Organism, the "identifier" associated with this organism absolutely has to persist as a linking parent identifier with all subsequent derived parts and preservations, material sample or otherwise, including and especially, parasites and tissues and sequences and media that are deposited other collections and institutions and repositories, to track these back to the source organism and occurrence. This is also true for source/parent material such as soil/water etc for eDNA, which technically is not an "organism" but which also has the same need to track parent/child relationships from a source collection object and occurrence. |
@campmlc :
I ABSOLUTELY agree! I was focused more on the conceptual entity of the |
I'm not following this. An occurrence is the observation of a taxon at a place and time. What you are talking about here (to me) is an event. The occurrences are the OTUs or taxa that you detected in the event. For me I can't understand how this is one occurrence. This is an event with many occurrences.
I don't understand why you wouldn't use an eventID for this. The event being a sample of water collected at a place and time. |
Ok but the |
Aha! What I think you mean here is an event with many "things". You can't have an occurrence without an event - they are inextricably linked. It is equally incongruous to imagine an occurrence with many events unless we invoke quantum entanglement. The nut @deepreef is getting us to crack is what are these "things"? We could call them |
But the things are not |
Yep - https://dwc.tdwg.org/terms/#occurrence
I agree - the trouble is, we don't do a good job of this. Events don't have identifiers that are shared by everyone and it is VERY easy to end up with multiple interpretations of a single event. |
Yes -- this is another one of the conundrums. Per DwC definition of
This is another example of a term currently nested within the But that does not address your point, which brings in
I don't view samples of water or soil as "Events", any more than I view specimens as "Events". This diagram of Darwin-SW is very helpful, I think, in showing the semantic relationships among many of the core DwC classes. Unfortunately, it doesn't include a node for In summary, I would definitely conceptualize Food for thought: take my use-case of bird, and imagine that prior to flying across the field and being hit by a car, it lived in a Zoo. Was it a What I'm trying to get at is the "essence" of an instance of |
This is an interesting one & permit my adventurous thought experiment. What if your eDNA sample came from a river? And, after the data are worked-up, you get "cougar" as a hit among all the other microorganisms. What the heck? Turns out, through radio collar data, you discover that there's another |
Indeed! And I guess that would start with someone stepping up to lead it (...he writes, as he quickly crouches down behind the desk and starts crawling towards the back exit... ;-) ) Seriously, though -- I would be delighted to get up at 3am (or whatever time it is in Hawaii) every single week to actively participate in such a Task Group (yes, seriously!), but I am absolutely not the right person to lead it (unless the intention is to guarantee that it languishes). |
Given that I (pretty much necessarily must) participate in every Darwin
Core-related Task Group that gets chartered, I do not have the bandwidth to
lead it either.
…On Wed, Jun 9, 2021 at 11:03 PM Richard L. Pyle ***@***.***> wrote:
With all this agreement it might be a good time to start thinking about
the scope of a Task Group. ;-)
Indeed! And I guess that would start with someone stepping up to lead it
(...he writes, as he quickly crouches down behind the desk and starts
crawling towards the back exit... ;-) )
Seriously, though -- I would be *delighted* to get up at 3am (or whatever
time it is in Hawaii) every single week to actively participate in such a
Task Group (yes, seriously!), but I am absolutely *not* the right person
to lead it (unless the intention is to guarantee that it languishes).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#314 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADQ7262VFSZRNYJTNXLDWLTSAMQRANCNFSM4WOSVQEQ>
.
|
Sigh, says the person who started this whole thing..... I have never lead a TDWG task group, so first I have to figure out the rules/process.... |
A MaterialSample is a physical object, intended to be representative of some physical thing in the world of interest to someone, on which observations can be made. In this view, a physical object is thing composed of matter that has some defined boundaries. |
Arriving very late in this discussion ... I wonder if it would be helpful to be more clear about the use of, and distinction between, the terms sample and specimen.
Some material samples are preserved and curated, so they are also specimens. Some samples are not. @smrgeoinfo has alluded to these issues a couple of times above. |
@dr-shorthair -- that distinction between sample and specimen is new to me, but I think it works (sorry... what is GLAM context?). If I understand, the idea is there can be samples that are not specimens, and specimens that are not samples. Some specific examples would help, particularly for a 'specimen' whose utility is not dependent on its relationship to some context (feature of interest) in the world. |
GLAM - Galleries, Libraries, Archives and Museums. This clarification of 'specimen' was given to me by Dimitris Koureas. |
In my mind, many of these words do not have clear definitions, and so I lump them (including "specimen" and "sample") as 1:1 synonyms with First of all, we'd need to add a third subclass along the lines of 'aggregate', to represent those instances of Second, there are other axes around which instances of Third, the word 'specimen' has different meanings that run counter to the distinction of 'specimen |
I suppose pretty much every 'specimen' in a museum or gallery is representative of an artists body-of-work, or a school-of-art, or a civilization, or similar, and are likely to be catalogued in this way. In SSN/SOSA the predicate is sosa:isSampleOf and this is pretty much the only required property of a sosa:Sample. Of course in RDF under the OWA a property instance may be missing from a specific dataset or graph, even though the Ontology says there must be one. That might be because the thing that it is a sample of is not yet clear. I'm not at all saying that you should abandon your current terminology - I know that sample/material-sample/specimen are often used synonymously, and I used the word 'Specimen' as a synonym for 'material sample' in ISO 19156 (O&M v2) (I would change that if I had a do-over). I'm just suggesting that it is worth recognising that sampling and curation are separate concerns, and that thinking about these concerns separately might help with some of the definitional challenges. |
I wholeheartedly agree with this, and I think this is in many ways the crux of the discussion that needs to follow with respect to the Task Group. I had written a summary of my take on the key issues related to the Task Group scope and agenda (#358), but I apparently either posted it in the wrong place, or perhaps failed to post it entirely. |
From usage point of view, the questions might be
Is there something else (besides sampling procedures, responsible parties, preparation procedures, locations) that we need to identify in this domain? There is a different (but overlapping) information model for each of these 'resources' ( kinds of things). |
Having (eventually) read (most) of the contributions to this discussion, and agreeing with many of them, I'd just like to come back to users and what will result at the front end of aggregators where these data wind up.
I'd like to bring us back to some real questions we get from users of aggregators or staff at our institutions Discussions like these (whilst interesting) can easily drift into such complexity that we lose sight of the fairly straightforward ways in which users may want to interact with the data. The "I just want to ... " usecase. Regardless of how this discussion ends up, can I put in a plea for the least-skilled user here? Or even for the user who is, perhaps, an ecologist or a population modeller or a government bureaucrat who "just wants some data". They might not know to look for "material sample" when they just want to get data about some specimens in a particular collection; or might not be too concerned about whether there's a boundary between an organism and a taxon because they just want to find that feather, and they know the record is there somewhere. Or for the collection manager who isn't going to sit and read 128 comments to find out how to map the things they call "specimens" out of their collection management system and into DarwinCore? We might not be there yet, but could we please eventually circle back to how will real-life users navigate these concepts in GBIF, or ALA, or wherever, so that users can intuitively find what they're looking for? |
Very well said, @elywallis !!! I completely agree. Indeed, a large part of my interest in this topic ( While I have probably been more guilty than anyone else in terms of endlessly waxing philosophical/conceptual, my real motivation here is driven by very practical needs. We at Bishop Museum are in the early stages of an informatics renaissance, including the harmonization digitization and data management among several major natural history collections as well as cultural and library/archives collections. A great deal of what I hope the Task Group will accomplish is to help me get to a place where our data management system can easily answer exactly these kinds of real-world questions. One final note in defense of the philosophical/conceptual perspective, though: I have learned over the years that constructing solutions optimized for solving real-world problems that are right in front of us sometimes (often?) accomplishes short-term gain in exchange for long-term pain. Having endured a great deal of that pain, I can see how thinking this stuff through carefully can lead to the development of systems that not only easily answer the questions that are right in front of us, but also the many countless questions we haven't even thought to ask (...yet). [That is...short-term pain in exchange for long-term gain...] |
I woke up in the middle of the night thinking about how DarwinCore is NOT structured for collection management or sample discovery. The two comments above make me feel that even more. |
Very true! And the irony is, when creating a collections management system from scratch, one of the (many) considerations designers try to accommodate is, "How does this translate to Darwin Core?" Rolling-up relational concepts into a philosophical |
Darwin Core is not structured. When we give courses on Darwin Core we try to hammer this in. It is a bag of terms that we hope to define well enough so that it can be reused in lots of contexts (including to define fields in databases). The confusion arises because the most popular (but not only) way to share data "in Darwin Core" is through Darwin Core Archives and the very few structures supported through "cores" an "extensions" defined by XML files on rs.gbif.org. |
I think I need one of these.... |
I would begin at Chapter 0 on https://github.com/tdwg/dwc-qa/wiki/Webinars. |
With the caveat from @tucotuco that DwC is not intended to be structured (not entirely true, or it wouldn't have defined classes)...
So... even though DwC was born in the context of sharing specimen data, I think the community at the time felt that the most valuable output from sharing/aggregating specimen data was the ability to document the occurrence of organisms in space and time (specifically, via collecting event data associated with those specimens). In that context, the large body of unvouchered observational records (especially birds) represented an opportunity to add additional "meat" to these patterns of organisms occurring in space and time. Thus, the notion of a "specimen" as the basis of record became I think that was a step in the right direction. However, the part that never sat well with me was the notion of Specimen=Occurrence, which seemed to pervade TDWG-land for a number of years, and underpins the point raised by @dshorthouse quoted above. Later, with the near-simultaneous introduction of Unfortunately (but predictably), there was a period of several years when most TDWG-folk weren't quite sure how best to implement the concepts represented by these two new DwC classes/concepts (especially given that one of them -- This very unusual year (for a few reasons) in DwC is, I think, in part a result of some sort of "awakening" among a critical mass of data content providers in how the "power" of these new classes (
Indeed, So, in the example from @dshorthouse above, a single phew... OK, I obviously allowed my philosophical/conceptual waxer out of his cage for longer than I probably should have, but there you go. |
The previous post was way too long, so I decided to break this out as a follow-up. I wanted to call out a different "tension" that I think underpins a lot of the conflict/confusion on this issue: The tension between the needs of collection object managers, and the needs of biodiversity researchers. Obviously, many of us are primarily focused on managing physical objects in a collection (
My sense is that these and other recent discussions are getting us close to (1), which is what all the philosophical/conceptual mumbo jumbo is about. After that reaches some sort of stable(ish) asymptote, then the real challenge of evolving our information management software (2) begins. My sincere hope is that we can make real progress on (1) without breaking existing software and protocols -- but I've often been accused of being overly optimistic. |
@deepreef this is exactly what I have been thinking - see ArctosDB/arctos#3630 (comment) |
A great summary of the issue :) |
This issue has been superseded by #451 |
Change term
From https://dwc.tdwg.org/terms/#materialsample
From https://dwc.tdwg.org/terms/#livingspecimen
Given the above, we propose that MaterialSample should be more specific to something less than what might be considered a "voucher" in order to delineate it from PreservedSpecimen.
Proposed new attributes of the term:
Note: all of the above is my interpretation of the Arctos Working Group conversation.
The text was updated successfully, but these errors were encountered: