Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can/should one introduce Context into FAIR data? #16

Open
hrzepa opened this issue Jul 3, 2017 · 8 comments
Open

How can/should one introduce Context into FAIR data? #16

hrzepa opened this issue Jul 3, 2017 · 8 comments

Comments

@hrzepa
Copy link

hrzepa commented Jul 3, 2017

I have heard criticism of FAIR as representing only four of the five essential attributes of data. The missing component is “context”. Without the back story associated with the data, it is impoverished.

Arguably, the FAIR metadata can provide link(s) to such back stories, but is this sufficient and should over mechanisms such as perhaps EventData be promoted as well?

@dr-shorthair
Copy link

dr-shorthair commented Jul 12, 2017

Yes. Context could also be generalized as 'linked' or 'connected'. This is a weakness or gap in the current FAIR gamut.

@micheldumontier
Copy link

The FAIR principles indicate that reuse is enabled with detailed provenance (R1.2), and this emcompasses context.

@CaroleGoble
Copy link

Is context just provenance? IMHO, no.

As soon as one is working with the datasets and OTHER assets arising from a range of studies and experiments you are not just talking about provenance, nor are you just talking about data as one data set. Or even data at all. In the FAIRDOM Systems Biology asset management platform we link together data, models, SOPs, workflows, samples, publications etc all around the ISA model. the entire compound "Research Object is FAIR as well as the individual components within.
See http://www.fair-dom.org, and

Wolstencroft K, Krebs O, Snoep JL, Stanford NJ, Bacall F, Golebiewski M, Kuzyakiv R, Nguyen Q, Owen S, Soiland-Reyes S, Straszewski J, van Niekerk DD, Williams AR, Malmström L, Rinn B, Müller W, Goble C FAIRDOMHub: a repository and collaboration environment for sharing systems biology research. Nucleic Acids Res, 45(D1): D404-D407. DOI: 10.1093/nar/gkw1032 (2016)

The Research Object (http://www.researchobject.org) approach is all about metadata manifests that retain context and relate components that are potentially scattered in external resources as well as contained in containers like docker or even zip files. By having FAIR Research Objects, rather than just "data" we get context

Belhajjame K, Zhao J, Garijo D, Gamble M, Hettne K, Palma R, Mina E, Corcho O, Gómez-Pérez JM, Bechhofer S, Klyne G, Goble C Using a suite of ontologies for preserving workflow-centric research objects, J. Web Sem. 32: 16-42, doi:10.1016/j.websem.2015.01.003. (2015)
and
Chard K, D' Arcy M, Heavner B, Foster I, Kesselman C, Madduri R, Rodriguez A, Soiland-Reyes S, Goble C, Clark K, Deutsch EW, Dinov I, Price N, Toga A I'll Take That to Go: Big Data Bags and Minimal Identifiers for Exchange of Large, Complex Datasets IEEE Intl Conf on Big Data doi:10.1109/BigData.2016.7840618 (2016)

@CaroleGoble
Copy link

Another comment - in Systems approaches in particular we are crossing the boundaries of different types of data - as is the case in, say, polyomic studies. Thus we very much need to retain the context of how data are related to each other in a study. In project data management workflows too often this linkage is broken when the sub-datasets are disbanded into type specific, siloed, public deposition archives. FAIRDOM (above) tackles this from the start for poly-asset projects through a FAIR metadata layer. BioStudies, kind of does this too. The DTL FAIRification platform (DataFAIRPoints and Fairifier) attempts to recover this retrospectively.

@micheldumontier
Copy link

The research object approach is perfectly fine way to bundle things together and provide the metadata that you need to understand what those objects are, and, as you say, the context for those objects. While we might disagree that the provenance of a digital object does not fully encompass the context from which it was produced, we should agree that context is covered by R1. meta(data) have a plurality of accurate and relevant attributes.

@dr-shorthair
Copy link

dr-shorthair commented Jul 31, 2017

we should agree that context is covered by R1. meta(data) have a plurality of accurate and relevant attributes.

I agree that you could shoehorn context here, but it would be helpful to have it made more explicit.

I'm looking for a bridge from FAIR to the 5th star of the W3C's Linked Open Data principles here - e.g. http://5stardata.info/en/
If your data is linked into a bigger 'graph' then it is more useful. This requires cross-references and (hyper-)links, not just 'a plurality of attributes'.

@micheldumontier
Copy link

micheldumontier commented Jul 31, 2017 via email

@dr-shorthair
Copy link

dr-shorthair commented Aug 1, 2017

Yes. I had temporarily overlooked that (though had already slotted it into our rating framework - https://confluence.csiro.au/display/OZNOME/Data+ratings ).

Unfortunately I find the groupings in FAIR to be less than ideal, so in some cases the chief concern is smeared over more than one FAIR principle - for example, 'findable' overlaps with R1, F2, F3, and 'useable' with I2 and R1.3.

Maybe its because our focus is on data, rather than metadata?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants