This repository has been archived by the owner on Oct 21, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 3
/
index.html
238 lines (227 loc) · 19.5 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
<!DOCTYPE html>
<html lang="en" style="padding-top: 3.25rem; scroll-behavior: smooth;">
<head>
<meta charset="UTF-8" />
<link rel="icon" type="image/svg+xml" href="/src/favicon.svg" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Vite + Lit App</title>
<script type="module" src="/src/index.ts"></script>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bulma@0.9.3/css/bulma.min.css">
<style>
body {
margin: 0px;
}
/* nfdi-navbar, nfdi-footer, nfdi-toc, nfdi-body {
--element-background-color: rgb(10, 12, 16);;
--element-text-color: white;
--link-color: #4FB3D9;
--link-hover-color: #84cae4;
--header-color: white;
--outside-background-color: #191919;
--accent-text-color: #1FC2A7
} */
nfdi-toc, nfdi-body {
--outside-background-color: rgb(240, 243, 246);
--element-background-color: #ffffff;
--element-text-color: #0E1116;
--header-color: rgb(10, 12, 16);
--accent-text-color: rgb(31, 194, 167);
--link-color: #4FB3D9;
--link-hover-color: #8ad3ee;
}
/* nfdi-toc, nfdi-body {
--element-background-color: rgb(46,62,80);
--element-text-color: white;
--link-color: rgb(79, 179, 217);
--link-hover-color: #8ad3ee;
--header-color: white;
--outside-background-color: whitesmoke;
--accent-text-color: rgb(31, 194, 167)
} */
thead tr th, strong {
color: var(--accent-text-color) !important
}
a {
color: var(--link-color, #4FB3D9) !important;
}
a:hover {
color: var(--link-hover-color, #3A3A3A) !important;
}
thead {
font-size: 1.2rem;
}
</style>
</head>
<body>
<nfdi-navbar>Test</nfdi-navbar>
<nfdi-body class="content" hasSidebar="true">
<!-- sidebar -->
<div slot="searchbar">
<link href="/_pagefind/pagefind-ui.css" rel="stylesheet">
<script src="/_pagefind/pagefind-ui.js" type="text/javascript"></script>
<div id="search"></div>
<script>
window.addEventListener('DOMContentLoaded', (event) => {
new PagefindUI({ element: "#search" });
});
</script>
</div>
<nfdi-sidebar-eleneo slot="sidebar">
<a href="/">Theory</a>
<a href="/#" slot="child">Metadata</a>
<a href="/#what-is-metadata-amp-more" slot="child">What is metadata?</a>
<a href="/#where-does-metadata-come-from" slot="child">Where does metadata come from?</a>
<a href="/#why-do-i-benefit-from-metadata" slot="child">Why do I benefit from metadata?</a>
<nfdi-sidebar-eleneo slot="child">
<a href="/#what-tasks-are-important-for-rich-metadata">What tasks are important for rich metadata?</a>
<a href="/#collection" slot="child">Collection</a>
<a href="/#structuring" slot="child">Structuring</a>
<a href="/#sharing-and-curation" slot="child">Sharing and curation</a>
</nfdi-sidebar-eleneo>
</nfdi-sidebar-eleneo>
<nfdi-sidebar-eleneo slot="sidebar">
<a href="/src/nested/CodeTest.html">Tests</a>
<a slot="child" href="/src/nested/CodeTest.html" >CodeTest</a>
<a slot="child" href="/src/nested/TableTest.html">TableTest</a>
<nfdi-sidebar-eleneo slot="child">
<a href="/src/nested/HeaderTest.html">HeaderTests</a>
<a slot="child" href="/src/nested/HeaderTest.html#ipsum-1">Ipsum 1</a>
<a slot="child" href="/src/nested/HeaderTest.html#ipsum-2">Ipsum 2</a>
</nfdi-sidebar-eleneo>
<a slot="child">Emojis! ✨</a>
</nfdi-sidebar-eleneo>
<nfdi-sidebar-element slot="sidebar" isActive=true>
<div slot="title">Theory</div>
<h1 slot="inner" href="/">Metadata</h1>
<h2 slot="inner" href="/#what-is-metadata">What is metadata?</h2>
<h2 slot="inner" href="/#where-does-metadata-come-from">Where does metadata come from?</h2>
<h2 slot="inner" href="/#why-do-i-benefit-from-metadata">Why do I benefit from metadata?</h2>
<h2 slot="inner" href="/#what-tasks-are-important-for-rich-metadata">What tasks are important for rich metadata?</h2>
<h3 slot="inner" href="/#collection">Collection</h3>
<h3 slot="inner" href="/#structuring">Structuring</h3>
<h3 slot="inner" href="/#sharing-and-curation">Sharing and curation</h3>
</nfdi-sidebar-element>
<!-- <nfdi-sidebar-element slot="sidebar" isActive=true>
<div slot="title">Tests</div>
<h1 slot="inner" href="/src/nested/CodeTest.html">CodeTest</h1>
<h1 slot="inner" href="/src/nested/TableTest.html">TableTest</h1>
<h1 slot="inner" href="/src/nested/HeaderTest.html">HeaderTests</h1>
<h2 slot="inner" href="/src/nested/HeaderTest.html#ipsum-1">Ipsum 1</h2>
<h2 slot="inner" href="/src/nested/HeaderTest.html#ipsum-2">Ipsum 2</h2>
<h1 slot="inner">Emojis! ✨</h1>
<h1 slot="inner" href="/src/nested/FAIR.html">FAIR</h1>
</nfdi-sidebar-element> -->
<!-- content -->
<h1>Metadata</h1>
<!-- TOC -->
<nfdi-toc></nfdi-toc>
<!-- /TOC -->
<nfdi-h1>What is metadata & more!</nfdi-h1>
<p>Metadata is "data that provides information about other data"<sup><a href="https://www.merriam-webster.com/dictionary/metadata" title="Merriam Webster definition of metadata">1</a></sup>. In order to put some (plant) life into this web dictionary explanation, let us explore metadata with a plant biology example:</p>
<blockquote>
Viola investigates the effect of the plant circadian clock on sugar metabolism in <em>W. mirabilis</em>. For her PhD project, which is part of an EU-funded consortium in Prof. Beetroot's lab, she acquires seeds from a South-African botanical society. Viola grows the plants under different light regimes, harvests leaves from a two-day time series experiment, extracts polar metabolites as well as RNA and submits the samples to nearby core facilities for metabolomics and transcriptomics measurements, respectively. After a few weeks of iterative consultation with the facilities' heads as well as technicians and computational biologists involved, Viola receives back a wealth of raw and processed data. From the data she produces figures and wraps everything up to publish the results in the <em>Journal of Wonderful Plant Sciences</em>.
</blockquote>
<p>Although overly-simplified, every sentence in this example is packed with metadata. All data that describe the acquisition, transformation, analysis or handling of data "from greenhouse-to-publication" as well as before (where did the plants originate from?) and beyond (what else was deduced from the same dataset?) is considered metadata.
This simple example already illustrates many different types and granular levels as well as common sources and stakeholders in charge for different tasks acting on metadata.</p>
<nfdi-h2 id="where-does-metadata-come-from-">Where does metadata come from?</nfdi-h2>
<p>Metadata arises along common research tasks, including</p>
<ul>
<li>project design (e.g., researcher, institute and project, biological context and research question, purpose of data collection),</li>
<li>experimental processes (e.g., origin and nature of the biological material, lab protocols), and</li>
<li>data-analytical processes (e.g., software, versions and dependencies employed).</li>
</ul>
<p>However, every project typically also produces</p>
<ul>
<li>technical (e.g., expected data volume, storage location and file formats),</li>
<li>bibliographic (e.g., author, publication date and title), and</li>
<li>legal or administrative (e.g., data origin and ownership, licensing, ethical aspects) metadata.</li>
</ul>
<p>The different types of metadata are collected from various analog and digital sources, including project grants, electronic lab notebooks, machines used for data acquisition, software used for data analysis. And they are provided by different contributors (stakeholders) including wet-lab biologists, core facilities, computational biologists, librarians or infrastructure providers.</p>
<nfdi-h2 id="why-do-i-benefit-from-metadata-">Why do I benefit from metadata?</nfdi-h2>
<p>Before going any deeper into the what and how, let us understand why we benefit from annotating our data with rich metadata. For this it helps to recap along the [FAIR principles][KB-FAIR] how metadata makes your data comprehensible and utilizable for yourself and your peers.</p>
<ul>
<li><strong>Findable</strong>: As metadata names the content of the data (e.g., what was examined, for what purpose, with what method), it is the basis for search engines and ideally makes it categorizable for people and machines and thus easier to search for and find.</li>
<li><strong>Accessible</strong>: Metadata provides information about the origin (e.g., persons, institutions) and the location of storage (e.g., repository) as well as access rights.</li>
<li><strong>Interoperable</strong>: Metadata identifies the software and file formats used for collection and processing, as well as any required conversions between file formats (e.g., from proprietary to open file formats).</li>
<li><strong>Reusable</strong>: If the three upper points (F, A, I) are met, research data can be found and obtained and reused according to clear rules described in licenses.</li>
</ul>
<nfdi-h2 id="what-tasks-are-important-for-rich-metadata-">What tasks are important for rich metadata?</nfdi-h2>
<p>The diversity of metadata types, sources and stakeholders highlights that collecting metadata is rarely a linear one-person or one-time event, but occurs continuously and requires constant updates paralleling a project's development. Here we address a few tasks typically acting on metadata.</p>
<nfdi-h3 id="collection">Collection</nfdi-h3>
<p>Metadata stakeholders from different environments have different understandings of what metadata is required for comprehension of the annotated data. As plant biologists we probably agree that, when retrieving data from a public repository or publication, it is beneficial to know what type of measurement was performed on what species of plants. By contrast, a computational biologist and a librarian might emphasize the importance of the programming environment required to interpret a script or the contributing authors and licenses, respectively.<br>The more metadata the merrier - wouldn't it be great to capture <em>all</em> metadata about a project? Realistically we can only collect a portion of metadata. To guide users on what metadata is encouraged to collect, different domains of data experts have formulated these requirements into what is often referred to as "metadata standards" or "minimum information standards".
Examples for bibliographic and administrative metadata standards include <a href="https://www.dublincore.org/specifications/dublin-core/dcmi-terms/">DublinCore</a> and <a href="https://schema.datacite.org">DataCite</a>. Prominent standards to annotate data relevant to different plant science domains are grouped under the "Minimum Information for Biological and Biomedical Investigations" (<a href="https://fairsharing.org/3518" title="MIBBI">MIBBI</a>) and define e.g. minimum information about a high-throughput SEQuencing Experiment (<a href="https://www.fged.org/projects/minseqe" title="MINSEQE">MINSEQE</a>), Proteomics Experiment (<a href="http://www.psidev.info/miape" title="MIAPE">MIAPE</a>) or a Plant Phenotyping Experiment (<a href="https://www.miappe.org" title="MIAPPE">MIAPPE</a>). There are many more metadata standards available which can be explored at <a href="https://fairsharing.org/search?fairsharingRegistry=Standard" title="Standards at fairsharing.org">fairsharing.org</a>.<br>The metadata standards can be regarded as "checklists", which, when followed, provide that the data is annotated with the required metadata attributes to make it comprehensible at least in the current context.</p>
<nfdi-h3 id="structuring">Structuring</nfdi-h3>
<p>Comprehensibility also strongly depends on the use of language. And not just the spoken language -English, German, Swahili, etc.- itself, but also different meanings of chosen words. While for a librarian the "title" usually refers to a publication item (e.g. manuscript or book) with "contributors" being "authors", "title" in bio-laboratory routines may refer to the name of a funded project with multiple publications (and titles. In said project "contributors" could include authors, but also other experimenters or data analysts and more. Likewise, the processes coming to a plant biologist's mind thinking about "protocols" (growing plants, collecting material, extracting RNA) will be of little help to a computer scientist trying to retrieve data (via HTTP or FTP) from a database.
Furthermore, the librarian, the computer scientist and the plant biologist would probably have a very different approach to collect and represent the required metadata according to the intuition or conventions of their respective domain.
Two technical solutions, schema and ontologies, help to overcome these ambiguities. Briefly, schema are structured documents that formulate a clear representation of the (metadata) information, thereby making it both human-readable and machine-processable. Ontologies are controlled vocabularies or thesauri with defined relations between the vocables that can be used to fill the schema. In this way, schema define where to find the information and ontologies interpret the information.
The use of schema combined with ontologies is well-established for bibliographic and administrative metadata, where high-level and oftentimes generic metadata relevant to most scientific domains is described. In fact, the two examples of <a href="https://www.dublincore.org/specifications/dublin-core/dcmi-terms/">DublinCore</a> and <a href="https://schema.datacite.org">DataCite</a> are not just checklists, but well-defined metadata schema. In more specialized scientific domains the use of schema is much less established.
However, the metadata schema ISA (for investigation – study – assay) can be employed for the most versatile data types relevant in plant sciences. ISA allows intuitive, flexible and yet structured and conclusive data annotation with biological as well as bibliographic and administrative metadata. For more details, see [ISA][KB-ISA].</p>
<nfdi-h3 id="sharing-and-curation">Sharing and curation</nfdi-h3>
<p>In order to make your data FAIR, it needs to be packaged with metadata and shared in a findable and accessible repository.
There are cases where metadata needs to be made accessible independently of the annotated data, e.g. for legal reasons such as dual-use, intellectual property rights or simply since the data is not yet published. In any case, there are at least as many routines for sharing or publishing metadata as there are for data. For more information please see [DataSharing][KB-DataSharing] and [Repositories][KB-Repositories].
During project lifetime data and metadata continuously develops and requires adaptations and updates. Proper versioning helps to keep an overview of the different stakeholders and tasks acting on metadata. Easy and traceable documentation of these processes can be achieved with general-purpose version control through [git][git].
As data and metadata develops the need to harmonize between metadata outputs may occur. If metadata is properly collected in a schema, converters can help to migrate between different metadata schema or securely export information from one schema to the other.</p>
<div style="page-break-after: always;"></div>
<nfdi-h1>Testing</nfdi-h1>
<nfdi-h2 id="how-does-dataplant-support-me-in-metadata-annotation-">How does DataPLANT support me in metadata annotation?</nfdi-h2>
<p>The following table gives an overview about DataPLANT tools and services related to metadata. Follow the link in the first column for details.</p>
<table>
<thead>
<tr>
<th>Name</th>
<th>Type</th>
<th>Tasks on metadata </th>
</tr>
</thead>
<tbody>
<tr>
<td><strong><a href="https://github.com/nfdi4plants/ARC" title="ARC specifications">ARC</a></strong> <br> (Annotated Research Context)</td>
<td>Standard</td>
<td><strong>Structure:</strong> <ul><li>Package data with metadata</li></ul></td>
</tr>
<tr>
<td><strong><a href="https://github.com/nfdi4plants/Swate/wiki" title="Swate Wiki">Swate</a></strong> <br> (Swate Workflow Annotation Tool for Excel)</td>
<td>Tool</td>
<td><strong>Collect and structure:</strong> <ul><li>Annotate experimental and computational workflows with ISA metadata schema</li><li>Easy use of ontologies and controlled vocabularies</li><li>Metadata templates for versatile data types</li></ul></td>
</tr>
<tr>
<td><strong><a href="https://github.com/nfdi4plants/arcCommander/wiki" title="ArcCommander Wiki">ArcCommander</a></strong></td>
<td>Tool</td>
<td><strong>Collect, structure and share:</strong> <ul><li>Add bibliographical metadata to your ARC</li><li>ARC version control and sharing via DataPLANT's DataHUB</li><li>Automated metadata referencing and version control as your ARC grows</li></ul></td>
</tr>
<tr>
<td><strong><a href="https://git.nfdi4plants.org" title="ARC DataHUB">DataHUB</a></strong></td>
<td>Service</td>
<td><strong>Share:</strong> <ul><li>Federated system to share ARCs</li><li>Manage who can view or access your ARC</li></ul></td>
</tr>
<tr>
<td><del>Converter</del></td>
<td>Tool under construction</td>
<td><strong>Curate:</strong> <ul><li>Harmonize and migrate between metadata schema</li><li>Manage who can view or access your ARC</li></ul></td>
</tr>
<tr>
<td><strong>Metadata registry</strong></td>
<td>Service</td>
<td><strong>Share:</strong> <ul><li>Find ARC (meta)data</li></ul></td>
</tr>
</tbody>
</table>
<nfdi-h3 id="dataplant-support">DataPLANT Support</nfdi-h3>
<p>Besides these technical solutions, DataPLANT supports you with community-engaged data stewardship. For further assistance, feel free to reach out via our <a href="https://support.nfdi4plants.org">helpdesk</a> or by contacting us <a href="mailto:dataplant@uni-kl.de?subject=DataPLANT%20Metadata">directly</a>.</p>
<div style="page-break-after: always;"></div>
<!-- Knowledgebase Cross-references -->
<ol>
<li>[KB-FAIR]: Link to knowledgebase article "FAIR principles"</li>
<li>[KB-ISA]: Link to knowledgeable article "ISA Model"</li>
<li>[KB-Repositories]: link to article repositories</li>
<li>[KB-DataSharing]: link to article Data Sharing</li>
</ol>
<nfdi-h1>The Minimalist's ARC-QuickStart</nfdi-h1>
<!-- -->
<!-- Reference links -->
<!-- [EMBL-EBI]: https://www.ebi.ac.uk/services/all "EMBL-EBI repositories"
[NCBI]: https://www.ncbi.nlm.nih.gov/guide/sitemap/ "NCBI repositories" -->
</nfdi-body>
<nfdi-footer></nfdi-footer>
</body>
</html>