Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

small changes to text #10

Merged
merged 1 commit into from
Jul 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 20 additions & 16 deletions doc/sdms_iog.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="generator" content="Asciidoctor 2.0.20">
<meta name="generator" content="Asciidoctor 2.0.16">
<title>Technical documentation: Guidance for data centres contributing to SDMS</title>
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Open+Sans:300,300italic,400,400italic,600,600italic%7CNoto+Serif:400,400italic,700,700italic%7CDroid+Sans+Mono:400,700">
<style>
Expand Down Expand Up @@ -84,10 +84,10 @@
ul,ol,dl{line-height:1.6;margin-bottom:1.25em;list-style-position:outside;font-family:inherit}
ul,ol{margin-left:1.5em}
ul li ul,ul li ol{margin-left:1.25em;margin-bottom:0}
ul.square li ul,ul.circle li ul,ul.disc li ul{list-style:inherit}
ul.square{list-style-type:square}
ul.circle{list-style-type:circle}
ul.disc{list-style-type:disc}
ul.square{list-style-type:square}
ul.circle ul:not([class]),ul.disc ul:not([class]),ul.square ul:not([class]){list-style:inherit}
ol li ul,ol li ol{margin-left:1.25em;margin-bottom:0}
dl dt{margin-bottom:.3125em;font-weight:bold}
dl dd{margin-bottom:1.25em}
Expand Down Expand Up @@ -193,8 +193,7 @@
#content h1>a.link:hover,h2>a.link:hover,h3>a.link:hover,#toctitle>a.link:hover,.sidebarblock>.content>.title>a.link:hover,h4>a.link:hover,h5>a.link:hover,h6>a.link:hover{color:#a53221}
details,.audioblock,.imageblock,.literalblock,.listingblock,.stemblock,.videoblock{margin-bottom:1.25em}
details{margin-left:1.25rem}
details>summary{cursor:pointer;display:block;position:relative;line-height:1.6;margin-bottom:.625rem;outline:none;-webkit-tap-highlight-color:transparent}
details>summary::-webkit-details-marker{display:none}
details>summary{cursor:pointer;display:block;position:relative;line-height:1.6;margin-bottom:.625rem;-webkit-tap-highlight-color:transparent}
details>summary::before{content:"";border:solid transparent;border-left:solid;border-width:.3em 0 .3em .5em;position:absolute;top:.5em;left:-1.25rem;transform:translateX(15%)}
details[open]>summary::before{border:solid transparent;border-top:solid;border-width:.5em .3em 0;transform:translateY(15%)}
details>summary::after{content:"";width:1.25rem;height:1em;position:absolute;top:.3em;left:-1.25rem}
Expand All @@ -208,10 +207,13 @@
.admonitionblock>table td.content{padding-left:1.125em;padding-right:1.25em;border-left:1px solid #dddddf;color:rgba(0,0,0,.6);word-wrap:anywhere}
.admonitionblock>table td.content>:last-child>:last-child{margin-bottom:0}
.exampleblock>.content{border:1px solid #e6e6e6;margin-bottom:1.25em;padding:1.25em;background:#fff;border-radius:4px}
.exampleblock>.content>:first-child{margin-top:0}
.exampleblock>.content>:last-child{margin-bottom:0}
.sidebarblock{border:1px solid #dbdbd6;margin-bottom:1.25em;padding:1.25em;background:#f3f3f2;border-radius:4px}
.sidebarblock>:first-child{margin-top:0}
.sidebarblock>:last-child{margin-bottom:0}
.sidebarblock>.content>.title{color:#7a2518;margin-top:0;text-align:center}
.exampleblock>.content>:first-child,.sidebarblock>.content>:first-child{margin-top:0}
.exampleblock>.content>:last-child,.exampleblock>.content>:last-child>:last-child,.exampleblock>.content .olist>ol>li:last-child>:last-child,.exampleblock>.content .ulist>ul>li:last-child>:last-child,.exampleblock>.content .qlist>ol>li:last-child>:last-child,.sidebarblock>.content>:last-child,.sidebarblock>.content>:last-child>:last-child,.sidebarblock>.content .olist>ol>li:last-child>:last-child,.sidebarblock>.content .ulist>ul>li:last-child>:last-child,.sidebarblock>.content .qlist>ol>li:last-child>:last-child{margin-bottom:0}
.exampleblock>.content>:last-child>:last-child,.exampleblock>.content .olist>ol>li:last-child>:last-child,.exampleblock>.content .ulist>ul>li:last-child>:last-child,.exampleblock>.content .qlist>ol>li:last-child>:last-child,.sidebarblock>.content>:last-child>:last-child,.sidebarblock>.content .olist>ol>li:last-child>:last-child,.sidebarblock>.content .ulist>ul>li:last-child>:last-child,.sidebarblock>.content .qlist>ol>li:last-child>:last-child{margin-bottom:0}
.literalblock pre,.listingblock>.content>pre{border-radius:4px;overflow-x:auto;padding:1em;font-size:.8125em}
@media screen and (min-width:768px){.literalblock pre,.listingblock>.content>pre{font-size:.90625em}}
@media screen and (min-width:1280px){.literalblock pre,.listingblock>.content>pre{font-size:1em}}
Expand All @@ -233,8 +235,9 @@
table.linenotable{border-collapse:separate;border:0;margin-bottom:0;background:none}
table.linenotable td[class]{color:inherit;vertical-align:top;padding:0;line-height:inherit;white-space:normal}
table.linenotable td.code{padding-left:.75em}
table.linenotable td.linenos,pre.pygments .linenos{border-right:1px solid;opacity:.35;padding-right:.5em;-webkit-user-select:none;-moz-user-select:none;-ms-user-select:none;user-select:none}
pre.pygments span.linenos{display:inline-block;margin-right:.75em}
table.linenotable td.linenos{border-right:1px solid;opacity:.35;padding-right:.5em}
pre.pygments .lineno{border-right:1px solid;opacity:.35;display:inline-block;margin-right:.75em}
pre.pygments .lineno::before{content:"";margin-right:-.125em}
.quoteblock{margin:0 1em 1.25em 1.5em;display:table}
.quoteblock:not(.excerpt)>.title{margin-left:-1.5em;margin-bottom:.75em}
.quoteblock blockquote,.quoteblock p{color:rgba(0,0,0,.85);font-size:1.15rem;line-height:1.75;word-spacing:.1em;letter-spacing:0;font-style:italic;text-align:justify}
Expand Down Expand Up @@ -271,7 +274,7 @@
table.frame-none>:last-child>:last-child>*,table.frame-sides>:last-child>:last-child>*{border-bottom-width:0}
table.frame-none>*>tr>:first-child,table.frame-ends>*>tr>:first-child{border-left-width:0}
table.frame-none>*>tr>:last-child,table.frame-ends>*>tr>:last-child{border-right-width:0}
table.stripes-all>*>tr,table.stripes-odd>*>tr:nth-of-type(odd),table.stripes-even>*>tr:nth-of-type(even),table.stripes-hover>*>tr:hover{background:#f8f8f7}
table.stripes-all tr,table.stripes-odd tr:nth-of-type(odd),table.stripes-even tr:nth-of-type(even),table.stripes-hover tr:hover{background:#f8f8f7}
th.halign-left,td.halign-left{text-align:left}
th.halign-right,td.halign-right{text-align:right}
th.halign-center,td.halign-center{text-align:center}
Expand All @@ -287,11 +290,10 @@
ul li ol{margin-left:1.5em}
dl dd{margin-left:1.125em}
dl dd:last-child,dl dd:last-child>:last-child{margin-bottom:0}
li p,ul dd,ol dd,.olist .olist,.ulist .ulist,.ulist .olist,.olist .ulist{margin-bottom:.625em}
ol>li p,ul>li p,ul dd,ol dd,.olist .olist,.ulist .ulist,.ulist .olist,.olist .ulist{margin-bottom:.625em}
ul.checklist,ul.none,ol.none,ul.no-bullet,ol.no-bullet,ol.unnumbered,ul.unstyled,ol.unstyled{list-style-type:none}
ul.no-bullet,ol.no-bullet,ol.unnumbered{margin-left:.625em}
ul.unstyled,ol.unstyled{margin-left:0}
li>p:empty:only-child::before{content:"";display:inline-block}
ul.checklist>li>p:first-child{margin-left:-1em}
ul.checklist>li>p:first-child>.fa-square-o:first-child,ul.checklist>li>p:first-child>.fa-check-square-o:first-child{width:1.25em;font-size:.8em;position:relative;bottom:.125em}
ul.checklist>li>p:first-child>input[type=checkbox]:first-child{margin-right:.25em}
Expand Down Expand Up @@ -334,6 +336,8 @@
#footnotes .footnote a:first-of-type{font-weight:bold;text-decoration:none;margin-left:-1.05em}
#footnotes .footnote:last-of-type{margin-bottom:0}
#content #footnotes{margin-top:-.625em;margin-bottom:0;padding:.75em 0}
.gist .file-data>table{border:0;background:#fff;width:100%;margin-bottom:0}
.gist .file-data>table td.line-data{width:99%}
div.unbreakable{page-break-inside:avoid}
.big{font-size:larger}
.small{font-size:smaller}
Expand Down Expand Up @@ -390,7 +394,7 @@
dt,th.tableblock,td.content,div.footnote{text-rendering:optimizeLegibility}
h1,h2,p,td.content,span.alt,summary{letter-spacing:-.01em}
p strong,td.content strong,div.footnote strong{letter-spacing:-.005em}
p,blockquote,dt,td.content,td.hdlist1,span.alt,summary{font-size:1.0625rem}
p,blockquote,dt,td.content,span.alt,summary{font-size:1.0625rem}
p{margin-bottom:1.25rem}
.sidebarblock p,.sidebarblock dt,.sidebarblock td.content,p.tableblock{font-size:1em}
.exampleblock>.content{background:#fffef7;border-color:#e0e0dc;box-shadow:0 1px 4px #e0e0dc}
Expand Down Expand Up @@ -437,7 +441,7 @@
<div id="header">
<h1>Technical documentation: Guidance for data centres contributing to SDMS</h1>
<div class="details">
<span id="revdate">2023-11-29</span>
<span id="revdate">2024-04-15</span>
<br><span id="revremark">Consolidated version following internal review.</span>
</div>
</div>
Expand Down Expand Up @@ -1807,7 +1811,7 @@ <h5 id="darwincorearchive"><a class="anchor" href="#darwincorearchive"></a>2.2.3
Recommendation
</td>
<td class="hdlist2">
<p>Darwin Core Archive is recommended for use for biodiversity data within SDMS.</p>
<p>Darwin Core Archive is recommended for use for biodiversity and associated data (e.g. experimental, measurements/traits, data derived from metabarcoding) within SDMS.</p>
</td>
</tr>
</table>
Expand Down Expand Up @@ -2139,7 +2143,7 @@ <h3 id="_outline"><a class="anchor" href="#_outline"></a>A.2. Outline</h3>
</div>
<div id="footer">
<div id="footer-text">
Last updated 2023-11-29 12:01:48 UTC
Last updated 2024-04-15 09:33:28 +0200
</div>
</div>
</body>
Expand Down
Binary file modified doc/sdms_iog.pdf
Binary file not shown.
52 changes: 26 additions & 26 deletions doc/sdms_iog_ch02-data-formats.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,48 +4,48 @@
[[introduction-3]]
===== Introduction

Most of the exchange mechanisms mentioned above transfer files. In order to properly understand the content of a file some use metadata is usually necessary.
File formats that embed use metadata (and also discovery metadata) are preferred (e.g. NetCDF/CF and Darwin Core Archive).
NetCDF in itself is not self describing, but NetCDF following the Climate and Forecast Convention is self describing.
Most of the exchange mechanisms mentioned above transfer files. In order to properly understand the content of a file some use metadata is usually necessary.
File formats that embed use metadata (and also discovery metadata) are preferred (e.g. NetCDF/CF and Darwin Core Archive).
NetCDF in itself is not self describing, but NetCDF following the Climate and Forecast Convention is self describing.
Adding the http://wiki.esipfed.org/index.php?title=Category:Attribute_Conventions_Dataset_Discovery[NetCDF Attribute Convention for Dataset Discovery] embeds full discovery metadata (e.g. originator/PI, constraints etc.) in the file.

When it is not possible to encode data as NetCDF-CF or Darwin Core Archive, data can be uploaded in a non-proprietary file format that is easy to consume for users (without specific software) accompanied by a detailed product manualfootnote:[There is currently no template for product manuals available. This is to be developed.] (in PDF format).
When it is not possible to encode data as NetCDF-CF or Darwin Core Archive, data can be uploaded in a non-proprietary file format that is easy to consume for users (without specific software) accompanied by a detailed product manualfootnote:[There is currently no template for product manuals available. This is to be developed.] (in PDF format).
_This approach cannot be used for SIOS Core Data_.

[[darwincorearchive]]
===== Darwin Core Archive
Darwin Core Archive is a file format much used within the biological community and in particular within biodiversity.
It is the backbone of the Global Biodiversity Information facility (GBIF) and the Ocean Biodiversity Information System (OBIS).
In essence it is a set of comma separated files (CSV) bundled with a metadata file (meta.xml) and using controlled vocabularies to describe the content.
A second file (eml.xml) describes the content and strucuture of the CSVs and links the column headers to the Darwin Core terms.
SDMS cannot do much on top of Darwin Core Archives (the diversity of types if data is too large), but the format is more or less FAIR compliant and is recommended for use within SDMS.
Further information is available at https://dwc.tdwg.org/terms/.
Darwin Core Archive is a file format much used within the biological community and in particular within biodiversity.
It is the backbone of the Global Biodiversity Information facility (GBIF) and the Ocean Biodiversity Information System (OBIS).
In essence it is a set of comma separated files (CSV) bundled with a metadata file (meta.xml) and using controlled vocabularies to describe the content.
A second file (eml.xml) describes the content and strucuture of the CSVs and links the column headers to the Darwin Core terms.
SDMS cannot do much on top of Darwin Core Archives (the diversity of types if data is too large), but the format is more or less FAIR compliant and is recommended for use within SDMS.
Further information is available at https://dwc.tdwg.org/terms/.

[horizontal]
Recommendation::
Darwin Core Archive is recommended for use for biodiversity data within SDMS.
Darwin Core Archive is recommended for use for biodiversity and associated data (e.g. experimental, measurements/traits, data derived from metabarcoding) within SDMS.

[[jsongeojsonjson-ld]]
===== JSON/GeoJSON/JSON-LD

JavaScript Object Notation (http://www.json.org/[JSON]) and the geographical extension http://geojson.org/[GeoJSON] of this is similar to NetCDF in that it is a container lacking standardised metadata.
JavaScript Object Notation (http://www.json.org/[JSON]) and the geographical extension http://geojson.org/[GeoJSON] of this is similar to NetCDF in that it is a container lacking standardised metadata.
http://json-ld.org/[JSON-LD] (JavaScript Object Notation for Linked Data,) enables encoding of Linked Data using JSON.

There is currently no standardised FAIR compliant implementation of JSON for the types of data SDMS is handling.
There is currently no standardised FAIR compliant implementation of JSON for the types of data SDMS is handling.
The CF convention could be implemented in JSON and there is work internationally pushing in this direction, but not yet mature enough.

IMPORTANT: SDMS is currently *not* able to consume JSON files as they are not sufficiently standartdised across providers.

NOTE: SDMS should work to enable ACDD and CF elements in GeoJSON files when the new OGC API's emerge.
NOTE: SDMS should work to enable ACDD and CF elements in GeoJSON files when the new OGC API's emerge.

[[netcdfcf]]
===== NetCDF/CF
NetCDF is a container like JSON and XML, and as such not a recommended file format for data within SDMS.
However, the Climate and Forecast convention constrains the degrees of freedom within NetCDF and enforces structures and application of controlled vocabularies to describe the content of the data.
CF-NetCDF is thus a FAIR compliant file format and recommended for use within SDMS.
However, even NetCDF/CF have too many degrees of freedom to allow higher orders services to be established for datasets.
Thus some further constraints on granularity and structures are recommended. NetCDF/CF is the backbone of the Earth System Grid Federation serving IPCC data, Copernicus Marine Environmental Monitoring Service (CMEMS), SeaDataNet and several other services.
The file format is recommended for meteorological, oceanographic, hydrological and glaciological data (although exceptions exist).
NetCDF is a container like JSON and XML, and as such not a recommended file format for data within SDMS.
However, the Climate and Forecast convention constrains the degrees of freedom within NetCDF and enforces structures and application of controlled vocabularies to describe the content of the data.
CF-NetCDF is thus a FAIR compliant file format and recommended for use within SDMS.
However, even NetCDF/CF have too many degrees of freedom to allow higher orders services to be established for datasets.
Thus some further constraints on granularity and structures are recommended. NetCDF/CF is the backbone of the Earth System Grid Federation serving IPCC data, Copernicus Marine Environmental Monitoring Service (CMEMS), SeaDataNet and several other services.
The file format is recommended for meteorological, oceanographic, hydrological and glaciological data (although exceptions exist).
Work is in progress within WMO to identify specific CF profiles for NetCDF for use within WMO.

[horizontal]
Expand All @@ -65,9 +65,9 @@ You can check your NetCDF file against the CF and ACDD conventions at https://si
[[wmo-bufr]]
===== WMO BUFR

Binary Universal Form for the Representation of meteorological data (BUFR) is a binary data format maintained by WMO.
Its main purpose is operational exchange of real time data and it is adapted for robust transfer on varying bandwidth connections.
Data that are supposed to be exchanged using WMO Global Telecommunication System (GTS) should be encoded in WMO BUFR.
Binary Universal Form for the Representation of meteorological data (BUFR) is a binary data format maintained by WMO.
Its main purpose is operational exchange of real time data and it is adapted for robust transfer on varying bandwidth connections.
Data that are supposed to be exchanged using WMO Global Telecommunication System (GTS) should be encoded in WMO BUFR.
BUFR is a table driven file format, implying that the format is not self explaining and the user has to have the correct table to understand the content.

IMPORTANT: BUFR is, although being a standardised format, not recommended for data sharing within SDMS.
Expand All @@ -77,16 +77,16 @@ NOTE: SDMS will extract information in BUFR format and convert this to CF-NetCDF
[[wmo-grib]]
===== WMO Grib

GRIdded Binary (GRIB) is a binary format maintained by WMO.
As BUFR, this format is best suited for real time exchange over WMO GTS.
GRIdded Binary (GRIB) is a binary format maintained by WMO.
As BUFR, this format is best suited for real time exchange over WMO GTS.
It is also a table driven format like BUFR, having the same limitations.

IMPORTANT: GRIB is, although being a standardised format, not recommended for data sharing within SDMS.

[[xml]]
===== XML

Extensible Markup Language (XML) is similar to NetCDF in that it is a container lacking standardised metadata describing its contents.
Extensible Markup Language (XML) is similar to NetCDF in that it is a container lacking standardised metadata describing its contents.
There are many variants of XML and the overhead is large, as the format is text-based.

NOTE: XML is more or less fully replaced by various flavours of JSON now. The bnew OGC API's also include CF-NetCDF as a information container.
Expand Down
Loading