Skip to content

SEO documentation

Jo Cook edited this page Feb 24, 2021 · 1 revision

Robots.txt

This is located at the root of the web server container (nginx), and is copied from https://github.com/AstunTechnology/os-custom-geonetwork/blob/main/nginx/root/robots.txt when the containers are spun up.

It is then accessible at https://osmetadata.astuntechnology.com/robots.txt

At present it is set to disallow all crawling while the site is in development mode. See https://www.robotstxt.org/ for more information.

Site Map

This is generated automatically using https://github.com/geonetwork/core-geonetwork/blob/3.10.x/web/src/main/webapp/xslt/services/sitemap/sitemap.xsl and can be found at https://osmetadata.astuntechnology.com/geonetwork/srv/api/sitemap (this location is linked to in robots.txt above).

Canonical Pages

The links in sitemap are to the "canonical" versions of records, eg https://osmetadata.astuntechnology.com/geonetwork/srv/api/records/d442b64c-c8c8-11e4-8731-1681e6b88ec1?language=all. This is the version of the page that includes linked data (view source and look for the <script type="application/ld+json"> block).

The linked data is schema-dependent (so part of Gemini 2.3), and is generated from https://github.com/AstunTechnology/iso19139.gemini23/blob/3.10.x/src/main/plugin/iso19139.gemini23/formatter/jsonld/iso19139.gemini23-to-jsonld.xsl