Skip to content
This repository has been archived by the owner on Jul 8, 2024. It is now read-only.

Tagging information resources and linking content

sklarman edited this page Oct 31, 2019 · 2 revisions

Subject tags

Different types of digital information resources (e.g., datasets, documents, SDG entities) can be described (annotated/tagged) with concepts of relevant shared taxonomies, such as:

Such annotations or subject tags facilitate content cataloging, semantic search, and content linking (see: Content linking).

The following example demonstrates a sample of UNBIS and EuroVoc annotations over the SDG target 1.1:

{
    "@context": {
        "subject": "http://purl.org/dc/terms/subject",
        "prefLabel": "http://www.w3.org/2004/02/skos/core#prefLabel"
    },
    "@id": "http://metadata.un.org/sdg/1.1",
    "@type": [
        "http://metadata.un.org/sdg/ontology#Target",
        "http://www.w3.org/2004/02/skos/core#Concept"
    ],
    "prefLabel": "By 2030, eradicate extreme poverty for all people everywhere, currently measured as people living on less than $1.25 a day",
    "subject": [
        {
            "@id": "http://metadata.un.org/thesaurus/1005064",
            "prefLabel": "POVERTY"
        },
        {
            "@id": "http://eurovoc.europa.eu/2281",
            "prefLabel": "poverty"
        },
        {
            "@id": "http://eurovoc.europa.eu/2062",
            "prefLabel": "standard of living"
        },
        {
            "@id": "http://metadata.un.org/thesaurus/1000544",
            "prefLabel": "BASIC NEEDS"
        },
        {
            "@id": "http://eurovoc.europa.eu/6781",
            "prefLabel": "basic needs"
        },
        {
            "@id": "http://metadata.un.org/thesaurus/1006166",
            "prefLabel": "STANDARD OF LIVING"
        }
    ]
}

The proposed set of subject tags over the SDG goals, targets, indicators and series (still under development) is included in sdg-kos-subject-mappings.ttl.

Tag management recommendation

In the current practice of publishing SDG series, subject tags are used merely as strings (originating from the UN-BIS taxonomy). In order to ease and scale the management of SDG entity tagging, as well as increase potential benefits of the resulting annotation layer, we strongly recommend employing linked data principles in this process reflected in the following rules:

  1. use the URIs of the subject concepts from dedicated taxonomies, rather than just their labels, when describing SDG entities
  2. link SDG entities' URIs with the subject concepts' URIs via some dedicated predicate from a standard vocabulary (e.g., dct:subject).
  3. whenever possible, associate subject concepts with most generic entities (e.g., goals and targets) and use inference to propagate them to more specific ones (e.g., indicators and series).

Inference

To ease curation of such annotations, automated inference can be used in order to propagate relevant tags from more generic to more specific content (e.g., from goals, to targets, to indicators, to series). Similarly, textual keywords can be inferred from the labels of annotation concepts. These basic inference patterns are depicted in the following figure:

Tagging information resources.

In the following example, all subject values of the indicator 01.03.01 are inferred to be at the same time subjects of all the series associated with that indicator. This is justified as the indicator is semantically subsumed by its subsidiary series. The series can, however, be associated with more specific subject annotations that are not propagated in the opposite direction, to the indicator. Analogically, all keywords values are inferred from the prefLabel values of each associated subject.

{
    "@context": {
        "hasSeries": "http://metadata.un.org/sdg/ontology#hasSeries",
        "keywords": "http://schema.org/keywords",
        "subject": "http://purl.org/dc/terms/subject",
        "prefLabel": "http://www.w3.org/2004/02/skos/core#prefLabel"
    },
    "@id": "http://metadata.un.org/sdg/C010301",
    "prefLabel": "01.03.01 Proportion of population covered by social protection floors/systems, by sex, distinguishing children, unemployed persons, older persons, persons with disabilities, pregnant women, newborns, work-injury victims and the poor and the vulnerable",
    "subject": [
        {
            "@id": "http://metadata.un.org/thesaurus#1005064",
            "prefLabel": "POVERTY"
        },
        {
            "@id": "http://eurovoc.europa.eu/2281",
            "prefLabel": "poverty"
        },
        {
            "@id": "http://eurovoc.europa.eu/2062",
            "prefLabel": "standard of living"
        },
        {
            "@id": "http://metadata.un.org/thesaurus#1000544",
            "prefLabel": "BASIC NEEDS"
        },
        {
            "@id": "http://eurovoc.europa.eu/6781",
            "prefLabel": "basic needs"
        },
        {
            "@id": "http://metadata.un.org/thesaurus#1005995",
            "prefLabel": "SOCIAL WELFARE"
        },
        {
            "@id": "http://metadata.un.org/thesaurus#1006166",
            "prefLabel": "STANDARD OF LIVING"
        }
    ],
    "keywords": [
        "standard of living",
        "basic needs",
        "STANDARD OF LIVING",
        "SOCIAL WELFARE",
        "POVERTY",
        "BASIC NEEDS",
        "poverty"
    ],
    "hasSeries": [
        {
            "@id": "http://metadata.un.org/sdg/SI_COV_WKINJRY",
            "prefLabel": "Proportion of employed population covered in the event of work injury (%)",
            "subject": [
                {
                    "@id": "http://metadata.un.org/thesaurus#1003238",
                    "prefLabel": "INSURANCE"
                },
                {
                    "@id": "http://eurovoc.europa.eu/4039",
                    "prefLabel": "occupational safety"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#050800",
                    "prefLabel": "INSURANCE"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1005995",
                    "prefLabel": "SOCIAL WELFARE"
                },
                {
                    "@id": "http://eurovoc.europa.eu/3151",
                    "prefLabel": "insurance"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1005064",
                    "prefLabel": "POVERTY"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1004552",
                    "prefLabel": "OCCUPATIONAL ACCIDENTS"
                },
                {
                    "@id": "http://eurovoc.europa.eu/2281",
                    "prefLabel": "poverty"
                },
                {
                    "@id": "http://eurovoc.europa.eu/2062",
                    "prefLabel": "standard of living"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1004561",
                    "prefLabel": "OCCUPATIONAL SAFETY"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1000544",
                    "prefLabel": "BASIC NEEDS"
                },
                {
                    "@id": "http://eurovoc.europa.eu/6781",
                    "prefLabel": "basic needs"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1006166",
                    "prefLabel": "STANDARD OF LIVING"
                }
            ],
            "keywords": [
                "OCCUPATIONAL SAFETY",
                "standard of living",
                "basic needs",
                "INSURANCE",
                "STANDARD OF LIVING",
                "occupational safety",
                "SOCIAL WELFARE",
                "OCCUPATIONAL ACCIDENTS",
                "POVERTY",
                "BASIC NEEDS",
                "insurance",
                "poverty"
            ]
        },
        {
            "@id": "http://metadata.un.org/sdg/SI_COV_PENSN",
            "prefLabel": "Proportion of population above statutory pensionable age receiving a pension, by sex (%)",
            "subject": [
                {
                    "@id": "http://metadata.un.org/thesaurus#1005064",
                    "prefLabel": "POVERTY"
                },
                {
                    "@id": "http://eurovoc.europa.eu/2062",
                    "prefLabel": "standard of living"
                },
                {
                    "@id": "http://eurovoc.europa.eu/2281",
                    "prefLabel": "poverty"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1000544",
                    "prefLabel": "BASIC NEEDS"
                },
                {
                    "@id": "http://eurovoc.europa.eu/6781",
                    "prefLabel": "basic needs"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1000115",
                    "prefLabel": "AGEING PERSONS"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1005995",
                    "prefLabel": "SOCIAL WELFARE"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1004610",
                    "prefLabel": "OLD AGE BENEFITS"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1006166",
                    "prefLabel": "STANDARD OF LIVING"
                }
            ],
            "keywords": [
                "standard of living",
                "basic needs",
                "AGEING PERSONS",
                "STANDARD OF LIVING",
                "SOCIAL WELFARE",
                "OLD AGE BENEFITS",
                "POVERTY",
                "BASIC NEEDS",
                "poverty"
            ]
        },
        {
            "@id": "http://metadata.un.org/sdg/SI_COV_UEMP",
            "prefLabel": "Proportion of unemployed persons receiving unemployment cash benefit, by sex (%)",
            "subject": [
                {
                    "@id": "http://metadata.un.org/thesaurus#1003238",
                    "prefLabel": "INSURANCE"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#050800",
                    "prefLabel": "INSURANCE"
                },
                {
                    "@id": "http://eurovoc.europa.eu/3151",
                    "prefLabel": "insurance"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1005995",
                    "prefLabel": "SOCIAL WELFARE"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1005064",
                    "prefLabel": "POVERTY"
                },
                {
                    "@id": "http://eurovoc.europa.eu/3332",
                    "prefLabel": "unemployment insurance"
                },
                {
                    "@id": "http://eurovoc.europa.eu/5974",
                    "prefLabel": "unemployment"
                },
                {
                    "@id": "http://eurovoc.europa.eu/2281",
                    "prefLabel": "poverty"
                },
                {
                    "@id": "http://eurovoc.europa.eu/2062",
                    "prefLabel": "standard of living"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1000544",
                    "prefLabel": "BASIC NEEDS"
                },
                {
                    "@id": "http://eurovoc.europa.eu/6781",
                    "prefLabel": "basic needs"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1006773",
                    "prefLabel": "UNEMPLOYMENT"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1006166",
                    "prefLabel": "STANDARD OF LIVING"
                },
                {
                    "@id": "http://metadata.un.org/thesaurus#1006774",
                    "prefLabel": "UNEMPLOYMENT INSURANCE"
                }
            ],
            "keywords": [
                "INSURANCE",
                "STANDARD OF LIVING",
                "UNEMPLOYMENT INSURANCE",
                "POVERTY",
                "poverty",
                "standard of living",
                "basic needs",
                "SOCIAL WELFARE",
                "unemployment",
                "UNEMPLOYMENT",
                "BASIC NEEDS",
                "unemployment insurance",
                "insurance"
            ]
        }
    ]
}

Editing subject concepts using VocBench instance

Subject tags can be conveniently added/edited using the VocBench editor instance, described in section SDG knowledge organization system. The following screenshot video demonstrates the entire process, involving adding a subject tag, exporting the SDG KOS repository from VocBench to the related triple store, which performs inference on the SDG KOS graph (as explained above), and querying & reviewing the resulting data using the GraphQL endpoint, described in section Query endpoints.

https://drive.google.com/file/d/1VYpiPIlsT_ufa0XzYtLLO2uYZHHGFQUK/view

Content linking

Different types of content described via shared taxonomy subjects can be linked together to facilitate content discovery and semantic search. One such content linking mechanism is implemented in the live demo application Sustainable Development Links.

The adopted approach involves measuring (and ranking) semantic similarity between two resources, by evaluating the path distances between their annotations within the same taxonomy. For instance, in the following example, a document and a series are semantically similar, as they share one annotation, while also being associated with other two annotations that are closely related to each other within their original taxonomy (one skos:broader "hop away").

Linking documents with datasets view taxonomy concepts.

Concept extraction

In a number of cases, (semi-)automated concept extraction and text annotation tools can be usefully involved to simplify discovery of relevant concepts to annotate the content.

A basic example of a service of this type is exposed at the URL http://sdg-links.org:5000. It extracts UNBIS and EuroVoc concepts from snippets of text, e.g.:

curl -X POST \
  http://sdg-links.org:5000 \
  -H 'Content-Type: application/json' \
  -H 'Postman-Token: 4d1ebbfa-49ce-4e5e-a86c-4484e2741138' \
  -H 'cache-control: no-cache' \
  -d '[
	{
		"text":"This chapter introduces newly developed concepts about sustainable hazardous waste management and treatment and describes the key elements of sustainable hazardous waste management in the context of current broader issues (e.g., renewable energy and climate change). It also discusses fundamentals and basic components of sustainable hazardous waste and presents technical options for sustainable hazardous waste treatment and remediation. Microorganisms play an important role in wastewater treatment because of their immense potential for immobilization and bio-accumulative properties. Next, the chapter explains disposal of hazardous waste and reviews adjustments to meet global challenges. The choice of disposal should be based on evaluation of economics and potential pollution risks. Finally, the chapter identifies future trends and challenges for sustainable hazardous waste management/treatment, providing research recommendations to help achieve the broader goals of sustainability.",
		"@id": "/url/document"
		
	}
]'
{
    "text": "This chapter introduces newly developed concepts about sustainable hazardous waste management and treatment and describes the key elements of sustainable hazardous waste management in the context of current broader issues (e.g., renewable energy and climate change). It also discusses fundamentals and basic components of sustainable hazardous waste and presents technical options for sustainable hazardous waste treatment and remediation. Microorganisms play an important role in wastewater treatment because of their immense potential for immobilization and bio-accumulative properties. Next, the chapter explains disposal of hazardous waste and reviews adjustments to meet global challenges. The choice of disposal should be based on evaluation of economics and potential pollution risks. Finally, the chapter identifies future trends and challenges for sustainable hazardous waste management/treatment, providing research recommendations to help achieve the broader goals of sustainability.",
    "@id": "/url/document",
    "matches": [
        {
            "url": "http://eurovoc.europa.eu/6103",
            "label": "hazardous waste",
            "start": 8,
            "end": 10
        },
        {
            "url": "http://metadata.un.org/thesaurus#1006571",
            "label": "hazardous waste management",
            "start": 8,
            "end": 11
        },
        {
            "url": "http://eurovoc.europa.eu/1158",
            "label": "waste management",
            "start": 9,
            "end": 11
        },
        {
            "url": "http://eurovoc.europa.eu/6103",
            "label": "hazardous waste",
            "start": 20,
            "end": 22
        },
        {
            "url": "http://eurovoc.europa.eu/1158",
            "label": "waste management",
            "start": 21,
            "end": 23
        },
        {
            "url": "http://eurovoc.europa.eu/754",
            "label": "renewable energy",
            "start": 32,
            "end": 34
        },
        {
            "url": "http://metadata.un.org/thesaurus#1002035",
            "label": "energy",
            "start": 33,
            "end": 34
        },
        {
            "url": "http://metadata.un.org/thesaurus#1001030",
            "label": "climate change",
            "start": 35,
            "end": 37
        },
        {
            "url": "http://eurovoc.europa.eu/5482",
            "label": "climate change",
            "start": 35,
            "end": 37
        },
        {
            "url": "http://metadata.un.org/thesaurus#1006935",
            "label": "waste treatment",
            "start": 55,
            "end": 57
        },
        {
            "url": "http://eurovoc.europa.eu/1158",
            "label": "waste treatment",
            "start": 55,
            "end": 57
        },
        {
            "url": "http://eurovoc.europa.eu/5740",
            "label": "microorganism",
            "start": 59,
            "end": 60
        },
        {
            "url": "http://eurovoc.europa.eu/612",
            "label": "wastewater",
            "start": 65,
            "end": 66
        },
        {
            "url": "http://metadata.un.org/thesaurus#1005837",
            "label": "wastewater",
            "start": 65,
            "end": 66
        },
        {
            "url": "http://metadata.un.org/thesaurus#030400",
            "label": "pollution",
            "start": 106,
            "end": 107
        },
        {
            "url": "http://metadata.un.org/thesaurus#1005003",
            "label": "pollution",
            "start": 106,
            "end": 107
        },
        {
            "url": "http://eurovoc.europa.eu/2524",
            "label": "pollution",
            "start": 106,
            "end": 107
        }
        ...
    ]
}

As the URIs of the extracted concepts originate in the same vocabularies as those associated with the SDG entities, the approach generates indirect links between different types of content.