Skip to content

Rahien/RDFazer

Repository files navigation

RDFazer

Introduction

RDFazer is a chrome browser plugin for tagging a website with RDFa concepts. It only tags the static page and does not store the content anywhere. It was conceived for taking an existing (static) web page and tagging it with concepts from some taxonomy (e.g. the EC ESCO taxonomy). This page can then be stored elsewhere.

This code makes heavy use of the rangy library, jquery and jquery-ui.

RDFazer is currently a WIP and should only be used if you know what you are doing.

Usage

RDFazer is a light-weight chrome browser extension that allows tagging a website with RDFa concepts from any SPARQL endpoint.

Interface

The interface of RDFazer is shown in the image below. doc-img/interface.png

  • start RDFazer: RDFazer is started by clicking the browser extension button on the top right of the screen. At this point, all RDFazer resources are loaded. As long as this button is not clicked, only a bare minimum of resources is loaded. Once RDFazer is ready, the RDFazer side bar will become visible on the screen.
  • close RDFazer: This button closes RDFazer and removes all RDFazer interface elements from the web page. The tagged concepts remain.
  • switch side: This button switches the interface between the left and the right of the screen.
  • show highlight: This button opens the RDFazer interface up, so it shows the highlighted concepts.
  • save html: This button first removes all RDFazer interface elements and then downloads the web page as an HTML file with all the highlighted concepts. Until this button is pressed, none of the changes made by RDFazer are persisted. When the web page is loaded later and RDFazer is activated, it will recognize all highlighted concepts from before.
  • settings: This button allows the modification of the RDFazer settings, see the configuration section. The settings are persisted on the local machine for later reuse.
  • highlight selected: This button opens a dialog that allows the user to tag the currently selected text on his webpage, see below. This selection can span multiple HTML elements and element borders (have a look at the rangy library if you’re interested in how this is done).

To tag a concept, select the part of text you want to tag and click the ‘highlight selected’ button.

Search SPARQL endpoint

RDFazer allows a search of any SPARQL endpoint (see the the configuration section for how to configure the search). The interface for the search is shown below.

doc-img/search.png

Enter the search term in the search field. To search, press enter (while in the search field).

The results are shown by their labels, to get more information on the results of the search query, click the results open by clicking the toggle button on the left.

Any number of results can be added to the concept. If you need to do multiple searches to find all the matches you want, simply check the results of the first set and do a new search. It will keep the checked results of the previous search and add the new search results.

Clear all (checked and unchecked) results using the ‘clear’ button.

Tag the selection with all checked concepts using the ‘highlight’ button.

Manual tagging

The manual tagging interface is less powerful than the search interface. It only allows to provide a URI, URL and label for a selection. Add URIs using the ‘+’ button and remove the last URI by using the ‘-’ button.

doc-img/manual.png

Removing tagged concepts

To remove a tag, click it and a dialog will pop up that allows you to remove the tag. The highlight will be removed, the stored information on the highlighted concept will be removed and the highlight’s tag will be removed in RDFazer’s panel. There is no way to retrieve the tag.

Installation

RDFazer is a chrome browser extension. To install:

  1. clone this git repository anywhere on your file system.
  2. open chromium/chrome
  3. go to Tools > Extensions
  4. make sure the ‘Developer mode’ is checked
  5. click ‘Load unpacked extension’ and select the folder with the RDFazer git repository

RDFazer can work with information in any SPARQL endpoint, just configure the url of the endpoint and the query correctly.

I will look into packing RDFazer and making it available on the chrome web store, probably once the ESCO SPARQL endpoint is online so a showcase is readily available.

Configuration

RDFazer offers many different configuration options. Through the settings dialog, the current configuration can be inspected and modified. The configuration file can also be downloaded as JSON or a new JSON file with a configuration can be uploaded.

Below is an example of such a JSON file, interspersed with comments on the meaning of the properties.

SPARQL endpoint

First comes the location of the SPARQL endpoint that is used for searches. By default, this is an endpoint on the localhost system, but it can be any endpoint at all.

{
    "sparql":"http://localhost:8890/sparql",

File URI

Next is the fileURI property, this property defines the baseURI of the current file. It is an optional property, if it is not set or left blank, the URL of the current document will be used.

"fileURI":"",

Profiles

RDFazer works with different profiles, the idea is that one can quickly switch between different profiles, so it becomes easy to tag elements of different types in a single document. The currently active profile is set in the ‘profile’ property.

The different profiles are listed in an object, working as a hash, from profile name to the profile description, holding all the profile properties. All the following properties are part of one profile.

"profile":"esco",
"profiles": {
    "esco": {

Query

The query defines what is being searched for when the user searches for a concept to highlight. This query is a regular SPARQL query, but there are some key things to note:

  • the query MUST contain the text $searchTerm, which will be replaced with the text the user has typed in the search box
  • the query MUST return a ?target variable, which MUST hold the URI of the concept to highlight
  • the query MAY return other variables at will
  • the query SHOULD return only one result per concept, as no duplicate checking is performed
"query": "select ?target ?label (group_concat(distinct(?labels),\"| \") as ?altLabels) (group_concat(distinct(?types), \"| \") as ?types)\n where { \n{ ?target a <http://ec.europa.eu/esco/model#Occupation> . } \nUNION\n { ?target a <http://ec.europa.eu/esco/model#Skill> . } \n?target <http://www.w3.org/2008/05/skos-xl#prefLabel> ?thing3. ?thing3 <http://www.w3.org/2008/05/skos-xl#literalForm> ?label .\n ?target <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?types .\n{ ?target <http://www.w3.org/2008/05/skos-xl#prefLabel> ?thing1. \n?thing1 <http://www.w3.org/2008/05/skos-xl#literalForm> ?plabels . \nFILTER (bif:contains(?plabels,\"'$searchTerm*'\")) . \nFILTER (lang(?plabels) = \"en\") . } \nUNION\n { ?target <http://www.w3.org/2008/05/skos-xl#altLabel> ?thing2.\n ?thing2 <http://www.w3.org/2008/05/skos-xl#literalForm> ?plabels .\n FILTER (bif:contains(?plabels,\"'$searchTerm*'\")) . \nFILTER (lang(?plabels)= \"en\") . \n} \nOPTIONAL {?target <http://www.w3.org/2008/05/skos-xl#altLabel> ?thing4\n. ?thing4 <http://www.w3.org/2008/05/skos-xl#literalForm> ?labels\n. FILTER (lang (?labels) = \"en\") \n}\nFILTER (lang (?label) = \"en\") \n} GROUP BY ?target ?label",

URI to URL

The URI of the concept to highlight may not necessarily be backed up with a Linked Data architecture that allows to use the URI as a URL. In that case, the ‘uriToUrl’ property holds a string with a javascript expression to transform the URI to a URL with information on the concept. If this is left blank, the URI is used as a URL.

"uriToUrl":"'https://ec.europa.eu/esco/web/guest/concept/-/concept/thing/en/' +uri",

Label Property

The query MAY return a human readable label concisely describing the returned concept. If so, the ‘labelProperty’ can point to the variable holding that label in the query result. This is an optional property, if it is not set, it is assumed to be equal to “label”. If no such variable is found in the query result set, the URI of the concept is returned instead.

"labelProperty":"label",

Label Predicate

RDFazer also reads the highlights already present in a file from an earlier session. It must therefore know which predicate to use as a label to show to the user. The ‘labelPredicate’ defines the value of this predicate. If no such predicate is found for a highlighted concept, the concept’s URI is shown instead.

"labelPredicate":"http://www.w3.org/2004/02/skos/core#prefLabel",

Stored Information

RDFazer can store information returned by the query other directly in the annotated file. For every variable in the query result set, a key-value pair MAY be present in the ‘storedInfo’ property. The key MUST be the name of the variable that is returned. The value MUST have the following structure:

  • predicate: the URI of the RDF predicate to connect the value of the variable to the concept being highlighted
  • type: either “property” or “relation”, a relation signals that a relation with another concept (with a URI) is made.
  • csv: if a query result groups multiple different values, separated by some character, the csv property defines this separator character, so the values are stored separately in the annotated file.
  • decorate: a json object with key value pairs defining extra attributes to be set on the value of the stored property. This is useful for defining the language of labels for instance.
        "storedInfo": {
            "label": {
                "predicate":"http://www.w3.org/2004/02/skos/core#prefLabel", 
                "type":"property", 
                "decorate":{"xml:lang":"en"}
            },
            "altLabels": {
                "predicate":"http://www.w3.org/2004/02/skos/core#altLabel", 
                "type":"property", 
                "csv":"|", 
                "decorate":{"xml:lang":"en"}
            },
            "types": {
                "predicate":"http://www.w3.org/1999/02/22-rdf-syntax-ns#type", 
                "type": "relation", 
                "csv":"|"
            }
        }
    }
}

Full example

The previous snippets combine to the following configuration json file:

{
    "sparql":"http://localhost:8890/sparql",
    "fileURI":"",
    "profile":"esco",
    "profiles": {
        "esco": {
            "query": "select ?target ?label (group_concat(distinct(?labels),\"| \") as ?altLabels) (group_concat(distinct(?types), \"| \") as ?types)\n where { \n{ ?target a <http://ec.europa.eu/esco/model#Occupation> . } \nUNION\n { ?target a <http://ec.europa.eu/esco/model#Skill> . } \n?target <http://www.w3.org/2008/05/skos-xl#prefLabel> ?thing3. ?thing3 <http://www.w3.org/2008/05/skos-xl#literalForm> ?label .\n ?target <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?types .\n{ ?target <http://www.w3.org/2008/05/skos-xl#prefLabel> ?thing1. \n?thing1 <http://www.w3.org/2008/05/skos-xl#literalForm> ?plabels . \nFILTER (bif:contains(?plabels,\"'$searchTerm*'\")) . \nFILTER (lang(?plabels) = \"en\") . } \nUNION\n { ?target <http://www.w3.org/2008/05/skos-xl#altLabel> ?thing2.\n ?thing2 <http://www.w3.org/2008/05/skos-xl#literalForm> ?plabels .\n FILTER (bif:contains(?plabels,\"'$searchTerm*'\")) . \nFILTER (lang(?plabels)= \"en\") . \n} \nOPTIONAL {?target <http://www.w3.org/2008/05/skos-xl#altLabel> ?thing4\n. ?thing4 <http://www.w3.org/2008/05/skos-xl#literalForm> ?labels\n. FILTER (lang (?labels) = \"en\") \n}\nFILTER (lang (?label) = \"en\") \n} GROUP BY ?target ?label",
            "uriToUrl":"'https://ec.europa.eu/esco/web/guest/concept/-/concept/thing/en/' +uri",
            "labelProperty":"label",
            "labelPredicate":"http://www.w3.org/2004/02/skos/core#prefLabel",
            "storedInfo": {
                "label": {
                    "predicate":"http://www.w3.org/2004/02/skos/core#prefLabel", 
                    "type":"property", 
                    "decorate":{"xml:lang":"en"}
                },
                "altLabels": {
                    "predicate":"http://www.w3.org/2004/02/skos/core#altLabel", 
                    "type":"property", 
                    "csv":"|", 
                    "decorate":{"xml:lang":"en"}
                },
                "types": {
                    "predicate":"http://www.w3.org/1999/02/22-rdf-syntax-ns#type", 
                    "type": "relation", 
                    "csv":"|"
                }
            }
        }
    }

Checklist

  • [X] Allow search through SPARQL endpoint
  • [X] Add settings panel
  • [X] Build a better CSS style…
  • [X] allow direct save of RDFazed html file
  • [X] allow storing query result properties as RDFa, apart from just the URI
  • [X] allow showing other property than URI in side bar
  • [X] add SPARQL endpoint provenance information
  • [X] allow setting of storedInfo properties
  • [X] add actual readme
  • [X] Build final CSS style
  • [X] Allow removal of results of an earlier session.
  • [?] Add paging to search results

About

Chrome browser plugin for adding RDFa to a webpage

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published