Skip to content

RDF Format

okram edited this page Jan 11, 2013 · 9 revisions

  • OutputFormat: com.thinkaurelius.faunus.formats.edgelist.rdf.RDFInputFormat

The Semantic Web community is one of the original promoters of the graph as an approach to data modeling. Their efforts have led to the development of the RDF format. While there are many RDF formats, an RDF file is (conceptually) composed of triples whereby a subject is connected to an object by a predicate. For instance:

<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#father> <http://thinkaurelius.com#jupiter> .

In this way, RDF is an edge list format. Faunus, on the other hand, makes use of an adjacency list in its representation. As such, the RDFInputFormat provided by Faunus is a MapReduce job that converts an edge list into a adjacency list.

Conversion Parameters

RDF Format

faunus.input.format.rdf.literal-as-property

There are numerous RDF formats. Faunus currently supports the following formats.

  • rdf-xml
  • n-triples
  • turtle
  • n3
  • trix
  • trig

Literal as Property

faunus.input.format.rdf.literal-as-property

There are two types of triples to be aware of — one that is a URI connecting to a URI and one that is a URI connecting to a literal.

<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#father> <http://thinkaurelius.com#jupiter> .
<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#age> "32"^^<http://www.w3.org/2001/XMLSchema#int> .

If the above Faunus property is set to true, then the Hercules vertex has an age property with an integer values of 32.

Use Local Name

faunus.input.format.rdf.use-localname

The theoretically infinite RDF graph is embedded with the infinite address space of URIs. To leverage this infinite space, a vertex is specified using a URI. In many situations, the full URI is not required and as such, if the above property is set to true, then

<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#father> <http://thinkaurelius.com#jupiter> .

Generates vertices with name hercules and jupiter connected by a father edge.

As Properties

faunus.input.format.rdf.as-properties

RDF is a triple format. As such, there are no properties, only vertices and edges. In some situations, an object URI should be treated as a property of the vertex. For instance, when http://www.w3.org/1999/02/22-rdf-syntax-ns#type is specified in the String list of the property above, then the triple

<http://thinkaurelius.com#hercules> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://thinkaurelius.com#demigod>

yields a Hercules vertex with type-property demigod. A typical setting for this property is below.

faunus.input.format.rdf.as-properties=http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2000/01/rdf-schema#label
Clone this wiki locally