Is possible to generate BlankNodes from data references? #271

dachafra · 2022-07-13T12:47:28Z

The behavior should be similar to the one in RML:

@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix rml: <http://semweb.mmlab.be/ns/rml#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ql: <http://semweb.mmlab.be/ns/ql#> .
@prefix ex: <http://example/> .
@prefix : <http://example.org/> .
@base <http://example.org/> .

:firstTM a rr:TriplesMap ;
    rml:logicalSource [
        rml:source "data.csv";
        rml:referenceFormulation ql:CSV
    ];
    rml:subjectMap [
        rml:reference "c1" ;
        rr:termType rr:BlankNode
    ];
    rr:predicateObjectMap [
        rr:predicate ex:p ;
        rml:objectMap [
            rr:template "http://example/{c2}"
        ]
    ] .

Input

c1,c2
b0,A

Output:

 _:b0 ex:p ex:A

The text was updated successfully, but these errors were encountered:

enridaga · 2022-07-13T15:23:35Z

You can just construct bnodes:

PREFIX ex: <http://example/> 
PREFIX fx:  <http://sparql.xyz/facade-x/ns/>
PREFIX xyz: <http://sparql.xyz/facade-x/data/>

CONSTRUCT {
 [] ex:p ?A
} WHERE {
 SERVICE <x-sparql-anything:> {
	fx:properties fx:location "./data.csv" ; fx:csv.headers true .
 	[] xyz:c2 ?A
 }
}

or, if you want to control the bnode identifier for some reason:

PREFIX ex: <http://example/> 
PREFIX fx:  <http://sparql.xyz/facade-x/ns/>
PREFIX xyz: <http://sparql.xyz/facade-x/data/>

CONSTRUCT {
 ?bnode ex:p ?A
} WHERE {
 SERVICE <x-sparql-anything:> {
	fx:properties fx:location "./data.csv" ; fx:csv.headers true .
 	[] xyz:c1 ?b0 ; xyz:c2 ?A
 }
 BIND ( BNODE ( ?b0 ) as ?bnode ) 
}

dachafra · 2022-07-14T13:44:02Z

I've arrived at this point, yes, but you can not take the identifier of the BN from the input source, right?

enridaga · 2022-07-14T16:55:49Z

I've arrived at this point, yes, but you can not take the identifier of the BN from the input source, right?

You can take it from there, as you see in the second query. I am not sure I get the use case here.
Do you mean that you want to keep blank node identifier in the generated graph?
The generated blank node ids depend on the serialiser. BNode identifiers are supposed to be local and are usually generated during serialisation or during data loading. So, what's the point of forcing them?
If you want to mint an identifier, you probably want an IRI instead. Am I getting it right?

justin2004 · 2022-07-14T23:02:56Z

you could do this:

curl --silent 'http://localhost:3000/sparql.anything'  \
--header "Accept: text/csv" \
--data-urlencode 'query=
PREFIX  fx:   <http://sparql.xyz/facade-x/ns/>
SELECT  *
WHERE
  { SERVICE <x-sparql-anything:>
      { fx:properties
                  fx:location     "/app/input.csv" ;
                  fx:csv.headers  true .
        ?s        ?p              ?o
        BIND(iri(?s) AS ?s_iri)
      }
  }
'

yielding:

s	p	o	s_iri
_:b0	http://sparql\.xyz/facade\-x/data/c1	b0	_:file:/app/input.csv##row1
_:b0	http://sparql\.xyz/facade\-x/data/c2	A	_:file:/app/input.csv##row1
_:b1	http://www\.w3\.org/1999/02/22\-rdf\-syntax\-ns\#type	http://sparql\.xyz/facade\-x/ns/root	_:file:/app/input.csv#
_:b1	http://www\.w3\.org/1999/02/22\-rdf\-syntax\-ns\#\_1	_:b0	_:file:/app/input.csv#

justin2004 · 2022-07-14T23:08:27Z

oh, i know what you want now.
one minute.

justin2004 · 2022-07-14T23:24:25Z

it appears that apache jena does not let you synthesize a bnode identifier manually.
this is as close as i can get but neither quad is what you are looking for (one isn't a well formed quad and i'm not sure about the other).
though i think an actual IRI is what i would use in practice.

curl --silent 'http://localhost:3000/sparql.anything'  \
--header "Accept: application/n-quads" \
--data-urlencode 'query=
PREFIX  :     <http://example.com/>
PREFIX  xyz:  <http://sparql.xyz/facade-x/data/>
PREFIX  fx:   <http://sparql.xyz/facade-x/ns/>
CONSTRUCT 
  { 
    ?new_s_iri :p ?new_c2 .
    ?new_s_str :p ?new_c2 .
  }
WHERE
  { SERVICE <x-sparql-anything:>
      { fx:properties
                  fx:location     "/app/input.csv" ;
                  fx:csv.headers  true .
        ?s        xyz:c1          ?c1 ;
                  xyz:c2          ?c2
        BIND(iri(concat("_:", ?c1)) AS ?new_s_iri)
        BIND(concat("_:", ?c1) AS ?new_s_str)
        BIND(iri(concat(str(:), ?c2)) AS ?new_c2)
      }
  }
'

yields:

"_:b0" <http://example.com/p> <http://example.com/A> .
<_:b0> <http://example.com/p> <http://example.com/A> .

dachafra · 2022-07-15T08:26:28Z

@justin2004 yeah, exactly! I was able to obtain the same results, but I don't think that any of the results are valid RDF, right?

For letting you know, this is coming from this R2RML test-cases: https://www.w3.org/2001/sw/rdb2rdf/test-cases/#R2RMLTC0002b. It is not that I specifically want to have this feature in the engine but it is more for comparing both solutions. One of the main benefits of having this feature is that identifiers do not have to be maintained in memory during the execution.

enridaga · 2022-07-15T08:53:47Z

I don't think it is possible to control the blank nodes that are generated by the serializer, but this is probably a question for users@jena.apache.org.

However, while playing with this use case I found an interesting issue when one wants to generate multiple triples with the same bnode on different construct template projections. At the moment, a new bnode is generated for every projection, even if we use the BNODE function. This is reproducible by adding more rows to the example CSV. A new bnode is created for each one of them. I will open a separate issue for that.

justin2004 · 2022-07-15T12:28:55Z

At the moment, a new bnode is generated for every projection, even if we use the BNODE function.

I thought I just wasn't understanding how to use bnode() with an argument but since you might have also expected different behavior I opened an issue:
https://issues.apache.org/jira/browse/JENA-2340

enridaga · 2022-07-18T08:23:51Z

For letting you know, this is coming from this R2RML test-cases: https://www.w3.org/2001/sw/rdb2rdf/test-cases/#R2RMLTC0002b. It is not that I specifically want to have this feature in the engine but it is more for comparing both solutions.

Considering they are bnodes, the comparison can be done via graph isomorphism (there are some useful utils for this in Jena).

enridaga added the Question Further information is requested label Jul 13, 2022

enridaga mentioned this issue Jul 15, 2022

Generate triples with the same bnode on different construct template projections #273

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is possible to generate BlankNodes from data references? #271

Is possible to generate BlankNodes from data references? #271

dachafra commented Jul 13, 2022

enridaga commented Jul 13, 2022

dachafra commented Jul 14, 2022

enridaga commented Jul 14, 2022

justin2004 commented Jul 14, 2022 •

edited

Loading

justin2004 commented Jul 14, 2022

justin2004 commented Jul 14, 2022 •

edited

Loading

dachafra commented Jul 15, 2022

enridaga commented Jul 15, 2022

justin2004 commented Jul 15, 2022

enridaga commented Jul 18, 2022

Is possible to generate BlankNodes from data references? #271

Is possible to generate BlankNodes from data references? #271

Comments

dachafra commented Jul 13, 2022

enridaga commented Jul 13, 2022

dachafra commented Jul 14, 2022

enridaga commented Jul 14, 2022

justin2004 commented Jul 14, 2022 • edited Loading

justin2004 commented Jul 14, 2022

justin2004 commented Jul 14, 2022 • edited Loading

dachafra commented Jul 15, 2022

enridaga commented Jul 15, 2022

justin2004 commented Jul 15, 2022

enridaga commented Jul 18, 2022

justin2004 commented Jul 14, 2022 •

edited

Loading

justin2004 commented Jul 14, 2022 •

edited

Loading