Statements w/ Blank Nodes can be duplicated in serialized output #261

no-reply · 2016-01-17T19:41:15Z

The definition of blank node equality runs afoul of keyspace definition for Ruby Hashes. This leads to repeated inserts of blank nodes that satisfy #==, and the same #id and `#hash, when they are not exactly the same object.

repo = RDF::Repository.new

repo.insert([RDF::Node.new('s'), RDF::URI('http://ex.org/p'), 'o'])
repo.insert([RDF::Node.new('s'), RDF::URI('http://ex.org/p'), 'o'])

repo.dump :ntriples
# => "_:s <http://ex.org/p> \"o\" .\n_:s <http://ex.org/p> \"o\" .\n"

A (proposed) failing test for this is at rdf-spec. An alternate solution is to continue to fail that test, but check that internally unique blank nodes are reflected as such in serialized output.

The text was updated successfully, but these errors were encountered:

gkellogg · 2016-01-17T20:55:34Z

This is a serializer issue, isn't it? There's a :unique_bnodes option to Writer that re-issues BNode identifiers because of this reason; mostly, people expect to see the same BNode identifier used when re-serializing, so it's not the default.

You are inserting two different nodes, so they should be two different statements.

Alternatively, Writer could scan the graph/repo to look for BNodes with the same id which are different nodes and modify the id; however, Writer doesn't operate on graph/repo, but on each statement coming in, so it would probably need to keep a local memory and this could get ugly.

no-reply · 2016-01-17T21:40:50Z

The proposed solutions to this are broken down into #262 and #263. The issue is represented correctly by those tickets.

Closing.

gkellogg mentioned this issue Jan 17, 2016

[for consideration] Hamster Repository #260

Merged

no-reply closed this as completed Jan 17, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Statements w/ Blank Nodes can be duplicated in serialized output #261

Statements w/ Blank Nodes can be duplicated in serialized output #261

no-reply commented Jan 17, 2016

gkellogg commented Jan 17, 2016

no-reply commented Jan 17, 2016

Statements w/ Blank Nodes can be duplicated in serialized output #261

Statements w/ Blank Nodes can be duplicated in serialized output #261

Comments

no-reply commented Jan 17, 2016

gkellogg commented Jan 17, 2016

no-reply commented Jan 17, 2016