Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VG RDF uses local non standard URIs for its own concepts #109

Closed
JervenBolleman opened this issue Oct 1, 2015 · 3 comments
Closed

VG RDF uses local non standard URIs for its own concepts #109

JervenBolleman opened this issue Oct 1, 2015 · 3 comments
Labels

Comments

@JervenBolleman
Copy link
Contributor

We should mint URIs for the concept of a node in the graph, path, step in path, and before.
i.e. a mini ontology for the following concepts.

Something like this

The following rdfs classes are present in the data (can be left implicit via an RDFS construct as they are not that interesting or needed for likely queries).

A :Node in a Variation Graph is a resource that represents a stretch of continuous DNA. They have an identity in the graph, as well as an associated IUPAC representation of the DNA molecule.

A :Path in the Variation Graph links nodes in the graph into an order (via steps) that represents a linear DNA sequence as found in a single person/assembly.

A :Step belongs to a single :Path and is ordered by the :step predicates value.

The RDF presentation depends on the following predicates

:before is used to link :Nodes together in the directed graph flowing in the general 5' to 3' direction of the consensus assembly. rdfs domain and range are :Node

:step has as domain the :Step type and as a range xsd:positiveInteger for each :Path a step has an unique value.

:node links a :Step in a :Path to the specific :Node. has as domain :Step and range :Node

@ekg
Copy link
Member

ekg commented Oct 1, 2015

I think this is a good start. We will also need to represent entities and
paths on the reverse strand. This is necessary to support concise
representations of inversions, translocations, and assembly graphs. So
instead of linking nodes our edges link :Sides, which correspond to a node
end, and each step in a path can be on the forward or reverse complement
(:Traversal maybe). Does this make sense?
On Oct 1, 2015 12:16 PM, "JervenBolleman" notifications@github.com wrote:

We should mint URIs for the concept of a node in the graph, path, step in
path, and before.
i.e. a mini ontology for the following concepts.

Something like this

The following rdfs classes are present in the data (can be left implicit
via an RDFS construct as they are not that interesting or needed for likely
queries).

A :Node in a Variation Graph is a resource that represents a stretch of
continuous DNA. They have an identity in the graph, as well as an
associated IUPAC representation of the DNA molecule.

A :Path in the Variation Graph links nodes in the graph into an order
(via steps) that represents a linear DNA sequence as found in a single
person/assembly.

A :Step belongs to a single :Path and is ordered by the :step predicates
value.

The RDF presentation depends on the following predicates

:before is used to link :Nodes together in the directed graph flowing in
the general 5' to 3' direction of the consensus assembly. rdfs domain and
range are :Node

:step has as domain the :Step type and as a range xsd:positiveInteger for
each :Path a step has an unique value.

:node links a :Step in a :Path to the specific :Node. has as domain :Step
and range :Node


Reply to this email directly or view it on GitHub
#109.

@JervenBolleman
Copy link
Contributor Author

@ekg Something like this

So instead of having a :before predicate we should maybe have

linksOn35to53 i.e. what was :before

linksOn35to53ofComplementOf and use this predicate to say that the end instead of

not attached here
|
v
actgaga
tctcagt
      /\
       | but here

We can then also split the :step predicate into two subproperties, :stepOnForward and :stepOnReverse.

This makes the query example a bit more complicated.

PREFIX :<http://base/> PREFIX rdf:<ttp://www.w3.org/1999/02/22-rdf-syntax-ns#> 
SELECT ?path (group_concat(?sequence; separator='') as ?pathSeq)
WHERE
{?step :path ?path; 
              :node ?node ;
              {
                   :stepOnForward ?order.
                   ?node rdf:value ?sequence
               }  UNION {
                   :stepOnReverse ?order.
                   ?node rdf:value ?toComplement
                   BIND(dna:reverseComplementOf(?toComplement) AS ?sequence)
               }
}
GROUP BY ?path 
ORDER BY ?order"

The dna:reverseComplementOf would be a custom function, or in the very rare case that a SPARQL endpoint does support custom functions a bunch of nested REPLACE operators.

JervenBolleman added a commit to JervenBolleman/vg that referenced this issue Feb 21, 2016
…concepts. For now I put them in the biohackathon.org namespace, but this needs to be discussed with all stakeholders.
JervenBolleman added a commit to JervenBolleman/vg that referenced this issue Feb 21, 2016
ekg added a commit that referenced this issue Mar 17, 2016
#109 Start with making public stable URIs for the VG graph RDF concepts
@adamnovak
Copy link
Member

I'm going to say that #217 fixed this. This can be re-opened if we want to further improve our URLs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants