Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesign graph view and graph query #269

Closed
benjamingeer opened this issue Sep 21, 2016 · 6 comments
Closed

Redesign graph view and graph query #269

benjamingeer opened this issue Sep 21, 2016 · 6 comments
Assignees
Labels
enhancement improve existing code or new feature
Milestone

Comments

@benjamingeer
Copy link

benjamingeer commented Sep 21, 2016

This concerns the API operation to get a graph of resources that are reachable from a certain resource. It has these requirements:

  • The depth of the search should be configurable in the request.
  • Include any edge that's a subproperty of knora-base:hasLinkTo.
  • Exclude any edge that's a subproperty of knora-base:isPartOf.
  • Include links that go in either direction.
  • Exclude deleted resources, and any resources that are reachable only via a deleted resource.
  • Exclude resources that the user doesn't have permission to see, resources that are reachable only via links that the user doesn't have permission to see, and any other resources reachable only via such paths.

The last time we tried to do this, we attempted to implement exactly what the old API did, but we didn't find a good SPARQL design. Also, it was difficult to make it efficient, because the old API provides an excessive amount of data, including all the properties of every node in the graph.

Thanks to Damyan Ognyanov at Ontotext, we have a proof-of-concept of a better SPARQL design (using a CONSTRUCT query with nested OPTIONAL blocks up to the requested maximum depth). It looks like we can get all the necessary data in one SPARQL query.

@lrosenth and I talked this over, and we're going to redesign the API operation, do a new design in the Knora API server, and change the implementation in the SALSAH GUI accordingly.

To deal with both large and small graphs, the GUI could support two ways of displaying the graph:

  1. If it's a small graph, the whole graph could be shown all at once, as in the current GUI.
  2. If it's a large graph, we could show just the initial node and its immediate neighbours. The user could then click on a neighbour to get its neighbours.

The API operation will return the minimum information needed to draw the graph, which is probably just the adjacency list, along with the label of each node and each edge, and the label of each node's resource class.

cc @milchkannen

@benjamingeer benjamingeer added enhancement improve existing code or new feature triplestore labels Sep 21, 2016
@benjamingeer benjamingeer added this to the On Deck milestone Sep 21, 2016
@benjamingeer benjamingeer self-assigned this Sep 21, 2016
@benjamingeer
Copy link
Author

benjamingeer commented Oct 18, 2016

Since the permission checking has to be done in Scala, it will need to work on an in-memory representation of the graph. This looks doable using Graph for Scala. After getting the SPARQL query results, we can remove any nodes that the user doesn't have permission to see, as well as any edges containing those nodes, and any edges whose link values the user doesn't have permission to see. To remove nodes that were only reachable via removed nodes, we can generate a new graph containing only what's still reachable from the initial node, like this:

scala> import scalax.collection.Graph
import scalax.collection.Graph

scala> import scalax.collection.GraphPredef._, scalax.collection.GraphEdge._
import scalax.collection.GraphPredef._
import scalax.collection.GraphEdge._

scala> import scalax.collection.edge._
import scalax.collection.edge._

scala> val nodes = List("http://foo/1", "http://foo/2", "http://foo/3", "http://foo/4")
nodes: List[String] = List(http://foo/1, http://foo/2, http://foo/3, http://foo/4)

scala> val e1 = LDiEdge("http://foo/1", "http://foo/2")("hasAuthor")
e1: scalax.collection.edge.LDiEdge[String] with scalax.collection.GraphEdge.EdgeCopy[scalax.collection.edge.LDiEdge]{type L1 = String} = http://foo/1~>http://foo/2 'hasAuthor

scala> val e2 = LDiEdge("http://foo/2", "http://foo/3")("hasName")
e2: scalax.collection.edge.LDiEdge[String] with scalax.collection.GraphEdge.EdgeCopy[scalax.collection.edge.LDiEdge]{type L1 = String} = http://foo/2~>http://foo/3 'hasName

scala> val edges = List(e1, e2)
edges: List[scalax.collection.edge.LDiEdge[String] with scalax.collection.GraphEdge.EdgeCopy[scalax.collection.edge.LDiEdge]{type L1 = String}] = List(http://foo/1~>http://foo/2 'hasAuthor, http://foo/2~>http://foo/3 'hasName)

scala> val g = Graph.from(nodes, edges)
g: scalax.collection.Graph[String,scalax.collection.edge.LDiEdge] = Graph(http://foo/1, http://foo/2, http://foo/3, http://foo/4, http://foo/1~>http://foo/2 'hasAuthor, http://foo/2~>http://foo/3 'hasName)

scala> val initial = g.get("http://foo/1")
initial: g.NodeT = http://foo/1

scala> initial.outerEdgeTraverser.toGraph
res0: scalax.collection.Graph[String,scalax.collection.edge.LDiEdge] = Graph(http://foo/1, http://foo/2, http://foo/3, http://foo/1~>http://foo/2 'hasAuthor, http://foo/2~>http://foo/3 'hasName)

Since two nodes can be connected by more than one edge, we'll need a multigraph with keyed edges.

@benjamingeer
Copy link
Author

Since arbor.js hasn't been updated in 5 years, I'm looking for something else.

These libraries look good and well-maintained:

d3

Uses SVG.

https://d3js.org/

Examples:

Cytoscape

Uses canvas.

http://js.cytoscape.org/

Example: http://js.cytoscape.org/demos/2ebdc40f1c2540de6cf0/

I didn't find a good WebGL option. My inclination is to try D3.

@benjamingeer
Copy link
Author

benjamingeer commented Nov 11, 2016

I spent some time trying to get d3 to work, but it's difficult because most of the graph examples I found use d3 version 3, but now there's d3 version 4 and it's very different.

I'm going to try Cytoscape, which looks easier because it's specifically for graphs.

@benjamingeer
Copy link
Author

Cytoscape seems to work, now I just need to integrate it into SALSAH.

@benjamingeer
Copy link
Author

The good news:

  • This works end-to-end with GraphDB.

The bad news:

  • When there's a realistic amount of data in the triplestore, the query is slow (3 seconds) with GraphDB.
  • It doesn't work at all with Fuseki.

Options:

  • Ask Ontotext for help (perhaps consulting) to optimise it.
  • Rewrite it using simple recursive SPARQL queries.

@benjamingeer
Copy link
Author

I've rewritten it using simple recursive SPARQL queries. Performance seems OK, and it works on Fuseki as well as GraphDB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement improve existing code or new feature
Projects
None yet
Development

No branches or pull requests

2 participants