Batch neighbour retrieval in traversals #21809
Draft
+311
−137
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We found that if we have a traversal with a limit, but one node that is hit is a supernode, we first retrieve all the neighbours of this supernode first and finally get rid of these again due to the limit. The unnecessary retrieval can be very expensive.
Therefore we decided to batch the neighbour retrieval for each node into batches of 1000. For the OneSidedEnumerator, the neighbourhood retrieval is done in
computeNeighbourhoodOfNextVertex()
. In single server, the expand method on the provider is the one that saves the neighbours all into memory. In cluster, already the fetching (via the provider) of the neighbour edges can be expensive and should be batched.The current changes include a
SingleServerNeighbourProvider
that is an iterator over neighbours of one vertex that can be set (and therefore resets the internal cursors) viarearm
. Thenext
method is supposed to give the next 1000 neighbours - current it still gives all of them, this still needs more work. TheSingleServerProvider
behavior also has not changed so far, it iterates over all batches. This has to change such that it executes the callback only on the next batch. (Another this that has to be investigated: theTraversalStats
have to be given to theSingleServerNeighbourProvider
, possibly as a shared_ptr (?), using just a bare reference to the traversal stats of the SingleServerProvider crashes arangod.)Also, the similar provider has to be implemented for the cluster case where we fetch edges.