-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support path semantic in non-recursive-path #4405
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @excaliburwyj,
Thanks for the contribution. I think the overall idea make sense. Though I'm in middle of migrating recursive join to a different parallel computation framework (#4404). I'll review this in detail during the weekend. Hope this works fine for you.
51401b7
to
aaa2bc4
Compare
Whoops, forgot to review this PR. Let me do it today. Sorry about this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @excaliburwyj,
As I'm reviewing this PR, another idea pops into my mind. Instead of first finding a join operator on top of recursive join and then append a filter (which is the logic you implemented in PathSemanticRewriter
), I wonder if the following approach will be simpler.
First we add a scalar function hasDuplication
in the system. It will be a function that take arbitrary number of nodes, or relations (a relation can be recursive), e.g.
MATCH (a)-[e]->(b)-[]->(c) WHERE NOT hasDuplication(a, b, c)
The above is an equivalent form of
MATCH p = (a)-[e]->(b)-[]->(c) WHERE is_acyclic(p)
The difference is that it doesn't require a path variable.
Once this is done, for a given MATCH query, we can check if the current semantic is trail
or acyclic
. If so, and binding stage, we directly add a predicate expression hasDuplication
with either nodes or relations depends on the semantic.
In this approach, we can limited change in function module and binder module. And we don't need to worry about planning and optimization.
Let me know what you think.
Hi @andyfengHKU, |
Hi @excaliburwyj, I totally agree with the idea that we should generate a different result when the semantic is set to What I'm suggesting is that, at implementation level, we construct an additional filter in the binder and this is hidden from the user. For example, user inputs are
Internally we perform a rewrite to the second query as
User is not aware of this rewrite but we can achieve a |
Hi @andyfengHKU |
Description
For non-recursive paths,such as "match (a)-[b]-(c)-[d]-(e) return e;", path semantics do not work.
support has been added:
Fixes # (issue)
Contributor agreement