-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
path.Order() #796
path.Order() #796
Conversation
Works: g.V().Order().All() {
"result": [
{
"id": "smart_person"
},
{
"id": "<alice>"
},
{
"id": "<are>"
},
{
"id": "<bob>"
},
{
"id": "<charlie>"
},
{
"id": "<dani>"
},
{
"id": "<emily>"
},
{
"id": "<follows>"
},
{
"id": "<fred>"
},
{
"id": "<greg>"
},
{
"id": "<predicates>"
},
{
"id": "<smart_graph>"
},
{
"id": "<status>"
}
]
} |
g.V("<bob>").Out().Order().All() Retruns: {
"result": [
{
"id": "cool_person"
},
{
"id": "<fred>"
}
]
} |
Works: g.V().Tag("entity").Out("<status>").Order().All() {
"result": [
{
"entity": "<bob>",
"id": "cool_person"
},
{
"entity": "<dani>",
"id": "cool_person"
},
{
"entity": "<greg>",
"id": "cool_person"
},
{
"entity": "<emily>",
"id": "smart_person"
},
{
"entity": "<greg>",
"id": "smart_person"
}
]
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation looks good, I don't see any obvious flaws in it. Well done! :)
As for the tests, I propose to add them to Gizmo first. Just add a few cases to the test table and we should be good.
The By()
implementation can be in a separate PR, by the way, as well as any backend-specific optimizations.
Reviewed 6 of 7 files at r1, 1 of 1 files at r3.
Reviewable status: all files reviewed, 14 unresolved discussions (waiting on @iddan)
graph/iterator/order.go, line 32 at r3 (raw file):
} func getOrderedValues(it *Order) *values {
Maybe accept a (qs QuadStore, it Iterator)
and pass (it.qs, it.subIt)
when calling it? This may be useful for other iterators later.
graph/iterator/order.go, line 34 at r3 (raw file):
func getOrderedValues(it *Order) *values { var qs, subIt = it.qs, it.subIt var results = make([]result, 0)
Nit: var results []result
graph/iterator/order.go, line 42 at r3 (raw file):
var tags = make(map[string]graph.Value) subIt.TagResults(tags) vals.results = append(vals.results, result{id, tags})
I think it would be better to also call qs.NameOf(id)
here and store it in the structure similar to the result
. This way you won't need to call it while sorting which will be much faster. Maybe even call .String()
as well to reduce allocations.
Also, this will allow removing qs
from values
.
graph/iterator/order.go, line 44 at r3 (raw file):
vals.results = append(vals.results, result{id, tags}) }
You should also check the error from the sub-iterator here:
err := subIt.Err()
and store it in Order.err
.
So the function should return an error as well.
graph/iterator/order.go, line 49 at r3 (raw file):
subIt.Reset() return &vals
Assuming my suggestion above, the function can be changed to return a []result
instead of *values
.
graph/iterator/order.go, line 52 at r3 (raw file):
} func NewOrder(qs graph.QuadStore, subIt graph.Iterator) *Order {
It seems like you could minimize the interface from graph.QuadStore
to graph.Namer
. This will make it easier to test.
graph/iterator/order.go, line 79 at r3 (raw file):
} // SubIterators returns a slice of the sub iterators. The first iterator is the
I believe the second sentence is not relevant to this iterator.
graph/iterator/order.go, line 82 at r3 (raw file):
// primary iterator, for which the complement is generated. func (it *Order) SubIterators() []graph.Iterator { return nil
The function should return a it.subIt
. I noticed you removed it in one of the commits. I wonder why.
graph/iterator/order.go, line 88 at r3 (raw file):
// has not previously seen. func (it *Order) Next(ctx context.Context) bool { it.runstats.Next += 1
++
, same in other places
graph/iterator/order.go, line 139 at r3 (raw file):
subStats := it.subIt.Stats() return graph.IteratorStats{ NextCost: 1,
subStats.NextCost
graph/iterator/order.go, line 141 at r3 (raw file):
NextCost: 1, ContainsCost: subStats.ContainsCost, Size: int64(len(it.ordered.results)),
It may not be populated yet (Next
not called).
graph/iterator/order.go, line 155 at r3 (raw file):
func (it *Order) String() string { return "Order"
"Order{"+it.subIt.String()+"}"
graph/kv/quad_iterator.go, line 223 at r3 (raw file):
} func (it *QuadIterator) Ordered() bool { return true }
I would actually prefer to name it Sorted
.
graph/shape/shape.go, line 1427 at r3 (raw file):
} type Order struct {
Can we name it Sort
? It works exactly like stdlib's sort.Sort
, so makes sense to name it the same way internally.
The Path.Order
and Gizmo's path.Order
can be named as-is, because those need to match Gremlin.
I am going to fix all the issues once this feature gets priority in my company's projects. Probably in less than a month. |
Updated the code to the new Iterator interface. Ready for rereview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 9 files at r4, 2 of 2 files at r5.
Reviewable status: all files reviewed, 14 unresolved discussions (waiting on @iddan)
graph/iterator/sort.go, line 26 at r5 (raw file):
func (v values) Swap(i, j int) { v[i], v[j] = v[j], v[i] } type sortNext struct {
Minor, but can you please order things the same way as in other iterators? E.g. first, all the exported methods/types, so a public API will be more visible, than the sortIt
(v2 implementation), than sortNext
and sortContains
(specific implementations of the iterator depending on the query plan).
graph/iterator/sort.go, line 29 at r5 (raw file):
namer graph.Namer subIt graph.Scanner ordered *values
You can safely remove pointer here. Slices in Go can be nil as well.
graph/iterator/sort.go, line 44 at r5 (raw file):
func (it *sortNext) TagResults(dst map[string]graph.Value) { if it.subIt != nil { it.subIt.TagResults(dst)
This won't work. You drain it.subIt
in getSortedValues
, so TagResults
will return tags related to the last result in the iterator. You should recall tags from value.result.tags
.
graph/iterator/sort.go, line 57 at r5 (raw file):
func (it *sortNext) Next(ctx context.Context) bool { if it.ordered == nil {
This should be after the if it.err != nil
. If getSortedValues
fails, v
may be nil, so calling Next
multiple times will try fetching values again, probably leading to the same error. Instead we can first check the saved error and return earlier without touching the datastore.
graph/iterator/sort.go, line 75 at r5 (raw file):
func (it *sortNext) NextPath(ctx context.Context) bool { return false
I think this is incorrect. NextPath
is not implemented on Sort
itself (which is correct), but it may be implemented on the subIt
. So you should call it in getSortedValues
as well (since we don't know if it will be called by the user or not).
graph/iterator/sort.go, line 109 at r5 (raw file):
func (it *sortIt) Lookup() graph.Index { return newSortContains(it.subIt.Lookup())
It's may not be obvious, but:
// we can ignore the order when doing Contains, so we return a subiterator directly
return it.subIt.Lookup()
And we can remove sortContains
completely. One of the benefits of a new iterator interface.
graph/iterator/sort.go, line 123 at r5 (raw file):
subStats, err := it.subIt.Stats(ctx) return graph.IteratorCosts{ NextCost: subStats.NextCost,
subStats.NextCost*2, // TODO(dennwc): better cost calculation
The real cost is still subStats.NextCost
, but we need to somehow make the cost appear larger, since we will pull all values into memory which may be expensive for large datasets.
graph/iterator/sort.go, line 147 at r5 (raw file):
} // NewSort creates a new Sort iterator
// NewSort creates a new Sort iterator.
// TODO(dennwc): This iterator must not be used inside And: in this cases it may be moved to a Contains branch and won't do anything. We should make And account for this.
graph/iterator/sort.go, line 161 at r5 (raw file):
func getSortedValues(namer graph.Namer, it graph.Scanner) (*values, error) { var v values var ctx = context.TODO()
Please don't fake the context. From the code, I see that you can pass a real one it in Next
.
query/gizmo/gizmo_test.go, line 656 at r5 (raw file):
"smart_person", }, },
Another test case worth considering:
{
message: "Use order tags",
query: `
g.V().Tag("target").order().all()
`,
tag: "target",
expect: []string{
"<alice>",
"<are>",
"<bob>",
"<charlie>",
"<dani>",
"<emily>",
"<follows>",
"<fred>",
"<greg>",
"<predicates>",
"<smart_graph>",
"<status>",
"cool_person",
"smart_person",
},
},
It checks if the iterator can correctly propagate tags.
Done everything except:
|
@dennwc I noticed a bug that after calling |
It might be. You can test it with |
Then I'll check if the bug persists after we fix |
@iddan I rebased the branch and implemented Note that ordering may dissappear in some circumstances when using Paths. So changes to the optimizer must be made to ensure that it works in all cases. |
Looks like the issue I reported was solved. |
This change is