path.Order() #796

iddan · 2019-07-08T21:55:30Z

Basic implementation
Tests and bug fixes

This change is

iddan · 2019-07-08T22:03:31Z

Works:

g.V().Order().All()

{
	"result": [
		{
			"id": "smart_person"
		},
		{
			"id": "<alice>"
		},
		{
			"id": "<are>"
		},
		{
			"id": "<bob>"
		},
		{
			"id": "<charlie>"
		},
		{
			"id": "<dani>"
		},
		{
			"id": "<emily>"
		},
		{
			"id": "<follows>"
		},
		{
			"id": "<fred>"
		},
		{
			"id": "<greg>"
		},
		{
			"id": "<predicates>"
		},
		{
			"id": "<smart_graph>"
		},
		{
			"id": "<status>"
		}
	]
}

iddan · 2019-07-08T22:04:48Z

g.V("<bob>").Out().Order().All()

Retruns:

{
	"result": [
                {
			"id": "cool_person"
		},
		{
			"id": "<fred>"
		}
	]
}

iddan · 2019-07-08T22:48:48Z

Works:

g.V().Tag("entity").Out("<status>").Order().All()

{
	"result": [
		{
			"entity": "<bob>",
			"id": "cool_person"
		},
		{
			"entity": "<dani>",
			"id": "cool_person"
		},
		{
			"entity": "<greg>",
			"id": "cool_person"
		},
		{
			"entity": "<emily>",
			"id": "smart_person"
		},
		{
			"entity": "<greg>",
			"id": "smart_person"
		}
	]
}

dennwc

The implementation looks good, I don't see any obvious flaws in it. Well done! :)

As for the tests, I propose to add them to Gizmo first. Just add a few cases to the test table and we should be good.

The By() implementation can be in a separate PR, by the way, as well as any backend-specific optimizations.

Reviewed 6 of 7 files at r1, 1 of 1 files at r3.
Reviewable status: all files reviewed, 14 unresolved discussions (waiting on @iddan)

graph/iterator/order.go, line 32 at r3 (raw file):

}

func getOrderedValues(it *Order) *values {

Maybe accept a (qs QuadStore, it Iterator) and pass (it.qs, it.subIt) when calling it? This may be useful for other iterators later.

graph/iterator/order.go, line 34 at r3 (raw file):

func getOrderedValues(it *Order) *values {
	var qs, subIt = it.qs, it.subIt
	var results = make([]result, 0)

Nit: var results []result

graph/iterator/order.go, line 42 at r3 (raw file):

		var tags = make(map[string]graph.Value)
		subIt.TagResults(tags)
		vals.results = append(vals.results, result{id, tags})

I think it would be better to also call qs.NameOf(id) here and store it in the structure similar to the result. This way you won't need to call it while sorting which will be much faster. Maybe even call .String() as well to reduce allocations.

Also, this will allow removing qs from values.

graph/iterator/order.go, line 44 at r3 (raw file):

		vals.results = append(vals.results, result{id, tags})
	}

You should also check the error from the sub-iterator here:

err := subIt.Err()

and store it in Order.err.

So the function should return an error as well.

graph/iterator/order.go, line 49 at r3 (raw file):

	subIt.Reset()

	return &vals

Assuming my suggestion above, the function can be changed to return a []result instead of *values.

graph/iterator/order.go, line 52 at r3 (raw file):

}

func NewOrder(qs graph.QuadStore, subIt graph.Iterator) *Order {

It seems like you could minimize the interface from graph.QuadStore to graph.Namer. This will make it easier to test.

graph/iterator/order.go, line 79 at r3 (raw file):

}

// SubIterators returns a slice of the sub iterators. The first iterator is the

I believe the second sentence is not relevant to this iterator.

graph/iterator/order.go, line 82 at r3 (raw file):

// primary iterator, for which the complement is generated.
func (it *Order) SubIterators() []graph.Iterator {
	return nil

The function should return a it.subIt. I noticed you removed it in one of the commits. I wonder why.

graph/iterator/order.go, line 88 at r3 (raw file):

// has not previously seen.
func (it *Order) Next(ctx context.Context) bool {
	it.runstats.Next += 1

++, same in other places

graph/iterator/order.go, line 139 at r3 (raw file):

	subStats := it.subIt.Stats()
	return graph.IteratorStats{
		NextCost:     1,

subStats.NextCost

graph/iterator/order.go, line 141 at r3 (raw file):

		NextCost:     1,
		ContainsCost: subStats.ContainsCost,
		Size:         int64(len(it.ordered.results)),

It may not be populated yet (Next not called).

graph/iterator/order.go, line 155 at r3 (raw file):

func (it *Order) String() string {
	return "Order"

"Order{"+it.subIt.String()+"}"

graph/kv/quad_iterator.go, line 223 at r3 (raw file):

}

func (it *QuadIterator) Ordered() bool { return true }

I would actually prefer to name it Sorted.

graph/shape/shape.go, line 1427 at r3 (raw file):

}

type Order struct {

Can we name it Sort? It works exactly like stdlib's sort.Sort, so makes sense to name it the same way internally.

The Path.Order and Gizmo's path.Order can be named as-is, because those need to match Gremlin.

iddan · 2019-09-11T09:42:33Z

I am going to fix all the issues once this feature gets priority in my company's projects. Probably in less than a month.

iddan · 2019-10-05T14:58:49Z

Updated the code to the new Iterator interface. Ready for rereview

dennwc

Reviewed 7 of 9 files at r4, 2 of 2 files at r5.
Reviewable status: all files reviewed, 14 unresolved discussions (waiting on @iddan)

graph/iterator/sort.go, line 26 at r5 (raw file):

func (v values) Swap(i, j int) { v[i], v[j] = v[j], v[i] }

type sortNext struct {

Minor, but can you please order things the same way as in other iterators? E.g. first, all the exported methods/types, so a public API will be more visible, than the sortIt (v2 implementation), than sortNext and sortContains (specific implementations of the iterator depending on the query plan).

graph/iterator/sort.go, line 29 at r5 (raw file):

	namer   graph.Namer
	subIt   graph.Scanner
	ordered *values

You can safely remove pointer here. Slices in Go can be nil as well.

graph/iterator/sort.go, line 44 at r5 (raw file):

func (it *sortNext) TagResults(dst map[string]graph.Value) {
	if it.subIt != nil {
		it.subIt.TagResults(dst)

This won't work. You drain it.subIt in getSortedValues, so TagResults will return tags related to the last result in the iterator. You should recall tags from value.result.tags.

graph/iterator/sort.go, line 57 at r5 (raw file):

func (it *sortNext) Next(ctx context.Context) bool {
	if it.ordered == nil {

This should be after the if it.err != nil. If getSortedValues fails, v may be nil, so calling Next multiple times will try fetching values again, probably leading to the same error. Instead we can first check the saved error and return earlier without touching the datastore.

graph/iterator/sort.go, line 75 at r5 (raw file):

func (it *sortNext) NextPath(ctx context.Context) bool {
	return false

I think this is incorrect. NextPath is not implemented on Sort itself (which is correct), but it may be implemented on the subIt. So you should call it in getSortedValues as well (since we don't know if it will be called by the user or not).

graph/iterator/sort.go, line 109 at r5 (raw file):

func (it *sortIt) Lookup() graph.Index {
	return newSortContains(it.subIt.Lookup())

It's may not be obvious, but:

// we can ignore the order when doing Contains, so we return a subiterator directly
return it.subIt.Lookup()

And we can remove sortContains completely. One of the benefits of a new iterator interface.

graph/iterator/sort.go, line 123 at r5 (raw file):

	subStats, err := it.subIt.Stats(ctx)
	return graph.IteratorCosts{
		NextCost:     subStats.NextCost,

subStats.NextCost*2, // TODO(dennwc): better cost calculation

The real cost is still subStats.NextCost, but we need to somehow make the cost appear larger, since we will pull all values into memory which may be expensive for large datasets.

graph/iterator/sort.go, line 147 at r5 (raw file):

}

// NewSort creates a new Sort iterator

// NewSort creates a new Sort iterator.
// TODO(dennwc): This iterator must not be used inside And: in this cases it may be moved to a Contains branch and won't do anything. We should make And account for this.

graph/iterator/sort.go, line 161 at r5 (raw file):

func getSortedValues(namer graph.Namer, it graph.Scanner) (*values, error) {
	var v values
	var ctx = context.TODO()

Please don't fake the context. From the code, I see that you can pass a real one it in Next.

query/gizmo/gizmo_test.go, line 656 at r5 (raw file):

			"smart_person",
		},
	},

Another test case worth considering:

	{
		message: "Use order tags",
		query: `
			g.V().Tag("target").order().all()
		`,
		tag: "target",
		expect: []string{
			"<alice>",
			"<are>",
			"<bob>",
			"<charlie>",
			"<dani>",
			"<emily>",
			"<follows>",
			"<fred>",
			"<greg>",
			"<predicates>",
			"<smart_graph>",
			"<status>",
			"cool_person",
			"smart_person",
		},
	},

It checks if the iterator can correctly propagate tags.

iddan · 2019-10-06T07:43:41Z

Done everything except:

graph/iterator/sort.go, line 75 at r5 (raw file):

func (it *sortNext) NextPath(ctx context.Context) bool {
return false
I think this is incorrect. NextPath is not implemented on Sort itself (which is correct), but it may be implemented on the subIt. So you should call it in getSortedValues as well (since we don't know if it will be called by the user or not).
Which I don't understand

iddan · 2019-10-07T08:14:01Z

@dennwc I noticed a bug that after calling order() back() isn't working as expected. Does it has to do with NextPath()?

dennwc · 2019-10-07T12:12:47Z

It might be. You can test it with CAYLEY_DEBUG_OPTIMIZER=true, and if you see it inside Quads and then NodesFrom at some point, than it is the case. As a rule of thumb, right now it should always be at the top level.

iddan · 2019-10-07T13:57:55Z

Then I'll check if the bug persists after we fix NextPath()

dennwc · 2019-10-07T21:19:00Z

@iddan I rebased the branch and implemented NextPath in it. Please test if it solves the issue you mentioned.

Note that ordering may dissappear in some circumstances when using Paths. So changes to the optimizer must be made to ensure that it works in all cases.

iddan · 2019-10-08T11:40:29Z

Looks like the issue I reported was solved.

iddan mentioned this pull request Jul 8, 2019

Gizmo: path.Order() #792

Closed

iddan changed the title ~~Path Order~~ path.Order() Jul 8, 2019

dennwc reviewed Jul 9, 2019

View reviewed changes

iddan self-assigned this Oct 4, 2019

iddan added the enhancement label Oct 4, 2019

iddan marked this pull request as ready for review October 4, 2019 19:56

dennwc requested changes Oct 5, 2019

View reviewed changes

iddan and others added 11 commits October 8, 2019 15:00

iterator: implement Sort

d8db590

docs: document order in GizmoAPI

f9439d7

iterator: use underlying iterator instead of sortContains

1a7147d

iterator: bigger NextCost in Sort

d34992a

iterator: add TODO comments for NewSort

a14ced2

iterator: implement TagResults for Sort iterator

bb92911

iterator: add comments for Sort iterator

6854fd0

path: add tests for sorting

e625926

iterator: implement NextPath for Sort

a4c8e9f

break natural sorting in the test dataset

fcaed7f

path: add tests for sorting optimization

7ce2071

iddan merged commit f449116 into cayleygraph:master Oct 8, 2019

iddan deleted the order branch October 8, 2019 13:27

dennwc added this to the v0.7.6 milestone Oct 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

path.Order() #796

path.Order() #796

iddan commented Jul 8, 2019 •

edited

Loading

iddan commented Jul 8, 2019 •

edited

Loading

iddan commented Jul 8, 2019 •

edited

Loading

iddan commented Jul 8, 2019 •

edited

Loading

dennwc left a comment

iddan commented Sep 11, 2019

iddan commented Oct 5, 2019

dennwc left a comment

iddan commented Oct 6, 2019

iddan commented Oct 7, 2019

dennwc commented Oct 7, 2019

iddan commented Oct 7, 2019

dennwc commented Oct 7, 2019

iddan commented Oct 8, 2019

path.Order() #796

path.Order() #796

Conversation

iddan commented Jul 8, 2019 • edited Loading

iddan commented Jul 8, 2019 • edited Loading

iddan commented Jul 8, 2019 • edited Loading

iddan commented Jul 8, 2019 • edited Loading

dennwc left a comment

Choose a reason for hiding this comment

iddan commented Sep 11, 2019

iddan commented Oct 5, 2019

dennwc left a comment

Choose a reason for hiding this comment

iddan commented Oct 6, 2019

iddan commented Oct 7, 2019

dennwc commented Oct 7, 2019

iddan commented Oct 7, 2019

dennwc commented Oct 7, 2019

iddan commented Oct 8, 2019

iddan commented Jul 8, 2019 •

edited

Loading

iddan commented Jul 8, 2019 •

edited

Loading

iddan commented Jul 8, 2019 •

edited

Loading

iddan commented Jul 8, 2019 •

edited

Loading