fix(DQL): optimize query for has function with offset. #7727

minhaj-shakeel · 2021-04-15T08:52:25Z

Fixes DGRAPH-574.

This PR reduces the latency of the has function slightly. For example the given query:-

{
    me(func: has(actor.film), first: 10, offset: 300000){
       name@en
    }
}

when run on the 21million dataset takes around 80 milliseconds to execute as compared to earlier around 93 milliseconds.

Machine Specs:-

Model Name:	MacBook Pro
  Model Identifier:	MacBookPro16,1
  Processor Name:	6-Core Intel Core i7
  Processor Speed:	2.6 GHz
  Number of Processors:	1
  Total Number of Cores:	6
  L2 Cache (per Core):	256 KB
  L3 Cache:	12 MB
  Memory:	16 GB

Note:-
This does not optimize for has function on predicates with @lang tag.

This change is

pawanrawal

The change looks good. I have a couple of comments.

Are there any tests that already cover this or else could we add some?
Also add some benchmark numbers based on your testing in the description. Add a comment saying that this doesn't improve latency for predicates with lang tag.

Reviewed 4 of 4 files at r1.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @manishrjain, @minhaj-shakeel, and @vvbalaji-dgraph)

protos/pb.proto, line 77 at r1 (raw file):

	int32 first = 15; // used to limit the number of result. Typically, the count is value of first
	// field. Now, It's been used only for has query.
	int32 offset = 16;

add a comment that first and offset help fetch lesser results for the has query when there is no filter and order.

query/query.go, line 2291 at r1 (raw file):

	if len(sg.Params.Order) == 0 && len(sg.Params.FacetsOrder) == 0 {
		// There is no ordering. Just apply pagination and return.
		if !(len(sg.Filters) == 0 && sg.SrcFunc != nil && sg.SrcFunc.Name == "has") {

Add a comment here that for has function when there is no filtering and sorting we fetch correct paginated results from disk, so no need to apply pagination again.

pawanrawal

Reviewed 3 of 3 files at r2.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @manishrjain, @minhaj-shakeel, and @vvbalaji-dgraph)

minhaj-shakeel

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @manishrjain, @pawanrawal, and @vvbalaji-dgraph)

protos/pb.proto, line 77 at r1 (raw file):

Previously, pawanrawal (Pawan Rawal) wrote…

add a comment that first and offset help fetch lesser results for the has query when there is no filter and order.

Done.

query/query.go, line 2291 at r1 (raw file):

Previously, pawanrawal (Pawan Rawal) wrote…

Add a comment here that for has function when there is no filtering and sorting we fetch correct paginated results from disk, so no need to apply pagination again.

Done.

Fixes DGRAPH-574. This PR reduces the latency of the `has` function slightly. For example the given query:- ``` { me(func: has(actor.film), first: 10, offset: 300000){ name@en } } ``` when run on the `21million` dataset takes around `80` milliseconds to execute as compared to earlier around `93` milliseconds. Machine Specs:- ``` Model Name: MacBook Pro Model Identifier: MacBookPro16,1 Processor Name: 6-Core Intel Core i7 Processor Speed: 2.6 GHz Number of Processors: 1 Total Number of Cores: 6 L2 Cache (per Core): 256 KB L3 Cache: 12 MB Memory: 16 GB ``` Note:- This does not optimize for the `has` function on predicates with the `@lang` tag.

Fixes DGRAPH-574. This PR reduces the latency of the `has` function slightly. For example the given query:- ``` { me(func: has(actor.film), first: 10, offset: 300000){ name@en } } ``` when run on the `21million` dataset takes around `80` milliseconds to execute as compared to earlier around `93` milliseconds. Machine Specs:- ``` Model Name: MacBook Pro Model Identifier: MacBookPro16,1 Processor Name: 6-Core Intel Core i7 Processor Speed: 2.6 GHz Number of Processors: 1 Total Number of Cores: 6 L2 Cache (per Core): 256 KB L3 Cache: 12 MB Memory: 16 GB ``` Note:- This does not optimize for the `has` function on predicates with the `@lang` tag. (cherry picked from commit 9ba15b7) Co-authored-by: minhaj-shakeel <minhaj@dgraph.io>

optimize query for has function

bb84ce7

minhaj-shakeel requested review from manishrjain, pawanrawal and vvbalaji-dgraph as code owners April 15, 2021 08:52

minhaj-shakeel added 2 commits April 16, 2021 09:59

add comments

ef28bbd

fix comment

0c918e0

pawanrawal suggested changes Apr 16, 2021

View reviewed changes

minhaj-shakeel added 2 commits April 16, 2021 11:38

add more comments

35fc593

add test case for has function with first and offset

5fc4044

pawanrawal approved these changes Apr 20, 2021

View reviewed changes

minhaj-shakeel commented Apr 20, 2021

View reviewed changes

minhaj-shakeel merged commit 214c5df into master Apr 20, 2021

minhaj-shakeel deleted the minhaj/offset-pagination branch April 20, 2021 03:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(DQL): optimize query for has function with offset. #7727

fix(DQL): optimize query for has function with offset. #7727

minhaj-shakeel commented Apr 15, 2021 •

edited

Loading

pawanrawal left a comment

pawanrawal left a comment

minhaj-shakeel left a comment

fix(DQL): optimize query for has function with offset. #7727

fix(DQL): optimize query for has function with offset. #7727

Conversation

minhaj-shakeel commented Apr 15, 2021 • edited Loading

pawanrawal left a comment

Choose a reason for hiding this comment

pawanrawal left a comment

Choose a reason for hiding this comment

minhaj-shakeel left a comment

Choose a reason for hiding this comment

minhaj-shakeel commented Apr 15, 2021 •

edited

Loading