Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(DQL): optimize query for has function with offset. #7727

Merged
merged 5 commits into from
Apr 20, 2021

Conversation

minhaj-shakeel
Copy link
Contributor

@minhaj-shakeel minhaj-shakeel commented Apr 15, 2021

Fixes DGRAPH-574.

This PR reduces the latency of the has function slightly. For example the given query:-

{
    me(func: has(actor.film), first: 10, offset: 300000){
       name@en
    }
}

when run on the 21million dataset takes around 80 milliseconds to execute as compared to earlier around 93 milliseconds.

Machine Specs:-

Model Name:	MacBook Pro
  Model Identifier:	MacBookPro16,1
  Processor Name:	6-Core Intel Core i7
  Processor Speed:	2.6 GHz
  Number of Processors:	1
  Total Number of Cores:	6
  L2 Cache (per Core):	256 KB
  L3 Cache:	12 MB
  Memory:	16 GB

Note:-
This does not optimize for has function on predicates with @lang tag.


This change is Reviewable

Copy link
Contributor

@pawanrawal pawanrawal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good. I have a couple of comments.

  1. Are there any tests that already cover this or else could we add some?
  2. Also add some benchmark numbers based on your testing in the description. Add a comment saying that this doesn't improve latency for predicates with lang tag.

Reviewed 4 of 4 files at r1.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @manishrjain, @minhaj-shakeel, and @vvbalaji-dgraph)


protos/pb.proto, line 77 at r1 (raw file):

	int32 first = 15; // used to limit the number of result. Typically, the count is value of first
	// field. Now, It's been used only for has query.
	int32 offset = 16;

add a comment that first and offset help fetch lesser results for the has query when there is no filter and order.


query/query.go, line 2291 at r1 (raw file):

	if len(sg.Params.Order) == 0 && len(sg.Params.FacetsOrder) == 0 {
		// There is no ordering. Just apply pagination and return.
		if !(len(sg.Filters) == 0 && sg.SrcFunc != nil && sg.SrcFunc.Name == "has") {

Add a comment here that for has function when there is no filtering and sorting we fetch correct paginated results from disk, so no need to apply pagination again.

Copy link
Contributor

@pawanrawal pawanrawal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r2.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @manishrjain, @minhaj-shakeel, and @vvbalaji-dgraph)

Copy link
Contributor Author

@minhaj-shakeel minhaj-shakeel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @manishrjain, @pawanrawal, and @vvbalaji-dgraph)


protos/pb.proto, line 77 at r1 (raw file):

Previously, pawanrawal (Pawan Rawal) wrote…

add a comment that first and offset help fetch lesser results for the has query when there is no filter and order.

Done.


query/query.go, line 2291 at r1 (raw file):

Previously, pawanrawal (Pawan Rawal) wrote…

Add a comment here that for has function when there is no filtering and sorting we fetch correct paginated results from disk, so no need to apply pagination again.

Done.

@minhaj-shakeel minhaj-shakeel merged commit 214c5df into master Apr 20, 2021
@minhaj-shakeel minhaj-shakeel deleted the minhaj/offset-pagination branch April 20, 2021 03:56
all-seeing-code pushed a commit that referenced this pull request Nov 14, 2022
Fixes DGRAPH-574.

This PR reduces the latency of the `has` function slightly. For example the given query:-
```
{
    me(func: has(actor.film), first: 10, offset: 300000){
       name@en
    }
}
```
when run on the `21million` dataset takes around `80` milliseconds to execute as compared to earlier around `93` milliseconds.

Machine Specs:-
```
Model Name:	MacBook Pro
  Model Identifier:	MacBookPro16,1
  Processor Name:	6-Core Intel Core i7
  Processor Speed:	2.6 GHz
  Number of Processors:	1
  Total Number of Cores:	6
  L2 Cache (per Core):	256 KB
  L3 Cache:	12 MB
  Memory:	16 GB
```

Note:-
This does not optimize for the `has` function on predicates with the `@lang` tag.
all-seeing-code pushed a commit that referenced this pull request Dec 1, 2022
Fixes DGRAPH-574.

This PR reduces the latency of the `has` function slightly. For example the given query:-
```
{
    me(func: has(actor.film), first: 10, offset: 300000){
       name@en
    }
}
```
when run on the `21million` dataset takes around `80` milliseconds to execute as compared to earlier around `93` milliseconds.

Machine Specs:-
```
Model Name:	MacBook Pro
  Model Identifier:	MacBookPro16,1
  Processor Name:	6-Core Intel Core i7
  Processor Speed:	2.6 GHz
  Number of Processors:	1
  Total Number of Cores:	6
  L2 Cache (per Core):	256 KB
  L3 Cache:	12 MB
  Memory:	16 GB
```

Note:-
This does not optimize for the `has` function on predicates with the `@lang` tag.
all-seeing-code added a commit that referenced this pull request Dec 2, 2022
Fixes DGRAPH-574.

This PR reduces the latency of the `has` function slightly. For example
the given query:-
```
{
    me(func: has(actor.film), first: 10, offset: 300000){
       name@en
    }
}
```
when run on the `21million` dataset takes around `80` milliseconds to
execute as compared to earlier around `93` milliseconds.

Machine Specs:-
```
Model Name:	MacBook Pro
  Model Identifier:	MacBookPro16,1
  Processor Name:	6-Core Intel Core i7
  Processor Speed:	2.6 GHz
  Number of Processors:	1
  Total Number of Cores:	6
  L2 Cache (per Core):	256 KB
  L3 Cache:	12 MB
  Memory:	16 GB
```

Note:-
This does not optimize for the `has` function on predicates with the
`@lang` tag.



(cherry picked from commit 9ba15b7)

Co-authored-by: minhaj-shakeel <minhaj@dgraph.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants