Allow filtering on non-indexed predicates #4305

mboxall · 2019-11-21T21:09:48Z

Currently functions can only be applied to predicates that have been indexed.

I'm requesting the ability filter on non-indexed predicates too.

In some cases it could be more efficient to iterate over an intermediate result set and apply the filter function than it would be to use an index.

For example, this query returns a count of 252 countries with 3.6 ms of processing:

{
    countries(func: type(Country)) {
        count(uid)
    }
}

Adding a filter on sys_start_time (indexed as hour) returns the same result but requires 358 ms of processing:

{
    countries(func: type(Country)) @filter(lt(sys_start_time, "2019-11-21")) {
        count(uid)
    }
}

Our database has 145 million nodes, and the distribution of sys_start_time values is uneven, with potentially millions of nodes sharing the same value.

For this example it would be advantageous not to index sys_start_time and to simply scan through a short list of 252 countries, eliminating any results that fail the filter function.

The text was updated successfully, but these errors were encountered:

MichelDiz · 2019-11-21T21:49:38Z

Actually you can do it https://docs.dgraph.io/tips/#search-on-non-indexed-predicates

e.g:

{
  var(func: type(Country)) {
    p as sys_start_time
  }
  query(func: lt(val(p), "2019-11-21")) {
      count(uid)
  }
}

robsws · 2019-11-22T12:01:08Z

Thanks @MichelDiz for highlighting that. Would this approach still work if we were to have several expanded edges in our query as well? Something along the lines of this:

{
    countries(func: type(Country)) @filter(lt(sys_start_time, "2019-11-21")) {
        hasCity @filter(lt(sys_start_time, "2019-11-21")) {
            ~livesIn @filter(lt(sys_start_time, "2019-11-21")) {
                Name
            }
        }
    }
}

I'm concerned even if that is possible that it might be a bit unwieldy.

manishrjain · 2019-12-02T18:56:39Z

I think it should still work. @MichelDiz can confirm.

MichelDiz · 2019-12-03T01:39:53Z

yeah, still works. I did this small test of concept

{
   A as var (func: type(Country)) {
# OR #  var(func: has(hasCity)) @filter(NOT has(~hasCity)) {
      p as sys_start_time
      hasCity {
          p2 as sys_start_time
          ~livesIn {
              p3 as sys_start_time
          }
      }
 }
    countries(func: uid(A)) @filter(lt(val(p), "2019-11-21")) {
      uid
      sys_start_time
      hasCity @filter(lt(val(p2), "2019-11-21")) {
          sys_start_time
          uid
          Name
          ~livesIn @filter(lt(val(p3), "2019-11-21")) {
                    uid
                    sys_start_time
                    Name
             }
        }
    }
}

{
  "data": {
    "countries": [
      {
        "uid": "0x111b5",
        "sys_start_time": "2019-10-21",
        "hasCity": [
          {
            "sys_start_time": "2019-10-21",
            "uid": "0x111b3",
            "Name": "Some City",
            "~livesIn": [
              {
                "uid": "0x111b4",
                "sys_start_time": "2019-10-21",
                "Name": "Red Light"
              }
            ]
          }
        ]
      }
    ]
  }
}

Mutation

{
	"set": [{
			"dgraph.type": "Country",
			"Name": "USA",
			"sys_start_time": "2019-10-21",
			"hasCity": [{
				"uid": "_:SomeCity",
				"Name": "Some City",
				"sys_start_time": "2019-10-21"
			}]
		},
		{
			"dgraph.type": "District",
			"Name": "Red Light",
			"sys_start_time": "2019-10-21",
			"livesIn": {
				"uid": "_:SomeCity"
			}
		}
	]
}

manishrjain · 2020-01-02T18:47:44Z

If @MichelDiz 's solution works, can we close this issue?

mboxall · 2020-01-03T14:43:57Z

@MichelDiz 's work-around does work, but we would prefer a more general solution. Our application re-writes the user query to automatically add point-in-time @filter clauses to all nodes involved in the query. Automatically applying the proposed work-around to any possible user query presents a challenge and would add more complexity to our application than ideal.

manishrjain · 2020-01-03T18:00:19Z

While these functions (lt, gt, etc.) won't work at root without an index (because that entails scanning through the entire predicate), we can make them work for filters. No promises, it would entirely depend upon how much complexity we add to the database. But, we'll definitely give it a shot.

sleto-it · 2020-01-14T11:08:26Z

Hi all,
This ticket was fixed by PR #4531, which has now been merged

The fix will be delivered in v1.2

Please comment if more help is needed, and we will reopen

Many thanks,

manishrjain self-assigned this Nov 22, 2019

manishrjain added the kind/question Something requiring a response. label Dec 2, 2019

manishrjain assigned MichelDiz and unassigned manishrjain Dec 2, 2019

mangalaman93 mentioned this issue Jan 9, 2020

Support for equality and inequality in filter when a predicate doens't have index #4077

Closed

sleto-it added kind/enhancement Something could be better. status/accepted We accept to investigate/work on it. and removed kind/question Something requiring a response. labels Jan 9, 2020

sleto-it assigned animesh2049 and unassigned MichelDiz Jan 9, 2020

animesh2049 mentioned this issue Jan 13, 2020

Support filtering on nonindexed predicate #4531

Merged

animesh2049 closed this as completed in #4531 Jan 14, 2020

sleto-it added this to the Dgraph v1.2 milestone Jan 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow filtering on non-indexed predicates #4305

Allow filtering on non-indexed predicates #4305

mboxall commented Nov 21, 2019

MichelDiz commented Nov 21, 2019

robsws commented Nov 22, 2019 •

edited

Loading

manishrjain commented Dec 2, 2019

MichelDiz commented Dec 3, 2019 •

edited

Loading

manishrjain commented Jan 2, 2020

mboxall commented Jan 3, 2020

manishrjain commented Jan 3, 2020

sleto-it commented Jan 14, 2020

Allow filtering on non-indexed predicates #4305

Allow filtering on non-indexed predicates #4305

Comments

mboxall commented Nov 21, 2019

MichelDiz commented Nov 21, 2019

robsws commented Nov 22, 2019 • edited Loading

manishrjain commented Dec 2, 2019

MichelDiz commented Dec 3, 2019 • edited Loading

Mutation

manishrjain commented Jan 2, 2020

mboxall commented Jan 3, 2020

manishrjain commented Jan 3, 2020

sleto-it commented Jan 14, 2020

robsws commented Nov 22, 2019 •

edited

Loading

MichelDiz commented Dec 3, 2019 •

edited

Loading