-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for filtering on value predicate facets #4034
Comments
That is something we already discussed internally. This has some logical barriers. We could not tell if the filter would have to be applied to the node itself or the intrinsic value to the predicate. I believe that if the filter should be applied to the node itself, we should allow using facet filters in Root rather than in the query body. @robsws are you mutating via RDF or JSON? What is your use case? Filter the node with the car facet or show the car value facet only if the facet value filter is true? links just for internal context
|
I think we should definitely implement this. Since a value predicate can hold facets, we should be able to filter by them as in other predicates. |
@campoy it is linked to the order of concepts in the syntax. Facets were created for the relationship between edges, the ability to create facets in values and allow functions at this level to affect the query itself, it slightly changes its concept a priori. I don't know what the whole design implications of this are. If any of the approaches below would affect other use cases. Although Facets are not first-class citizen I think at the root would make more sense depending on the context. But perhaps in the body, it is the most simple and straightforward. If we will support this, we must determine whether facets value filtering in this context will be executed
or in the query body.
If we choose in the query body, what happens if we have 3 Alices in the DB and only one has a car? (assuming the "since" value of one is 2006) The intention is: 1 - Display the three Alices and not show the value "since"(or even "car")? because Alice has the car since 2006 and not 2019 - as it is in the query - And you are determining that facet filter directly on the predicate/value. So, do you want to apply this filtering to the value or the query as a whole? 2 - Do not display any Alice, because all other Alices do not have the "since" facet or a value greater than 2019? If you wanna apply this to the query as a whole. Now, assuming the query is with the value 1 - Would we display only one Alice? because all the others do not have the "since" facet. In short term isDo we apply this to the value or the query? If we apply to value, I cannot understand the immediate usefulness of it. It would be like "skip" this value for now, but show me the result anyway. |
For my use case, I would like to be able to filter multiples of a particular value edge down to one or more based on the value of the facet. For example, maybe if Alice had several cars like this:
I want to be able to fetch the car (or cars) she had where since > "2019-01-01" with my query above and ignore the other entries. I am concerned that if the extra clause appears at the root, then we won't be able to do facet filtering on deeper levels of the query like we can with uid predicates. Consider an example where people can change the registration plates for their car, then we might want to do something like this:
With regards to how to deal with cases where the facet isn't specified, this is inconsequential to my use case since the facet will exist on all edges of that type, but I understand that it's something that needs to be decided upon. |
I'd say we should unify how we filter by facets and do the same for predicates pointing to other nodes as well as predicates pointing to values. If car was pointing to a node with field {
data(func: eq(name, "Alice")) {
name
mobile
car @filter(gt(since, "2019-01-01"))
}
} The same way, if you have a predicate that points to a string value and you want to filter by a face attached to it, you should be able to write: {
data(func: eq(name, "Alice")) {
name
mobile
car @facets(gt(since, "2019-01-01"))
}
} Let's see with a sample dataset created by this mutation: {
set {
_:a <name> "Alice" .
_:a <dgraph.type> "Person" .
_:b <name> "Bob" .
_:b <dgraph.type> "Person" .
_:c <name> "Carla" .
_:c <dgraph.type> "Person" .
_:b <car> _:bc .
_:bc <name> "Bugatti" .
_:bc <since> "2015-01-01"^^<xs:dateTime> .
_:c <car> _:cc .
_:cc <name> "Corolla" .
_:cc <since> "2019-02-01"^^<xs:dateTime> .
}
} And schema: <car>: [uid] .
<name>: string .
<since>: datetime @index(day) . Then we can query who owns a car since after 2019-01-01 with: {
p(func: type(Person)) {
name
car @filter(gt(since, "2019-01-01")) {
name
}
}
} And we obtain: {
"data": {
"p": [
{
"name": "Alice"
},
{
"name": "Bob"
},
{
"name": "Carla",
"car": [
{
"name": "Corolla"
}
]
}
]
},
"extensions": {...}
} In the same way, if we held the {
set {
_:a <name> "Alice" .
_:a <dgraph.type> "Person" .
_:b <name> "Bob" .
_:b <dgraph.type> "Person" .
_:c <name> "Carla" .
_:c <dgraph.type> "Person" .
_:b <car> _:bc (since="2015-01-01") .
_:bc <name> "Bugatti" .
_:c <car> _:cc (since="2019-02-01") .
_:cc <name> "Corolla" .
}
} We can query in a very similar way by using filters on facets: {
p(func: type(Person)) {
name
car @facets(gt(since, "2019-01-01")) {
name
}
}
} Finally, since the car node only holds a name now, we can make it so {
set {
_:a <name> "Alice" .
_:a <dgraph.type> "Person" .
_:b <name> "Bob" .
_:b <dgraph.type> "Person" .
_:c <name> "Carla" .
_:c <dgraph.type> "Person" .
_:b <car> "Bugatti" (since="2015-01-01") .
_:c <car> "Corolla" (since="2019-02-01") .
}
} I would now expect this query, which is still very similar to the previous ones, to work: {
p(func: type(Person)) {
name
car @facets(gt(since, "2019-01-01"))
}
} Currently, it fails with |
@robsws there is a small issue with what you want. _:alice <car> "MA0123" (since=2006-02-02T13:01:09, first=true) .
_:alice <car> "XY0123" (since=2008-02-02T13:01:09, first=false) .
_:alice <car> "GU0123" (since=2019-02-02T13:01:09, first=false) .
So instead of that, is better to do: {
set {
_:alice <name> "Alice" .
_:alice <mobile> "040123456" (since=2006-01-02T15:04:05) .
_:alice <cars> _:MA0123 (since=2006-02-02T13:01:09, first=true) .
_:alice <cars> _:XY0123 (since=2008-02-02T13:01:09, first=false) .
_:alice <cars> _:GU0123 (since=2019-02-02T13:01:09, first=false) .
_:MA0123 <model> "MA0123" .
_:XY0123 <model> "XY0123" .
_:GU0123 <model> "GU0123" .
}
} Query
Result{
"data": {
"data": [
{
"uid": "0x50954",
"name": "Alice",
"mobile": "040123456",
"cars": [
{
"cars|first": false,
"cars|since": "2019-02-02T13:01:09Z"
}
]
}
]
}
} |
If I understand correctly, @robsws is not saying that a given facet is applied to each value in a list predicate, but rather in his application all predicates will be created with their own facets. This looks like a pretty straight forward addition at this point, would you agree with this @MichelDiz |
Not sure if I get you right. In his last example, the sample will be inevitably a list. The only way to avoid creating a list is by giving a predicate for each car. e.g:
That's the only way to have a value facet for now. But, I don't think this is ideal. This can eventually lead to the creation of numerous predicates, which goes beyond control. Despite this point, functions in facet values are in my view welcome. But it is necessary to understand the implications.
Another detail with this sample: _:alice <car> _:car .
_:car1 <registration> "AA1111" (since=2017-03-02T13:01:09Z) .
_:car1 <model> "MA0123" .
_:car2 <registration> "AA222" (since=2018-01-01T13:01:09Z) .
_:car2 <model> "XY0123" .
_:car3 <registration> "BB1111" (since=2019-01-01T13:01:09Z) .
_:car3 <model> "GU0123" . This syntax is okay for this specific case of a nested block. {
data(func: eq(name, "Alice")) {
name
mobile
car {
registration @facets(gt(since, "2019-01-01"))
model
}
}
} But the result is:
OR ?
If the response is the first result, I think it would be better to have this value facet in root/edge, like bellow. As I mentioned from my first comment. It would look like this if in the root query/edge.
{
data(func: eq(name, "Alice")) {
name
mobile
car @facets @facets(in "registration" gt(since, "2019-01-01")){
registration
model
}
}
}
|
Hi, thanks for continuing to look into this. I have a few comments based on the discussion above:
I think this is correct yes - each individual predicate in the list has it's own facets, rather than the list itself having facets. As if each predicate were individual, different predicates rather than a single list predicate. Hopefully this means it should be possible to implement? It's important for us to be able to filter several value edges out of the same type by their facet, as essentially our use case will involve switching between different "versions" of a predicate based on the timestamp in the facet. Modifying the example a little to include start and end times:
By using a timestamp variable in our query, we can determine what car Alice had on a particular date, say 2018-12-01, by doing something like this:
If it is impossible to do value facet filtering like I've demonstrated in the example above, then the functionality I don't think is much good for my use case unfortunately.
Thanks for clarifying this - I thought that you were suggesting that the syntax appear in the ultimate root of the query rather than just the subquery in which it should apply. I'm intrigued by that "lookahead" syntax actually (i.e. filter a uid predicate based on the facets of a value predicate on the other side of the uid predicate). If maybe Alice owned multiple cars at the same time, and I wanted to see what cars she owned at a particular time that have registrations beginning with "AA", would it be possible to do something like this?:
|
@robsws About the last question, yeah, it definitely would be possible. About having facets on individual values of a list. I'm not sure, at some time ago a user asked about this functionality, and the conclusion among the core team was that it would not be possible. I don't know about today, but this is another feature. It would not be part of this issue here. |
Hey @MichelDiz, I totally missed your comment above, sorry about that. The result to the query you propose should be:
If you wanted to also remove the nodes above you would use something else. At this point I'd say we should simply implement this fix as is and in the same way we filter for facets on predicates of type If I'm not mistaken @ashish-goswami is already working on this. |
Okay, that behave makes sense. But thinking a little more. I think one thing still missing. Are we gonna skip only the value facet or the predicate/value and value facet? The example I gave it does not return the registration predicate. So, it is skipping both. the other alternative:
|
I want to be able to filter value predicates on their facets as well as uid predicates. For example, I want to be able to execute something like the following query on the toy graph specified in the facets documentation:
Running that query right now provides the following error response:
The text was updated successfully, but these errors were encountered: