-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-RFC Live queries #386
Comments
I'm discovering more writings related to live queries: #284 -- A long discussion of how live queries relate to subscriptions. I’ve put more thought into this, and I’ve concluded that while it would be possible to build live queries on top of event-based subscriptions, perhaps as a proof-of-concept, there's really a need for some dedicated support for live queries for the concept to truly be effective. Live query semanticsI'd suggest the following semantics:
This is loose enough to allow for many implementations to meet this standard, from the standpoints of a GraphQL client library API and client-server communication. In the worst case, the library could simply poll an entire ordinary query and unconditionally invoke a callback with the response. This would provide suboptimal latency, would likely overpush, would trigger client updates too often, and could miss some rapid changes. But, it could satisfy the above points only using standard HTTP approaches, proving that there’s nothing terribly exotic about unoptimized live queries. However, a better client library solution would be to represent the result of a query as some sort of observable data structure. Likewise, a better client-server communication paradigm would be to use a server push technology to stream minimal updates. Such optimizations would likely rely on more advanced technology, like reactive systems and stateful server push mechanisms. Representing updatesFrom what I can tell, the biggest novelty from a GraphQL perspective would be a protocol for representing incremental updates to a response. It would be highly desirable to model both the initial result and the updates using one type schema. As the spec for executing selection sets mandates that every selected field appear in the result, I believe this would be possible by loosening this restriction for updates. Absence in the response map would indicate that no change occurred. The trickiest part would be concisely representing changes in GraphQL’s only unbounded type, the list. One option would be to use something like Javascript’s Array splice API as a concise way of describing edits to a list. Coexistence with subscriptionsTo achieve desirable user experience, sometimes it’s best to sync the UI with the server-side data model, which is the use case live queries serve. Other times, it’s best to depict to the user what happened, rather than showing a live view of the data. In these cases event subscriptions are more appropriate. But often, there is a blend of these needs, and it would be best for a client to have both the update and the reason. So then it would be ideal for a client to able to subscribe to live queries and events in one operation, over a single connection. It seems like the existing subscription mechanism could be expanded to accommodate live queries. It could be as simple as adding a To tie the two types of subscriptions together, it would be nice to mandate that a source event that results in an response event and a change to the selection of the live query should package that live query data in the same response. That way, clients could count on depicting the latest at the time of the event. |
As an FYI, I think that when you hear about ‘live queries’ it refers to a polling based approach which Relay calls ‘live queries’. You can read more about it in this discussion and the linked to API. |
@alloy Ah, thanks! Yeah, so Relay is literally implementing the naive live query approach. Biggest problem is that it doesn't seem to leave room for partial updates. So, I actually think the scope of this is much smaller than I would have guessed. It's really more like, we should enhance current subscriptions model with the concept of partial updates and a schema directive that specifies that a given root field in the |
Apollo Client has polling built in, but we don't call it live queries. I think when people refer to live queries they are usually talking about the server pushing updates to a query, possibly combined with some kind of directive. Usually it's something that's quite hard to do with just normal queries and subscriptions, at least in a generic way. For example, you can get this today with packages built by @DxCx:
These aren't quite prepared for public presentation but I think Hagai would be interested in collaborating with more people about it! |
Yeah, my idea here was to present something that would encompass polling and efficient reactive push updates. I'll check those out and see if there's a good place to collaborate! |
See also the JSON patch spec: http://jsonpatch.com/ In practice it's not actually that hard to build out something live-query-like on top of subscriptions. We've done it – it's not so bad. The semantics get tricky if you have something like – suppose you have a connection that's searchable. You probably don't want to allow live queries when you have a search query specified, since they're not really efficiently implementable (arbitrary sort/filter in general – if a new element is inserted, can the server efficiently compute the insertion index?). The other issue implementation-wise is – how do you even follow the spec? In principle live queries should be fully recursive, right? Step down the query, provide updates on everything down the path... but what happens if portion of the query isn't (or shouldn't be) "live"? |
Interesting. So looking at that spec, one implementation of this concept would simply be a standard event subscription with a schema that would look something like: scalar GraphQLQuery
scalar JSON
type Subscription {
liveQuery(query: GraphQLQuery): LiveQuery
}
type LiveQuery {
op: String!
from: String
path: String
value: JSON
} A couple thoughts:
The idea would be that it's up to developer judgment what makes sense to live query. As mentioned in my proposed semantics, a library could always trivially fall back to polling the standard resolvers to implement the bare minimum requirements. Libraries could simply choose to offer better guarantees by offering a way for some data to be pushed by application code. The way I see it, you would only put types in your I guess it would be tricky to share types and resolvers between normal queries and live queries, if you want the latter to be more restrictive. For recursion and live updated searches, my first thought is I don't see why to outlaw these things, but I'd probably leave it to people who have the need to figure out how to do it efficiently for their use case. |
Actually, maybe a better way to apply this idea would be: scalar JSON
type Subscription {
liveQuery: LiveQuery
}
type LiveQuery {
query: Query # or an alternate root query type
update: Patch # or an alternate type with different update semantics
}
type Patch {
op: String!
from: String
path: String
value: JSON
} This would make the live query shape itself checkable and returned as an immediate initial subscription push. All subsequent updates would come back in patch format in the |
@robzhu linked https://www.youtube.com/watch?v=BSw05rJaCpA in #284 (comment), which is super helpful. Adding it here in case anyone missed it on #284. Someone should write up the video content into a blog post! It's really great. |
Following up on our discussion in #284, I think there's really two questions here with live queries:
I entirely agree that "the full query response" is an inefficient answer for (2), but it is nevertheless an answer. I think, however, that @rodmk's talk mostly touches on (1). And just to note earlier, with subscriptions, the equivalent of (2) is really obvious. Here we do in fact need to spec it out. |
For live query semantics, I would note:
This causes some drawbacks though... I've run into issues with deciding when to sync the partial result state to the UI. I'm thinking it may be a good idea to build some "response gates" that allow pieces of the result to surface to the UI as they are completed - by depth, perhaps? @taion Easy answers:
I maintain that it's unnecessary and in fact a detriment to attempt to build something like this on top of a classic GraphQL engine. The performance losses and development work-arounds necessary defeat the purpose. You may as well start from scratch with something efficient and correct. |
@taion posed the key questions. Here's what I would propose:
Lastly, where (1) and (2) connect is that it's possible that some fields will have arguments per subscriber and need to calculate diffs. I'd suggest that servers can account for this by having the previous value of a field be made available to their resolver. @paralin I've got some thoughts on the points you raised, but I'll follow up with another comment shortly, since this one's long enough... |
@paralin I think you may be thinking of a fairly different concept. I want to be clear here, what I've got in mind is a system that would allow people to architect systems with "live view" semantics on top of GraphQL's paradigm of client-specified API traversal. While it should be performant, the goal isn't to squeeze ever drop of performance in day one (which is really an implementation concern, anyway), but rather to be flexible enough to admit optimization. That said, you've brought up two concerns with considering:
You're right that a live query system could be built around the basic I'm beginning to feel confident that I can put together either an RFC for Live Query Subscriptions or a PR to the current subscriptions RFC. I think that at this point, the principles are more or less clear and some examples would be a good next step. |
@acjay I too am talking about making a live view on top of existing apis. You're misunderstanding the intent behind subscriptions. The intent is to subscribe to a set of events of a single type. Each event emitted has the same type. There are no semantics for updating a result, it is purely events. This is called "event based state" - one event type emitted for each change to the state. What this lacks in order to make it a "live query system":
The subscriptions system is not and was not intended to be a live queries system. It solves a different set of problems. Live queries are a different thing altogether. Live queries are not a question or language or spec. The existing language and spec works just fine for them. I did not have to modify the underlying Go and JavaScript graphql parsers at all. The spec has room for server declared directives, so the server can specify The difficult parts of live queries are dealing with change detection, computation sharding, the inefficiencies of keeping a connection open with all connected clients, and result encoding. Otherwise, it is possible to prototype a live query system in 10 minutes flat, because computing a diff and sending it over the wire is an easy task - it's making this scale, and allowing the query to be changed without restarting the entire operation, that makes something like this take some planning. It is for these reasons that I think the best way forward for implementing these things is to make real practical proof of concept implementations that do not stray from the existing spec, aside from custom directives. We can keep the base code in place for the parser, there's no need to modify it. Let's find ways to make this scale, prove that they work, and then consider making some sort of Request For Comment from the Facebook team to initiate a standard. |
@paralin Sorry for the long delay -- holiday break. I think you misunderstand what I'm saying. I totally get that the intention of subscriptions was an event-push model. My point is that the mechanism--even if inadvertently--is actually suitable for live queries in a fairly generic way, as well. Yes, you are right that a subscription has one field. But that field has the full power of the GraphQL type system behind it, and it doesn't have to conceptually represent a domain event. There's no reason it can't be the entry point to an entire schema of live data, such that every event effectively triggers a push of data of the requested shape. Similar patterns are often used in mutations to allow the mutation request to respond with mutation outcome data (intuitively) or arbitrary data, in any combination, as the schema designer and client see fit. A more concrete idea is forthcoming... |
This is unfortunately not true. You can of course make a generic message type and send it over the subscription but this doesn't get you anywhere. |
I just reread the subscription spec, but I'm not seeing how this is disallowed. Could you explain? |
@acjay I'm guessing you want to do something like this? Schema:
Rerunning all of the resolvers every time something changes is going to kill your server. I don't see the point. It's possible I'm completely out of sync with what you are imagining, in which case, I look forward to seeing your designs and prototypes. Otherwise, I can't see how something like this could be useful for more than just a toy implementation. |
That's not what I would literally do, but I would argue that that schema should be possible. So, yes: types used as queries should be reusable in subscriptions. Realistically, the root fields present in the subscription type would be limited to those that the server is architected to produce efficiently. The resolvers for types shared between queries and subscriptions would have to be written to take advantage of data pushed in the event payload or reactive data wherever possible, instead of pulling from services, as a normal query would. I'd expect techniques like caching and memoization to be used--scoped to the processing cycle of an event--to prevent duplicate work from being done when many clients need the same data. This is not unlike the way data loaders are used to make resolution more efficient in ordinary queries. |
@acjay - We implemented a prototype of live queries on top of subscriptions for Graphcool back when we first implemented subscriptions. We found that exposing diffs or changes in the schema was overly complicated. Our prototype ended up simply exposing the field
2 is a fairly limiting constraint. If the query depends on the active session, it can be difficult or impossible to deduplicate between sessions, forcing you to evaluate the query for each subscriber. We never shipped live queries, but I am excited to see that interest is picking up around this topic. See also the notes from the latest working group meeting: https://github.com/graphql/graphql-wg/blob/master/notes/2018-02-01.md#discuss-subscriptions-for-live-queries |
@sorenbs The issues you describe were already solved in the rgraphql proof of concept:
|
@sorenbs That was me who brought up the topic at graphql-wg :) So, for 2 and 3, I think those would be "left up to the implementation". I'm actually at a point right now where I think I want to pursue something that requires no actual changes to the GraphQL spec at all. The schema would look something like:
We probably won't actually attempt this for a while yet, but thought I'd report this idea back, since I started this ticket. |
@acjay I would be interested to see this implemented in a GraphqlJS library and could help build that prototype. Am I understanding you correctly that you are creating a subscription that returns a LiveQueryInitialRespone immediately and a LiveQueryUpdate as needed after that? Also https://tools.ietf.org/html/rfc6902 format is already an existing standard so I would lean towards your 2nd December 11 suggestion for the Patch type. Also I expect having a update and a query field will make client side implementation easier (eg. |
One more note: if possible I would prefer to keep the benefits of GraphQL typing for the initial query. Even though the patch needs to be json (or a minimum of a union of every type of field at any level of nesting in the query) we could at some point use the type information from the query to determine server-side and client-side if the patch JSON is valid and effectively implement typed patches in the library. So based on this and my previous comment I would suggest the following schema (Scroll to bottom for example useage) Example Schemascalar GraphQLRequest
scalar JSON
# As an example we are live querying all posts
type Post {
id: ID!
title: String!
postedAt: String!
}
type subscription {
AllPosts(resumptionCursor: ID): PostLiveQuery!
}
type PostLiveQuery {
query: [Post!]
# Patches may be batched
patch: [RFC4627Patch!]
resumptionCursor: ID!
# TODO: Should resumptionCursor be optional for servers that are unable to
# resume a live query?
}
}
union RFC4627Patch =
|| RFC4627Add
|| RFC4627Remove
|| RFC4627Replace
|| RFC4627Move
|| RFC4627Copy
|| RFC4627Test
# Note the op fields bellow are redundant with __typename but are included for
# RFC4627 compatibility.
RFC4627Add {
op: String! # Always returns "add"
path: String!
value: JSON!
}
RFC4627Remove {
op: String! # Always returns "remove"
path: String!
}
RFC4627Remove {
op: String! # Always returns "replace"
path: String!
value: JSON!
}
RFC4627Move {
op: String! # Always returns "move"
from: String!
path: String!
}
RFC4627Copy {
op: String! # Always returns "copy"
from: String!
path: String!
}
RFC4627Test {
op: String! # Always returns "test"
path: String!
value: JSON!
}
Example SubscriptionIn this example:
subscribe {
allPosts {
query {
id
title
}
patch {
... on RFC4627Add { op, path, value }
... on RFC4627Remove { op, path }
... on RFC4627Replace { op, path, value }
... on RFC4627Move { op, from, path }
... on RFC4627Copy { op, from, path }
... on RFC4627Test { op, path, value }
}
}
} Edit: fixed some mistakes |
I've published an npm package based on this thread for subscription-based live queries: https://github.com/D1plo1d/graphql-live-subscriptions |
This is really awesome @D1plo1d! I think that would work for my use case, and it seems conceptually simpler than what I had described. Especially so in your library, where all the patch subtypes are flattened into one type. For the resumption question, yeah, optional. In the case I'm considering, it would be easy to implement, but I can definitely imagine that not being the case for all applications. |
@D1plo1d I like your schema, I just modified it a little bit, because this makes more sense for me. Example Schema
Explanation: You send only the query or the patches not both. And I dont think we need a resumptionCursor, because the Server knows the "initialquery" of this subscription, and calculates the patches in relation to this initial query. |
@acjay thanks! Resumptions are the biggest unknown for me atm. I haven't given them much thought tbh but please experiment away and let me know how it goes! If you need to change anything to the library to get them working I'd really appreciate the Pull Requests. |
@nkordulla great idea. I like that returning a union would allow both the Post and the RFC4627Patch to be wrapped in NonNullable - I like that type safety. Any reason not to wrap the patch part of the union in I'm presently optimising the patch generation because re-serializing queries every change proved to be prohibitively slow for my usage (we're running our node servers on Raspberry Pi's though to be fair). If you want to take a stab at refactoring the schema to be Union based on the current master branch that would be great (just shoot me a PR) or I could let you know when I'm finished with my optimisation work to save us a nasty merge. re: resumptionCursor: Not that any of this is implemented yet but I think the idea here is that if your client momentarily looses it's connection to the server it could send it's resumptionCursor and start receiving patches again from where it left off without the need for another initial query response. As I understand it the resumptionCursor would prevent race conditions in the case where first the client disconnects then a change is made and then the client reconnects (and wishes to optimise the amount of data over the wire by skipping a re-send of the initial query payload). It's unnecessary AFAIK in any scenario where you re-send the initial query on re-connect like subscribeToLiveData does presently. |
Also a note on usage with graphql-live-subscriptions v0.1.0: I've found in practice that I prefer to nest all of my live data under one "live query root". It simplifies my client-side code because I can create one live subscription for all of my live data. Eg: Schematype Post {
id: ID!
title: String!
postedAt: String!
}
type Jedi {
id: ID!
name: String!
}
type LiveQueryRoot {
jedis: [Jedi]
posts: [Post]
}
type LiveSubscription {
query: LiveQueryRoot
patch: [RFC4627Patch]
}
type subscription {
live: LiveSubscription!
} Client Sidesubscribe {
live {
query: {
jedis: { name }
posts: { title }
}
patch: // .. usual patch query here
}
} |
A quick update, I've released You can check it out at https://github.com/D1plo1d/graphql-live-subscriptions My apologies for the less-then-great README. If you have any questions feel free to ask me directly. |
Any progress? |
@Bessonov @D1plo1d interesting work on graphql-live-subscriptions, congrats. I'm actually still using rgraphql / magellan's binpacked protocol in a few projects, and have rewritten it around static code generation in Go for better performance, compiler type safety, and removal of the "reflect" package. The prototype of this approach is here: https://github.com/rgraphql/nion with the TypeScript client here: https://github.com/rgraphql/soyuz/tree/nion |
We announced support for Live Queries at Grafbase yesterday. Check it out: https://grafbase.com/blog/simplify-building-realtime-applications-with-graphql-live-queries Let's get this spec approved:) |
Congrats @fbjork! I like the JSON patch over Server-Sent Events format. I've been playing with WebTransport lately and I suspect that this format could also be applied to WebTransport Unidiretional Streams equally well. |
I've heard live queries alluded to on various podcast episodes and in the RFC for subscriptions, but it's unclear to me whether there's a repository of ideas on what this feature would look like.
Purpose
GraphQL supports an event-based real-time mechanism called subscriptions. These are useful for many uses cases, such as when you need a UI to explicitly reflect domain events to the user.
However, these aren't the only type of real-time semantics. Sometimes, a the UX is designed around depicting the real-time state of a domain, rather than discrete domain events. It is possible to serve this need with GraphQL subscriptions. Each event could effectively expose the entire root query schema, or the applicable slice of it. That way, clients could update arbitrary local state for every event. This brings up a couple challenges:
Less naive implementation on subscriptions
I think it may be possible to implement achieve this on top of subscriptions with some machinery.
Change types
For issue (1), there would need to be a way to represent changes to a schema. This could be done by representing every type
A
in a schema with a box type:The box would be necessary for times when
A
is nullable, to differentiate between a change tonull
and no change.The subscription representing a live query update would effectively be one giant "data model changed" event. It would need to provide a field that serves as the root of the live-updatable data. If the data is represented by type
Data
, this field would have a type ofChange[Data]
.Change[Data]
would have a field of typeChange[A]
for every field of typeA
inData
.This generic
Change
type doesn't need to literally exist, but it could be synthesized.Server-side client model
To determine what changes to send, the server would either need to store the current client-side state (in the same way that it needs to store subscriptions queries) or its business logic would need to natively be capable of deriving diffs.
Array changes
This leaves out the question of efficient updates to arrays, which would naively be wrapped whole in a box. But that could still result in over-pushing and it might not play way well abstractions like Relay's connections.
Bootstrap events and consistency
For issue (2), there could be a synthetic event that the server fires immediately upon subscription success, which would have field containing the unaugmented payload of
Data
, which would allow the clients to bootstrap their local state. If the transport guarantees that either all messages are received or the connection fails (like Web Sockets), consistency would be guaranteed. Other transports would need some other system for ensuring consistency. Either way, applications would likely want to be able to depict interruptions of the connection.So, how to do it?
The above approach is appealing in that it can conceivably be built on primitives that exist today, albeit with some schema augmentation and some GraphQL library machinery on the client and server sides. On the pro side, this would allow for event-based subscription payloads and live query updates to coexist in one protocol. On the con side, not having a first-class representation could make the pieces needed to achieve live queries feel disjointed.
I wanted to throw this out there as a rough proposal to get some comments and suggestions. If there's any interest, I can try to make it a proper RFC. My company is looking at adopting GraphQL for one of our real-time products, and I'm trying to look ahead to whether we'd be able to replace some bespoke Web Sockets stuff with something that's more of a standard. Maybe there's an opportunity for us to help push such a standard forward.
The text was updated successfully, but these errors were encountered: