Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Initial pass at adding subscription to executor #189

Merged
merged 4 commits into from
Oct 20, 2015

Conversation

skevy
Copy link
Contributor

@skevy skevy commented Oct 1, 2015

So, in talking to @taion and @KyleAMathews about subscriptions in the GraphQL Slack chatroom, I felt like I should pass my work on this along.

TLDR; - there's a lot of work to be done on supporting subscriptions in the Relay/GraphQL ecosystem, and lots to figure out. This RFC doesn't do much, except allow "subscription" operations to be executed through the existing executor. Also, I haven't fully tested this. Tests PASS (or they did when I wrote it), but if we go forward with something like this, I will of course update code as needed.

Longer version:

In thinking about subscriptions in GraphQL land, and looking at what GraphQL (specifically graphql-js) does now...it seems like graphql-js really doesn't have to prescribe much about subscriptions. It seems like all that really needs to happen is that the server UNDERSTANDS subscription operations, and is able to execute them just like normal queries. The implementor of the GraphQL resolver would take the extra step to actually cause the side effect of the subscription query being stored, so that it can be executed later. It wouldn't make sense to have this functionality be part of GraphQL, because GraphQL is server agnostic.

My perceived flow of how subscriptions would work in GraphQL/Relay land would be this:

  1. Client submits subscription operation to the GraphQL backend.
  2. GraphQL executes the query in parallel, just like a normal query and returns the payload back to the client. This piece could be defined by a subscriptionWithClientId method, such as the mutationWithClientMutationId in graphql-relay-js (this would be out of scope of graphql-js).
  3. When the resolve occurs, the GraphQL server would do SOMETHING to store this query somewhere (could be in-memory or Redis), and would return back a subscriptionId that the client could use to listen for updates. You could either store the whole query (as is discussed here: [Feature request] Returning fieldASTs to be able to cache them for performance imporvements #158 (comment)) or store just the ASTs. Again this would be up to the implementor to build in user-land.
  4. When a change happens in the backend, it would figure out (in some user-land described way) which subscriptions needed to be notified. The process that's listening for these changes would take the subscriber query and execute it, either against the original resolver backend from the original query, or against a fat payload that the server received from the subscription backend (such as Redis pub/sub), or maybe even a combination of the two.
  5. The client receives this payload and does with it what it needs (in the case of Relay, I would think it would do something similar to what happens when a mutation occurs).

In this whole process, I feel like the only thing graphql-js needs to care about is actually executing the subscription queries, which is what this PR addresses.

I'm curious as to everyones thoughts on all of this. It seems like, in talking to @taion and @KyleAMathews, and of things that I've heard from @dschafer, that I'm on the right track. But regardless of whether I am going down the right implementation path or not, I wanted to open this PR so that there could be a discussion that didn't occur over Slack, since it loses history constantly. :)

@KyleAMathews
Copy link

Here's an issue in Relay for adding an API for user-land mutations facebook/relay#411

@dschafer
Copy link
Contributor

dschafer commented Oct 1, 2015

CC @laneyk

Yep, this all matches my thoughts almost exactly; graphql-js probably doesn't have much to say about subscriptions, it just knows how to execute them, and a lot of the complexity lives in the userland.

Client submits subscription operation to the GraphQL backend.

When the resolve occurs, the GraphQL server would do SOMETHING to store this query somewhere (could be in-memory or Redis), and would return back a subscriptionId that the client could use to listen for updates.

Yep, we do the submission over MQTT; we have the client submit a clientSubscriptionId (akin to clientMutationId), so that could probably be used as the subscription Id you mention (I would imagine you could use some client identifier combined with the ID they submit as the unique key so that clients can't cause weirdness by smashing other people's IDs).

The client receives this payload and does with it what it needs (in the case of Relay, I would think it would do something similar to what happens when a mutation occurs).

Yep, it's quite similar.

Overall, this looks like the right approach to me. CC @leebyron, @schrockn for their thoughts.

@OlegIlyenko
Copy link
Contributor

The proposed solution and a discussion in a slack channel inspired me to make a write-up about my thought on a GraphQL subscription mechanism. It turned out to be a bit bigger than I expected, so I put it in a gist:

https://gist.github.com/OlegIlyenko/a5a9ab1b000ba0b5b1ad

I think it's a bit different from the solution described in this issue, but I hope it would be helpful for a final decision.

@dschafer
Copy link
Contributor

dschafer commented Oct 5, 2015

@OlegIlyenko That write-up is fantastic! I think "subscription field results as well as their meaning are completely user-defined" is key there; it means we can probably introduce the core functionality needed to build subscriptions as a pretty minimal set of changes to graphql-js, since most of the compelxity (by necessity) is used-defined.

It seems like this PR is basically the minimal set of changes needed to allow graphql-js to do anything with subscriptions. The open question seems to be how execution works; does the executor for a subscription expect resolve to return an observable object, or is there a layer atop the executor that is observable, and when it gets pushed a new event, it runs the executor, and resolve returns an object or a promise as usual. The subscriptions work we've been doing at Facebook use the latter approach, but I think either one would probably work.

@taion
Copy link
Contributor

taion commented Oct 5, 2015

@dschafer I don't think they're quite symmetric, though. One advantage of the "resolve returns object or promise" approach is that it seems like this could let me directly share resolve between mutations and subscriptions, which seems like an important ergonomic advantage in cases where subscriptions mirror mutations.

@OlegIlyenko
Copy link
Contributor

@dschafer thanks for looking into it and describing the approach you are using at FB! I agree, it's very important to keep semantics a small as possible. Putting too much constraints, or explicitly supporting particular patterns in subscriptions can drastically reduce it's usefulness for many people.

I guess what I described in this gist was more an example of subscription use-case in server-to-server communication. I really excited to explore this area, I think it has a lot of potential.

The approach, that you have described (where observable sits atop of the executor), is definitely a viable solution. One thing concerns me though. It has an assumption of having only one type/kind of observable. One way to look at GraphQL is to treat it as an integration point for different backend services. This means that SubscriptionType can potentially expose streams of events/updates/data from different sources which may have different capabilities (like support of lastEventId, advanced and efficient filtering capabilities, etc). By making the stream itself a first-class in form of field in the SubscriptionType it becomes much easier to expose these capabilities of every individual stream with standard GraphQL features like field arguments and aliases. So I guess the question is more: should it be one implicit stream of data, or arbitrary number of explicit first-class streams in form of fields which are merged into one by executor (executor can also encapsulate some of the complexity associated with streams, just like it does with promises).

As far as I understand, in case of observable sitting atop of the executor, every field of SubscriptionType represent an event/update itself. Is it correct? @taion if my assumption is correct, then it should be possible cover the use-case you've described with both solutions. It just will be one level deeper in the subscription query. So, for instance, instead of writing it like this:

mutation {
  setName(name: "foo") {
    oldName
    newName
  }
}

....

subscription {
  setName {
    oldName
    newName
  }  
}

one can write subscription query like this:

subscription {
  mutationUpdates {
    setName {
      oldName
      newName
    }
  }  
}

Now I actually get a feeling that what we are discussing here is more about the execution mechanics/implementation. I wonder how this will affect the actual specification and whether it can be possible to describe subscription semantics in a way that leaves room for both of these implementation approaches. Maybe it would be a good idea to explore this in graphql/graphql-spec#86? :)

@taion
Copy link
Contributor

taion commented Oct 5, 2015

Consider how we might want to define a subscription in graphql-js. Take as an example:

var GraphQLRenameTodoMutation = mutationWithClientMutationId({
  name: 'RenameTodo',
  inputFields: {
    id: { type: new GraphQLNonNull(GraphQLID) },
    text: { type: new GraphQLNonNull(GraphQLString) },
  },
  outputFields: {
    todo: {
      type: GraphQLTodo,
      resolve: ({localTodoId}) => getTodo(localTodoId),
    }
  },
  mutateAndGetPayload: ({id, text}) => {
    var localTodoId = fromGlobalId(id).id;
    renameTodo(localTodoId, text);
    return {localTodoId};
  },
});

What I would like to have would be something like

var GraphQLRenameTodoSubscription = subscriptionWithClientId({
  name: 'RenameTodo',
  outputFields: {
    todo: {
      type: GraphQLTodo,
      resolve: ({localTodoId}) => getTodo(localTodoId),
    }
  },
  subscribeAndGetPayload: (callback, {id}) => {
    var localTodoId = fromGlobalId(id).id;
    emitter.on(`renameTodo:${localTodoId}`, () => callback({localTodoId});
    // Or return an RxJS observable.
  },
});

The symmetry here is that outputFields is exactly identical between my mutation and my subscription declaration.

}

var schema = new GraphQLSchema(schemaBody);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could combine this line with below, returning directly.

@leebyron
Copy link
Contributor

leebyron commented Oct 6, 2015

This is looking great - I agree that this is the right step forward for enabling experimentation around subscriptions.

Could you add some test cases, perhaps within some of the validation rules, to ensure these additions you made are correct?

@skevy
Copy link
Contributor Author

skevy commented Oct 6, 2015

Awesome. Yah I'll put that together tonight or tomorrow.

@grydstedt
Copy link

This is looking great @skevy

@skevy
Copy link
Contributor Author

skevy commented Oct 16, 2015

Guess I better get my butt in gear on this given this blog post: http://graphql.org/blog/subscriptions-in-graphql-and-relay/ :)

@skevy
Copy link
Contributor Author

skevy commented Oct 16, 2015

Random aside...the hardest thing about writing tests for this is keeping the line lengths for the test descriptions under 80 characters :-p

@skevy
Copy link
Contributor Author

skevy commented Oct 16, 2015

Updated tests @leebyron. One note - since subscriptions, from an executor perspective, execute the same as queries, I didn't create a separate subscriptions.js test file in execution/__tests__. There's many tests for query in the execution/__tests__ folder, and I'm not sure if you'd like more tests to cover subscription execution. Let me know how you'd like to proceed there.

leebyron added a commit that referenced this pull request Oct 20, 2015
[RFC] Initial pass at adding `subscription` to executor
@leebyron leebyron merged commit 39a2ebc into graphql:master Oct 20, 2015
@leebyron
Copy link
Contributor

This is really solid. Thanks for your hard work and for your patience for my review!

@grydstedt
Copy link

Awesome stuff

@arunoda
Copy link

arunoda commented Nov 7, 2015

One question. How do you unsubscribe? or close the subscription.
Don't we need to have something for that in the GraphQL spec itself.

@kevinSuttle
Copy link

@kevinSuttle
Copy link

@Matthias247
Copy link

Hi,
are there any updates or decisions for this? We are currently investigating if GraphQL could be a viable data query solution for some of our [realtime] systems, but subscriptions are an absolutely necessary thing for us.

Some input and feedback from my side:
I think @OlegIlyenko did a really good writeup.

I totally agree that GraphQL should neither prescribe

  • on how subscriptions should be delivered to the client
  • whether subscriptions are live-query-like or event-like (which also means they either have or don't have an initial value)

The event based solution which is described in @OlegIlyenko's further examples (and also similary in this post: http://graphql.org/blog/subscriptions-in-graphql-and-relay/ ) most makes a lot of sense if you already have organized your backend in an event processing form.

If not (you only store the latest) it might not be the best way for you:
When you use events in addition to queries to transfer the state of some specific thing you will have to responsibility on the client to reconstruct the true state from query results as well as an event stream, which will be harder than simply getting the latest state through a new query.
If you separate subscriptions and getting the initial value then you might also run into race conditions: In the Facebook blog post example you might never see the correct likers for a story, e.g. under this condition:

  • you query the story and get some initial result (initial number of likers)
  • in parallel you subscribe (which triggers a subscription through another system, like MQTT in this example)
  • before this succeeds the number of likers changes.
  • You won't see that reflected in the initial result (which was already sent) and also not through the subscription (which will only trigger if some other person will also likes the story/ triggers the event - which might never happen).

This means if you want to get reliable updates in that fashion you also need store some history of events and provide that on demand to the client (like @OlegIlyenko showed with droidEvents(lastSeenEventId: xyz), and need logic in the client that merges query and event results.

Live queries make that easier, but as the FB blog post discusses, they might be hard to implement.

I think what is needed for all scenarios is a way to get some subscription context information inside the resolve function:

  • I need to know that I was subscribed to (and from whome)
  • I need to tell that component to rerun the query execution if something changes.
  • I need to know when an unsubscribe happens (to be able to handle also internal unsubscriptions)
  • I might need a way to tell the executor that no immediate resolve result is available (in the cast that I want to use the subscriptions for events that might only happen in the future).

In the beginning I thought that probably every field could return an observable for subscriptions, but I don't really know how this would work out with subscriptions. The interesting thing about GraphQL is that we have nested queries, and therefore even while a client is subscribed to the same query internal subscriptions might need to be cancelled and recreated.

Another options that I thought about was that the resolve function could pass some kind of notify callback function to the implementer which he can store and call back later to signal the executor (or the thing that triggered the executor) to rerun the query. This could then return something that signals that the subscription is no longer in use. But this would delay an unsubscribe until an event must be sent (probably never -> resource leak). Delivering an EventEmitter or observable in parallel to the notify function could solve this. But nested queries would still be hard. If you want to be totally agnostic then only an opaque subscription object can be forwarded into the resolve function, which only needs to be understood by the server (which servers subscriptions and needs to rerun queries) and the user implementation.

On how to get subscription data from the server to the client:
There are lots of possibilities. @OlegIlyenko outlined one with using an SSE stream for that. Probably not the best solution for HTTP/1.1 because the number of parallel connections that you can make is very limited, but at least GraphQL allows you to compose all kind of needed data so you probably won't run into limitations there. HTTP/2 solves this.
With HTTP/2 and response body streaming (fetch API) you could also use other serializations to get events in the correct order through a single channel from server to client (e.g. send JSON serialized events with length-prefixing in the response body).
And of course you can also use websockets to deliver subscriptions: You send a well-defined subscription message including the query and some kind of client-subscription-id to the server and it will answer you with a series of subscription event messages until you unsubscribe.
Or you store all events for a given subscription on the server and rely on the client to fetch it with HTTP polling somehow.

Although not necessarily needed it might make sense to standardize some of those mechanisms so that GraphQL server implementations are compatible - and compatible with tooling like GraphiQL).

@Matthias247
Copy link

In the meantime I came to the conclusion that internal subscriptions (the things that need to be monitored in order to provide updates to a client) can not be valid for longer than 1 update in the general case with GraphQL.

As an example lets look at the following subscription query:

subscription TopPlayerWithFriends {
    players(sortBy: score, length: 1) {
        friends(length: 5) {
            name
        }
    }
}

In order to provide live updates for this subscription we would internally need to have a subscription on the list of players on the server. Additionally we need to to have a subscription on the list of friends for the top player. As soon as one of those internal dependencies changes we need to rerun the query in order to deliver an update to the client. During this update our internal subscriptions would be partly invalidated and must be updated -> We still need to track the list of players, but we might night to track another players friends list.

Depending on how you model your interface and data updating internal subscriptions is not necessary, but for the general case it seems to be. This means for me that returning observables from the first query execution and monitoring them for delivering subscriptions would probably not work out too good.

It also means that @arunoda & Co might need to check how their invalidation server approach for reactive GraphQL covers this scenarios. I guess it won't if the dependencies are only sent during the initial subscription and not updated along with the individual subscription results?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.