-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instrumentation #27
Comments
Yes, that code example tracks the start and end of resolving a field, not the time it spends actually executing those fields, so no other integration is needed with graphql-ruby. You would see a similar problem any time fields are able to resolved completely concurrently, such as from thread/connection pools, locking or CPU intensive tasks in a parallel executor. The problem really is about instrumenting batch loaders.
Although the resolving is mostly happening in the batch loader, the resolver could still take significant time. I don't think this fact should be ignored, since the resolver might end up doing an unbatched load or non-trivial computation before doing the batch load, so assuming the start of execution happens in the loader would be misleading in a very important case. If you actually want to visualize the batch execution, then you will need to make sure that you can collect more than just the start and end time for resolving a field. The other thing that will complicate the instrumentation that you didn't mention is that GraphQL::Batch::Promise callbacks get called immediately after they are fulfilled, so a callback that does non-trivial computation or an unbatched query could make it look like that time was spent in the batch loader itself.
One possibility here is to allow the application to configure the executor so they can mixin instrumenation into it. I think this would also be useful to support parallel or concurrent execution. Another would be to have a subscription and notification system that would allow code to round around the batch load, which would allow it to get the start and end time.
Ideally I would like to move the async execution strategy support into graphql-ruby and to decouple the batch loaders from graphql, since I could see batch loading being useful outside of graphql (e.g. batch loading data accessed during liquid rendering). As such, I don't plan on having the batch loaders track these fields, but I do think it should allow this to be done. However, by having a way to run code around field resolving and batch loading, it would already be possible to keep track of the context in which a batch load is requested. So I think the only thing missing would be instrumentation for when GraphQL::Batch::Loader#load is called. I was already planning on making it so that GraphQL::Batch::Loader#load would make sure the loader was registered with the executor, so that the GraphQL::Batch::Loader could be used to cache loads across a GraphQL request. That means this instrumentation could also be handled by the executor. |
I assume you mean "any time fields not are able to be resolved..."?
You're right of course. I think a current line of thought is to measure times for field Although thinking about this more (and considering your post-batch processing time point), we may need to get even more nuanced if we want to be completely correct here. It's certainly conceivable that a field may end up executing over multiple non-contiguous blocks of time before resolving in some execution models. I'm not sure if we want to fudge this fact or not.
This makes a lot of sense to me also. |
Some expanded current thinking of data to collect: https://github.com/apollostack/optics-agent/blob/pcarrier/proposals/timing/proposals/timing.md |
@dylanahsmith looking at this again, and working through the new Then when the first promise in the batch calls Am I making sense here? |
Following up from rmosolgo/graphql-ruby#354 (comment):
@dylanahsmith :
Actually the above does not work (or at least not in a particularly useful way). The problem is that the executor does not necessarily run the loader associated with the field immediately, instead it may wait for some other loaders (or fields I suppose) to execute first. So the "start time" logged above (when the field is added to the loader), is not really "correct", in terms of the work done to fetch the field.
As an example, if I run a query against our example GitHunt server that looks something like:
You end up seeing traces that look like:
Note in the above that the start time of all the fields is more or less the same (as the executor runs over the set of entries, and they are all added to the various loaders), and the total time is "cumulative". In actuality the
vote
loader is more or less instantaneous (running against a local sqlite db in this case), and a correct start time for those fields should really be at the end of therepository
loader (so I guess 2.03ms in this SS).This is why I think for proper instrumentation of the batch loader I think we need to know two things:
The first part is trivial I suppose, but the second seems tricky.
The text was updated successfully, but these errors were encountered: