Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use DataLoader in GraphQL logic to compose In clause for SQL query #125

Open
landonreed opened this issue Jun 8, 2018 · 5 comments
Open

Comments

@landonreed
Copy link
Contributor

Currently, nested objects in a GraphQL request are fetched in separate SQL queries. For example, if requesting the trips for a pattern with nested stop_times, one query will be made for the n trips and then n SQL queries will be made to fetch the stop_times.

We should use a BatchLoader to make this more efficient.

cc @abyrd.

@abyrd
Copy link
Member

abyrd commented Jul 10, 2018

Oh nice, I always cringed a little at cascading hundreds of SQL queries, it's good to know there's a more efficient way to handle that.

Example:

SELECT * FROM stop_times WHERE trip_id = 'a';
SELECT * FROM stop_times WHERE trip_id = 'b';
SELECT * FROM stop_times WHERE trip_id = 'c';

can be combined to

SELECT * FROM stop_times WHERE trip_id in ('a', 'b', 'c');

There are also query simplifications that can be done by joining tables rather than cascading queries, but the above is one of the more straightforward simplifications.

@landonreed
Copy link
Contributor Author

Yes, I'd like to fast track this because the way we're doing it now is super inefficient. Every time I see those redundant SQL queries stack up in the logs I also cringe.

@evansiroky
Copy link
Contributor

I'm not sure BatchLoader would help in this case. The BatchLoader works well for when the same entity is repeatedly being requested as a result of being seen in different levels of the graph. However, another tool may exist that can do SQL joins. I'll try to look into that.

@abyrd abyrd changed the title Use BatchLoader in GraphQL logic to compose In clause for SQL query Use DataLoader in GraphQL logic to compose In clause for SQL query Sep 15, 2018
@abyrd
Copy link
Member

abyrd commented Sep 15, 2018

If I'm not mistaken the device we're talking about is called a DataLoader rather than a BatchLoader (https://github.com/graphql-java/java-dataloader). I changed the name of the issue, but please correct me if I'm wrong.

@evansiroky I think the cacheing is just one of two optimizations, the other of which is batching requests. Most of the articles I've looked at on the subject mention batching requests together.

https://spin.atomicobject.com/2017/05/15/optimize-graphql-queries/
https://medium.com/@gajus/using-dataloader-to-batch-requests-c345f4b23433
http://www.petecorey.com/blog/2017/08/14/batching-graphql-queries-with-dataloader/

@landonreed
Copy link
Contributor Author

#144 begins to address this. However, the original intent of this ticket was to reduce the number of SQL queries for a GraphQL query that has nested entities (e.g., n trips with nested stop_times) from n individual queries for the nested entity to one query that has an "in clause" containing all of the parent join values. We could then add logic to the DataLoader that is able to filter out the results for each JdbcQuery.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants