Skip to content
This repository has been archived by the owner on Apr 17, 2023. It is now read-only.

Query structure, session pinning #1980

Open
lastmjs opened this issue Sep 2, 2020 · 13 comments
Open

Query structure, session pinning #1980

lastmjs opened this issue Sep 2, 2020 · 13 comments
Labels
question Further information is requested

Comments

@lastmjs
Copy link

lastmjs commented Sep 2, 2020

Hey, I'm attempting to deploy Graphback into a serverless environment (AWS Lambda) that uses AWS RDS Proxy to manage connections to the database. The proxy is supposed to multiplex connections, but it seems there is not multiplexing occuring whatsoever. This is leading to extremely bad performance.

I'm just wondering if there is anything in the PostgreSQL queries being created by Graphback that might cause session pinning in AWS RDS Proxy.

From the AWS docs linked to below:

Multiplexing is more efficient when database requests don't rely on state information from previous requests. In that case, RDS Proxy can reuse a connection at the conclusion of each transaction. Examples of such state information include most variables and configuration parameters that you can change through SET or SELECT statements. SQL transactions on a client connection can multiplex between underlying database connections by default.

Does Graphback set any kind of state information needed across requests? I can't believe how poorly my proxy is performing, and I hope to have a little insight if anyone here knows. Thank you!

More information on pinning: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/rds-proxy.html#rds-proxy-pinning

@lastmjs lastmjs added the question Further information is requested label Sep 2, 2020
@machi1990
Copy link
Contributor

Automatically generated comment to notify maintainers
/cc @craicoverflow, @machi1990, @wtrocki

@lastmjs
Copy link
Author

lastmjs commented Sep 2, 2020

I think I've tracked down the most likely candidate...prepared statements. Does Graphback use prepared statements? Is there any way to turn them off if so?

@lastmjs
Copy link
Author

lastmjs commented Sep 2, 2020

Digging into my logs more I get to the following message:

2020-09-02T05:36:40.995Z [WARN] [clientConnection=3710375619] The client session was pinned to the database connection [dbConnection=1366424556] for the remainder of the session. The proxy can't reuse this connection until the session ends. Reason: A parse message was detected.

A parse message was detected.. Seems this might be related to extended queries? Any insight here? Is there anyway I can get rid of parse messages from within Graphback or Knex?

@lastmjs
Copy link
Author

lastmjs commented Sep 2, 2020

Seems this is going to be a problem with pg itself

@wtrocki
Copy link
Contributor

wtrocki commented Sep 2, 2020

We do not use prepared statements. Executed queries are very vanilla (anything that knex would do), however I believe that problem is related to your app configuration and unrelated to how we do queries.

I do think that if you use serverless and proxy, knex needs to be properly configured to not create connection pool (the part inside your app that setup knex database, similar to our templates).

https://github.com/aerogear/graphback/blob/master/templates/ts-apollo-postgres-backend/src/db.ts#L17

Pinning cannot be avoided anyway in some situations but if you have an external pool source we need to configure knex to initiate a connection every time it makes request. I would also see this as possible problem with the slowness of lambda as creating connection pool slows them.

To confirm my thesis I have configured proxy on AWS and used pooling:

How this causing issues

If I setup knex to have pool of 50 connections it takes the entire proxy connection pool on each lamba startup.

This is not only going to take lots of time for the startup but also going to block any other functions from accessing the pool. I do see connections being released after lambda is finished but with some delay etc.

What our team can get from this

Setting up graphback as a serverless function took me quite some time (without even touching subscription support) so it will be cool if we build some community template that graphback can use.

@wtrocki
Copy link
Contributor

wtrocki commented Sep 2, 2020

@lastmjs let me know if that helps you. As team we are devoted to your success with graphback. It was really nice to play with AWS capabilities - I never used lambdas and proxy before.

@lastmjs
Copy link
Author

lastmjs commented Sep 2, 2020

Thanks for getting back to me, you guys still rock!

So, as for your suggestion here:

I do think that if you use serverless and proxy, knex needs to be properly configured to not create connection pool (the part inside your app that setup knex database, similar to our templates).

How would I do this? I already have knex configured to have a min pool of 0 and max pool of 1. I'm not sure this is the issue, though I would like to try your suggestion.

From what I understand, knex makes a prepared statement every time. So all graphback queries are prepared statements. I found that out here: knex/knex#3636 (comment)

From all of my research, it seems that prepared statements are not supported in postgres proxies that do transaction pooling. This is true for AWS RDS and for pgBouncer. I've linked to the AWS documentation that says this above, and here is what pgBouncer has to say about it: https://www.pgbouncer.org/faq.html#how-to-use-prepared-statements-with-transaction-pooling

So really it seems to be the prepared statements. Not supported by transaction pooling proxies. Knex does prepared statements for everything, and there is no way to turn them off.

I've collected my research on the matter here on the node-postgres repo: brianc/node-postgres#2327

One avenue that I have not gone down yet is attempting to use the libpq native bindings in node-postgres to access their binary mode or their single roundtrip mode. I'll link to the issues on this below. They seem to be having good results with transaction pooling on pgBouncer, so it seems likely this would also work for AWS RDS proxy. I am not yet sure how I would configure this on node-postgres, but the first step is to get the native libpq compiled on an Amazon Linux instance.

More about the native libpq possible solution:

@wtrocki
Copy link
Contributor

wtrocki commented Sep 3, 2020

It is still maintaining connection especially that AWS lambdas.
Check knex/knex#3464

There is sample app (broken) that knex maintainer explains how to fix.

@wtrocki
Copy link
Contributor

wtrocki commented Sep 3, 2020

Alternatively, it will be better to create fastify middleware to disconnect database (that will be quick and easy) if pooling tricks do not work - I see they do.

@lastmjs
Copy link
Author

lastmjs commented Sep 4, 2020

There is sample app (broken) that knex maintainer explains how to fix.

I've been doing quite bit of testing in my staging and production environments. I assume the fix you mentioned from knex/knex#3464 has to do with letting the connection pool die. I fiddled with the idleTimeoutMillis, at 1000 and 0...my database was still overwhelmed. I also manually initialized a knex pool and destroyed it before and after calling the handler in my lambda. The database was still overwhelmed. It seems if I have requests that take any sort of time (simple requests that are extremely short seem to be handled fine), then the database is overwhelmed. I'm not sure what to do now but try pgBouncer (perhaps transaction pooling will magically work based on some comments I've seen), and maybe to consider a postgres-compatible database built for serverless loads like Aurora.

Have you had success with the pooling tricks you're mentioning?

@wtrocki
Copy link
Contributor

wtrocki commented Sep 7, 2020

I moved out and no longer have access to the env. Using Aurora seems like the best choice in terms of efficiency and cost. Let us know how this works.

@lastmjs
Copy link
Author

lastmjs commented Sep 8, 2020

I will let you know. I'm looking into optimizing my queries with indexes, VACUUM and ANALYZE. That doesn't seem to be having any affect. It looks like scaling the database with read replicas or going with Aurora Serverless might be the next options. I'll keep you updated.

@lastmjs
Copy link
Author

lastmjs commented Sep 9, 2020

After all of my testing, the best path forward seems to be Aurora Serverless with the Data API. The Data API claims to abstract away connection handling. I'll be extensively testing it in my staging environment. If it looks promising, I'll most likely be moving forward with it.

The problem now is how to hook up Graphback to use the Data API. It is an HTTP API that accepts SQL queries directly. I believe I would need to somehow intercept the SQL queries from knex or node-postgres to get this to work. Unfortunately, that path does not look very promising. I would love to know if you know of a way that this can be done.

Unfortunately, this might be the end of the road for me and Graphback on the project I've been using it on. I love Graphback, but we need more stability and conservatism moving forward. Graphback not being easily compatible with the Data API seems to be the nail in the coffin. We have to have direct control over our SQL queries.

I appreciate all of the help, and again if there is any simple way to allow Graphback to use the Data API, I would love to know soon.

@machi1990 machi1990 removed the triage label Sep 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants