Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS RDS and Extended Query Protocol #2326

Closed
lastmjs opened this issue Sep 2, 2020 · 19 comments
Closed

AWS RDS and Extended Query Protocol #2326

lastmjs opened this issue Sep 2, 2020 · 19 comments

Comments

@lastmjs
Copy link

lastmjs commented Sep 2, 2020

Hi! Amazing library, thanks for all of the hard work.

TLDR: The extended query protocol makes the AWS RDS Proxy entirely useless. Is there any way to turn off the extended query protocol in pg?

I have a serious issue I'm facing in a serverless environment. I'm using AWS Lambda, and directly connecting to postgres has caused horrific performance and stability issues because of the vast numbers of connections being opened and closed. So, the solution from AWS seems to be to use their RDS Proxy. It sits in front of postgres and handles a warm connection pool. I've set it all up properly. Unfortunately, all of my connections are being session pinned, essentially rendering the proxy useless, and it even seems to be performing worse than without the proxy. I've looked at my error logs and found this error repeatedly:

The client session was pinned to the database connection [dbConnection=251482716] for the remainder of the session. The proxy can't reuse this connection until the session ends. Reason: A parse message was detected.

Reason: A parse message was detected.. This seems to come because of the extended query protocol. Apparently the extended query protocol causes pinning, as documented here: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/rds-proxy.html#rds-proxy-pinning

So my real question is, is there any way to turn off the extended query protocol in pg? If not, does anyone have any insight into how I can properly manage connections to my database in a serverless environment? This is a major issue for us at this time, and I'm not sure how to proceed

@boromisp
Copy link
Contributor

boromisp commented Sep 2, 2020

You probably won't be able to use prepared statements, so you should try to use a query building library.

https://github.com/datalanche/node-pg-format
https://github.com/felixfbecker/node-sql-template-strings
https://github.com/vitaly-t/pg-promise#formatting-filters

@lastmjs
Copy link
Author

lastmjs commented Sep 2, 2020

Thank you. I am already using Knex, and it uses prepared statements and it doesn't look like you can turn those off: knex/knex#3636 (comment)

Do you suggest it will be impossible to use AWS RDS Proxy with prepared statements in any way? What are the other common proxies that might support this? I'm currently looking into pgbouncer

@lastmjs
Copy link
Author

lastmjs commented Sep 2, 2020

This #2266 is looking very related to my problem. I might be able to get the RDS proxy to work if I can do something like binary parameters as described in this post: https://blog.bullgare.com/2019/06/pgbouncer-and-prepared-statements/

Seems that the RDS Proxy uses transaction pooling only, and pgbouncer uses session and transaction pooling. I would love to avoid setting up pgbouncer if possible, and I'm not even sure that session pooling would work well.

Is there something like binary parameters in pg?

@boromisp
Copy link
Contributor

boromisp commented Sep 3, 2020

From the linked page:

Prepared statements cause the proxy to pin the session. This rule applies whether the prepared statement uses SQL text or the binary protocol.

Even the single round-trip, unnamed prepared statements will pin the session, there doesn't seem to be a way around that.
The pgBouncer might support these in transaction pooling mode, but I haven't personally used either.

@eugene-kim
Copy link

If not, does anyone have any insight into how I can properly manage connections to my database in a serverless environment? This is a major issue for us at this time, and I'm not sure how to proceed

@lastmjs I'm using Knex with node-postgres as well and I believe we have the same problem.

What do you think about using Knex as a querybuilder and using another Postgres client with the extended query protocol off to make the requests to the RDS Proxy? It would be hacky and you'd want a way to sanitize input as much as possible before sending it over to the DB but it may help get the ball rolling until a better solution is found. I'm wondering if I'm missing something, though.

@lastmjs
Copy link
Author

lastmjs commented Oct 3, 2020

Essentially you're creating another proxy to use the RDS proxy, and yes sanitization is a huge problem. I don't believe there is a way to do it and be perfectly safe, unless you use the postgres parameterized queries. I searched as many libraries as I could, and didn't find anything that would let me safely sanitize SQL queries that I can just send as raw.

We've come to the conclusion that RDS proxy will not work for us. If they can somehow implement transaction pooling, it may work in the future, but not now. We have stopped using knex, and are just using the node-postgres API with parameterized queries. We have set up our own pool on a server, essentially just a node-postgres pool exposed with a simple express server. I've written some code to handle transactions from our clients, and we're getting close to pushing it to production.

I'm not sure there's another way to get around these problems. I've been intensely trying to solve this for the past month or so. I'll let you know if our current solution works.

Something else we tried was using Aurora Serverless with the Data API. From my benchmarking, that should also have worked. But the Data API has too many limitations, the biggest one being that responses are limited to 1MB. That was a big problem, so we decided against it.

@eugene-kim
Copy link

Many thanks for the details response, @lastmjs

Would love to hear how it goes!

We've come to the conclusion that RDS proxy will not work for us. If they can somehow implement transaction pooling, it may work in the future, but not now.

Did you mean session pooling? It was my understanding that RDS Proxy already used transaction pooling

@lastmjs
Copy link
Author

lastmjs commented Oct 4, 2020 via email

@lastmjs
Copy link
Author

lastmjs commented Oct 4, 2020 via email

@brianc
Copy link
Owner

brianc commented Oct 4, 2020

But in node-postgres and I imagine other clients, parameterized queries are generally implemented as prepared statements.

There is no difference at the protocol level, actually. Any query with parameters (whether or not you call it a prepared statement) requires a parse/bind/describe/execute series of messages. Either way it "prepares" the query during the parse phase. Sorry the RDS proxy is so picky about parameterized queries...that's quite limiting as doing parameters in queries with string concatenation and client side sanitization is not ideal. Let me know if there's anything you can think of from this libraries side to help...but sounds like an RDS issue for now.

@brianc brianc closed this as completed Oct 4, 2020
@eugene-kim
Copy link

I also have official AWS support responses confirming that the RDS Proxy will not work for my use case, it will not pool prepared statements. And I think I made a mistake in the previous comment. I don't believe it's parameterized queries that are the problem per se, but prepared statements. But in node-postgres and I imagine other clients, parameterized queries are generally implemented as prepared statements.

This has provided me closure as well. Thanks for keeping me in the loop @lastmjs , much appreciated

@wesleyabbey
Copy link

You probably won't be able to use prepared statements, so you should try to use a query building library.

https://github.com/datalanche/node-pg-format
https://github.com/felixfbecker/node-sql-template-strings
https://github.com/vitaly-t/pg-promise#formatting-filters

Thank you.
Sanitizing our queries with something like pg-format helped prevent RDS Proxy from session pinning.

@lastmjs
Copy link
Author

lastmjs commented Feb 12, 2021

I should have updated you all a while ago...I'm not sure how much of what I have said in this thread can be relied upon, because I found the source of my issues. There were 1 or 2 extremely unoptimized queries that were taking down the performance of our entire system. Once I cleaned those up, the problems resolved. We've been using Lambdas with a pool size set to 1 for months now, and all is working very well.

What I discussed with the AWS people should be useful still, because the benchmarking I did I believe still showed that prepared statements cannot be proxied efficiently. But just know that my actual problems came from my own queries...haha, good times right?

@eugene-kim
Copy link

Awesome! Good to hear that you got to the bottom of it @lastmjs

Is your takeaway that even with session pinning the proxy can handle your serverless loads sufficiently? Or something else?

@Prophet32j
Copy link

we were experiencing this issue with version 7.18. We upgraded to 8.6 and were successfully connecting with prepared statements through the proxy. Yes, you will have session pinning, but RDS Proxy can handle it.

@AtherBilal
Copy link

I'm curious what ratio people are between DatabaseConnectionsCurrentlySessionPinned and ClientConnections. I'm using Knex and I've tried to optimize my queries as much as I possible can but my ratio is still very low

@picosam
Copy link

picosam commented Feb 4, 2022

@lastmjs thanks a lot for providing so much insight! If I may ask, do you create your pg client outside of lambda handlers; and do you limit it to a single connection or do you also allow it a pool size? I'm assuming you are still using RDS Proxy.

@lastmjs
Copy link
Author

lastmjs commented Feb 4, 2022

@picosam I don't really remember, I don't work for that employer now and I've transitioned entirely into Web3. I think we created it outside of the handler, and I think we used a single connection, but really I don't remember. I'm sure I did the most sensible way after trying out or researching many ways of doing it.

@timothymathison
Copy link

AWS may have finally fixed the issue with RDS: https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-rds-proxy-postgresql-extended-query-protocol/
So I believe unnamed prepared statements used by the extended query protocol should no longer trigger pinning in RDS proxy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants