This repository contains a full implementation for a PosgreSQL database proxy using HTTP and WebSockets.
Establishing a connection to a PostgreSQL database is a rather expensive operation. This is a widely known issue, and that's why connection pools are normally used to alleviate the issue.
That said, connection pools are best suited for long-lived server processes, and don't play well with "serverless" types of deployment (for example AWS Lambda Functions or CloudFlare Workers).
AWS introduced RDS Proxy to alleviate some of the problems related with Lambda functions, and introduced a data api (available only with RDS Serverless v1) that allowed HTTP-based access (a great help with CloudFlare workers), but with the endpoint deprecated, no other solutions exist at this time.
This PostgreSQL proxy is a tiny server providing acccess to pooled PostgreSQL connections over HTTP (for HTTPs use a load balancing solution like NGINX or AWS Application Load Balancer).
Single queries can be executed with POST, while multiple statements (such as multi-step transactions) can be executed over the WebSocket protocol.
The protocol is extremely trivial and only strings (in the format defined
by libpq
are passed over-the-wire, while conversion from said strings to
their JavaScript equivalent is performed by the client.
Clients are available for
- Node JS:
@juit/pgproxy-client-node
'Nuff said! - WHATWG + WebCrypto:
@juit/pgproxy-client-whatwg
Although this client can be used in a browser, it is specifically designed to work with CloudFlare Workers (and tested withworkerd
). - Direct via LibPQ:
@juit/pgproxy-client-libpq
Defeating the purpose of the client-server model, used normally only for running unit tests or in development environments.
The server component of the proxy is provided as an installable library in
@juit/pgproxy-server
, and an extremely simple command
line interface can be found in the @juit/pgproxy-cli
package.
@juit/pgproxy-cli
Command line interface to easily run our PGProxy Server.@juit/pgproxy-client
Abstract implementation of the PGProxy Client and registry for actual implementations, somewhat inspired by the way JDBC drivers work in Java-land.@juit/pgproxy-client-node
Concrete implementation of the PGProxy Client for Node JS.@juit/pgproxy-client-psql
Test PGProxy Client implementation usinglibpq
directly.@juit/pgproxy-client-whatwg
Concrete implementation of the PGProxy Client for CloudFlare Workers.@juit/pgproxy-persister
A higher-level interface offering (on top of the usual connection and query interface) a CRUD abstraction over database tables.@juit/pgproxy-pool
Connectivity layer to PostgreSQL vialibpq
offering connection pooling.@juit/pgproxy-server
Library providing the main implementation of our PGProxy Server.@juit/pgproxy-types
Library providing type conversions betweenlibpq
strings and JavaScript objects.
The protocol used by PGProxy is extremely trivial. Both POST
and UPGRADE
are available any path (the request path is completely ignored) as the
server, by design, exposes one and only one interface to a single PostgreSQL
database.
Load balancers can (and should) be used to group multiple connections mapping them to different request paths, and to provide SSL.
Authentication is performed by specifying the auth
query string parameter with
a token as described here. We rely on query string parameters,
rather than headers, because by design WebSockets do not provide a way to set
custom headers alongside the UPGRADE
request.
As tokens are unique per request (tied to their timestamp, and can never be reused) we felt this was an acceptable compromise rather than implementing a full ticketing system as common in WebSocket scenarios (mainly performace).
{
// a unique id to correlate requests and responses (normally a random UUID)
"id": "...",
// the SQL query to execute
"query": "SELECT ...",
// optional parameters to be substituted in lieu of "$n" in the query string
"params": [ "foo", "bar", ... ],
}
{
// the same ID from the request (copied verbatim)
"id": "...",
// the status code, as in HTTP, always 200 for _positive_ responses
"statusCode": 200,
// the command associated with the result (e.g. "SELECT", "INSERT", ...)
"command": "SELECT",
// the number of rows _affected_ (e.g. the number of added rows in "INSERT")
"rowCount": 123,
// the result fields tuples (in column order) indicating name and OID
"fields": [
[ "foo", 25 ], // the "foo" column (index 0) is of type "text"
[ "bar", 16 ], // the "bar" column (index 1) is of type "bool"
],
// the result rows
"rows": [ // |_____foo_____|__bar__|
[ "some text", "T" ], // | "some text" | true |
[ null, "F" ], // | null | false |
],
}
{
// the same ID from the request (copied verbatim)
"id": "...",
// the status code: 400 for SQL errors, 500 for any other error
"statusCode": 400,
// the error message to return to the client
"error": "... the error message ...",
}
Performance is one of the main points of PGProxy, and while a lot of
optimizations can still be done, it already outperforms dramatically a
typical installation with pg
.
The testing scenario is as follows:
- PosgreSQL 13.8 database deployed on AWS (RDS Serverless V2)
- Load balancing and SSL off-loading are performed by AWS ALBv2
- PGProxy 1.0 is deployed on an EC2 instance (
t4g.small
) - Test machine sitting approximately 25ms away from AWS (fiber connection)
- All queries performed are a trivial
SELECT now()
- Everywhere, NodeJS v18.17.1
With plain pg
the test machine tried two tests:
- without recycling connections: (
connect
,query
,disconnect
) 100 times - using a single connection: (
connect
,query
100 times,disconnect
)
Using the @juit/pgproxy-client-node
the test machine tried three tests:
- repeating
query
100 times directly on the client (using HTTP POST) - repeating
connect
thenquery
100 times (using one WebSocket perquery
) - repeating
query
100 times within aconnect
callback (reusing a single WebSocket)
The results are as follows:
pg
with multiple connections: ~202 ms per querypg
reusing the same connection: ~45 ms per querypgproxy
using HTTP POST: ~70 ms per querypgproxy
executing one query per WebSocket: ~72 ms per querypgproxy
executing all queries over the same WebSocket: ~30 ms per query
So, overall, and in all scenarios (whether running a single query, or multiple
queries over the same connection) pgproxy
outperforms significantly
the standard pg
library.
Q.E.D.