Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement prepared statements with LRU cache and statement deduplication #618

Conversation

zainkabani
Copy link
Contributor

@zainkabani zainkabani commented Oct 13, 2023

This PR reimplements prepared statements in PgCat. There was a large latency regression when running PgCat with multiple clients and prepared statements, this is most likely due to cache misses and needing to prepare multiple statements on the server.

This new approach hashes the Parse content (excluding the name) as a key to a map which holds rewritten parse messages, this greatly increased the number of cache hits we get and deduplicates any of the same statements that different clients send with different names.

Prepared statements in action

Running a test against a Baseline PgCat with the extended protocol and a Feature PgCat with prepared statements we were able to get some data. The test runs 12 unique prepared statements against an empty table, this helps us exclude actual PG compute from other parts of the query execution process.

We were able to get a slight latency win, about 5% faster
image

But even more interestingly we saw the prepared statement database instance's CPU usage was less than half of the extended protocol instance. While an extreme case, this helps to show how much time can be saved by pre-planning queries and reducing the amount of work PG has to do.
image

Other notes and features:

  • PgCat will try to detect and invalid statement cache errors by deallocating all prepared statements on the server when it gets the cached plan must not change result type error message. This helps to fix things when there are DDL changes that invalidate cached plans
  • Rewrites the extended protocol buffering logic to better support prepared statements and queue pgcat generated response messages
  • Removes the prepared_statements variable and moves the prepared_statements_cache_size to the pool configuration level, this is because when enabled prepared statements will incur a small packet inspection penalty to determine if a statement is named or not
  • Fixes stats for prepared statements, and adds a eviction stat for when a prepared statement is evicted from a server's LRU

Equivalent latency to pgbouncer, all running with 1000 cache size, 50 max connections and 1 thread

Pgbouncer

➜  ~ pgbench -T 120 -c 20 -j 3 --protocol prepared postgresql://postgres:postgres@127.0.0.1:7432/postgres
pgbench (14.9 (Homebrew), server 15.3 (Debian 15.3-1.pgdg120+1))
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: prepared
number of clients: 20
number of threads: 3
duration: 120 s
number of transactions actually processed: 36749
latency average = 65.345 ms
initial connection time = 3.273 ms
tps = 306.069304 (without initial connection time)

PgCat

➜  ~ pgbench -T 120 -c 20 -j 3 --protocol prepared postgresql://postgres:postgres@127.0.0.1:6432/postgres
pgbench (14.9 (Homebrew), server 15.3 (Debian 15.3-1.pgdg120+1))
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: prepared
number of clients: 20
number of threads: 3
duration: 120 s
number of transactions actually processed: 36808
latency average = 65.225 ms
initial connection time = 9.805 ms
tps = 306.632400 (without initial connection time)

Implementation details:

TLDR:

  • Connection pool has an LRU of hash of query to rewritten parse
  • Client has a mapping of the name it's using to the rewritten parse
  • Server has an LRU of the names of rewritten the prepared statements it has

ConnectionPool:
This connection pool has a new attribute which looks like this:
cache: LruCache<u64, Arc<Parse>>

It is an LRU cache where the Key is a hash of the contents of Parse packet excluding the name ie. (query string, number of params, param types). The value is an Arc of the rewritten Parse message (it's an Arc to avoid duplicating data that is given to the clients).

When a Parse comes in, the pool determines if it the query already exists (ignoring the name), if it exists it clones the Arc<Parse> and gives it to the client, if it doesn't exist it creates a new name for the packet that will be used to prepare it against the server. This is based on a global counter that increments every time we need to generate a new name.

If we exceed the capacity of the LRU then the Arc<Parse> is simply dropped. There is no other management to be done since any clients that still need it have a copy of the Parse and can just add it back to the cache if needed.

Client:
The client has a new attribute which looks like this:
prepared_statements: HashMap<String, Arc<Parse>>

This is a mapping of the Prepared Statement names set by the client to the rewritten Parse messages. When a new parse comes in it checks against the connection pool for the rewritten Parse message and stores it in this map. Any statements that come in (Bind, Describe) will be rewritten based on the rewritten Parse messages name.

The main reason we're using an Arc for the Parse messages is because clients that use a prepared statement will all need the original contents of the Parse (especially in the case when the message is evicted from the ConnectionPool) and that can take up memory, by using an Arc we have one copy of the Parse and the client stores a reference to this. The Parse is released from memory when the client is dropped and it no longer exists on the ConnectionPool.

Server:
The server has a new attribute which looks like this:
prepared_statement_cache: LruCache<String, ()>

The server needs to know which prepared statements it has, this mapping stores the rewritten names in an LRU. The client prepares the Parse on the server if needed (it might not already have it) and it updates (adds/promotes) the name of the statement to the ConnecctionPool's prepared_statement_cache.

If a statement needs to be evicted from the cache, pgcat will send a close message to the server to drop it.

@zainkabani zainkabani changed the title Reimplement prepared statements with global LRU cache and statement deduplication Reimplement prepared statements with LRU cache and statement deduplication Oct 16, 2023
@zainkabani zainkabani force-pushed the zain/reimplment-prepared-statements-with-global-lru-cache branch 2 times, most recently from 8d15135 to cfe8e9f Compare October 20, 2023 21:27
@@ -568,6 +556,9 @@ pub struct Pool {
#[serde(default)] // False
pub log_client_parameter_status_changes: bool,

#[serde(default = "Pool::default_prepared_statements_cache_size")]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to pool instead of global because there is a small packet inspection penalty you pay when this is enabled to determine if packets are using named statements or not

src/server.rs Outdated
@@ -957,6 +970,42 @@ impl Server {
if self.in_copy_mode {
self.in_copy_mode = false;
}
// TODO: consider logging a warning here

if self.prepared_statement_enabled {
Copy link
Contributor Author

@zainkabani zainkabani Oct 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say we have a statement that uses select * on a table but a new column is added after we prepared it, PG will send a cached plan must not change result type error if we try to use this statement.

This change tries to identify this type of error and DEALLOCATE ALL on the server connection to force re-prepares. Clients will still see errors the first time for each server that hasn't deallocated but this will help clean up the pool.

@zainkabani zainkabani force-pushed the zain/reimplment-prepared-statements-with-global-lru-cache branch from fb0d253 to d791f06 Compare October 21, 2023 17:20
@zainkabani zainkabani marked this pull request as ready for review October 21, 2023 17:20
@zainkabani zainkabani force-pushed the zain/reimplment-prepared-statements-with-global-lru-cache branch from 1ee3df3 to db70499 Compare October 23, 2023 14:27
src/client.rs Outdated
pool: &ConnectionPool,
server: &mut Server,
address: &Address,
) -> Result<(), Error> {
// We want to update this in the LRU to know this was recently used and add it if it isn't there already
// This could be the case if it was evicted or if doesn't exist (ie. we reloaded and it got removed)
pool.register_parse_to_cache(hash, parse);
if let Some(new_parse) = pool.register_parse_to_cache(hash, &parse) {
// If the pool has renamed this parse, we need to update the client cache with the new name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who's doing this and why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the pool is cleared and we are generating new names for the parse messages then we want to update those within the client. I was thinking about ways to handle DDL changes better, but I think I'll have a follow up PR instead and exclude this change.

@zainkabani zainkabani force-pushed the zain/reimplment-prepared-statements-with-global-lru-cache branch from 4ac4832 to db70499 Compare October 23, 2023 16:25
Copy link
Contributor

@levkk levkk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really cool, thank you!

@levkk levkk merged commit 7d3003a into postgresml:main Oct 25, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants