Reimplement prepared statements with LRU cache and statement deduplication #618

zainkabani · 2023-10-13T20:10:12Z

This PR reimplements prepared statements in PgCat. There was a large latency regression when running PgCat with multiple clients and prepared statements, this is most likely due to cache misses and needing to prepare multiple statements on the server.

This new approach hashes the Parse content (excluding the name) as a key to a map which holds rewritten parse messages, this greatly increased the number of cache hits we get and deduplicates any of the same statements that different clients send with different names.

Prepared statements in action

Running a test against a Baseline PgCat with the extended protocol and a Feature PgCat with prepared statements we were able to get some data. The test runs 12 unique prepared statements against an empty table, this helps us exclude actual PG compute from other parts of the query execution process.

We were able to get a slight latency win, about 5% faster

But even more interestingly we saw the prepared statement database instance's CPU usage was less than half of the extended protocol instance. While an extreme case, this helps to show how much time can be saved by pre-planning queries and reducing the amount of work PG has to do.

Other notes and features:

PgCat will try to detect and invalid statement cache errors by deallocating all prepared statements on the server when it gets the cached plan must not change result type error message. This helps to fix things when there are DDL changes that invalidate cached plans
Rewrites the extended protocol buffering logic to better support prepared statements and queue pgcat generated response messages
Removes the prepared_statements variable and moves the prepared_statements_cache_size to the pool configuration level, this is because when enabled prepared statements will incur a small packet inspection penalty to determine if a statement is named or not
Fixes stats for prepared statements, and adds a eviction stat for when a prepared statement is evicted from a server's LRU

Equivalent latency to pgbouncer, all running with 1000 cache size, 50 max connections and 1 thread

Pgbouncer

➜  ~ pgbench -T 120 -c 20 -j 3 --protocol prepared postgresql://postgres:postgres@127.0.0.1:7432/postgres
pgbench (14.9 (Homebrew), server 15.3 (Debian 15.3-1.pgdg120+1))
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: prepared
number of clients: 20
number of threads: 3
duration: 120 s
number of transactions actually processed: 36749
latency average = 65.345 ms
initial connection time = 3.273 ms
tps = 306.069304 (without initial connection time)

PgCat

➜  ~ pgbench -T 120 -c 20 -j 3 --protocol prepared postgresql://postgres:postgres@127.0.0.1:6432/postgres
pgbench (14.9 (Homebrew), server 15.3 (Debian 15.3-1.pgdg120+1))
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: prepared
number of clients: 20
number of threads: 3
duration: 120 s
number of transactions actually processed: 36808
latency average = 65.225 ms
initial connection time = 9.805 ms
tps = 306.632400 (without initial connection time)

Implementation details:

TLDR:

Connection pool has an LRU of hash of query to rewritten parse
Client has a mapping of the name it's using to the rewritten parse
Server has an LRU of the names of rewritten the prepared statements it has

ConnectionPool:
This connection pool has a new attribute which looks like this:
cache: LruCache<u64, Arc<Parse>>

It is an LRU cache where the Key is a hash of the contents of Parse packet excluding the name ie. (query string, number of params, param types). The value is an Arc of the rewritten Parse message (it's an Arc to avoid duplicating data that is given to the clients).

When a Parse comes in, the pool determines if it the query already exists (ignoring the name), if it exists it clones the Arc<Parse> and gives it to the client, if it doesn't exist it creates a new name for the packet that will be used to prepare it against the server. This is based on a global counter that increments every time we need to generate a new name.

If we exceed the capacity of the LRU then the Arc<Parse> is simply dropped. There is no other management to be done since any clients that still need it have a copy of the Parse and can just add it back to the cache if needed.

Client:
The client has a new attribute which looks like this:
prepared_statements: HashMap<String, Arc<Parse>>

This is a mapping of the Prepared Statement names set by the client to the rewritten Parse messages. When a new parse comes in it checks against the connection pool for the rewritten Parse message and stores it in this map. Any statements that come in (Bind, Describe) will be rewritten based on the rewritten Parse messages name.

The main reason we're using an Arc for the Parse messages is because clients that use a prepared statement will all need the original contents of the Parse (especially in the case when the message is evicted from the ConnectionPool) and that can take up memory, by using an Arc we have one copy of the Parse and the client stores a reference to this. The Parse is released from memory when the client is dropped and it no longer exists on the ConnectionPool.

Server:
The server has a new attribute which looks like this:
prepared_statement_cache: LruCache<String, ()>

The server needs to know which prepared statements it has, this mapping stores the rewritten names in an LRU. The client prepares the Parse on the server if needed (it might not already have it) and it updates (adds/promotes) the name of the statement to the ConnecctionPool's prepared_statement_cache.

If a statement needs to be evicted from the cache, pgcat will send a close message to the server to drop it.

…bal-lru-cache

…xpecting results

…bal-lru-cache

…port mocking response packets and close

… cached plans with ideas about how to further improve it

zainkabani · 2023-10-21T16:40:32Z

src/config.rs

@@ -568,6 +556,9 @@ pub struct Pool {
    #[serde(default)] // False
    pub log_client_parameter_status_changes: bool,

+    #[serde(default = "Pool::default_prepared_statements_cache_size")]


Moved to pool instead of global because there is a small packet inspection penalty you pay when this is enabled to determine if packets are using named statements or not

zainkabani · 2023-10-21T17:02:57Z

src/server.rs

@@ -957,6 +970,42 @@ impl Server {
                    if self.in_copy_mode {
                        self.in_copy_mode = false;
                    }
+                    // TODO: consider logging a warning here
+
+                    if self.prepared_statement_enabled {


Let's say we have a statement that uses select * on a table but a new column is added after we prepared it, PG will send a cached plan must not change result type error if we try to use this statement.

This change tries to identify this type of error and DEALLOCATE ALL on the server connection to force re-prepares. Clients will still see errors the first time for each server that hasn't deallocated but this will help clean up the pool.

* Cosmetic fixes * fix test

levkk · 2023-10-23T16:06:50Z

src/client.rs

        pool: &ConnectionPool,
        server: &mut Server,
        address: &Address,
    ) -> Result<(), Error> {
        // We want to update this in the LRU to know this was recently used and add it if it isn't there already
        // This could be the case if it was evicted or if doesn't exist (ie. we reloaded and it got removed)
-        pool.register_parse_to_cache(hash, parse);
+        if let Some(new_parse) = pool.register_parse_to_cache(hash, &parse) {
+            // If the pool has renamed this parse, we need to update the client cache with the new name


Who's doing this and why?

if the pool is cleared and we are generating new names for the parse messages then we want to update those within the client. I was thinking about ways to handle DDL changes better, but I think I'll have a follow up PR instead and exclude this change.

…g cached plans

…for pool LRU

levkk

This is really cool, thank you!

zainkabani and others added 25 commits October 13, 2023 10:58

Initial commit

06f7ea5

Cleanup and add stats

1cd957e

Use an arc instead of full clones to store the parse packets

af6c2ea

Use mutex instead

2247f86

Merge branch 'main' into zain/reimplment-prepared-statements-with-glo…

d4d88c4

…bal-lru-cache

fmt

d2927d0

clippy

fb23e33

fmt

2fb3d4a

fix?

a59aa63

fix?

e2963e9

fmt

8111582

typo

6e85fb2

Update docs

46c8f9e

Refactor custom protocol

9a80b47

fmt

19d8478

move custom protocol handling to before parsing

a07874a

Support describe

bf5a39c

Add LRU for server side statement cache

6a87a68

rename variable

b528b95

Refactoring

0177af8

Move docs

cd7942b

Fix test

6205548

fix

9cd675e

Update tests

89e2651

trigger build

d37514f

zainkabani changed the title ~~Reimplement prepared statements with global LRU cache and statement deduplication~~ Reimplement prepared statements with LRU cache and statement deduplication Oct 16, 2023

zainkabani added 4 commits October 16, 2023 17:36

Add more tests

bcce2d5

Reorder handling sync

63aa0c7

Support when a named describe is sent along with Parse (go pgx) and e…

e392607

…xpecting results

don't talk to client if not needed when client sends Parse

53880f2

zainkabani and others added 6 commits October 17, 2023 15:28

nit

6a1d7f6

Reduce hashing

a5d4bcf

Reducing work done to decode describe and parse messages

dd021c2

minor refactor

116a681

Merge branch 'main' into zain/reimplment-prepared-statements-with-glo…

72826e6

…bal-lru-cache

Merge branch 'main' into zain/reimplment-prepared-statements-with-glo…

cfe8e9f

…bal-lru-cache

zainkabani force-pushed the zain/reimplment-prepared-statements-with-global-lru-cache branch 2 times, most recently from 8d15135 to cfe8e9f Compare October 20, 2023 21:27

zainkabani added 3 commits October 21, 2023 02:59

Rewrite extended and prepared protocol message handling to better sup…

b27c918

…port mocking response packets and close

An attempt to better handle if there are DDL changes that might break…

d107bbe

… cached plans with ideas about how to further improve it

fix

21b9cde

zainkabani commented Oct 21, 2023

View reviewed changes

Minor stats fixed and cleanup

d791f06

zainkabani force-pushed the zain/reimplment-prepared-statements-with-global-lru-cache branch from fb0d253 to d791f06 Compare October 21, 2023 17:20

zainkabani marked this pull request as ready for review October 21, 2023 17:20

levkk and others added 2 commits October 23, 2023 10:22

Cosmetic fixes (#64)

a57550d

* Cosmetic fixes * fix test

Change server drop for statement cache error to a deallocate all

db70499

zainkabani force-pushed the zain/reimplment-prepared-statements-with-global-lru-cache branch from 1ee3df3 to db70499 Compare October 23, 2023 14:27

levkk reviewed Oct 23, 2023

View reviewed changes

zainkabani force-pushed the zain/reimplment-prepared-statements-with-global-lru-cache branch from 4ac4832 to db70499 Compare October 23, 2023 16:25

zainkabani added 7 commits October 23, 2023 12:29

Updated comments and added new idea for handling DDL changes impactin…

7fa1147

…g cached plans

fix test?

005029d

Revert test change

6928d31

trigger build, flakey test

b889b4b

Avoid potential race conditions by changing get_or_insert to promote …

0dd5e88

…for pool LRU

remove ps enabled variable on the server in favor of using an option

962090a

Add close to the Extended Protocol buffer

80aa607

levkk approved these changes Oct 25, 2023

View reviewed changes

levkk merged commit 7d3003a into postgresml:main Oct 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reimplement prepared statements with LRU cache and statement deduplication #618

Reimplement prepared statements with LRU cache and statement deduplication #618

Uh oh!

zainkabani commented Oct 13, 2023 •

edited

Loading

Uh oh!

zainkabani Oct 21, 2023

Uh oh!

zainkabani Oct 21, 2023 •

edited

Loading

Uh oh!

levkk Oct 23, 2023

Uh oh!

zainkabani Oct 23, 2023

Uh oh!

levkk left a comment

Uh oh!

Uh oh!

Reimplement prepared statements with LRU cache and statement deduplication #618

Reimplement prepared statements with LRU cache and statement deduplication #618

Uh oh!

Conversation

zainkabani commented Oct 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Prepared statements in action

Other notes and features:

Implementation details:

Uh oh!

zainkabani Oct 21, 2023

Choose a reason for hiding this comment

Uh oh!

zainkabani Oct 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

levkk Oct 23, 2023

Choose a reason for hiding this comment

Uh oh!

zainkabani Oct 23, 2023

Choose a reason for hiding this comment

Uh oh!

levkk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zainkabani commented Oct 13, 2023 •

edited

Loading

zainkabani Oct 21, 2023 •

edited

Loading