Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a garbage collected connection pool #157

Open
peschkaj opened this issue Jul 19, 2013 · 0 comments
Open

Create a garbage collected connection pool #157

peschkaj opened this issue Jul 19, 2013 · 0 comments
Milestone

Comments

@peschkaj
Copy link
Contributor

I've set this up as a series of rules. Unless noted, this should behave as if it were a regular connection pool.

Individual connections will never be GC'd or removed (barring specific circumstances detailed below).

Instead, the connection pool is subdivided into buckets. Individual buckets can be removed from active use over time, but once removed from active use a bucket may never be re-used (even if an active connection remains inside the bucket). This is called reclamation. This lets us safely create blocks of connections and reclaim the resources used by buckets of connections.

When a user attempts to connect, existing buckets are checked in sequence from 0 to MAX. If a bucket contains an unused connection, the user receives that connection. If no connections are available, move to the next bucket. Should we run out of buckets, but total connection count is < max pool size, a new bucket is created and all connections in the bucket are initialized.

Buckets can marked for reclamation when:

  • All connections in the bucket are unused.
  • < 20% of connections in a bucket are unused AND the bucket below the current bucket has empty connections.
    E.g. bucket 7 has 1 active connection being used by a long running query. Buckets 5, 6, 8 have no active connections. In this scenario bucket 7 is eligible for reclamation.
  • Only the top 20% of buckets (rounded up) may be marked for reclamation at any time.

Buckets marked for reclamation aren't immediately destroyed, but they are removed from active utilization. That is, once a bucket has been marked for reclamation, it cannot be re-used. New buckets will be created to serve new connections, even if a reclaimed bucket has 1 active connection and many unused connections.

Once a bucket has been placed on the reclamation pile individual connections can be destroyed as long as they are not busy. This is to free up resources. Once the last connection is destroyed, a bucket may be immediately destroyed.

Only active connections in active buckets count toward max pool size.

Max pool size does not have to be evenly divisible by bucket size. We will, however, have a fudge factor and round up. E.g. max pool size 97 with bucket size 30 means max pool size is actually 120.

Bucket size and bucket count should be configurable parameters. Both bucket size and count should have hard minimums to minimize overhead - creating 100 buckets of size 1 would be resource intensive and a crazy idea.

Potential Issue: Long Running Connections
It's important to make sure that applications don't maintain long running connections to a single node in the cluster. Long lived connections could interfere with bucket reclamation.

This can be solved with a combination of techniques. First, we can promote buckets. E.g. Buckets 5, 6, 8 have no active connections. Bucket 7 has 1 active connection being used. Buckets 5, 6, 8 can be marked for reclamation and bucket 7 gets promoted to bucket 5. This will prevent connection pool spamming - active connections and buckets will bubble up to the top level buckets and should be re-used more frequently.

The design of CorrugatedIron should prevent this from becoming a problem. However, a little bit of future proofing never hurt anyone.

Potential Issue: Connection Pool Fragmentation
It's possible that, through the course of normal operations, there are a number of active buckets each containing a small number of active connections to a Riak cluster. In these situations it will be important to defragment the connection pool by moving connections to buckets at the top of the stack.

To avoid the need for any locking mechanism, we should bubble up active connections one at a time. The defragmentation order should look at reclaimed buckets and then walk the stack of active buckets to move active connections between buckets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant