-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auth passthrough (auth_query) #266
Conversation
8c2544b
to
d0e3081
Compare
@@ -34,6 +34,8 @@ rustls-pemfile = "1" | |||
hyper = { version = "0.14", features = ["full"] } | |||
phf = { version = "0.11.1", features = ["macros"] } | |||
exitcode = "1.1.2" | |||
postgres-protocol = "0.6.4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great find!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@magec I created this library a while ago with intention to create a PR to use this in PgCat. Wondering if you'd take a look, https://github.com/zain-kabani/postgres-proto-rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took this one because was the one used in postgres crate. Also, I found it pretty easy to use and is widely imused. Havent seen yours yet. Will take a look.
Hey, I'm trying this on RDS (Aurora Serverless v2) and it is timing out while connecting whereas it works via the dockerized Postgres and RDS works fine on the main branch, so seems to be a regression from main. |
Uhmm, weird, could you provide soke logs in debug mode? |
After a bit of thought, I'm surprised this works on RDS at all because of pgbouncer/pgbouncer#265 RDS blocks |
I set the auth query to The important thing for this PR is that I was having the connection problem even with the auth_user / auth_query turned off. Logs from debug mode:
pgcat.toml: [general]
host = "0.0.0.0"
port = 6432
enable_prometheus_exporter = true
prometheus_exporter_port = 9930
connect_timeout = 50000
healthcheck_timeout = 100000
healthcheck_delay = 30000
shutdown_timeout = 60000
ban_time = 60
log_client_connections = false
log_client_disconnections = false
autoreload = false
admin_username = "postgres"
admin_password = "postgres"
[pools.postgres]
pool_mode = "transaction"
default_role = "any"
query_parser_enabled = true
primary_reads_enabled = true
sharding_function = "pg_bigint_hash"
pool_size = 9
statement_timeout = 10000
[pools.postgres.users.0]
username = "postgres"
password = "********"
pool_size = 9
statement_timeout = 0
[pools.postgres.shards.0]
servers = [
[ "*********", 5432, "primary" ]
]
database = "postgres"
|
Hello!, sorry the late response. I could not reproduce the thing, using an instance in localhost with user/pass postgres/postgres and with the config you provided, it just works. I also checked the code that decides whether to use
My log goes like:
Any ideas what could be different? I used a pgcat compiled using this branch. |
I have added some commits to implement config reload. I also detected an issue regarding reload which I think was a bug. The commit is this one.
/cc @levkk |
src/client.rs
Outdated
} | ||
} | ||
if password_hash.is_none() { | ||
warn!("Clien auth is not possible, you either have not set a valid auth_query or a password for {{ username: {:?}, pool_name: {:?}, application_name: {:?} }}", username, pool_name, application_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warn!("Clien auth is not possible, you either have not set a valid auth_query or a password for {{ username: {:?}, pool_name: {:?}, application_name: {:?} }}", username, pool_name, application_name); | |
warn!("Client auth is not possible, you either have not set a valid auth_query or a password for {{ username: {:?}, pool_name: {:?}, application_name: {:?} }}", username, pool_name, application_name); |
src/config.rs
Outdated
@@ -448,6 +458,16 @@ impl Config { | |||
pub fn default_path() -> String { | |||
String::from("pgcat.toml") | |||
} | |||
|
|||
pub fn overwrite_passwords(&mut self) { | |||
if let Ok(val) = env::var("PGCAT_ADMIN_PASSWORD") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't allow env-based configuration at the moment, only file based.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it because it goes aside with the auth_query
thing. One of the things that we get from the auth_query
is being able to stop using cleartext passwords on the config. To really get that done, we need a way of setting these two passwords so I added a simple mechanism for overwriting them using ENV vars.
That said, the actual reason (I have) is being able to deploy to kubernetes without losing the auto reload
feature, which I find neat.
Normally, secrets are set as ENV vars in k8s. If I have no way of setting that password through and ENV var, I should create an script that generates it at boot time. The script should, provided a config file with a placeholder for those passwords, substitute those using the env vars, and write to a file that will be fed to pgcat.
The problem is that then (apart from this nighmare), I lose the auto reload feature, well not really loose it, but it would make me deploy the thing using a sidecar that does the password rewrite every time the templated config changes, so later pgcat sees it and does the actual reload.
I find this a bit cumbersome because it complicates containerized deploys for pgcat (the same would apply to different orchestrators) so I was hoping that these two variables could be an exception.
@@ -211,12 +242,31 @@ impl ConnectionPool { | |||
replica_number += 1; | |||
} | |||
|
|||
// We assume every server in the pool share user/passwords |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They don't :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is assumed here, is that every server in the connection pool (a connection pool being 1 user, 1 database n servers) shares the same user/password, and this is an assumption that is already being made in current implementation (with cleartext), also this assumption is logical, because you are in the end connecting to the pool using just one "user/database/password". The only thing that I think we can do is to tell that in logs so the administrator can spot the issue in the server and update its config, which is already done here
Left a few comments. We're on the right track. |
Hey so this is only happening on RDS actually - it wasn't a problem for me on localhost either - but I think it's worth investigating specifically on RDS since its such a common database provider and this PR definitely seems to break support for it where it works in the main branch. |
3d6dddf
to
99dfaa3
Compare
I have made the changes so now |
854ff93
to
870b34d
Compare
92255ad
to
e78b618
Compare
6381136
to
7a55af7
Compare
af2d2e9
to
8c768f1
Compare
There was an issue with scram auth (it was broken), I understand that RDS uses it by default and that's why it failed. This is already solved. |
Ok, so @levkk I did some changes to this PR in the hope to make it to
No documentation is written yet, waiting for validation. |
Thanks!, I think I prefer a review, will come to this again next week. |
50ae46b
to
1d9fdc9
Compare
Ok, I added some documentation on the newly created |
This adds a new `exec_simple_query` method so we can make 'out of band' queries to servers that don't interfere with pools at all. In order to reuse startup code for making these simple queries, we need to set the stats (`Reporter`) optional, so using these simple queries wont interfere with stats.
Adds a feature that allows setting auth passthrough for md5 auth. It adds 3 new (general and pool) config parameters: - `auth_query`: An string containing a query that will be executed on boot to obtain the hash of a given user. This query have to use a placeholder `$1`, so pgcat can replace it with the user its trying to fetch the hash from. - `auth_query_user`: The user to use for connecting to the server and executing the auth_query. - `auth_query_password`: The password to use for connecting to the server and executing the auth_query. The configuration can be done either on the general config (so pools share them) or in a per-pool basis. The behavior is, at boot time, when validating server connections, a hash is fetched per server and stored in the pool. When new server connections are created, and no cleartext password is specified, the obtained hash is used for creating them, if the hash could not be obtained for whatever reason, it retries it. When client authentication is tried, it uses cleartext passwords if specified, it not, it checks whether we have query_auth set up, if so, it tries to use the obtained hash for making client auth. If there is no hash (we could not obtain one when validating the connection), a new fetch is tried. Once we have a hash, we authenticate using it against whathever the client has sent us, if there is a failure we refetch the hash and retry auth (so password changes can be done). The idea with this 'retrial' mechanism is to make it fault tolerant, so if for whatever reason hash could not be obtained during connection validation, or the password has change, we can still connect later.
Is this ready to merge? I have been using the feature in prod for some time now without issues at all, also, in the end is an opt-in disabled by default. |
Sounds good to me! Can we sync real quick regarding #389. I'm introducing multiple ways to authenticate, and maybe we can rebase your PR on that so we can have clean separation between various authentication mechanisms? I can rebase it myself, if that's ok with you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it's pretty straight forward to rebase my changes on yours once yours is in. Good to go!
Auth passthrough (auth_query) feature
EDITED
Add auth passthough (auth_query)
Adds a feature that allows setting auth passthrough for md5 auth.
Configuration:
It adds 3 new (general and pool) config parameters:
auth_query
: An string containing a query that will be executed on bootto obtain the hash of a given user. This query have to use a placeholder
$1
,so pgcat can replace it with the user its trying to fetch the hash from.
auth_query_user
: The user to use for connecting to the server and executing theauth_query.
auth_query_password
: The password to use for connecting to the server and executing theauth_query.
The configuration can be done either on the general config (so pools share them) or in a per-pool basis.
Behavior
The behavior is, at boot time, when validating server connections, a hash is fetched per server
and stored in the pool. When new server connections are created, and no cleartext password is specified,
the obtained hash is used for creating them, if the hash could not be obtained for whatever reason, it retries
it.
When client authentication is tried, it uses cleartext passwords if specified, it not, it checks whether
we have query_auth set up, if so, it tries to use the obtained hash for making client auth. If there is no
hash (we could not obtain one when validating the connection), a new fetch is tried.
Once we have a hash, we authenticate using it against whathever the client has sent us, if there is a failure
we refetch the hash and retry auth (so password changes can be done).
The idea with this 'retrial' mechanism is to make it fault tolerant, so if for whatever reason hash could not be
obtained during connection validation, or the password has change, we can still connect later.
Testing
This is currently tested using ruby tests:
Considerations
Currently hash refresh is not done when reloading and no pool have changed.