Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSL for Postgres #2000

Closed
zsoca8711 opened this issue Oct 6, 2017 · 11 comments
Closed

SSL for Postgres #2000

zsoca8711 opened this issue Oct 6, 2017 · 11 comments

Comments

@zsoca8711
Copy link

Hi,

we use a lot of postgres databases and we have some servers in different points of the world. This means some of them would be used over public connections, meaning we are reluctant to setup our datasource to postgres without SSL.
However, it seems redash doesn't support it (yet) for Postgres.
Is it possible to have this enabled somehow? Or added to the roadmap?

@zsoca8711
Copy link
Author

@clayheaton
Copy link

clayheaton commented Apr 19, 2018

When you add a new Postgres Data Source, there should be an additional option to allow the user to enter additional connection parameters. My use case: I need to create multiple Data Sources to different PostgreSQL databases, all of which require SSL. I have different client certificates and keys for each of those databases. However, I have no way to tell Redash where the client certificate and key required for a particular database lives. This usually is handled in the Postgres connection string, which parses sslcert and sslkey parameters. I would like to be able to tell Redash to use those parameters when it connects and where to find the appropriate files:

More info here:
https://discuss.redash.io/t/specify-ssl-certificate-locations-for-postgres-databases/1801

For reference, here's the Metabase panel for adding a new database, which allows for the specification of additional connection parameters.

screen shot 2018-04-19 at 10 08 56 am

@clayheaton
Copy link

@kravets-levko I hacked this into my deployment and it works. Maybe this can help you. Here's what I did:

In redash/query_runner/pg.py, on Line 52, is the definition for PostgreSQL.configuration_schema(). I modified that to be:

    def configuration_schema(cls):
        return {
            "type": "object",
            "properties": {
                "user": {
                    "type": "string"
                },
                "password": {
                    "type": "string"
                },
                "host": {
                    "type": "string",
                    "default": "127.0.0.1"
                },
                "port": {
                    "type": "number",
                    "default": 5432
                },
                "dbname": {
                    "type": "string",
                    "title": "Database Name"
                },
                "sslmode": {
                   "type": "string",
                   "title": "SSL Mode",
                   "default": "prefer"
                },
                "sslkey": {
                   "type": "string",
                   "title": "SSL Client Key Location",
                   "default": ""
                },
                "sslcert": {
                   "type": "string",
                   "title": "SSL Client Certificate Location",
                   "default":""
                },
            },
            "order": ['host', 'port', 'user', 'password', 'dbname', 'sslmode','sslkey','sslcert'],
            "required": ["dbname"],
            "secret": ["password"]
        }

In the same file, on line 134, I modified the definition of PostgreSQL._get_connection() to be the following:

    def _get_connection(self):
        config = self.configuration.to_dict()
        config = {k:v for k,v in config.items() if v != ""}
        connection = psycopg2.connect(async=True, **config)
        return connection

I now can create Data Sources to each of my PostgreSQL databases that require SSL and specify the location of the certificate. The new Data Source form for PostgreSQL now looks like:

screen shot 2018-04-19 at 11 15 13 am

@kravets-levko
Copy link
Collaborator

thanks @clayheaton! @arikfr, WDYT about this? looks like a good solution

@clayheaton
Copy link

clayheaton commented Apr 19, 2018

There's one more issue that comes up... Google just announced that Postgres for Cloud SQL is out of beta. To use SSL with Cloud SQL, you generate a certificate on the VM in Google Cloud Console. It's a self-signed certificate, but they provide the server root certificate.

You can use SSL Mode of verify-ca to instruct psycopg2 to use the sslrootcert parameter to check the validity of the certificate.

To specify the location of the root certificate, I added:

"sslrootcert": {
                   "type": "string",
                   "title": "SSL Server Root Certificate Location",
                   "default":""
                }

According to 32.18.1. Client Verification of Server Certificates in the Postgres documentation:

To allow server certificate verification, one or more root certificates must be placed in the file ~/.postgresql/root.crt in the user's home directory. (On Microsoft Windows the file is named %APPDATA%\postgresql\root.crt.) Intermediate certificates should also be added to the file if they are needed to link the certificate chain sent by the server to the root certificates stored on the client.

Certificate Revocation List (CRL) entries are also checked if the file ~/.postgresql/root.crl exists (%APPDATA%\postgresql\root.crl on Microsoft Windows).

The location of the root certificate file and the CRL can be changed by setting the connection parameters sslrootcert and sslcrl or the environment variables PGSSLROOTCERT and PGSSLCRL.

In the VM where I'm running Redash, I can invoke python and use psycopg2 to connect with sslmode=verify-caand use the sslrootcert parameter to specify the root certificate's location and everything works as expected.

However, when Redash is running (with sslrootcert added), the value of sslrootcert somehow is overwritten by Redash or the environment and is not used by psycopg2.

The odd thing here is that I can set up the Data Source using the information and it tests correctly. It is only when I try to run a query that the error appears. This is from the logs:

[2018-04-19 16:17:26,806][PID:9449][WARNING][Worker-1102] Unexpected error while running query:
Traceback (most recent call last):
  File "/opt/redash/redash.4.0.0.b3948/redash/tasks/queries.py", line 449, in run
    data, error = query_runner.run_query(annotated_query, self.user)
  File "/opt/redash/redash.4.0.0.b3948/redash/query_runner/pg.py", line 147, in run_query
    return schema.values()
  File "/opt/redash/redash.4.0.0.b3948/redash/query_runner/pg.py", line 35, in _wait
    state = conn.poll()
OperationalError: root certificate file "/home/redash/.postgresql/root.crt" does not exist
Either provide the file or change sslmode to disable server certificate verification.

Hence, I'm trying to figure out how to guarantee that Redash and psycopg2 are respecting the sslrootcert everytime psycopg2.connect() is called. Right now, that's not happening.

Basically, it appears that the logic used when testing a new Data Source connection is not the same as the logic used when creating a connection to execute a query. Somewhere along the line, the sslrootcert parameter is being dropped, overwritten, or ignored.

@clayheaton
Copy link

I think what I'm running into is this bug, at least on the Google Compute Environment Image that Redash makes available and/or the Google Cloud SQL databases to which I'm trying to connect:

http://www.postgresql-archive.org/BUG-14329-libpq-doesn-t-send-complete-client-certificate-chain-on-first-SSL-connection-td5921885.html

Somewhere, Postgres is compiled with an old version of OpenSSL that isn't properly handling the certificates.

@arikfr
Copy link
Member

arikfr commented Apr 20, 2018

Thanks, @clayheaton for looking into this and providing the detailed feedback!

Somewhere, Postgres is compiled with an old version of OpenSSL that isn't properly handling the certificates.

How did you setup your Redash instance? Using the GCE image?

I now can create Data Sources to each of my PostgreSQL databases that require SSL and specify the location of the certificate. The new Data Source form for PostgreSQL now looks like:

While this is an easy fix, we prefer not to use file paths, but to rather store the certificate itself (like we do for BigQuery service account JSON file for example). This ensures that the Redash server is stateless and can be scaled, if needed.

If psycopg2/libpq doesn't support accepting a certificate instead of a path, we can store it to a temporary location before usage.

@clayheaton
Copy link

Hi @arikfr. Yes, I used the GCE Image to start and then upgraded Redash (from 2.0, I believe) and Ubuntu to be current. That image could stand to be updated at some point.

I think the SSL issue I'm having may be related to the build of Postgres that Google Cloud SQL is using.

I'm unaware of psycopg2/libpq supporting accepting a certificate instead of a path. However, I'm not a libpq expert, so I very possibly could be incorrect.

I agree that your suggested approach is better. For my current needs, I have to get something working ASAP, so I'll stick with my hack until I'm either able to suggest a better improvement or your team commits a change in line with your suggestion. (The work I'm doing is related to helping my company prepare for the GDPR, so I'm sure you can understand the frenzy.)

@clayheaton
Copy link

FYI, I stood up a new Postgres database on Google Cloud SQL and migrated my data to it. Redash now can connect via SSL with the hacky method that I posted above, likely because Google now is building Postgres on this instances without the bug I mentioned above.

@rasmi
Copy link

rasmi commented Jan 7, 2019

@clayheaton I ran into the same issue -- I needed to restart my local redash instance for the SSL cert changes to take effect properly. It seems the settings are not correctly propagated to db connection objects upon save, only upon restart. I think it's a broader issue than just pg.py.

I've submitted a PR that adds the various SSL parameters (#3259). @kravets-levko @arikfr thoughts?

@arikfr
Copy link
Member

arikfr commented Dec 16, 2021

Implemented in #4871.

@arikfr arikfr closed this as completed Dec 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants