Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(django): improve getting psycopg3 connection info #3580

Merged
merged 2 commits into from
Oct 8, 2024

Conversation

nijel
Copy link
Contributor

@nijel nijel commented Sep 27, 2024

Fetch the few needed parameters manually instead of relying on get_parameters() which adds visible overhead due to excluding default values for parameters.

Fixes #3573


Thank you for contributing to sentry-python! Please add tests to validate your changes, and lint your code using tox -e linters.

Running the test suite on your PR might require maintainer approval. The AWS Lambda tests additionally require a maintainer to add a special label, and they will fail until this label is added.

@nijel
Copy link
Contributor Author

nijel commented Sep 27, 2024

This might need additional changes, but I'd like to know if this is a viable approach to address #3573.

I didn't investigate how unix_socket would be handled here (if at all).

@szokeasaurusrex
Copy link
Member

Thanks for the PR, @nijel! I just approved our test suit to run against it – let's see whether anything fails

Copy link

codecov bot commented Sep 27, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 84.27%. Comparing base (2d2e548) to head (da26bf4).
Report is 1 commits behind head on master.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
sentry_sdk/integrations/django/__init__.py 50.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3580      +/-   ##
==========================================
- Coverage   84.29%   84.27%   -0.03%     
==========================================
  Files         133      133              
  Lines       14028    14031       +3     
  Branches     2956     2957       +1     
==========================================
- Hits        11825    11824       -1     
- Misses       1464     1466       +2     
- Partials      739      741       +2     
Files with missing lines Coverage Δ
sentry_sdk/integrations/django/__init__.py 83.83% <50.00%> (-0.42%) ⬇️

... and 2 files with indirect coverage changes

@szokeasaurusrex
Copy link
Member

Hi @nijel, to be honest I am not super familiar with psycopg3 (someone else from the team wrote this part of the integration).

Could you please explain why you are only including the dbname, host, and port here? From what I can tell, get_parameters could return more information

@nijel
Copy link
Contributor Author

nijel commented Sep 28, 2024

These are the only fields used in the code below (+ Unix socket which I didn't figure out how to obtain).

@github-actions github-actions bot removed the Trigger: tests using secrets PR code is safe; run CI label Sep 30, 2024
@nijel
Copy link
Contributor Author

nijel commented Sep 30, 2024

I've adjusted the code to expose all PostgreSQL connection parameters even when they are currently not used. I've also verified that unix_socket is never present with psycopg3, so now the code is completely consistent with get_parameters.

@nijel nijel marked this pull request as ready for review September 30, 2024 11:58
Copy link
Member

@szokeasaurusrex szokeasaurusrex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we go back to what you had in c862f55? This pgconn stuff appears to be internal to psycopg3 because it is not super well documented from what I can tell, which makes me worried that this logic could break.

Your original suggestion is fine with me; the only reason I was originally worried about it is because I did not look at the code outside the diff where we actually use the connection_params. As soon as you explained that we are only using some of the parameters, I realized I needed to expand that section, and now I see that you were correct that those are the only parameters we need.

Also, have you already tested this change? Did you notice a performance improvement?

@nijel
Copy link
Contributor Author

nijel commented Oct 1, 2024

The problem with the implementation in c862f55 is that when using UNIX sockets, it returns part of the UNIX socket path as host (what might be a bug in psycopg3). This implementation behaves the same as the original code in both socket and IP connection cases.

This change makes profiling overhead not measurable compared to the original code as shown in #3573, so it does address the issue I experienced.

@szokeasaurusrex
Copy link
Member

The problem with the implementation in c862f55 is that when using UNIX sockets, it returns part of the UNIX socket path as host (what might be a bug in psycopg3).

Could you share an example of how c862f55 changes the host?

Would it be possible for us to trim this UNIX socket path out of the host? I would really prefer to avoid the pgconn stuff, as I could not find any documentation on it.

@nijel
Copy link
Contributor Author

nijel commented Oct 1, 2024

host is /var/run/postgresql when obtained via cursor_or_db.connection.info.host.

cursor_or_db.connection.info.get_parameters() returns in such case:

{'user': 'USER', 'dbname': 'DATABASE', 'client_encoding': 'UTF8', 'sslcertmode': 'allow'}

The new code returns the same values (+ some additional settings which are not used).

The host behaves this way because it calls PQhost and it is documented to return directory where socket is placed.

The alternate approach to the current patch would be to use cursor_or_db.connection.info.host as host only in case it does not start with /.

@szokeasaurusrex
Copy link
Member

@nijel Just making sure I understand correctly:

  1. cursor_or_db.connection.info.host can return an absolute path for the host, but cursor_or_db.connection.info.get_parameters() cannot
  2. If cursor_or_db.connection.info.host returns an absolute path (starting with /), then the host will be missing from cursor_or_db.connection.info.get_parameters().
  3. If cursor_or_db.connection.info.host returns something that is not an absolute path (not starting with /), then the host will be contained in cursor_or_db.connection.info.get_parameters() and will have the same value as cursor_or_db.connection.info.host.

Are all of those three points correct? If yes, then I think we should go with your suggestion, where we use c862f55 but only add the host if it does not start with /:

The alternate approach to the current patch would be to use cursor_or_db.connection.info.host as host only in case it does not start with /.

Fetch the few needed parameters manually instead of relying on `get_parameters()` which adds visible overhead due to excluding default values for parameters.
@nijel
Copy link
Contributor Author

nijel commented Oct 2, 2024

Yes, all points are correct. I've adjusted the code accordingly.

Copy link
Member

@szokeasaurusrex szokeasaurusrex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Thanks for the contribution @nijel!

@szokeasaurusrex szokeasaurusrex added the Trigger: tests using secrets PR code is safe; run CI label Oct 8, 2024
@szokeasaurusrex szokeasaurusrex enabled auto-merge (squash) October 8, 2024 08:12
@szokeasaurusrex szokeasaurusrex merged commit 4f79aec into getsentry:master Oct 8, 2024
135 of 138 checks passed
@nijel nijel deleted the patch-1 branch October 8, 2024 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Trigger: tests using secrets PR code is safe; run CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PostgreSQL getting connection info slowing down tracing
2 participants