Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection establishment errors introduced with 2.3.12 #149

Open
RobinHolzingerQC-OLD opened this issue Sep 15, 2023 · 12 comments
Open

Connection establishment errors introduced with 2.3.12 #149

RobinHolzingerQC-OLD opened this issue Sep 15, 2023 · 12 comments
Assignees

Comments

@RobinHolzingerQC-OLD
Copy link

I am using SQLAlchemy (2.0) to maintain connections to an MS SQL Server instance in Python. Under the hood unixodbc is used as a driver when the scripts are run under Linux.

With the release of 2.3.12 I noticed some multiprocessed (Python multiprocessing) applications failing with sqlalchemy/unixodbc errors that did not appear to be deterministic. After further debugging, I found out that this problem is also occurring in single-threaded applications that try to establish many connections in a short amount of time via SqlAlchemy having pooling enabled.

After building unixodbc at different commits on main and integrating these builds into my Python environment I was able to narrow down the issue to Fix possible seg faults with SQLAPI and pooling. As this commit also has significant changes with regard to pooling it is very likely that this is indeed the root of my problems.

The error messages that I’m facing look like the following:

E       pyodbc.OperationalError: ('08S01', '[08S01] [Microsoft][ODBC Driver 18 for SQL Server]TCP Provider: Error code 0x2746 (10054) (SQLExecDirectW)')
...
>       raise exc.ResourceClosedError("This Connection is closed")
E       sqlalchemy.exc.ResourceClosedError: This Connection is closed

../../../<…>…/lib/python3.10/site-packages/sqlalchemy/engine/base.py:655: ResourceClosedError

Most of these errors have in common that they contain some TCP Provider error (mostly error code 0x2…).

I also drafted a small example showing the problems.

Before diving deeper into the inner workings of unixodbc and the specifics of commit 2a73009 I wanted to ask if anyone has an idea where this could be coming from. @lurcher Any idea?

In case anyone else is experiencing these issues, please let me know.

@lurcher
Copy link
Owner

lurcher commented Sep 19, 2023 via email

@lurcher
Copy link
Owner

lurcher commented Sep 20, 2023 via email

@lurcher lurcher self-assigned this Sep 20, 2023
@RobinHolzingerQC-OLD
Copy link
Author

Thanks @lurcher for tackling this issue so fast!

To me, it looks as if bb7ed6e already solved a fair chunk of the TCP issues if was encountering. However, there are still a couple of ResourceClosed / TCP related failures in my application (not the minimal example on gist) when using the most recent commit on main.

@v-chojas
Copy link
Contributor

Are you actually forking the process and trying to reuse connections that way? Yes, as @lurcher says, that will cause a bunch of problems. However, multiple threads within the same process taking their turn with a single connection handle (as long as it's not concurrent) shouldn't be a problem.

@lurcher
Copy link
Owner

lurcher commented Sep 20, 2023 via email

@lurcher
Copy link
Owner

lurcher commented Sep 20, 2023 via email

@RobinHolzingerQC-OLD
Copy link
Author

RobinHolzingerQC-OLD commented Sep 20, 2023

I would have saved myself a lot of time if I had just realized it was forking in the first place :-(.

Sorry for the confusion, yes Python multiprocessing Pools fork processes. As Python does not have 'true' multithreading due to the GIL (Global Interpreter Lock), multiple processes are needed to truly parallelize in Python.

Are you actually forking the process and trying to reuse connections that way?

I am forking the processes (or rather Python does), but I do not really intend to reuse the connections by doing that. As I am creating new sqlalchemy engines for every process, I thought that the connections wouldn't be reused.

@lurcher
Copy link
Owner

lurcher commented Sep 20, 2023 via email

@RobinHolzingerQC-OLD
Copy link
Author

RobinHolzingerQC-OLD commented Sep 26, 2023

@lurcher Thanks for the clarification.

Did I understand correctly that currently there's no way of getting the pooling to work on Python and you therefore rolled back the changes that were made with 2.3.12 in that regard?

I am asking as I am still facing TCP 0x20 (ResourceClosedErrors) issues when running the most recent commit of unixobc's main branch (0x9b31cd2).

@lurcher
Copy link
Owner

lurcher commented Sep 26, 2023 via email

@jabbera
Copy link

jabbera commented Oct 3, 2023

I am forking the processes (or rather Python does), but I do not really intend to reuse the connections by doing that. As I am creating new sqlalchemy engines for every process, I thought that the connections wouldn't be reused.

Change your multiprocessing start type to spawn. It's much slower but, it will not copy memory and you should avoid this issue: https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

@greatvovan
Copy link

Python does not have 'true' multithreading due to the GIL (Global Interpreter Lock), multiple processes are needed to truly parallelize in Python

Are you doing some heavy computations in your script? While you are correct for CPU-bound tasks, Python multithreading is fine to workaround IO-wait, that is common when working with databases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants