Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't register atexit after shutdown thrown in Python 3.9 #3221

Open
relativityboy opened this issue Apr 12, 2022 · 9 comments
Open

can't register atexit after shutdown thrown in Python 3.9 #3221

relativityboy opened this issue Apr 12, 2022 · 9 comments
Labels
bug This issue is a confirmed bug. p2 This is a standard priority issue resources s3

Comments

@relativityboy
Copy link

Describe the bug

When attempting to create an s3 session resource running on python 3.9 function call throws with message can't register atexit after shutdown

Behaves as expected in python 3.8

Expected Behavior

When running in python 3.9 the following code (assuming valid values) results in s3 being a resource

session = boto3.Session(
        aws_access_key_id=public_key,
        aws_secret_access_key=secret_key,
    )

s3 = session.resource('s3', region_name=aws_region)

Current Behavior

When running in python 3.9 the following code (assuming valid values) results in an Exception

session = boto3.Session(
        aws_access_key_id=public_key,
        aws_secret_access_key=secret_key,
    )

s3 = session.resource('s3', region_name=aws_region)

Reproduction Steps

Run the following in python 3.9 with public_key, secret_key, and aws_region set to valid values

def x():
        session = boto3.Session(
                aws_access_key_id=public_key,
                aws_secret_access_key=secret_key,
        )

        s3 = session.resource('s3', region_name=aws_region)

Timer(0.5, x).start()

Possible Solution

Run with python 3.8? Error does not occur.

Additional Information/Context

No response

SDK version used

1.21.38

Environment details (OS name and version, etc.)

Mac OS 12.3.1, Python 3.9

@relativityboy relativityboy added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Apr 12, 2022
@nateprewitt
Copy link
Contributor

Hi @relativityboy, thanks for bringing this to our attention. This appears to be an issue specific to the S3 resource. I wasn't able to reproduce it on other resources (ec2, dynamodb, etc). There was a change in Python 3.9 to prevent daemon threads in concurrent.futures due to unexpected behaviors with subprocesses.

We'll need more investigation but from first glance this is likely an issue with the transfer module or s3transfer itself. It'll need a deep dive to understand what's conflicting.

@kdaily kdaily removed the needs-triage This issue or PR still needs to be triaged. label Apr 12, 2022
@medley56
Copy link

I have another example of this that is occurring with a highly upvoted SO answer for uploading logs to S3 using atexit.register to register the uploader function during python shutdown. I suspect this registered hook is being cloned into some sort of multiprocessing under the hood and it's attempting to call the hook for every subprocess (thread?).

Here is the SO answer that appears to now be a broken pattern: https://stackoverflow.com/a/51070892/2970906

Code sample below to reproduce:

import io
import logging
import os

import boto3

os.environ['AWS_PROFILE'] = 'my-aws-profile'


def write_logs(body, bucket, key):
    s3 = boto3.client("s3")
    s3.put_object(Body=body.getvalue(), Bucket=bucket, Key=key)


log = logging.getLogger("some_log_name")
log_stringio = io.StringIO()
handler = logging.StreamHandler(log_stringio)
handler.setLevel('DEBUG')
log.addHandler(handler)

atexit.register(write_logs, body=log_stringio, bucket="2022-04-28-log-bucket1", key="key_name")

log.warning("Hello S3")

Python 3.9.9
boto 1.21.42

@marcrleonard
Copy link

Any update on this issue?

@tim-finnigan
Copy link
Contributor

To address this issue we recommend joining your child threads before letting the main thread go into shutdown. But as mentioned in an earlier comment some investigation is still required to look into how this might be handled in s3transfer.

@marcrleonard
Copy link

@tim-finnigan I'm not quite understanding how to implement a work around in practice. For instance, if you take the original example:

def x():
        session = boto3.Session(
                aws_access_key_id=public_key,
                aws_secret_access_key=secret_key,
        )

        s3 = session.resource('s3', region_name=aws_region)

Timer(0.5, x).start()

If your goal is to just return an S3 client/session, I don't think you would want to add:

atexit.register(x)

... If I understand it, this would call the function again during tear down.

Any help would be much appreciated.

@nateprewitt
Copy link
Contributor

nateprewitt commented Jun 28, 2022

Hi @marcrleonard,

For this specific example, you have to join your threads.

t = Timer(0.5, x)
t.start()
t.join()

The problem with the current implementation is you're starting a ThreadPoolExecutor inside of a Thread without notifying the parent thread. This is resulting in timer ending before cleanup is performed on S3transfer. The core of the issue stems from changes in 3.9 around concurrent.futures works with ThreadPoolExecutor. You can find the CPython issue here.

There isn't straight forward way to fix this with Python's new behavior at the moment unfortunately.

Edit: To make sure I conveyed the scope of the issue, this isn't specific to any of our code. It's also reproducible with just concurrent.futures code. It's a fundamental issue with Python's threading paradigm.

from threading import Timer

def x():
    from concurrent.futures import ThreadPoolExecutor

Timer(0.5, x).start()

@pbacellar
Copy link

pbacellar commented Oct 27, 2022

I have another example of this that is occurring with a highly upvoted SO answer for uploading logs to S3 using atexit.register to register the uploader function during python shutdown. I suspect this registered hook is being cloned into some sort of multiprocessing under the hood and it's attempting to call the hook for every subprocess (thread?).

Here is the SO answer that appears to now be a broken pattern: https://stackoverflow.com/a/51070892/2970906

Code sample below to reproduce:

import io
import logging
import os

import boto3

os.environ['AWS_PROFILE'] = 'my-aws-profile'


def write_logs(body, bucket, key):
    s3 = boto3.client("s3")
    s3.put_object(Body=body.getvalue(), Bucket=bucket, Key=key)


log = logging.getLogger("some_log_name")
log_stringio = io.StringIO()
handler = logging.StreamHandler(log_stringio)
handler.setLevel('DEBUG')
log.addHandler(handler)

atexit.register(write_logs, body=log_stringio, bucket="2022-04-28-log-bucket1", key="key_name")

log.warning("Hello S3")

Python 3.9.9 boto 1.21.42

Does anyone have a workaround for this case? I'm currently stuck with python 3.9

EDIT: neveremind, just found this other thread python/cpython#86813 (comment)

@aBurmeseDev aBurmeseDev added the p2 This is a standard priority issue label Nov 9, 2022
@alexwakeman
Copy link

I managed to get this sorted by importing the ThreadPoolExecutor module, before calling any of my main application code - so at the top of my main.py start-up script.

thread_pool_ref = concurrent.futures.ThreadPoolExecutor

Just the act of importing the module early (before any threads are initialised) was enough to fix the error. There is an on-going issue around how this module must be present in the main thread, before any child threads import or use any related threading library code.

The inspiration for this fix came from this post on the Python bugs site. My issue was specifically around boto3 library, but the fix is applicable across the board.

@jagajeet-kuppala-gilead

(Reference: python/cpython#86813 (comment))

You can make this work by disabling threading in S3 client.

import boto3.s3.transfer

client = boto3.client("s3")
client.upload_file(
  Filename="path/to/file",
  Bucket="my-bucket",
  Key="my-file",
  Config=boto3.s3.transfer.TransferConfig(use_threads=False),
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a confirmed bug. p2 This is a standard priority issue resources s3
Projects
None yet
Development

No branches or pull requests

10 participants