Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid pickling boto3 client in S3CloudInterface #361

Merged
merged 1 commit into from
Jul 23, 2021

Conversation

mikewallace1979
Copy link
Contributor

Excludes the boto3 client from the S3CloudInterface state so that
it is not pickled by multiprocessing.

This fixes barman-cloud-backup with Python >= 3.8. Previously this
would fail with the following error:

ERROR: Backup failed uploading data (Can't pickle <class 'boto3.resources.factory.s3.ServiceResource'>: attribute lookup s3.ServiceResource on boto3.resources.factory failed)

This is because boto3 cannot be pickled using the default pickle
protocol in Python >= 3.8. See the following boto3 issue:

https://github.com/boto/boto3/issues/678

The workaround of forcing pickle to use an older version of the
pickle protocol is not available because it is multiprocessing
which invokes pickle and it does not allow us to specify the
protocol version.

We therefore exclude the boto3 client from the pickle operation by
implementing custom __getstate__ and __setstate__ methods as
documented here:

https://docs.python.org/3/library/pickle.html#handling-stateful-objects

This works because the worker processes create their own boto3
session anyway due to race conditions around re-using the boto3
session from the parent process.

It is also necessary to defer the assignment of the
worker_processes list until after all worker processes have been
spawned as the references to those worker processes also cannot
be pickled with the default pickle protocol in Python >= 3.8. As
with the boto3 client, the worker_processes list was not being
used by the worker processes anyway.

Excludes the boto3 client from the S3CloudInterface state so that
it is not pickled by multiprocessing.

This fixes barman-cloud-backup with Python >= 3.8. Previously this
would fail with the following error:

    ERROR: Backup failed uploading data (Can't pickle <class 'boto3.resources.factory.s3.ServiceResource'>: attribute lookup s3.ServiceResource on boto3.resources.factory failed)

This is because boto3 cannot be pickled using the default pickle
protocol in Python >= 3.8. See the following boto3 issue:

    boto/boto3#678

The workaround of forcing pickle to use an older version of the
pickle protocol is not available because it is multiprocessing
which invokes pickle and it does not allow us to specify the
protocol version.

We therefore exclude the boto3 client from the pickle operation by
implementing custom `__getstate__` and `__setstate__` methods as
documented here:

    https://docs.python.org/3/library/pickle.html#handling-stateful-objects

This works because the worker processes create their own boto3
session anyway due to race conditions around re-using the boto3
session from the parent process.

It is also necessary to defer the assignment of the
`worker_processes` list until after all worker processes have been
spawned as the references to those worker processes also cannot
be pickled with the default pickle protocol in Python >= 3.8. As
with the boto3 client, the `worker_processes` list was not being
used by the worker processes anyway.
@amenonsen amenonsen merged commit 889830e into master Jul 23, 2021
@amenonsen amenonsen deleted the fix-barman-cloud-backup-for-python-gte-3.8 branch July 23, 2021 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants