Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't raise an exception on failure to CreateBucket #154

Closed
jmeickle opened this issue Dec 1, 2017 · 5 comments · Fixed by #155
Closed

Don't raise an exception on failure to CreateBucket #154

jmeickle opened this issue Dec 1, 2017 · 5 comments · Fixed by #155

Comments

@jmeickle
Copy link

jmeickle commented Dec 1, 2017

Traceback (most recent call last):
  File "/opt/crash_uploader", line 165, in <module>
    upload_user_coredump(pid, executable)
  File "/opt/crash_uploader", line 123, in upload_user_coredump
    with smart_open(dumpfile, 'wb') as s3out:
  File "/usr/local/lib/python2.7/dist-packages/smart_open/smart_open_lib.py", line 175, in smart_open
    return s3_open_uri(parsed_uri, mode, **kw)
  File "/usr/local/lib/python2.7/dist-packages/smart_open/smart_open_lib.py", line 255, in s3_open_uri
    fobj = smart_open_s3.open(parsed_uri.bucket_id, parsed_uri.key_id, s3_mode, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/smart_open/s3.py", line 60, in open
    fileobj = BufferedOutputBase(bucket_id, key_id, min_part_size=s3_min_part_size, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/smart_open/s3.py", line 255, in __init__
    s3.create_bucket(Bucket=bucket)
  File "/usr/local/lib/python2.7/dist-packages/boto3/resources/factory.py", line 520, in do_action
    response = action(self, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/boto3/resources/action.py", line 83, in __call__
    response = getattr(parent.meta.client, operation_name)(**params)
  File "/usr/local/lib/python2.7/dist-packages/botocore/client.py", line 314, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python2.7/dist-packages/botocore/client.py", line 612, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the CreateBucket operation: Access Denied

In general implicit bucket creation seems like a bad call, in case of a misconfig. But if it is going to happen, it definitely shouldn't block writing to that bucket if it already exists. (In our case this is running from an IAM box that does not have permission to create buckets, but can write to an existing bucket.)

Testing indicates this broke in 1.5.4

@piskvorky
Copy link
Owner

piskvorky commented Dec 2, 2017

Creating buckets by default would definitely be a bug, that'd be horrible, I agree.

@menshikh-iv what happened in 1.5.4?

@gdoron
Copy link

gdoron commented Dec 4, 2017

Hit it myself now.
Wasted 4 hours 😢 😢 😢 😢 😢

@jerluc
Copy link

jerluc commented Dec 5, 2017

Also ran into this. Is there any toggle for specifiying? Or should we just assume that it will always try to create a bucket? Just afraid of the typo scenario which would auto-create a bucket that shouldn't exist 😨

@piskvorky
Copy link
Owner

@menshikh-iv we need the bug fix release ASAP.

@menshikh-iv
Copy link
Contributor

@piskvorky work on it, will release smart_open today

menshikh-iv pushed a commit that referenced this issue Dec 6, 2017
)

* use boto3 instead of boto in s3 tests

Looks like moto doesn't work with boto anymore.
Works fine with boto3.

* Override .readline() in s3.BufferedInputBase, increase buffer size

The existing BufferedInputBase didn't override the .readline() method,
forcing the superclass implementation to use .read() to read one byte at
a time. This slowed reading down significantly.

Also increased the buffer size to 256kB, this is consistent with
s3transfer.

http://boto3.readthedocs.io/en/latest/_modules/boto3/s3/transfer.html

* rewrite try..except as an if-else, it is faster that way

* get rid of unused and incorrect TEXT_NEWLINE

* Resolve Issue #154: don't create buckets if they don't exist

When writing, check if the bucket exists and raise a ValueError if it
doesn't.

* fixup for d453af2: create the correct bucket in top-level unit tests

* Resolve Issue #153: don't wrap GzipFile in contextlib.closing

This is no longer necessary since we dropped support for Python 2.6.
The GzipFile from Py2.7 and above is already a context manager, so the
closing is not required.

https://docs.python.org/2/library/gzip.html

* Support errors keyword

* add some integration tests, focusing on S3 only for now

* Specify utf-8 encoding explicitly in tests

If we don't specify it, smart_open will use the system encoding.  This
may be ascii on some systems (e.g. Py2.7), which will fail the test
because it contains non-ascii characters.

* Refactored S3 subsystem to disable seeking

Renamed RawReader to SeekableRawReader.
Renamed BufferedInputBase to SeekableBufferedInputBase.
Introduced new, non-seekable RawReader and BufferedInputBase.
Seeking functionality was strictly necessary while we were supporting
Py2.6, because the gzip reader required it back then.  The gzip reader
from 2.7 onwards does not require seeking, so neither do we.

Seeking is convenient, but appears to be slower, so disabling it for now
is the right thing to do.

* Re-enable seeking for S3

It appears Py2.7 gzip still requires seeking.  Py3 gzip does not.
We're still supporting Py2.7, so we need seeking to work if we continue
to use the existing gzip reader.

* fixup for 3e93fb1: point unit tests at seekable S3 object
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants