Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Airflow S3hook fails while writing logs in S3 #270

Closed
aliaksandr-d opened this issue Jul 25, 2018 · 3 comments · Fixed by #601
Closed

Airflow S3hook fails while writing logs in S3 #270

aliaksandr-d opened this issue Jul 25, 2018 · 3 comments · Fixed by #601
Assignees
Labels
bug [Fixed] for any bug fixes.

Comments

@aliaksandr-d
Copy link
Member

[2018-07-25 21:13:29,956] {{connectionpool.py:383}} DEBUG - "HEAD /<bucketname>None HTTP/1.1" 404 0
[2018-07-25 21:13:29,958] {{parsers.py:234}} DEBUG - Response headers: {'x-amz-id-2': '***', 'content-type': 'application/xml', 'transfer-encoding': 'chunked', 'date': 'Wed, 25 Jul 2018 21:13:29 GMT', 'server': 'AmazonS3', 'x-amz-request-id': '8231679C93605D24'}
[2018-07-25 21:13:29,958] {{parsers.py:235}} DEBUG - Response body:
b''
[2018-07-25 21:13:29,958] {{hooks.py:209}} DEBUG - Event needs-retry.s3.HeadBucket: calling handler <botocore.retryhandler.RetryHandler object at 0x7f75785e39b0>
[2018-07-25 21:13:29,959] {{retryhandler.py:187}} DEBUG - No retry needed.
[2018-07-25 21:13:29,959] {{hooks.py:209}} DEBUG - Event needs-retry.s3.HeadBucket: calling handler <bound method S3RegionRedirector.redirect_from_error of <botocore.utils.S3RegionRedirector object at 0x7f75785e39e8>>
[2018-07-25 21:13:29,964] {{s3_task_handler.py:162}} ERROR - Could not write logs to s3://legion-data-dev/airflow-logs/example_python_work/sleep_3_seconds/2018-07-25T21:11:00/1.log
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/smart_open/s3.py", line 340, in __init__
    s3.meta.client.head_bucket(Bucket=bucket)
  File "/usr/local/lib/python3.5/dist-packages/botocore/client.py", line 314, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.5/dist-packages/botocore/client.py", line 612, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (404) when calling the HeadBucket operation: Not Found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/airflow/utils/log/s3_task_handler.py", line 159, in s3_write
    encrypt=configuration.getboolean('core', 'ENCRYPT_S3_LOGS'),
  File "/usr/local/lib/python3.5/dist-packages/legion_airflow/hooks/s3_hook.py", line 267, in load_string
    with self.open_file(bucket_name, key, 'w', encoding) as out:
  File "/usr/local/lib/python3.5/dist-packages/legion_airflow/hooks/s3_hook.py", line 79, in open_file
    aws_secret_access_key=self.aws_secret_access_key)
  File "/usr/local/lib/python3.5/dist-packages/smart_open/smart_open_lib.py", line 178, in smart_open
    return s3_open_uri(parsed_uri, mode, **kw)
  File "/usr/local/lib/python3.5/dist-packages/smart_open/smart_open_lib.py", line 260, in s3_open_uri
    fobj = smart_open_s3.open(parsed_uri.bucket_id, parsed_uri.key_id, s3_mode, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/smart_open/s3.py", line 63, in open
    fileobj = BufferedOutputBase(bucket_id, key_id, min_part_size=s3_min_part_size, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/smart_open/s3.py", line 342, in __init__
    raise ValueError('the bucket %r does not exist, or is forbidden for access' % bucket)
ValueError: the bucket '<bucketname>None' does not exist, or is forbidden for access
[2018-07-25 21:13:29,987] {{session.py:289}} DEBUG - Loading variable profile from defaults.
@aliaksandr-d aliaksandr-d added the bug [Fixed] for any bug fixes. label Jul 25, 2018
@kirillmakhonin
Copy link
Member

HEAD /<bucketname>None?

@aliaksandr-d aliaksandr-d self-assigned this Jul 26, 2018
aliaksandr-d added a commit that referenced this issue Aug 13, 2018
aliaksandr-d added a commit that referenced this issue Aug 16, 2018
@aliaksandr-d
Copy link
Member Author

fixed. merged.

@aliaksandr-d
Copy link
Member Author

[2018-11-12 20:14:45,646] {{base_task_runner.py:98}} INFO - Subtask: [2018-11-12 20:14:45,645] {{s3_hook.py:102}} INFO - Opening file "s3://.../airflow-logs/model_trainer/predictor_trx/2018-11-12T19:14:25.369133/1.log" with "w" mode
[2018-11-12 20:14:45,750] {{base_task_runner.py:98}} INFO - Subtask: [2018-11-12 20:14:45,748] {{s3_task_handler.py:162}} ERROR - Could not write logs to s3://.../airflow-logs/model_trainer/predictor_trx/2018-11-12T19:14:25.369133/1.log
[2018-11-12 20:14:45,750] {{base_task_runner.py:98}} INFO - Subtask: Traceback (most recent call last):
[2018-11-12 20:14:45,750] {{base_task_runner.py:98}} INFO - Subtask:   File "/usr/local/lib/python3.6/dist-packages/smart_open/s3.py", line 340, in __init__
[2018-11-12 20:14:45,750] {{base_task_runner.py:98}} INFO - Subtask:     s3.meta.client.head_bucket(Bucket=bucket)
[2018-11-12 20:14:45,750] {{base_task_runner.py:98}} INFO - Subtask:   File "/usr/local/lib/python3.6/dist-packages/botocore/client.py", line 320, in _api_call
[2018-11-12 20:14:45,750] {{base_task_runner.py:98}} INFO - Subtask:     return self._make_api_call(operation_name, kwargs)
[2018-11-12 20:14:45,750] {{base_task_runner.py:98}} INFO - Subtask:   File "/usr/local/lib/python3.6/dist-packages/botocore/client.py", line 623, in _make_api_call
[2018-11-12 20:14:45,751] {{base_task_runner.py:98}} INFO - Subtask:     raise error_class(parsed_response, operation_name)
[2018-11-12 20:14:45,751] {{base_task_runner.py:98}} INFO - Subtask: botocore.exceptions.ClientError: An error occurred (403) when calling the HeadBucket operation: Forbidden
[2018-11-12 20:14:45,751] {{base_task_runner.py:98}} INFO - Subtask: 
[2018-11-12 20:14:45,751] {{base_task_runner.py:98}} INFO - Subtask: During handling of the above exception, another exception occurred:
[2018-11-12 20:14:45,751] {{base_task_runner.py:98}} INFO - Subtask: 
[2018-11-12 20:14:45,751] {{base_task_runner.py:98}} INFO - Subtask: Traceback (most recent call last):
[2018-11-12 20:14:45,751] {{base_task_runner.py:98}} INFO - Subtask:   File "/usr/local/lib/python3.6/dist-packages/airflow/utils/log/s3_task_handler.py", line 159, in s3_write
[2018-11-12 20:14:45,751] {{base_task_runner.py:98}} INFO - Subtask:     encrypt=configuration.getboolean('core', 'ENCRYPT_S3_LOGS'),
[2018-11-12 20:14:45,751] {{base_task_runner.py:98}} INFO - Subtask:   File "/usr/local/lib/python3.6/dist-packages/legion_airflow/hooks/s3_hook.py", line 302, in load_string
[2018-11-12 20:14:45,751] {{base_task_runner.py:98}} INFO - Subtask:     with self.open_file(bucket_name, key, 'w', encoding) as out:
[2018-11-12 20:14:45,751] {{base_task_runner.py:98}} INFO - Subtask:   File "/usr/local/lib/python3.6/dist-packages/legion_airflow/hooks/s3_hook.py", line 106, in open_file
[2018-11-12 20:14:45,752] {{base_task_runner.py:98}} INFO - Subtask:     aws_secret_access_key=self.aws_secret_access_key)
[2018-11-12 20:14:45,752] {{base_task_runner.py:98}} INFO - Subtask:   File "/usr/local/lib/python3.6/dist-packages/smart_open/smart_open_lib.py", line 178, in smart_open
[2018-11-12 20:14:45,752] {{base_task_runner.py:98}} INFO - Subtask:     return s3_open_uri(parsed_uri, mode, **kw)
[2018-11-12 20:14:45,752] {{base_task_runner.py:98}} INFO - Subtask:   File "/usr/local/lib/python3.6/dist-packages/smart_open/smart_open_lib.py", line 260, in s3_open_uri
[2018-11-12 20:14:45,752] {{base_task_runner.py:98}} INFO - Subtask:     fobj = smart_open_s3.open(parsed_uri.bucket_id, parsed_uri.key_id, s3_mode, **kwargs)
[2018-11-12 20:14:45,752] {{base_task_runner.py:98}} INFO - Subtask:   File "/usr/local/lib/python3.6/dist-packages/smart_open/s3.py", line 63, in open
[2018-11-12 20:14:45,752] {{base_task_runner.py:98}} INFO - Subtask:     fileobj = BufferedOutputBase(bucket_id, key_id, min_part_size=s3_min_part_size, **kwargs)
[2018-11-12 20:14:45,752] {{base_task_runner.py:98}} INFO - Subtask:   File "/usr/local/lib/python3.6/dist-packages/smart_open/s3.py", line 342, in __init__
[2018-11-12 20:14:45,752] {{base_task_runner.py:98}} INFO - Subtask:     raise ValueError('the bucket %r does not exist, or is forbidden for access' % bucket)
[2018-11-12 20:14:45,752] {{base_task_runner.py:98}} INFO - Subtask: ValueError: the bucket '...' does not exist, or is forbidden for access
[2018-11-12 20:14:45,958] {{jobs.py:2521}} INFO - Task exited with return code 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug [Fixed] for any bug fixes.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants