Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track ECS container exit codes #2

Merged
merged 5 commits into from
Jul 27, 2016
Merged

Track ECS container exit codes #2

merged 5 commits into from
Jul 27, 2016

Conversation

codingmoose
Copy link

@codingmoose codingmoose commented Jul 12, 2016

Raise Exception to fail ECSTask if essential containers have non-zero exit codes.

The ECS test can be run with tox -e py35 -- -x test/contrib/ecs_test.py

@p7k @bsusensjackson

# Check if container's command returned error
# or if ECS had an error running the command
if 'exitCode' in container and container['exitCode'] != 0 \
or 'reason' in container:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i imagine that reasons are not extraordinarily informative. but something's better than nothing - maybe we should let them bubble up?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by bubble up exactly?

I'm collecting all the failures into one exception and then raising that, as there might be multiple failing essential containers in one task, and ECS does not seem to kill all the tasks immediately on a failed essential container (there is some delay, and those containers that are subsequently terminated exit with exitCode 137), so multiple containers may have reason/exitCode set.

@p7k
Copy link

p7k commented Jul 13, 2016

(venv) ~/w/c/luigi ❯❯❯ tox -e py35 -- -x test/contrib/ecs_test.py                                                                                          track_exit_codes ✱ ◼
py35 create: /Users/pavel/workspace/celmatix/luigi/.tox/py35
py35 installdeps: mock<2.0, moto<1.0, HTTPretty==0.8.10, nose<2.0, unittest2<2.0, boto<3.0, boto3>=1.3.1, sqlalchemy<2.0, elasticsearch<2.0.0, psutil<4.0, enum34>1.1.0, coverage>=3.6,<3.999, codecov>=1.4.0, requests<3.0, pygments, hypothesis[datetime]
py35 develop-inst: /Users/pavel/workspace/celmatix/luigi
py35 installed: boto==2.41.0,boto3==1.3.1,botocore==1.4.36,click==6.6,codecov==2.0.5,coverage==3.7.1,docutils==0.12,elasticsearch==1.9.0,enum34==1.1.6,Flask==0.11.1,httpretty==0.8.10,hypothesis==3.4.1,itsdangerous==0.24,Jinja2==2.8,jmespath==0.9.0,linecache2==1.0.0,lockfile==0.12.2,-e git+git@github.com:Celmatix/luigi.git@822965409c03c03032a13d4379e0797ecae7ecfb#egg=luigi,MarkupSafe==0.23,mock==1.3.0,moto==0.4.25,nose==1.3.7,pbr==1.10.0,psutil==3.4.2,Pygments==2.1.3,python-daemon==2.1.1,python-dateutil==2.5.3,pytz==2016.4,requests==2.10.0,six==1.10.0,SQLAlchemy==1.0.14,tornado==4.3,traceback2==1.4.0,unittest2==1.1.0,urllib3==1.16,Werkzeug==0.11.10,xmltodict==0.10.2
py35 runtests: PYTHONHASHSEED='926252669'
py35 runtests: commands[0] | python --version
Python 3.5.1
py35 runtests: commands[1] | coverage run test/runtests.py -v --ignore-files=(^\.|^_|^setup\.py$) --ignore-files=not_imported.py -x test/contrib/ecs_test.py
/Users/pavel/workspace/celmatix/luigi/luigi/scheduler.py:89: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() instead
  fn_args = inspect.getargspec(fn)
Failure: NoRegionError (You must specify a region.) ... ERROR

======================================================================
ERROR: Failure: NoRegionError (You must specify a region.)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/nose/failure.py", line 39, in runTest
    raise self.exc_val.with_traceback(self.tb)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/nose/loader.py", line 418, in loadTestsFromName
    addr.filename, addr.module)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/nose/importer.py", line 47, in importFromPath
    return self.importFromDir(dir_path, fqname)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/nose/importer.py", line 94, in importFromDir
    mod = load_module(part_fqname, fh, filename, desc)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/imp.py", line 234, in load_module
    return load_source(name, filename, file)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/imp.py", line 172, in load_source
    module = _load(spec)
  File "<frozen importlib._bootstrap>", line 693, in _load
  File "<frozen importlib._bootstrap>", line 673, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 662, in exec_module
  File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
  File "/Users/pavel/workspace/celmatix/luigi/test/contrib/ecs_test.py", line 37, in <module>
    from luigi.contrib.ecs import ECSTask, _get_task_statuses, _get_task_descriptions
  File "/Users/pavel/workspace/celmatix/luigi/luigi/contrib/ecs.py", line 63, in <module>
    client = boto3.client('ecs')
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/boto3/__init__.py", line 79, in client
    return _get_default_session().client(*args, **kwargs)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/boto3/session.py", line 250, in client
    aws_session_token=aws_session_token, config=config)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/session.py", line 818, in create_client
    client_config=config, api_version=api_version)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/client.py", line 69, in create_client
    verify, credentials, scoped_config, client_config, endpoint_bridge)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/client.py", line 199, in _get_client_args
    service_name, region_name, endpoint_url, is_secure)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/client.py", line 322, in resolve
    service_name, region_name)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/regions.py", line 122, in construct_endpoint
    partition, service_name, region_name)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/regions.py", line 135, in _endpoint_for_partition
    raise NoRegionError()
botocore.exceptions.NoRegionError: You must specify a region.
-------------------- >> begin captured logging << --------------------
botocore.credentials: DEBUG: Skipping environment variable credential check because profile name was explicitly set.
botocore.credentials: DEBUG: Looking for credentials via: env
botocore.credentials: DEBUG: Looking for credentials via: assume-role
botocore.credentials: DEBUG: Looking for credentials via: shared-credentials-file
botocore.credentials: DEBUG: Looking for credentials via: config-file
botocore.credentials: DEBUG: Looking for credentials via: ec2-credentials-file
botocore.credentials: DEBUG: Looking for credentials via: boto-config
botocore.credentials: DEBUG: Looking for credentials via: iam-role
botocore.vendored.requests.packages.urllib3.connectionpool: INFO: Starting new HTTP connection (1): 169.254.169.254
botocore.utils: DEBUG: Caught exception while trying to retrieve credentials: HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /latest/meta-data/iam/security-credentials/ (Caused by ConnectTimeoutError(<botocore.awsrequest.AWSHTTPConnection object at 0x1102044e0>, 'Connection to 169.254.169.254 timed out. (connect timeout=1)'))
Traceback (most recent call last):
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/vendored/requests/packages/urllib3/connection.py", line 134, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/vendored/requests/packages/urllib3/util/connection.py", line 88, in create_connection
    raise err
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/vendored/requests/packages/urllib3/util/connection.py", line 78, in create_connection
    sock.connect(sa)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 544, in urlopen
    body=body, headers=headers)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 349, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1083, in request
    self._send_request(method, url, body, headers)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/awsrequest.py", line 129, in _send_request
    self, method, url, body, headers)
  File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1128, in _send_request
    self.endheaders(body)
  File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1079, in endheaders
    self._send_output(message_body)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/awsrequest.py", line 156, in _send_output
    self.send(msg)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/awsrequest.py", line 241, in send
    return HTTPConnection.send(self, str)
  File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 854, in send
    self.connect()
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/vendored/requests/packages/urllib3/connection.py", line 155, in connect
    conn = self._new_conn()
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/vendored/requests/packages/urllib3/connection.py", line 139, in _new_conn
    (self.host, self.timeout))
botocore.vendored.requests.packages.urllib3.exceptions.ConnectTimeoutError: (<botocore.awsrequest.AWSHTTPConnection object at 0x1102044e0>, 'Connection to 169.254.169.254 timed out. (connect timeout=1)')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/vendored/requests/adapters.py", line 370, in send
    timeout=timeout
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 597, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/vendored/requests/packages/urllib3/util/retry.py", line 271, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
botocore.vendored.requests.packages.urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /latest/meta-data/iam/security-credentials/ (Caused by ConnectTimeoutError(<botocore.awsrequest.AWSHTTPConnection object at 0x1102044e0>, 'Connection to 169.254.169.254 timed out. (connect timeout=1)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/utils.py", line 156, in _get_request
    response = requests.get(url, timeout=timeout)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/vendored/requests/api.py", line 69, in get
    return request('get', url, params=params, **kwargs)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/vendored/requests/api.py", line 50, in request
    response = session.request(method=method, url=url, **kwargs)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/vendored/requests/sessions.py", line 465, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/vendored/requests/sessions.py", line 573, in send
    r = adapter.send(request, **kwargs)
  File "/Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/vendored/requests/adapters.py", line 419, in send
    raise ConnectTimeout(e, request=request)
botocore.vendored.requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /latest/meta-data/iam/security-credentials/ (Caused by ConnectTimeoutError(<botocore.awsrequest.AWSHTTPConnection object at 0x1102044e0>, 'Connection to 169.254.169.254 timed out. (connect timeout=1)'))
botocore.utils: DEBUG: Max number of attempts exceeded (1) when attempting to retrieve data from metadata service.
botocore.loaders: DEBUG: Loading JSON file: /Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/data/endpoints.json
botocore.loaders: DEBUG: Loading JSON file: /Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/data/ecs/2014-11-13/service-2.json
botocore.loaders: DEBUG: Loading JSON file: /Users/pavel/workspace/celmatix/luigi/.tox/py35/lib/python3.5/site-packages/botocore/data/_retry.json
botocore.client: DEBUG: Registering retry handlers for service: ecs
botocore.hooks: DEBUG: Event creating-client-class.ecs: calling handler <function add_generate_presigned_url at 0x11017c268>
--------------------- >> end captured logging << ---------------------

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (errors=1)
ERROR: InvocationError: '/Users/pavel/workspace/celmatix/luigi/.tox/py35/bin/coverage run test/runtests.py -v --ignore-files=(^\\.|^_|^setup\\.py$) --ignore-files=not_imported.py -x test/contrib/ecs_test.py'
___________________________________________________________________________________ summary ____________________________________________________________________________________
ERROR:   py35: commands failed

@codingmoose
Copy link
Author

Do you have AWS_DEFAULT_REGION set? Or the region in your ~/.aws/config file?

@p7k
Copy link

p7k commented Jul 13, 2016

@codingmoose yessir - but i'll confirm

@p7k p7k merged commit 1302f03 into master Jul 27, 2016
@p7k p7k deleted the track_exit_codes branch July 27, 2016 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants