Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify to work with AWS ECS #300

Merged
merged 3 commits into from
Jan 14, 2018
Merged

Modify to work with AWS ECS #300

merged 3 commits into from
Jan 14, 2018

Conversation

myoung34
Copy link
Contributor

In ECS the /proc/self/cgroup contains the prefix /ecs/ not /docker/ (likely the issue with other hosts such as kubernetes)

The "aws" fix here is to use the ECS_CONTAINER_METADATA_FILE env var provided by ecs

Im open to modifying it as well to instead take an override for the sed:

CGROUP_PREFIX="${CGROUP_PREFIX:-docker}"
export CONTAINER_ID=$(cat /proc/self/cgroup | sed -nE "s/^.+$CGROUP_PREFIX[\/-]([a-f0-9]{64}).*/\1/p" | head -n 1)


set -u
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change was to make it not error on "${DOCKER_PROVIDER,,}" which is possibly unbound (but actually is not because of the -n sanity)

@buchdag
Copy link
Member

buchdag commented Dec 12, 2017

Hi ! Thanks for that PR. I have two remarks:

  1. could you use a parameter substitution instead of removing set -u and testing for a non null variable ? Something like this:
DOCKER_PROVIDER=${DOCKER_PROVIDER:-}
if [[ "${DOCKER_PROVIDER,,}" == "aws" ]]; then
  1. similarly, could you do grep 'ContainerID' directly from the file rather than disabling SC2002 ?

btw 1) is currently lacking on this line and that line could directly sed from the file as suggested on 2). If you don't mind adding that to the PR that would be neat 👍

@buchdag buchdag added the kind/feature-request Issue requesting a new feature label Dec 12, 2017
@myoung34
Copy link
Contributor Author

@buchdag done

Copy link
Member

@buchdag buchdag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woops sorry, seems I misdirected you.

My last comment was suggesting that if [[ -z "$CONTAINER_ID" ]]; then be changed to if [[ -z "${CONTAINER_ID:-}" ]]; then, no need to do a separate assignment first for this one.

Also please do not remove the export of CONTAINER_ID even if it's not needed at the moment.

@myoung34
Copy link
Contributor Author

@buchdag makes sense, changed

@DurinMusicspear
Copy link

DurinMusicspear commented Dec 22, 2017

I got the same issue as described here and the proposed fix helped me to get this working again. However i got two problems:

  1. ECS_ENABLE_CONTAINER_METADATA is not enabled by default and the proposed change does not warn you about that.
  2. The check for if [[ -n "${DOCKER_PROVIDER:-}" ]] && [[ "${DOCKER_PROVIDER,,}" == "aws" ]]; then did not seem to work and i had to remove that part to get CONTAINER_ID from ECS_CONTAINER_METADATA_FILE

@myoung34
Copy link
Contributor Author

myoung34 commented Dec 22, 2017

@DurinMusicspear

  1. youre right, but there's no reason to check for ecs.config to do it. If it's not there it will fail the check for -n and behave the same as if the code didnt exist which is good behavior

  2. looks to work as expected:

$ unset DOCKER_PROVIDER
$ [[ -n "${DOCKER_PROVIDER:-}" ]] && [[ "${DOCKER_PROVIDER,,}" == "aws" ]] && echo truthy || echo falsey
falsey
$ export DOCKER_PROVIDER="AwS"
$ [[ -n "${DOCKER_PROVIDER:-}" ]] && [[ "${DOCKER_PROVIDER,,}" == "aws" ]] && echo truthy || echo falsey
truthy

@buchdag
Copy link
Member

buchdag commented Dec 23, 2017

@myoung34 is it possible that in some case AWS ECS does not set the DOCKER_PROVIDER environment variable ?

Is this DOCKER_PROVIDER variable a standard among container management services or unique to ECS ?

If it's not widely used and/or if it is sometimes not set by ECS, maybe the fix should rather directly test [[ -n "${ECS_CONTAINER_METADATA_FILE:-}" ]] instead ?

@buchdag buchdag mentioned this pull request Dec 23, 2017
@myoung34
Copy link
Contributor Author

DOCKER_PROVIDER is custom and meant for here to swap between kubernetes, gcp, aws, etc.
ECS_CONTAINER_METADATA_FILE is standard to AWS in ECS but has to be enabled by ecs.config

I ilke the current way because there's no behavior change existing or across cloud providers, but is "opt in" by setting DOCKER_PROVIDER and enabling the use of "cloud lookups"

@myoung34
Copy link
Contributor Author

@buchdag I pushed up what it would look like to test for AWS ECS.
My only complaint would be without "opting in" by telling it which one to use, if there were 15 providers (15 if()'s) it would be a long time for it to go through the failed checks to hit the original method.

@buchdag
Copy link
Member

buchdag commented Dec 27, 2017

@myoung34 do you think a better approach would be to check in /proc/self/cgroup first, then if it returns nothing either use ECS_CONTAINER_METADATA_FILE or a different regex ?

From what @DurinMusicspear said, if ECS_CONTAINER_METADATA_FILE isn't enabled by default the alternative regex seems to be a bit more reliable.

There is a similar proposition for the same issue in docker-gen: nginx-proxy/docker-gen#263

@myoung34
Copy link
Contributor Author

I think the best way to approach it is with an opt-in flag like DOCKER_PROVIDER so that you can be explicit and not hope the trickle through works honestly. Im not a fan of changing default behavior so either that or having the cat /proc happen first would be my choices

@buchdag
Copy link
Member

buchdag commented Dec 27, 2017

Ok I might have misunderstood something from the beginning, the DOCKER_PROVIDER was supposed to be a new user provided environment variable ?

@myoung34
Copy link
Contributor Author

It is, but it acts like an opt-in. not providing it would be no change in current behavior, manually providing it is a way for the user to set behavior, basically saying "i want this to act under ECS" , so if its ECS, use the ECS_CONTAINER_METADATA_FILE var. Then in the future someone can add a block that does something special to get it to work, say, under Kubernetes by saying DOCKER_PROVIDER==k8s then in that block doing whatever it takes to work under kubernetes, etc. Its faster since its explicit, and its opt in meaning that you have to set it manually to get it to behave differently for you (not changing behavior for anyone else currently using it by default)

@buchdag
Copy link
Member

buchdag commented Dec 27, 2017

Got it. I thought DOCKER_PROVIDER was supposed to be set by ECS itself and I'm under the impression that @DurinMusicspear did not understand that it was supposed to be set by the user either.

If you introduce a new environment variable you'll have to document its use in the README.

With all that in mind i think the sed override might be simpler / more reliable. Something like, let's say:

DOCKER_PROVIDER=${DOCKER_PROVIDER:-docker}

case $DOCKER_PROVIDER in
aws|AWS)
    CGROUP_PREFIX=ecs
    ;;
*)
    CGROUP_PREFIX=docker
    ;;
esac

export CONTAINER_ID=$(sed -nE "s/^.+$CGROUP_PREFIX[\/-]([a-f0-9]{64}).*/\1/p" /proc/self/group | head -n 1)

Unless I've missed something, default behavior is unchanged, it is explicit and it work under ECS even if ECS_CONTAINER_METADATA_FILE hasn't been enabled. We could directly set the CGROUP_PREFIX but I think it's better to abstract away what the container really does to get its own ID under different container management services and just have the user set an environment variable with a self explanatory name and value.

If in the future getting the container ID under Kubernetes or another container management service require something more complex than a regexp change, the export CONTAINER_ID can be moved inside the case.

Your thought ?

@myoung34
Copy link
Contributor Author

/proc/cgroup is not a good choice on ECS. it might work, but it might not as well. The container ID is provided first-hand in the environment variable given via the API (ECS_CONTAINER_METADATA_FILE). I dont think using /proc/cgroup is a good idea. I honestly think my first submission is the best, being opt-in (with DOCKER_PROVIDER) then using the recommended method (the env var ECS_CONTAINER_METADATA_FILE)

@buchdag
Copy link
Member

buchdag commented Dec 28, 2017

Ok so let's go for ECS_CONTAINER_METADATA_FILE. There are still a some changes needed for this to be merged:

  • document the new DOCKER_PROVIDER environment variable use (and clarify the fact that in ECS,
    ECS_CONTAINER_METADATA_FILE has to be manually enabled).
  • use the value ecs rather than aws (you can run containers on AWS without using ECS at all, so I'm afraid it might mislead some people if we stick to aws).
  • use a case rather than if/else, it's cleaner and will be easier for us to maintain in the long term if the plan is to add other container management services. If you're worried about speed, it does not make any measurable difference whether you have two patterns in your case statement or several thousands (tested for real with time and script-generated case).
  • preferably check in the ecs case that ECS_CONTAINER_METADATA_FILE is set, and if it's not exit with a message explaining that it needs to be enabled.
  • no need to check DOCKER_PROVIDER with -n if it is a user set environment variable. Either it's set by the user or it's not, get a default value and is caught by the default case.

proto code:

DOCKER_PROVIDER=${DOCKER_PROVIDER:-default} #or docker or whatever default value seems more relevant to you

case $DOCKER_PROVIDER in
ecs|ECS)
    if [[ -z ${ECS_CONTAINER_METADATA_FILE:-} ]]; then
        echo instructions
        exit 1
    fi
    export CONTAINER_ID=$(ecs method)
    ;;
*)
    export CONTAINER_ID=$(default method)
    ;;
esac

@jschlieber
Copy link

Hi, thanks for addressing this issue. When can users expect this to be merged?

@buchdag
Copy link
Member

buchdag commented Jan 5, 2018

@jschlieber As soon as the requested changes are made.

@myoung34 If you lack time to make the changes or if you think they're problematic, please let me know.

@pbreah
Copy link

pbreah commented Jan 5, 2018

I tested @myoung34 initial fix, and there is another issue on ECS: when the docker API is called on the script it gets this:

{"message":"No such exec instance 'null' found in daemon"}

this is part of the docker-compose for the nginx-proxy container

nginx_proxy:
    image:                      jwilder/nginx-proxy
    container_name:             nginx_proxy
    links:
     - geoserver
    ports:
     - 80:80
     - 443:443
    labels:
      - com.github.jrcs.letsencrypt_nginx_proxy_companion.nginx_proxy=true

this also causes this error when trying to reload nginx:

nginx: [emerg] no servers are inside upstream

on the nginx config file /etc/nginx/conf.d/default.conf (I replaced the real domain with example.com):

# example.com
upstream example.com {
}

@pbreah
Copy link

pbreah commented Jan 6, 2018

At the moment this image works on AWS ECS using docker stacks (implementing @myoung34 fix) - manually creating your EC2 instance and going from there.

Also there is a mayor confusion on the docs:

  • if you use -e NGINX_PROXY_CONTAINER=[name] it has hard time finding the nginx-proxy ID, because the function that looks for a container with the label com.github.jrcs.letsencrypt_nginx_proxy_companion.nginx_proxy=true is named the same as this environment variable. So by not including this environment variable it works (not getting the {"message":"No such exec instance 'null' found in daemon"} message in the logs)

Hope this helps anyone with the same issue.

@buchdag
Copy link
Member

buchdag commented Jan 6, 2018

@pbreah thanks for reporting the issue, there are a few more instances of functions sharing names with variables, I'll try to fix them asap.

@vfeskov
Copy link

vfeskov commented Jan 14, 2018

@myoung34 it doesn't work for me on elastic beanstalk multi-container docker platform, ECS_CONTAINER_METADATA_FILE env var is not set in containers

@myoung34
Copy link
Contributor Author

@vfeskov I dont think you can use it in beanstalk since that's not the same as ECS

@vfeskov
Copy link

vfeskov commented Jan 14, 2018

@myoung34 elastic beanstalk uses ecs and the companion was working fine before (post)

this change: vfeskov@bb85a95 makes it get container id correctly, but i'm now struggling with "Connection refused" error

@myoung34
Copy link
Contributor Author

myoung34 commented Jan 14, 2018

@vfeskov Im not sure but I dont use beanstalk with containers so feel free to push additional changes if anything has to change to accomodate it. As for the current push:

bash-4.3# cat test.sh
DOCKER_PROVIDER=${DOCKER_PROVIDER:-docker}
case "${DOCKER_PROVIDER}" in
ecs|ECS)
    if [[ -n "${ECS_CONTAINER_METADATA_FILE:-}" ]]; then
      export CONTAINER_ID=$(grep ContainerID "${ECS_CONTAINER_METADATA_FILE}" | sed 's/.*: "\(.*\)",/\1/g')
    else
      echo "${DOCKER_PROVIDER} specified as 'ecs' but not available. See: http://docs.aws.amazon.com/AmazonECS/latest/developerguide/container-metadata.html"
      exit 1
    fi
    ;;
*)
    export CONTAINER_ID=$(sed -nE 's/^.+docker[\/-]([a-f0-9]{64}).*/\1/p' /proc/self/cgroup | head -n 1)
    ;;
esac

echo "container id: ${CONTAINER_ID}"
echo "docker provider: ${DOCKER_PROVIDER}"


bash-4.3# export DOCKER_PROVIDER=ecs
bash-4.3# sh test.sh
container id: c16201cdac97ec6cf74c7d1a1e665c6ebfd12c1505483f73667d0edd93987261
docker provider: ecs

bash-4.3# unset DOCKER_PROVIDER
bash-4.3# sh test.sh
container id: 
docker provider: docker

@buchdag Latest meets criteria and passes manual verify on ECS with ecs.config enabled.

@vfeskov
Copy link

vfeskov commented Jan 14, 2018

@myoung34 Can you check if this command works in your setup? It does work on Elastic Beanstalk ECS, and it doesn't require any additional config

export CONTAINER_ID=$(cat /proc/self/cgroup | sed -nE 's/^.+(docker[\/-]|ecs\/[a-f0-9\-]+\/)([a-f0-9]{64}).*/\2/p' | head -n 1)

@myoung34
Copy link
Contributor Author

myoung34 commented Jan 14, 2018

@vfeskov that wont work in ECS, it does not expose that in the prc file:

bash-4.3# cat /proc/self/cgroup
9:perf_event:/ecs/5c2c44db-6d3d-4224-95f8-1a5113e7de38/c16201cdac97ec6cf74c7d1a1e665c6ebfd12c1505483f73667d0edd93987261
8:memory:/ecs/5c2c44db-6d3d-4224-95f8-1a5113e7de38/c16201cdac97ec6cf74c7d1a1e665c6ebfd12c1505483f73667d0edd93987261
7:hugetlb:/ecs/5c2c44db-6d3d-4224-95f8-1a5113e7de38/c16201cdac97ec6cf74c7d1a1e665c6ebfd12c1505483f73667d0edd93987261
6:freezer:/ecs/5c2c44db-6d3d-4224-95f8-1a5113e7de38/c16201cdac97ec6cf74c7d1a1e665c6ebfd12c1505483f73667d0edd93987261
5:devices:/ecs/5c2c44db-6d3d-4224-95f8-1a5113e7de38/c16201cdac97ec6cf74c7d1a1e665c6ebfd12c1505483f73667d0edd93987261
4:cpuset:/ecs/5c2c44db-6d3d-4224-95f8-1a5113e7de38/c16201cdac97ec6cf74c7d1a1e665c6ebfd12c1505483f73667d0edd93987261
3:cpuacct:/ecs/5c2c44db-6d3d-4224-95f8-1a5113e7de38/c16201cdac97ec6cf74c7d1a1e665c6ebfd12c1505483f73667d0edd93987261
2:cpu:/ecs/5c2c44db-6d3d-4224-95f8-1a5113e7de38/c16201cdac97ec6cf74c7d1a1e665c6ebfd12c1505483f73667d0edd93987261
1:blkio:/ecs/5c2c44db-6d3d-4224-95f8-1a5113e7de38/c16201cdac97ec6cf74c7d1a1e665c6ebfd12c1505483f73667d0edd93987261

@vfeskov
Copy link

vfeskov commented Jan 14, 2018

looks exactly what i see on elastic beanstalk ecs and the regex extracts container id nicely

@myoung34
Copy link
Contributor Author

@vfeskov feel free to submit a regex fix after this merge then, but there are better ways to get container ids in other areas (like ECS) using the metadata instead of relying on the cgroup file.

@vfeskov
Copy link

vfeskov commented Jan 14, 2018

https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_docker_ecs.html

Elastic Beanstalk uses Amazon Elastic Container Service (Amazon ECS) to coordinate container deployments to multicontainer Docker environments

so something like this then?

case "${DOCKER_PROVIDER}" in
ecs|ECS)
    # AWS ECS. Enabled in /etc/ecs/ecs.config (http://docs.aws.amazon.com/AmazonECS/latest/developerguide/container-metadata.html)
    if [[ -n "${ECS_CONTAINER_METADATA_FILE:-}" ]]; then
      export CONTAINER_ID=$(grep ContainerID "${ECS_CONTAINER_METADATA_FILE}" | sed 's/.*: "\(.*\)",/\1/g')
    else
      echo "${DOCKER_PROVIDER} specified as 'ecs' but not available. See: http://docs.aws.amazon.com/AmazonECS/latest/developerguide/container-metadata.html"
      exit 1
    fi
    ;;
*)
    export CONTAINER_ID=$(cat /proc/self/cgroup | sed -nE 's/^.+(docker[\/-]|ecs\/[a-f0-9\-]+\/)([a-f0-9]{64}).*/\2/p' | head -n 1)
    ;;
esac

@myoung34
Copy link
Contributor Author

That would work but this is starting to get into scope creep. Can we just get this merged and issue subsequent pr's ? I've already added code not related to ecs here and would like to stop making changes

@buchdag
Copy link
Member

buchdag commented Jan 14, 2018

@myoung34 LGTM, thanks for this PR and the added effort on changes.

Is it ok with you if I squash the commits ?

@buchdag buchdag added the lgtm label Jan 14, 2018
@myoung34
Copy link
Contributor Author

Good with me, I meant to originally. Thanks!

@buchdag buchdag merged commit da5cc2b into nginx-proxy:master Jan 14, 2018
@alepycom
Copy link

Hi, I'm still having the "Error: can't get my container ID !"

i'm following your instructions about "Separate containers" on https://github.com/JrCs/docker-letsencrypt-nginx-proxy-companion

I'm trying to mount that on AWS ECS Task Definition with those 3 containers.
I'm able to do it manually, but the cool thing it's to do it on the console.

Can anyone tell me if i'm missing some new instruction?

thanks a lot

speshak pushed a commit to speshak/docker-letsencrypt-nginx-proxy-companion that referenced this pull request Oct 2, 2018
+ add foundation for future support of other container management services
@sunomie
Copy link

sunomie commented Oct 20, 2018

Getting 'Error: can't get my container ID' with the combined container setup. Verified ECS_ENABLE_CONTAINER_METADATA is available in the container. Verified that the function get_self_cid works fine when I run it manually in the container. Maybe metadata file takes a little bit to establish and this is getting called before before that happens and causing the container blow up? Not sure what else could be going on.

@1awst
Copy link

1awst commented Oct 23, 2018

@sunomie I also started getting the same issue when I used the latest AMI with the latest ECS agent (1.21.0) however previous versions work as per the documentation. I was able to fix it on the latest agent by removing the DOCKER_PROVIDER environment variable completely.

Update:
I initially thought the prefix in /proc/self/cgroup had changed from /ecs/ to /docker/ and that was the reason why things weren't working as expected with the latest AMI/agent. However I'm using jwilder/nginx-proxy and there is a known issue related to the prefix (see here) which I'm working around by setting ECS_ENABLE_TASK_CPU_MEM_LIMIT=false in /etc/ecs/ecs.config. That's the reason the prefix is /docker/ and not because of the most recent AMI or ECS agent. That also explains why I was able to remove the DOCKER_PROVIDER setting from the companion container.

tl;dr I don't know what changed in the latest AMI and/or ECS agent but removing the DOCKER_PROVIDER setting from the companion container and adding ECS_ENABLE_TASK_CPU_MEM_LIMIT=false to the ECS agent config resolved the 'container ID' error when the companion container attempted to start. My other clusters running older AMIs and agents run without removing the DOCKER_PROVIDER setting.

@sunomie
Copy link

sunomie commented Oct 23, 2018 via email

@buchdag
Copy link
Member

buchdag commented Oct 29, 2018

This self container ID issue with AWS ECS bugs me.

The sole purpose of getting the LE container ID are a configuration check during container startup and the ability to automatically get the nginx-proxy container ID with --volumes-from.

None of them are actually required to correctly run the container.

I'm a newbie with ECS but I guess the --volumes-from flag of docker run does not even make sense in this context ?

@buchdag buchdag added Amazon ECS and removed lgtm labels Oct 29, 2018
@buchdag
Copy link
Member

buchdag commented Jan 10, 2019

The self container ID is now retrieved with a way simpler method that has been tested to work on Amazon ECS (and should work on Kubernetes too) + the hard requirement that the container be able to retrieve its self ID during startup has been lifted if you use other methods of getting the nginx/nginx-proxy container ID. See #491 for details.

This should greatly simplify the use of this container on Amazon ECS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature-request Issue requesting a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants