Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OMS agent dies within docker container, change main.sh #99

Open
paktek123 opened this issue Mar 13, 2018 · 0 comments
Open

OMS agent dies within docker container, change main.sh #99

paktek123 opened this issue Mar 13, 2018 · 0 comments

Comments

@paktek123
Copy link

Hi

I have been running OMS agent inside of docker as described in the docs. I soon realised that the OMS would die very frequently, even when upgraded to the latest version. The logs would say nothing. This is OMS agent running on ubuntu 16.04 on an Azure VM started via systemd. Right now the Dockerfile runs a sleep inf && wait:

https://github.com/Microsoft/OMS-docker/blob/master/1.4.4-210/main.sh#L80

My systemd job looks as follows:

[Unit]
Description=OMS Agent for Linux Docker
Requires=docker.service
After=docker.service

[Service]
Restart=always
RestartSec=10
RestartPreventExitStatus=5
TimeoutStartSec=20
TimeoutStopSec=15
SyslogIdentifier=oms-agent
ExecStop=/usr/bin/docker rm -f omsagent
ExecStartPre=-/usr/bin/docker kill omsagent
ExecStartPre=-/usr/bin/docker rm -f omsagent
ExecStartPre=/usr/bin/docker pull microsoft/oms:1.4.4-210
ExecStart=/usr/bin/docker run --privileged \
    -v /etc/omsagent/main.sh:/opt/main.sh:ro \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v /var/log:/var/log \
    -e WSID="XXX" \
    -e KEY="XXX" \
    -e SERVICE_25225_CHECK_TCP=true \
    -e SERVICE_25225_CHECK_INTERVAL=15s \
    -e SERVICE_25225_CHECK_TIMEOUT=3s \
    -h="XXX" \
    --net=host \
    --restart=always \
    --publish 10.x.x.x:25225:25225 \
    --publish 10.x.x.x:25224:25224/udp \
    --name="omsagent" \
    microsoft/oms:1.4.4-210

[Install]
WantedBy=multi-user.target

Right now if the OMS service stops responding, the docker container will keep running. I am suggesting that instead of sleep infinity we change the main.sh to something like this:

ERROR_CODE=1
until [ $ERROR_CODE == '0' ] ; do /opt/microsoft/omsagent/bin/service_control is-running; ERROR_CODE=$?; echo "OMSAgent Running"; sleep 15; done

Notice that the exit codes are not standard unix. 0 is false and 1 is success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants