Please add container uptime to docker input #3184

jindov · 2017-08-28T18:39:55Z

As cadvisor with prometheus, I hope docker input will have container_uptime metric. I really need to know when a container restart or how long it was started.
Thank dev team,
Jin

The text was updated successfully, but these errors were encountered:

jamesshannon · 2018-08-17T18:00:11Z

Any movement on this? I'd really like to see this done as a way to catch errors that cause reboots.

It's not clear to me which measurement this would go into. There are container-specific metrics for blockio, cpu, mem, and net. Would we need to create a new one for uptime? That seems wasteful (though technically-speaking it probably isn't).

glinton · 2018-08-21T16:54:47Z

@jamesshannon I'm not aware of anyone spending time on this yet. If you'd like to, you're more than welcome to open a pr for it. It looks like adding it as another metric like the blockio, cpu, etc... is the way to go.

danielnelson · 2018-08-21T22:58:23Z

In 1.8.0 we have added a new measurement:

- docker_container_status
  - tags:
    - engine_host
    - server_version
    - container_image
    - container_name
    - container_status
    - container_version
  - fields:
    - oomkilled (boolean)
    - pid (integer)
    - exitcode (integer)
    - started_at (integer)
    - finished_at (integer)

@jindov @jamesshannon Would this meet your requirements?

jamesshannon · 2018-08-21T23:13:23Z

I was thinking of something like uptime_seconds. My goal is to be able to catch when my application crashes (which then causes docker to reboot). I'm just getting started with the influxdb stack, but I would then set up grafana to count instances where uptime < some small number. But I guess I could also track other rows, like exitcode or finished_at not null. Right?

More generally, I am a bit confused about your datamodel. The fact that there's an exitcode and finished_at suggests that the new version will be be logging all containers? Though looking at a recent docker container I see:

 "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 30344,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2018-08-21T18:42:03.079404269Z",
            "FinishedAt": "2018-08-21T18:42:00.546149979Z"
        },

So it looks that -- for running containers -- finishedat and oomkilled will be populated and represent the last time it died. TBH, not sure how useful that would be in influxdb. But I guess I could try to filter out where status == running and finishedat is in the last few minutes?

danielnelson · 2018-08-21T23:38:41Z

The containers monitored will be subject to the existing options, such as container_name_include and container_state_include.

To be honest, I was not even aware it was possible to have information about the last time a container died, and I'm not sure even what that means. I was under the assumption that these would not be filled out until the container had stopped. Once a container is stopped can it be restarted, I thought restarting it would mean creating a new container.

By the way, the details of this addition are in #4259

jamesshannon · 2018-08-22T00:24:27Z

I wasn't, either. :) But it's pretty clear from my sample output that FinishedAt is from about 2.5 seconds prior to the StartedAt. I'd think ExitCode would be populated in a similar fashion, but then I would expect that it'd be something other than 0 for the container that had restarted due to an error.

All the more reason that I'm confused about the desired use case for this -- you'll have some containers (state == stopped) with fields that never change, and the others where the fields may be populated or may not, and which will generally be pretty static, until they abruptly change (like the StartedAt). And other than doing math on StartedAt there's no clear way to measure uptime. But, then, I'm still a telegraf / influxdb beginner so I don't have my head around non-traditional metrics.

danielnelson · 2018-08-23T19:47:24Z

I don't think there would be a way to easily display uptime in Grafana or Chronograf with the start_time/end_time only. You would probably only be able to do something interesting with custom code or use one of these tools to display the value in a table.

So it seems that we should add a new uptime field to the docker_container_status measurement based on started_at and time.Now().

Bursade · 2019-05-01T22:51:51Z

I don't think there would be a way to easily display uptime in Grafana or Chronograf with the start_time/end_time only. You would probably only be able to do something interesting with custom code or use one of these tools to display the value in a table.

So it seems that we should add a new uptime field to the docker_container_status measurement based on started_at and time.Now().

Hi! Sorry to revive this, there's any plan to add this feature?

danielnelson · 2019-05-01T23:54:21Z

We would accept a pull request for this, it should probably be done as uptime_ns, since we are normalizing on nanosecond durations. If the container has already stopped I'm not sure the best behavior, we could either cap the uptime end-start, or just stop reporting uptime.

danielnelson added area/docker feature request Requests for new plugin and for new features to existing plugins labels Aug 29, 2017

GeorgeMac mentioned this issue Jun 15, 2019

Container Uptime in Docker Input Plugin #5996

Merged

3 tasks

danielnelson closed this as completed in #5996 Jun 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please add container uptime to docker input #3184

Please add container uptime to docker input #3184

jindov commented Aug 28, 2017

jamesshannon commented Aug 17, 2018

glinton commented Aug 21, 2018

danielnelson commented Aug 21, 2018

jamesshannon commented Aug 21, 2018

danielnelson commented Aug 21, 2018

jamesshannon commented Aug 22, 2018

danielnelson commented Aug 23, 2018

Bursade commented May 1, 2019

danielnelson commented May 1, 2019

Please add container uptime to docker input #3184

Please add container uptime to docker input #3184

Comments

jindov commented Aug 28, 2017

jamesshannon commented Aug 17, 2018

glinton commented Aug 21, 2018

danielnelson commented Aug 21, 2018

jamesshannon commented Aug 21, 2018

danielnelson commented Aug 21, 2018

jamesshannon commented Aug 22, 2018

danielnelson commented Aug 23, 2018

Bursade commented May 1, 2019

danielnelson commented May 1, 2019