Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please add container uptime to docker input #3184

Closed
jindov opened this issue Aug 28, 2017 · 9 comments · Fixed by #5996
Closed

Please add container uptime to docker input #3184

jindov opened this issue Aug 28, 2017 · 9 comments · Fixed by #5996
Labels
area/docker feature request Requests for new plugin and for new features to existing plugins

Comments

@jindov
Copy link

jindov commented Aug 28, 2017

As cadvisor with prometheus, I hope docker input will have container_uptime metric. I really need to know when a container restart or how long it was started.
Thank dev team,
Jin

@danielnelson danielnelson added area/docker feature request Requests for new plugin and for new features to existing plugins labels Aug 29, 2017
@jamesshannon
Copy link

Any movement on this? I'd really like to see this done as a way to catch errors that cause reboots.

It's not clear to me which measurement this would go into. There are container-specific metrics for blockio, cpu, mem, and net. Would we need to create a new one for uptime? That seems wasteful (though technically-speaking it probably isn't).

@glinton
Copy link
Contributor

glinton commented Aug 21, 2018

@jamesshannon I'm not aware of anyone spending time on this yet. If you'd like to, you're more than welcome to open a pr for it. It looks like adding it as another metric like the blockio, cpu, etc... is the way to go.

@danielnelson
Copy link
Contributor

In 1.8.0 we have added a new measurement:

- docker_container_status
  - tags:
    - engine_host
    - server_version
    - container_image
    - container_name
    - container_status
    - container_version
  - fields:
    - oomkilled (boolean)
    - pid (integer)
    - exitcode (integer)
    - started_at (integer)
    - finished_at (integer)

@jindov @jamesshannon Would this meet your requirements?

@jamesshannon
Copy link

I was thinking of something like uptime_seconds. My goal is to be able to catch when my application crashes (which then causes docker to reboot). I'm just getting started with the influxdb stack, but I would then set up grafana to count instances where uptime < some small number. But I guess I could also track other rows, like exitcode or finished_at not null. Right?

More generally, I am a bit confused about your datamodel. The fact that there's an exitcode and finished_at suggests that the new version will be be logging all containers? Though looking at a recent docker container I see:

 "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 30344,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2018-08-21T18:42:03.079404269Z",
            "FinishedAt": "2018-08-21T18:42:00.546149979Z"
        },

So it looks that -- for running containers -- finishedat and oomkilled will be populated and represent the last time it died. TBH, not sure how useful that would be in influxdb. But I guess I could try to filter out where status == running and finishedat is in the last few minutes?

@danielnelson
Copy link
Contributor

The containers monitored will be subject to the existing options, such as container_name_include and container_state_include.

To be honest, I was not even aware it was possible to have information about the last time a container died, and I'm not sure even what that means. I was under the assumption that these would not be filled out until the container had stopped. Once a container is stopped can it be restarted, I thought restarting it would mean creating a new container.

By the way, the details of this addition are in #4259

@jamesshannon
Copy link

I wasn't, either. :) But it's pretty clear from my sample output that FinishedAt is from about 2.5 seconds prior to the StartedAt. I'd think ExitCode would be populated in a similar fashion, but then I would expect that it'd be something other than 0 for the container that had restarted due to an error.

All the more reason that I'm confused about the desired use case for this -- you'll have some containers (state == stopped) with fields that never change, and the others where the fields may be populated or may not, and which will generally be pretty static, until they abruptly change (like the StartedAt). And other than doing math on StartedAt there's no clear way to measure uptime. But, then, I'm still a telegraf / influxdb beginner so I don't have my head around non-traditional metrics.

@danielnelson
Copy link
Contributor

I don't think there would be a way to easily display uptime in Grafana or Chronograf with the start_time/end_time only. You would probably only be able to do something interesting with custom code or use one of these tools to display the value in a table.

So it seems that we should add a new uptime field to the docker_container_status measurement based on started_at and time.Now().

@Bursade
Copy link

Bursade commented May 1, 2019

I don't think there would be a way to easily display uptime in Grafana or Chronograf with the start_time/end_time only. You would probably only be able to do something interesting with custom code or use one of these tools to display the value in a table.

So it seems that we should add a new uptime field to the docker_container_status measurement based on started_at and time.Now().

Hi! Sorry to revive this, there's any plan to add this feature?

@danielnelson
Copy link
Contributor

We would accept a pull request for this, it should probably be done as uptime_ns, since we are normalizing on nanosecond durations. If the container has already stopped I'm not sure the best behavior, we could either cap the uptime end-start, or just stop reporting uptime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docker feature request Requests for new plugin and for new features to existing plugins
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants