Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable prometheus metrics for katib-controller #717

Merged
merged 3 commits into from
Aug 16, 2019
Merged

Enable prometheus metrics for katib-controller #717

merged 3 commits into from
Aug 16, 2019

Conversation

hougangliu
Copy link
Member

@hougangliu hougangliu commented Aug 13, 2019

Upgrade controller-runtime to 0.1.9 and enable prometheus metrics for katib-controller

# curl http://10.0.149.224:8080/metrics
...
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 1.8476192e+07
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 1.819165e+06
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 1.671168e+06
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 1.671168e+06
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 7.3072888e+07
# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
go_threads 21
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 5.09
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 12
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 4.6043136e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.56568718538e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.41164544e+08
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes -1
...

This change is Reviewable

@johnugeorge
Copy link
Member

Don't we need katib specific metrics to be exposed?

@hougangliu
Copy link
Member Author

Don't we need katib specific metrics to be exposed?

This PR only exposes controller-runtime default metrics for controller. I will submit other PRs for experiments/trials etc.

@hougangliu
Copy link
Member Author

/test kubeflow-katib-presubmit

1 similar comment
@hougangliu
Copy link
Member Author

/test kubeflow-katib-presubmit

@hougangliu
Copy link
Member Author

/cc @gaocegege @johnugeorge

Copy link
Member

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/retest

@johnugeorge
Copy link
Member

/lgtm

@johnugeorge
Copy link
Member

/approve

@johnugeorge
Copy link
Member

/retest

1 similar comment
@johnugeorge
Copy link
Member

/retest

@hougangliu
Copy link
Member Author

/test kubeflow-katib-presubmit

@johnugeorge
Copy link
Member

/retest

@hougangliu
Copy link
Member Author

/test kubeflow-katib-presubmit

@johnugeorge
Copy link
Member

/retest

@hougangliu
Copy link
Member Author

/test kubeflow-katib-presubmit

2 similar comments
@hougangliu
Copy link
Member Author

/test kubeflow-katib-presubmit

@gaocegege
Copy link
Member

/test kubeflow-katib-presubmit

@k8s-ci-robot k8s-ci-robot removed the lgtm label Aug 15, 2019
@hougangliu
Copy link
Member Author

Disable UI build so that the build error will not block katib CI.
@andreyvelich will be back to fix #699

@hougangliu
Copy link
Member Author

/test kubeflow-katib-presubmit

@gaocegege
Copy link
Member

/retest

2 similar comments
@hougangliu
Copy link
Member Author

/retest

@hougangliu
Copy link
Member Author

/retest

@hougangliu
Copy link
Member Author

/retest

@johnugeorge
Copy link
Member

/lgtm

@johnugeorge
Copy link
Member

/approve

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: johnugeorge

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 45124c1 into kubeflow:master Aug 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants