Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCP: support monitor mount external configuration #2294

Merged
merged 72 commits into from
May 29, 2020
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
cdc9561
e2e case for clear TiDB failureMembers when scale TiDB to zero
mikechengwei Jan 25, 2020
1b4149b
increase timeout time
mikechengwei Jan 26, 2020
b659c07
optimize comment
mikechengwei Jan 26, 2020
08b16cf
optimize comment
mikechengwei Jan 26, 2020
d9415cc
Merge branch 'master' of github.com:pingcap/tidb-operator
mikechengwei Feb 3, 2020
7c9cc90
support monitor external config map
mikechengwei Apr 26, 2020
8b302f9
Merge branch 'master' of github.com:pingcap/tidb-operator into extern…
mikechengwei Apr 26, 2020
090e010
fix fake configmap
mikechengwei Apr 26, 2020
5beecd3
fix fake configmap
mikechengwei Apr 26, 2020
b17dbda
update docs
mikechengwei Apr 26, 2020
4b0be1c
Merge branch 'master' of github.com:pingcap/tidb-operator into extern…
mikechengwei Apr 26, 2020
679c19a
add start-command config
mikechengwei Apr 26, 2020
f66642c
fix configMap demo
mikechengwei Apr 26, 2020
4fc8ec1
use ConfigMapRef field
mikechengwei Apr 26, 2020
4ca7926
update docs
mikechengwei Apr 26, 2020
d630a33
Merge branch 'master' into externalConfigMap
mikechengwei Apr 26, 2020
ceca34a
Merge branch 'master' of github.com:tongcheng-elong/tidb-operator int…
mikechengwei May 14, 2020
cdbb7f8
support command
mikechengwei May 14, 2020
85b7ee7
Merge branch 'externalConfigMap' of github.com:tongcheng-elong/tidb-o…
mikechengwei May 14, 2020
72e9715
Merge branch 'master' of github.com:pingcap/tidb-operator into extern…
mikechengwei May 14, 2020
da0589f
support command
mikechengwei May 14, 2020
d010960
support command
mikechengwei May 14, 2020
64ad314
remove invalid code
mikechengwei May 14, 2020
f4adabe
remove invalid code
mikechengwei May 14, 2020
29ad5f3
remove invalid code
mikechengwei May 14, 2020
956ae68
remove invalid code
mikechengwei May 14, 2020
00dbfac
recover not change code
mikechengwei May 14, 2020
6eab03a
add container config
mikechengwei May 14, 2020
dcadeec
optimize code
mikechengwei May 15, 2020
fbccca0
optimize code
mikechengwei May 15, 2020
3f8d3fe
optimize code
mikechengwei May 15, 2020
360c656
add new-line
mikechengwei May 15, 2020
7b8eff3
Merge branch 'master' of github.com:pingcap/tidb-operator into extern…
mikechengwei May 15, 2020
161d730
update doc
mikechengwei May 15, 2020
b3e30a8
optimize code
mikechengwei May 15, 2020
25e46dd
optimize code
mikechengwei May 15, 2020
1d0a247
optimize code
mikechengwei May 15, 2020
5b34f99
update doc
mikechengwei May 15, 2020
977fb31
Merge branch 'master' of github.com:pingcap/tidb-operator into extern…
mikechengwei May 15, 2020
753ab55
optimize code
mikechengwei May 15, 2020
76354e2
Merge branch 'master' of github.com:pingcap/tidb-operator into extern…
mikechengwei May 15, 2020
432ac85
update doc
mikechengwei May 15, 2020
b77cba7
update doc
mikechengwei May 15, 2020
1eca463
update doc
mikechengwei May 15, 2020
1b0581d
update doc
mikechengwei May 15, 2020
bfcc535
update doc
mikechengwei May 18, 2020
cba083f
update example README.md
mikechengwei May 20, 2020
45f74b3
Merge branch 'master' of github.com:pingcap/tidb-operator into extern…
mikechengwei May 20, 2020
2bb0b81
fix new review
mikechengwei May 20, 2020
1a7050c
fix openapi
mikechengwei May 20, 2020
6c56e84
fix doc
mikechengwei May 20, 2020
1dc0508
Merge branch 'master' of github.com:pingcap/tidb-operator into extern…
mikechengwei May 20, 2020
5abddd4
fix type
mikechengwei May 20, 2020
1427471
fix type
mikechengwei May 20, 2020
69b2961
fix type
mikechengwei May 20, 2020
cf244c1
fix type
mikechengwei May 20, 2020
89d659e
Merge branch 'master' of github.com:pingcap/tidb-operator into extern…
mikechengwei May 21, 2020
3d1eb79
fix type
mikechengwei May 21, 2020
e92201e
fix type
mikechengwei May 21, 2020
2b5981c
Merge branch 'master' into externalConfigMap
mikechengwei May 22, 2020
58d984e
Merge branch 'master' into externalConfigMap
mikechengwei May 25, 2020
a63722d
Merge branch 'master' into externalConfigMap
mikechengwei May 26, 2020
8969ddb
Merge branch 'master' into externalConfigMap
mikechengwei May 26, 2020
ac4f2e6
Merge branch 'master' into externalConfigMap
mikechengwei May 26, 2020
a990b4b
Merge branch 'master' into externalConfigMap
mikechengwei May 26, 2020
7792dbf
Merge branch 'master' into externalConfigMap
mikechengwei May 27, 2020
4959d64
Merge branch 'master' into externalConfigMap
mikechengwei May 27, 2020
813acae
Merge branch 'master' into externalConfigMap
mikechengwei May 28, 2020
41f5823
Merge branch 'master' into externalConfigMap
mikechengwei May 28, 2020
364498a
Merge branch 'master' of github.com:pingcap/tidb-operator into extern…
mikechengwei May 29, 2020
c30ef23
Merge branch 'externalConfigMap' of github.com:tongcheng-elong/tidb-o…
mikechengwei May 29, 2020
1793824
Merge branch 'master' into externalConfigMap
mikechengwei May 29, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions charts/tidb-cluster/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -554,6 +554,8 @@ monitor:
type: NodePort
portName: http-grafana
prometheus:
externalConfigMaps:
config: external-config
image: prom/prometheus:v2.11.1
imagePullPolicy: IfNotPresent
logLevel: info
Expand Down
190 changes: 190 additions & 0 deletions manifests/monitor/monitor-configMap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
# Source: tidb-cluster/templates/monitor-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: external-config
labels:
app.kubernetes.io/name: tidb-cluster
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/instance: test
app.kubernetes.io/component: monitor
helm.sh/chart: tidb-cluster-dev
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between this configuration and the configuration auto-generated by tidbmonitor? I don't think we should make mounting external configuration as the default choice for tidbmonitor.

data:
prometheus-config: |-
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'pd'
scrape_interval: 15s
honor_labels: true
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- default
scheme: http
tls_config:
insecure_skip_verify: true
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_instance]
action: keep
regex: test
- source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_component]
action: keep
regex: pd
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__meta_kubernetes_pod_name, __meta_kubernetes_pod_label_app_kubernetes_io_instance,
__meta_kubernetes_pod_annotation_prometheus_io_port]
regex: (.+);(.+);(.+)
target_label: __address__
replacement: $1.$2-pd-peer:$3
action: replace
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: kubernetes_node
- source_labels: [__meta_kubernetes_pod_ip]
action: replace
target_label: kubernetes_pod_ip
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: instance
- source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_instance]
action: replace
target_label: cluster
- job_name: 'tidb'
scrape_interval: 15s
honor_labels: true
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- default
scheme: http
tls_config:
insecure_skip_verify: true
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_instance]
action: keep
regex: test
- source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_component]
action: keep
regex: tidb
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__meta_kubernetes_pod_name, __meta_kubernetes_pod_label_app_kubernetes_io_instance,
__meta_kubernetes_pod_annotation_prometheus_io_port]
regex: (.+);(.+);(.+)
target_label: __address__
replacement: $1.$2-tidb-peer:$3
action: replace
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: kubernetes_node
- source_labels: [__meta_kubernetes_pod_ip]
action: replace
target_label: kubernetes_pod_ip
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: instance
- source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_instance]
action: replace
target_label: cluster
- job_name: 'tikv'
scrape_interval: 15s
honor_labels: true
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- default
scheme: http
tls_config:
insecure_skip_verify: true
# TiKV doesn't support scheme https for now.
# And we should fix it after TiKV fix this issue: https://github.com/tikv/tikv/issues/5340
#
# scheme: http
# tls_config:
# insecure_skip_verify: true
#
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_instance]
action: keep
regex: test
- source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_component]
action: keep
regex: tikv
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__meta_kubernetes_pod_name, __meta_kubernetes_pod_label_app_kubernetes_io_instance,
__meta_kubernetes_pod_annotation_prometheus_io_port]
regex: (.+);(.+);(.+)
target_label: __address__
replacement: $1.$2-tikv-peer:$3
action: replace
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: kubernetes_node
- source_labels: [__meta_kubernetes_pod_ip]
action: replace
target_label: kubernetes_pod_ip
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: instance
- source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_instance]
action: replace
target_label: cluster
rule_files:
- '/prometheus-rules/rules/*.rules.yml'

dashboard-config: |-
{
"apiVersion": 1,
"providers": [
{
"folder": "",
"name": "0",
"options": {
"path": "/grafana-dashboard-definitions/tidb"
Copy link
Contributor

@yeya24 yeya24 Apr 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to support additional dashboards for tidb monitor in this pr? We should support this as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we could also export grafana config in the configmap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we support custom dashboards by configMap , we will need to append extra configMap to volumes and VolumeMounts . reference

VolumeMounts: []core.VolumeMount{
.
And I think users can dynamically add dashboard in Grafana.So, this feature is not important.

},
"orgId": 1,
"type": "file"
}
]
}
start-command: |-
Copy link
Contributor

@yeya24 yeya24 Apr 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the start-command is neither flexible nor scalable. A possible way is to do something like Prometheus-operator https://github.com/coreos/prometheus-operator/blob/87a7bea9f028c62396bef665f2887df326fa97f7/pkg/apis/monitoring/v1/types.go#L244. We can have a containers field in the tidb monitor spec. Then in the real generating phase, we do a patch to containers like here.

This approach is more recommended since it can help specify something other than the start commands for Prometheus, like env vars, volumes, etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have only support command field patch in container, other field patch I think is not important now.

/bin/prometheus
--web.enable-admin-api
--web.enable-lifecycle
--log.level=info
--config.file=/etc/prometheus/prometheus.yml
--storage.tsdb.path=/data/prometheus
--storage.tsdb.retention=12d


2 changes: 2 additions & 0 deletions manifests/monitor/tidb-monitor.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ spec:
clusters:
- name: demo
prometheus:
config:
externalConfigMap: external-config
baseImage: prom/prometheus
version: v2.11.1
resources: {}
Expand Down
7 changes: 7 additions & 0 deletions pkg/apis/pingcap/v1alpha1/tidbmonitor_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,13 @@ type PrometheusSpec struct {
Service ServiceSpec `json:"service,omitempty"`
// +optional
ReserveDays int `json:"reserveDays,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yisaer Not related to this pr, but I am curious about this field. So we only support retention with unit d? Not support other time like h?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right. Maybe we would create a new proper property for this in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have support custom command ,so I think reserveDays or other time field is not necessary.

// +optional
Config *Configuration `json:"config,omitempty"`
}

// Config is the the desired state of Prometheus Configuration
mikechengwei marked this conversation as resolved.
Show resolved Hide resolved
type Configuration struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think only Configuration is not clear. What about PrometheusConfiguration.

ExternalConfigMap string `json:"externalConfigMap,omitempty"`
Copy link
Contributor

@Yisaer Yisaer Apr 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about use ConfigMapRef here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

}

// GrafanaSpec is the desired state of grafana
Expand Down
21 changes: 21 additions & 0 deletions pkg/apis/pingcap/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 18 additions & 0 deletions pkg/controller/configmap_control.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ type ConfigMapControlInterface interface {
UpdateConfigMap(controller runtime.Object, cm *corev1.ConfigMap) (*corev1.ConfigMap, error)
// DeleteConfigMap delete the given ConfigMap owned by the controller object
DeleteConfigMap(controller runtime.Object, cm *corev1.ConfigMap) error
// GetConfigMap get the ConfigMap by configMap name
GetConfigMap(controller runtime.Object, cm *corev1.ConfigMap) (*corev1.ConfigMap, error)
}

type realConfigMapControl struct {
Expand Down Expand Up @@ -95,6 +97,11 @@ func (cc *realConfigMapControl) DeleteConfigMap(owner runtime.Object, cm *corev1
return err
}

func (cc *realConfigMapControl) GetConfigMap(owner runtime.Object, cm *corev1.ConfigMap) (*corev1.ConfigMap, error) {
existConfigMap, err := cc.kubeCli.CoreV1().ConfigMaps(cm.Namespace).Get(cm.Name, metav1.GetOptions{})
return existConfigMap, err
}

func (cc *realConfigMapControl) recordConfigMapEvent(verb string, owner runtime.Object, cm *corev1.ConfigMap, err error) {
kind := owner.GetObjectKind().GroupVersionKind().Kind
var name string
Expand Down Expand Up @@ -124,6 +131,7 @@ func NewFakeConfigMapControl(cmInformer coreinformers.ConfigMapInformer) *FakeCo
RequestTracker{},
RequestTracker{},
RequestTracker{},
RequestTracker{},
}
}

Expand All @@ -133,6 +141,7 @@ type FakeConfigMapControl struct {
createConfigMapTracker RequestTracker
updateConfigMapTracker RequestTracker
deleteConfigMapTracker RequestTracker
getConfigMapTracker RequestTracker
}

// SetCreateConfigMapError sets the error attributes of createConfigMapTracker
Expand Down Expand Up @@ -181,4 +190,13 @@ func (cc *FakeConfigMapControl) DeleteConfigMap(_ runtime.Object, _ *corev1.Conf
return nil
}

func (cc *FakeConfigMapControl) GetConfigMap(controller runtime.Object, cm *corev1.ConfigMap) (*corev1.ConfigMap, error) {
defer cc.getConfigMapTracker.Inc()
if cc.getConfigMapTracker.ErrorReady() {
defer cc.getConfigMapTracker.Reset()
return nil, cc.getConfigMapTracker.GetError()
}
return cm, nil
}

var _ ConfigMapControlInterface = &FakeConfigMapControl{}
26 changes: 25 additions & 1 deletion pkg/monitor/monitor/monitor_manager.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,14 @@ package monitor

import (
"fmt"

"github.com/pingcap/tidb-operator/pkg/apis/pingcap/v1alpha1"
informers "github.com/pingcap/tidb-operator/pkg/client/informers/externalversions"
v1alpha1listers "github.com/pingcap/tidb-operator/pkg/client/listers/pingcap/v1alpha1"
"github.com/pingcap/tidb-operator/pkg/controller"
utildiscovery "github.com/pingcap/tidb-operator/pkg/util/discovery"
corev1 "k8s.io/api/core/v1"
rbac "k8s.io/api/rbac/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/discovery"
discoverycachedmemory "k8s.io/client-go/discovery/cached/memory"
kubeinformers "k8s.io/client-go/informers"
Expand All @@ -41,6 +41,7 @@ type MonitorManager struct {
pvLister corelisters.PersistentVolumeLister
pvControl controller.PVControlInterface
recorder record.EventRecorder
cmControl controller.ConfigMapControlInterface
}

const (
Expand Down Expand Up @@ -207,6 +208,29 @@ func (mm *MonitorManager) syncTidbMonitorConfig(tc *v1alpha1.TidbCluster, monito
if err != nil {
return nil, err
}
if monitor.Spec.Prometheus.Config != nil && len(monitor.Spec.Prometheus.Config.ExternalConfigMap) > 0 {
externalCM, err := mm.cmControl.GetConfigMap(monitor, &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: monitor.Spec.Prometheus.Config.ExternalConfigMap,
Namespace: monitor.Namespace,
},
})
if err != nil {
klog.Errorf("tm[%s/%s]'s configMap failed to get,err: %v", monitor.Namespace, monitor.Spec.Prometheus.Config.ExternalConfigMap, err)
return nil, err
}
if externalContent, ok := externalCM.Data["prometheus-config"]; ok {
newCM.Data["prometheus-config"] = externalContent
}

if externalContent, ok := externalCM.Data["dashboard-config"]; ok {
newCM.Data["dashboard-config"] = externalContent
}

if externalContent, ok := externalCM.Data["start-command"]; ok {
newCM.Data["start-command"] = externalContent
}
}
return mm.typedControl.CreateOrUpdateConfigMap(monitor, newCM)
}

Expand Down
Loading