Skip to content

Commit 7136ca2

Browse files
xeniapemaltesanderTechassi
authored
chore: ensure metrics are correctly exposed (#701)
* chore: ensure metrics are correctly exposed * add changelog entry * Update rust/operator-binary/src/crd/mod.rs Co-authored-by: Techassi <git@techassi.dev> * Update rust/operator-binary/src/hbase_controller.rs Co-authored-by: Techassi <git@techassi.dev> * Apply suggestions from code review Co-authored-by: Techassi <git@techassi.dev> * add metrics service to tls cert * clean up port retrieval methods * pre commit * Apply suggestions from code review Co-authored-by: Techassi <git@techassi.dev> * use boolean for https instead of hbase cluster --------- Co-authored-by: Malte Sander <malte.sander.it@gmail.com> Co-authored-by: Techassi <git@techassi.dev>
1 parent 232e6be commit 7136ca2

File tree

6 files changed

+212
-155
lines changed

6 files changed

+212
-155
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,19 @@
1010
- `EOS_CHECK_MODE` (`--eos-check-mode`) to set the EoS check mode. Currently, only "offline" is supported.
1111
- `EOS_INTERVAL` (`--eos-interval`) to set the interval in which the operator checks if it is EoS.
1212
- `EOS_DISABLED` (`--eos-disabled`) to disable the EoS checker completely.
13+
- Add `metrics` Services ([#701]).
1314

1415
### Changed
1516

1617
- Bump stackable-operator to `0.100.1` ([#705]).
1718
- Changed env-vars to be consistent with config-utils in the entrypoint script ([#700]).
19+
- BREAKING: The `prometheus.io/scrape` label moved from the `headless` Service to the `metrics` Service, which
20+
uses `metrics` as the port name instead of the previous `ui-http`/`ui-https` port name ([#701]).
1821

1922
[#691]: https://github.com/stackabletech/hbase-operator/pull/691
2023
[#697]: https://github.com/stackabletech/hbase-operator/pull/697
2124
[#700]: https://github.com/stackabletech/hbase-operator/pull/700
25+
[#701]: https://github.com/stackabletech/hbase-operator/pull/701
2226
[#705]: https://github.com/stackabletech/hbase-operator/pull/705
2327

2428
## [25.7.0] - 2025-07-23

docs/modules/hbase/pages/usage-guide/monitoring.adoc

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,5 +6,20 @@ See xref:operators:monitoring.adoc[] for more details.
66

77
Starting with HBase 2.6 the URL for Prometheus metrics has changed.
88
This is because HBase offers now a built-in endpoint for this purpose.
9-
This endpoint is available from the UI service.
10-
For example, in the case of the master service, the URL is `http://<master-service>:16010/prometheus`.
9+
This endpoint is available from the `metrics` Services.
10+
For example, in the case of the master Service, the URL is `http://<hbasecluster-name>-master-<rolegroup-name>-metrics:16010/prometheus`.
11+
12+
== Authentication when using TLS
13+
14+
HBase exposes metrics through the same port as their web UI. Hence, when configuring HBase with TLS the metrics are also secured by TLS,
15+
and the clients scraping the metrics endpoint need to authenticate against it. This could for example be accomplished by utilizing mTLS
16+
between Kubernetes Pods with the xref:home:secret-operator:index.adoc[Secret Operator].
17+
18+
When using Prometheus `ServiceMonitor` for scraping, the `address` label needs relabeling to use the `headless` Service instead of the
19+
`metrics` Service. This is because by default Prometheus targets the Pod IPs as endpoints, but since the Pod IPs are not
20+
part of the certificate, the authentication will fail. Instead, the FQDN of the Pods, which can be added to the certificate, is used, but
21+
this FQDN is only available through the `headless` Service.
22+
23+
A more detailed explanation can be found in the xref:home:nifi:usage_guide/monitoring.adoc[NiFi Operator Monitoring Docs] with a similar situation
24+
and an example of a Prometheus `ServiceMonitor` configured for TLS in the
25+
https://github.com/stackabletech/demos/blob/main/stacks/monitoring/prometheus-service-monitors.yaml[Monitoring Stack{external-link-icon}^].

rust/operator-binary/src/crd/mod.rs

Lines changed: 72 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -65,20 +65,24 @@ pub const SSL_CLIENT_XML: &str = "ssl-client.xml";
6565
pub const HBASE_CLUSTER_DISTRIBUTED: &str = "hbase.cluster.distributed";
6666
pub const HBASE_ROOTDIR: &str = "hbase.rootdir";
6767

68-
pub const HBASE_UI_PORT_NAME_HTTP: &str = "ui-http";
69-
pub const HBASE_UI_PORT_NAME_HTTPS: &str = "ui-https";
70-
pub const HBASE_REST_PORT_NAME_HTTP: &str = "rest-http";
71-
pub const HBASE_REST_PORT_NAME_HTTPS: &str = "rest-https";
68+
const HBASE_UI_PORT_NAME_HTTP: &str = "ui-http";
69+
const HBASE_UI_PORT_NAME_HTTPS: &str = "ui-https";
70+
const HBASE_REST_PORT_NAME_HTTP: &str = "rest-http";
71+
const HBASE_REST_PORT_NAME_HTTPS: &str = "rest-https";
72+
const HBASE_METRICS_PORT_NAME: &str = "metrics";
7273

7374
pub const HBASE_MASTER_PORT: u16 = 16000;
7475
// HBase always uses 16010, regardless of http or https. On 2024-01-17 we decided in Arch-meeting that we want to stick
7576
// the port numbers to what the product is doing, so we get the least surprise for users - even when this means we have
7677
// inconsistency between Stackable products.
7778
pub const HBASE_MASTER_UI_PORT: u16 = 16010;
79+
pub const HBASE_MASTER_METRICS_PORT: u16 = 16010;
7880
pub const HBASE_REGIONSERVER_PORT: u16 = 16020;
7981
pub const HBASE_REGIONSERVER_UI_PORT: u16 = 16030;
82+
pub const HBASE_REGIONSERVER_METRICS_PORT: u16 = 16030;
8083
pub const HBASE_REST_PORT: u16 = 8080;
8184
pub const HBASE_REST_UI_PORT: u16 = 8085;
85+
pub const HBASE_REST_METRICS_PORT: u16 = 8085;
8286
pub const LISTENER_VOLUME_NAME: &str = "listener";
8387
pub const LISTENER_VOLUME_DIR: &str = "/stackable/listener";
8488

@@ -513,52 +517,6 @@ impl v1alpha1::HbaseCluster {
513517
.as_ref()
514518
.map(|a| a.tls_secret_class.clone())
515519
}
516-
517-
/// Returns required port name and port number tuples depending on the role.
518-
/// Hbase versions 2.6.* will have two ports for each role. The metrics are available over the
519-
/// UI port.
520-
pub fn ports(&self, role: &HbaseRole) -> Vec<(String, u16)> {
521-
match role {
522-
HbaseRole::Master => vec![
523-
("master".to_string(), HBASE_MASTER_PORT),
524-
(self.ui_port_name(), HBASE_MASTER_UI_PORT),
525-
],
526-
HbaseRole::RegionServer => vec![
527-
("regionserver".to_string(), HBASE_REGIONSERVER_PORT),
528-
(self.ui_port_name(), HBASE_REGIONSERVER_UI_PORT),
529-
],
530-
HbaseRole::RestServer => vec![
531-
(
532-
if self.has_https_enabled() {
533-
HBASE_REST_PORT_NAME_HTTPS
534-
} else {
535-
HBASE_REST_PORT_NAME_HTTP
536-
}
537-
.to_string(),
538-
HBASE_REST_PORT,
539-
),
540-
(self.ui_port_name(), HBASE_REST_UI_PORT),
541-
],
542-
}
543-
}
544-
545-
pub fn service_port(&self, role: &HbaseRole) -> u16 {
546-
match role {
547-
HbaseRole::Master => HBASE_MASTER_PORT,
548-
HbaseRole::RegionServer => HBASE_REGIONSERVER_PORT,
549-
HbaseRole::RestServer => HBASE_REST_PORT,
550-
}
551-
}
552-
553-
/// Name of the port used by the Web UI, which depends on HTTPS usage
554-
pub fn ui_port_name(&self) -> String {
555-
if self.has_https_enabled() {
556-
HBASE_UI_PORT_NAME_HTTPS
557-
} else {
558-
HBASE_UI_PORT_NAME_HTTP
559-
}
560-
.to_string()
561-
}
562520
}
563521

564522
pub fn merged_env(rolegroup_config: Option<&BTreeMap<String, String>>) -> Vec<EnvVar> {
@@ -759,6 +717,70 @@ impl HbaseRole {
759717
};
760718
Ok(pvc)
761719
}
720+
721+
/// Returns required port name and port number tuples depending on the role.
722+
///
723+
/// Hbase versions 2.6.* will have two ports for each role. The metrics are available on the
724+
/// UI port.
725+
pub fn ports(&self, hbase: &v1alpha1::HbaseCluster) -> Vec<(String, u16)> {
726+
vec![
727+
(self.data_port_name(hbase), self.data_port()),
728+
(
729+
Self::ui_port_name(hbase.has_https_enabled()).to_string(),
730+
self.ui_port(),
731+
),
732+
]
733+
}
734+
735+
pub fn data_port(&self) -> u16 {
736+
match self {
737+
HbaseRole::Master => HBASE_MASTER_PORT,
738+
HbaseRole::RegionServer => HBASE_REGIONSERVER_PORT,
739+
HbaseRole::RestServer => HBASE_REST_PORT,
740+
}
741+
}
742+
743+
pub fn data_port_name(&self, hbase: &v1alpha1::HbaseCluster) -> String {
744+
match self {
745+
HbaseRole::Master | HbaseRole::RegionServer => self.to_string(),
746+
HbaseRole::RestServer => {
747+
if hbase.has_https_enabled() {
748+
HBASE_REST_PORT_NAME_HTTPS.to_owned()
749+
} else {
750+
HBASE_REST_PORT_NAME_HTTP.to_owned()
751+
}
752+
}
753+
}
754+
}
755+
756+
pub fn ui_port(&self) -> u16 {
757+
match self {
758+
HbaseRole::Master => HBASE_MASTER_UI_PORT,
759+
HbaseRole::RegionServer => HBASE_REGIONSERVER_UI_PORT,
760+
HbaseRole::RestServer => HBASE_REST_UI_PORT,
761+
}
762+
}
763+
764+
/// Name of the port used by the Web UI, which depends on HTTPS usage
765+
pub fn ui_port_name(has_https_enabled: bool) -> &'static str {
766+
if has_https_enabled {
767+
HBASE_UI_PORT_NAME_HTTPS
768+
} else {
769+
HBASE_UI_PORT_NAME_HTTP
770+
}
771+
}
772+
773+
pub fn metrics_port(&self) -> u16 {
774+
match self {
775+
HbaseRole::Master => HBASE_MASTER_METRICS_PORT,
776+
HbaseRole::RegionServer => HBASE_REGIONSERVER_METRICS_PORT,
777+
HbaseRole::RestServer => HBASE_REST_METRICS_PORT,
778+
}
779+
}
780+
781+
pub fn metrics_port_name() -> &'static str {
782+
HBASE_METRICS_PORT_NAME
783+
}
762784
}
763785

764786
fn default_resources(role: &HbaseRole) -> ResourcesFragment<HbaseStorageConfig, NoRuntimeLimits> {

0 commit comments

Comments
 (0)