-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TLS: only support TLS 1.2 / 1.3 #1687
Conversation
I believe we said that we'd stick purely with Go defaults. |
I think that only tls 1.3 might be too far but we should at least disable 1.0 and 1.1 as it won't achieve a good score at ssllabs and isn't supported by browsers anymore (or won't be soon). |
We can also argue that config.PreferServerCipherSuites = true should be set (but that is less critical) |
Well, we can leave this as it but I would not be surprise that it does not pass security audit. |
That's a matter for Go then, fine tuning TLS settings over time is not something you want to get into. |
Well I don't think so. If you look at other software that do support HTTPS they all support fine tuning ciphers and TLS version. Why would we be the only ones not to? |
One point about at least tls 1.2 is that probably lots of OS-provided clients (like curl in red hat) does not support tls 1.3 widely |
Because that way lies madness, there's already over 10 different configurables and that'll only continue to grow over time and in ways that you can't forsee config wise. This is what was discussed back when we first agreed to do tls, that we'll stick with Go defaults with no configurability. If you have a setup that requires a more niche setup, you can use a reverse proxy. |
I'd say we support 1.2 and 1.3. Re/ options for 1.2: Not a strong opinion. But I think it would be fine to just hardcode whatever is currently considered best practice. I'm not really concerned about the maintenance overhead. Yes, this stuff changes but so do kernel metric, procfs format and others. So I'll merge this if @SuperQ agrees. |
TLS 1.0 and 1.1 are deprecated by major vendors (e.g. browsers). Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
Adapted, with commit text: TLS 1.0 and 1.1 are deprecated by major vendors (e.g. browsers). |
I agree with Brian, we don't want to go deep into the madness of all TLS options. We do want to go with Golang defaults to avoid too many flags and the support issues with users flipping all the flags because they can. But setting a couple of sane defaults is probably worthwhile. And I agree with @discordianfish, hardcoding these is fine, especially as a first iteration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Can you add a note in the README about the minimum TLS version? |
My question is how do we decide what "sane defaults" are? The plan up to this point was to delegate that decision entirely to Go, figuring out what those settings should be on an ongoing basis (and when going via a library which users then need to pull in) is likely to end up being non-trivial. |
Don't you agree that tls 1.0 should be removed and that it is not go responsibility? |
That's not really the point. The point is that if we put ourselves in a position where we make security decisions like this, then we need to keep on making them on a regular and consistent basis, ensure that such decisions get promptly released into ~16 binaries implemented in 2 languages plus deal with requests to change them from users - whether due to general concern by users or the particulars of their setup. Delegating to a 3rd party avoids this. That is something that we could do, but it is not what was agreed and discussed. Thus we'd need sort out the details of all that before going down this path as it's not merely a matter of one-off hardcoding some settings. |
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
@SuperQ updated readme |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We still need to resolve the question of how us using non-default settings would be managed, as this was not the path you proposed and thus that was discussed back at the dev summit. For example if a PR comes in to me in common changing the hardcoding, what should I do? |
We are only hardcoding TLS version here ; not ciphers etc. |
I don't really see that making a difference, as this is a question of an overall policy framework. E.g. why not disable 1.2? When would 1.2 be disabled? For what reasons would it be acceptable to enable 1.0/1.1? |
Qualys refers to this: https://blog.qualys.com/ssllabs/2018/11/19/grade-change-for-tls-1-0-and-tls-1-1-protocols which refers to this: http://www.rfc-editor.org/info/rfc7525 |
To be clear I'm looking for a policy, not the reasoning behind the changes in this particular PR. I'm aware of sslabs, that's not what I'm asking about. I worried more about why should we aim for this particular A rating, versus any other rating? That RFC is from 5 years ago, how do we determine what RFCs to follow and on what timeline? What browsers are we okay with breaking? Which aren't we okay with breaking? What about past releases of Prometheus? Why should we care only about browsers, and not programmatic use? For example this PR would break pushgateway support in our client libraries, as we officially support some older versions of Python and Java that only do up to TLS 1.0. There'll also be user code talking to the Prometheus and Alertmanager APIs. As you can see this is a quite nuanced issue, and I've yet to have answers to my above questions that'd help me as future maintainer of this code once it moves to common evaluate inbound PRs in the years to come. I'm not willing to address this on a PR-to-PR basis as users look to change the various hardcoded 10+ TLS server configurables (nor do I think it's sane to have an ever-shifting TLS configuration from a general project security stance standpoint), there needs to be a plausible coherent policy established in advance. I'd suggest starting a doc if you want to go down this path.
@discordianfish There is active disagreement on this issue, merging this is not kosher. I ask that you promptly correct this oversight. |
My position is that this still is part of the node-exporter. If we extract this package and you take over ownership, you can establish whatever process you want and change the TLS settings as you see fit. |
As far as I am concerned, disabling TLS1.0 and 1.1 is a no brainer. I have seen different customers just asking for everything tls1.1 and 1.0 go away; and I assume that those very same customers are the same who would like https in the prometheus server too, instead of only at the LB level. While that does not help with a general policy, I would agree to let that setting defaulting to TLS1.2 but enable more flexibility with the TLS configuration, enabling setting the min version, the ciphers, etc. So maybe we need to take that discussion on the developers mailing list to reach a new concensus about this? I am also glad to help you with maintaining the https package when we take it to common, and becoming one of the maintainers.
As this is disabled by default, people with old software can simply not enable TLS and use weaker reverse proxies as they do now. We do not "break" anyone with this.
I guess, yes, I can work on a document. At the same time, I do not want to hold the 1.0.0 release of the node_exporter for this. |
* The netdev collector CLI argument `--collector.netdev.ignored-devices` was renamed to `--collector.netdev.device-blacklist` in order to conform with the systemd collector. #1279 * The label named `state` on `node_systemd_service_restart_total` metrics was changed to `name` to better describe the metric. #1393 * Refactoring of the mdadm collector changes several metrics - `node_md_disks_active` is removed - `node_md_disks` now has a `state` label for "fail", "spare", "active" disks. - `node_md_is_active` is replaced by `node_md_state` with a state set of "active", "inactive", "recovering", "resync". * Additional label `mountaddr` added to NFS device metrics to distinguish mounts from the same URL, but different IP addresses. #1417 * Metrics node_cpu_scaling_frequency_min_hrts and node_cpu_scaling_frequency_max_hrts of the cpufreq collector were renamed to node_cpu_scaling_frequency_min_hertz and node_cpu_scaling_frequency_max_hertz. #1510 * Collectors that are enabled, but are unable to find data to collect, now return 0 for `node_scrape_collector_success`. * [CHANGE] Add `--collector.netdev.device-whitelist`. #1279 * [CHANGE] Ignore iso9600 filesystem on Linux #1355 * [CHANGE] Refactor mdadm collector #1403 * [CHANGE] Add `mountaddr` label to NFS metrics. #1417 * [CHANGE] Don't count empty collectors as success. #1613 * [FEATURE] New flag to disable default collectors #1276 * [FEATURE] Add experimental TLS support #1277, #1687, #1695 * [FEATURE] Add collector for Power Supply Class #1280 * [FEATURE] Add new schedstat collector #1389 * [FEATURE] Add FreeBSD zfs support #1394 * [FEATURE] Add uname support for Darwin and OpenBSD #1433 * [FEATURE] Add new metric node_cpu_info #1489 * [FEATURE] Add new thermal_zone collector #1425 * [FEATURE] Add new cooling_device metrics to thermal zone collector #1445 * [FEATURE] Add swap usage on darwin #1508 * [FEATURE] Add Btrfs collector #1512 * [FEATURE] Add RAPL collector #1523 * [FEATURE] Add new softnet collector #1576 * [FEATURE] Add new udp_queues collector #1503 * [FEATURE] Add basic authentication #1673 * [ENHANCEMENT] Log pid when there is a problem reading the process stats #1341 * [ENHANCEMENT] Collect InfiniBand port state and physical state #1357 * [ENHANCEMENT] Include additional XFS runtime statistics. #1423 * [ENHANCEMENT] Report non-fatal collection errors in the exporter metric. #1439 * [ENHANCEMENT] Expose IPVS firewall mark as a label #1455 * [ENHANCEMENT] Add check for systemd version before attempting to query certain metrics. #1413 * [ENHANCEMENT] Add a flag to adjust mount timeout #1486 * [ENHANCEMENT] Add new counters for flush requests in Linux 5.5 #1548 * [ENHANCEMENT] Add metrics and tests for UDP receive and send buffer errors #1534 * [ENHANCEMENT] The sockstat collector now exposes IPv6 statistics in addition to the existing IPv4 support. #1552 * [ENHANCEMENT] Add infiniband info metric #1563 * [ENHANCEMENT] Add unix socket support for supervisord collector #1592 * [ENHANCEMENT] Implement loadavg on all BSDs without cgo #1584 * [ENHANCEMENT] Add model_name and stepping to node_cpu_info metric #1617 * [ENHANCEMENT] Add `--collector.perf.cpus` to allow setting the CPU list for perf stats. #1561 * [ENHANCEMENT] Add metrics for IO errors and retires on Darwin. #1636 * [ENHANCEMENT] Add perf tracepoint collection flag #1664 * [ENHANCEMENT] ZFS: read contents of objset file #1632 * [ENHANCEMENT] Linux CPU: Cache CPU metrics to make them monotonically increasing #1711 * [BUGFIX] Read /proc/net files with a single read syscall #1380 * [BUGFIX] Renamed label `state` to `name` on `node_systemd_service_restart_total`. #1393 * [BUGFIX] Fix netdev nil reference on Darwin #1414 * [BUGFIX] Strip path.rootfs from mountpoint labels #1421 * [BUGFIX] Fix seconds reported by schedstat #1426 * [BUGFIX] Fix empty string in path.rootfs #1464 * [BUGFIX] Fix typo in cpufreq metric names #1510 * [BUGFIX] Read /proc/stat in one syscall #1538 * [BUGFIX] Fix OpenBSD cache memory information #1542 * [BUGFIX] Refactor textfile collector to avoid looping defer #1549 * [BUGFIX] Fix network speed math #1580 * [BUGFIX] collector/systemd: use regexp to extract systemd version #1647 * [BUGFIX] Fix initialization in perf collector when using multiple CPUs #1665 * [BUGFIX] Fix accidentally empty lines in meminfo_linux #1671 Signed-off-by: Ben Kochie <superq@gmail.com>
* The netdev collector CLI argument `--collector.netdev.ignored-devices` was renamed to `--collector.netdev.device-blacklist` in order to conform with the systemd collector. prometheus#1279 * The label named `state` on `node_systemd_service_restart_total` metrics was changed to `name` to better describe the metric. prometheus#1393 * Refactoring of the mdadm collector changes several metrics - `node_md_disks_active` is removed - `node_md_disks` now has a `state` label for "fail", "spare", "active" disks. - `node_md_is_active` is replaced by `node_md_state` with a state set of "active", "inactive", "recovering", "resync". * Additional label `mountaddr` added to NFS device metrics to distinguish mounts from the same URL, but different IP addresses. prometheus#1417 * Metrics node_cpu_scaling_frequency_min_hrts and node_cpu_scaling_frequency_max_hrts of the cpufreq collector were renamed to node_cpu_scaling_frequency_min_hertz and node_cpu_scaling_frequency_max_hertz. prometheus#1510 * Collectors that are enabled, but are unable to find data to collect, now return 0 for `node_scrape_collector_success`. * [CHANGE] Add `--collector.netdev.device-whitelist`. prometheus#1279 * [CHANGE] Ignore iso9600 filesystem on Linux prometheus#1355 * [CHANGE] Refactor mdadm collector prometheus#1403 * [CHANGE] Add `mountaddr` label to NFS metrics. prometheus#1417 * [CHANGE] Don't count empty collectors as success. prometheus#1613 * [FEATURE] New flag to disable default collectors prometheus#1276 * [FEATURE] Add experimental TLS support prometheus#1277, prometheus#1687, prometheus#1695 * [FEATURE] Add collector for Power Supply Class prometheus#1280 * [FEATURE] Add new schedstat collector prometheus#1389 * [FEATURE] Add FreeBSD zfs support prometheus#1394 * [FEATURE] Add uname support for Darwin and OpenBSD prometheus#1433 * [FEATURE] Add new metric node_cpu_info prometheus#1489 * [FEATURE] Add new thermal_zone collector prometheus#1425 * [FEATURE] Add new cooling_device metrics to thermal zone collector prometheus#1445 * [FEATURE] Add swap usage on darwin prometheus#1508 * [FEATURE] Add Btrfs collector prometheus#1512 * [FEATURE] Add RAPL collector prometheus#1523 * [FEATURE] Add new softnet collector prometheus#1576 * [FEATURE] Add new udp_queues collector prometheus#1503 * [FEATURE] Add basic authentication prometheus#1673 * [ENHANCEMENT] Log pid when there is a problem reading the process stats prometheus#1341 * [ENHANCEMENT] Collect InfiniBand port state and physical state prometheus#1357 * [ENHANCEMENT] Include additional XFS runtime statistics. prometheus#1423 * [ENHANCEMENT] Report non-fatal collection errors in the exporter metric. prometheus#1439 * [ENHANCEMENT] Expose IPVS firewall mark as a label prometheus#1455 * [ENHANCEMENT] Add check for systemd version before attempting to query certain metrics. prometheus#1413 * [ENHANCEMENT] Add a flag to adjust mount timeout prometheus#1486 * [ENHANCEMENT] Add new counters for flush requests in Linux 5.5 prometheus#1548 * [ENHANCEMENT] Add metrics and tests for UDP receive and send buffer errors prometheus#1534 * [ENHANCEMENT] The sockstat collector now exposes IPv6 statistics in addition to the existing IPv4 support. prometheus#1552 * [ENHANCEMENT] Add infiniband info metric prometheus#1563 * [ENHANCEMENT] Add unix socket support for supervisord collector prometheus#1592 * [ENHANCEMENT] Implement loadavg on all BSDs without cgo prometheus#1584 * [ENHANCEMENT] Add model_name and stepping to node_cpu_info metric prometheus#1617 * [ENHANCEMENT] Add `--collector.perf.cpus` to allow setting the CPU list for perf stats. prometheus#1561 * [ENHANCEMENT] Add metrics for IO errors and retires on Darwin. prometheus#1636 * [ENHANCEMENT] Add perf tracepoint collection flag prometheus#1664 * [ENHANCEMENT] ZFS: read contents of objset file prometheus#1632 * [ENHANCEMENT] Linux CPU: Cache CPU metrics to make them monotonically increasing prometheus#1711 * [BUGFIX] Read /proc/net files with a single read syscall prometheus#1380 * [BUGFIX] Renamed label `state` to `name` on `node_systemd_service_restart_total`. prometheus#1393 * [BUGFIX] Fix netdev nil reference on Darwin prometheus#1414 * [BUGFIX] Strip path.rootfs from mountpoint labels prometheus#1421 * [BUGFIX] Fix seconds reported by schedstat prometheus#1426 * [BUGFIX] Fix empty string in path.rootfs prometheus#1464 * [BUGFIX] Fix typo in cpufreq metric names prometheus#1510 * [BUGFIX] Read /proc/stat in one syscall prometheus#1538 * [BUGFIX] Fix OpenBSD cache memory information prometheus#1542 * [BUGFIX] Refactor textfile collector to avoid looping defer prometheus#1549 * [BUGFIX] Fix network speed math prometheus#1580 * [BUGFIX] collector/systemd: use regexp to extract systemd version prometheus#1647 * [BUGFIX] Fix initialization in perf collector when using multiple CPUs prometheus#1665 * [BUGFIX] Fix accidentally empty lines in meminfo_linux prometheus#1671 Signed-off-by: Ben Kochie <superq@gmail.com>
* The netdev collector CLI argument `--collector.netdev.ignored-devices` was renamed to `--collector.netdev.device-blacklist` in order to conform with the systemd collector. prometheus#1279 * The label named `state` on `node_systemd_service_restart_total` metrics was changed to `name` to better describe the metric. prometheus#1393 * Refactoring of the mdadm collector changes several metrics - `node_md_disks_active` is removed - `node_md_disks` now has a `state` label for "fail", "spare", "active" disks. - `node_md_is_active` is replaced by `node_md_state` with a state set of "active", "inactive", "recovering", "resync". * Additional label `mountaddr` added to NFS device metrics to distinguish mounts from the same URL, but different IP addresses. prometheus#1417 * Metrics node_cpu_scaling_frequency_min_hrts and node_cpu_scaling_frequency_max_hrts of the cpufreq collector were renamed to node_cpu_scaling_frequency_min_hertz and node_cpu_scaling_frequency_max_hertz. prometheus#1510 * Collectors that are enabled, but are unable to find data to collect, now return 0 for `node_scrape_collector_success`. * [CHANGE] Add `--collector.netdev.device-whitelist`. prometheus#1279 * [CHANGE] Ignore iso9600 filesystem on Linux prometheus#1355 * [CHANGE] Refactor mdadm collector prometheus#1403 * [CHANGE] Add `mountaddr` label to NFS metrics. prometheus#1417 * [CHANGE] Don't count empty collectors as success. prometheus#1613 * [FEATURE] New flag to disable default collectors prometheus#1276 * [FEATURE] Add experimental TLS support prometheus#1277, prometheus#1687, prometheus#1695 * [FEATURE] Add collector for Power Supply Class prometheus#1280 * [FEATURE] Add new schedstat collector prometheus#1389 * [FEATURE] Add FreeBSD zfs support prometheus#1394 * [FEATURE] Add uname support for Darwin and OpenBSD prometheus#1433 * [FEATURE] Add new metric node_cpu_info prometheus#1489 * [FEATURE] Add new thermal_zone collector prometheus#1425 * [FEATURE] Add new cooling_device metrics to thermal zone collector prometheus#1445 * [FEATURE] Add swap usage on darwin prometheus#1508 * [FEATURE] Add Btrfs collector prometheus#1512 * [FEATURE] Add RAPL collector prometheus#1523 * [FEATURE] Add new softnet collector prometheus#1576 * [FEATURE] Add new udp_queues collector prometheus#1503 * [FEATURE] Add basic authentication prometheus#1673 * [ENHANCEMENT] Log pid when there is a problem reading the process stats prometheus#1341 * [ENHANCEMENT] Collect InfiniBand port state and physical state prometheus#1357 * [ENHANCEMENT] Include additional XFS runtime statistics. prometheus#1423 * [ENHANCEMENT] Report non-fatal collection errors in the exporter metric. prometheus#1439 * [ENHANCEMENT] Expose IPVS firewall mark as a label prometheus#1455 * [ENHANCEMENT] Add check for systemd version before attempting to query certain metrics. prometheus#1413 * [ENHANCEMENT] Add a flag to adjust mount timeout prometheus#1486 * [ENHANCEMENT] Add new counters for flush requests in Linux 5.5 prometheus#1548 * [ENHANCEMENT] Add metrics and tests for UDP receive and send buffer errors prometheus#1534 * [ENHANCEMENT] The sockstat collector now exposes IPv6 statistics in addition to the existing IPv4 support. prometheus#1552 * [ENHANCEMENT] Add infiniband info metric prometheus#1563 * [ENHANCEMENT] Add unix socket support for supervisord collector prometheus#1592 * [ENHANCEMENT] Implement loadavg on all BSDs without cgo prometheus#1584 * [ENHANCEMENT] Add model_name and stepping to node_cpu_info metric prometheus#1617 * [ENHANCEMENT] Add `--collector.perf.cpus` to allow setting the CPU list for perf stats. prometheus#1561 * [ENHANCEMENT] Add metrics for IO errors and retires on Darwin. prometheus#1636 * [ENHANCEMENT] Add perf tracepoint collection flag prometheus#1664 * [ENHANCEMENT] ZFS: read contents of objset file prometheus#1632 * [ENHANCEMENT] Linux CPU: Cache CPU metrics to make them monotonically increasing prometheus#1711 * [BUGFIX] Read /proc/net files with a single read syscall prometheus#1380 * [BUGFIX] Renamed label `state` to `name` on `node_systemd_service_restart_total`. prometheus#1393 * [BUGFIX] Fix netdev nil reference on Darwin prometheus#1414 * [BUGFIX] Strip path.rootfs from mountpoint labels prometheus#1421 * [BUGFIX] Fix seconds reported by schedstat prometheus#1426 * [BUGFIX] Fix empty string in path.rootfs prometheus#1464 * [BUGFIX] Fix typo in cpufreq metric names prometheus#1510 * [BUGFIX] Read /proc/stat in one syscall prometheus#1538 * [BUGFIX] Fix OpenBSD cache memory information prometheus#1542 * [BUGFIX] Refactor textfile collector to avoid looping defer prometheus#1549 * [BUGFIX] Fix network speed math prometheus#1580 * [BUGFIX] collector/systemd: use regexp to extract systemd version prometheus#1647 * [BUGFIX] Fix initialization in perf collector when using multiple CPUs prometheus#1665 * [BUGFIX] Fix accidentally empty lines in meminfo_linux prometheus#1671 Signed-off-by: Ben Kochie <superq@gmail.com>
TLS 1.0 and 1.1 are considered unsecure, while TLS 1.2 would imply that
we add more configuration variables, e.g. to pick cipher suites. TLS1.3
means that cipher suites are not configurable anyway.
Last releases of Prometheus support TLS 1.3 (since Prometheus 2.13).
Signed-off-by: Julien Pivotto roidelapluie@inuits.eu