Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Web graphs are empty if client's system time is in advance compared to node time #1048

Closed
deribaucourt opened this issue Oct 22, 2024 · 5 comments
Labels
bug Something isn't working needs triage

Comments

@deribaucourt
Copy link

Bug description

If the system time of the browser is in the future compared to the node's time, then the default time window shown in the graph will use the client's time instead of the node. The result is that all graphs appear "empty" because no data exists yet on the node at that time.

Confusingly, other graphical items like dashboard do present the live data.

Expected behavior

The default presented timeframe should present some valid data when opening the netdata web UI, independently of system time configurations. The default time frame presents the last 15 minutes of data. The time reference should be computed relative to the node's time rather than the client's time to present the expected data.

Screenshot from 2024-10-22 17-06-56

Steps to reproduce

  1. Open any netdata web GUI. Ex: https://app.netdata.cloud/spaces/netdata-demo/rooms/all-nodes/overview
  2. Advance system time on the web client by a few hours (date command, or system settings)
  3. All charts appear empty, the queried time does not contain data on the node

Installation method

kickstart.sh

System info

Client: Ubuntu 24.04, but all systems should be affected
Node: All demo nodes and my custom instance with Netdata 1.47.1

Netdata build info

Netdata Version ____________________________________________ : v1.47.1

Additional info

No response

@deribaucourt deribaucourt added bug Something isn't working needs triage labels Oct 22, 2024
@ilyam8
Copy link
Member

ilyam8 commented Oct 22, 2024

Hi, @deribaucourt.

The time reference should be computed relative to the node's time rather than the client's time to present the expected data.

I think this request is "wontfix". cc @netdata/product @netdata/cloud-fe

@ilyam8 ilyam8 transferred this issue from netdata/netdata Oct 22, 2024
@ktsaou
Copy link
Member

ktsaou commented Oct 22, 2024

Time internally in Netdata is always in UTC.

For each browser / user it is converted to his/hers local time, so that each user can see the time relative to his/hers timezone. Also users can change the timezone they view charts, from the datetime picker on the dashboard.

The same logic is applied on nodes too. So, each node can be configured on any timezone and mainly because Netdata is always using UTC internally, all clocks around the world are actually synchronized.

So the idea is simple. Configure the clocks of all nodes and clients, using any timezone you wish for each of them. Netdata will do the right job, as long as you actually set the clocks right.

If nodes are not configured properly (ie they don't have the same equivalent UTC), all kinds of strange things can happen, especially when streaming data between nodes (parents - children).

For clients (web browsers), also a lot of things will be wrong if the configured local time in UTC is wrong. Not only the dashboard will be shifted, but also alert transitions will have the wrong timestamps.

This is a 'wontfix' for a good reason. Netdata dashboards are multi-node. In many cases choosing the "node's time" will be impossible, as there are may be multiple nodes involved.

So the solution to this is simple. No matter what is your local time, configure the right timezone, to both server nodes and client machines. And always use some form of clock sync (NTP) to make sure you have the right time.

@deribaucourt
Copy link
Author

Thanks for your reply @ktsaou. During my investigation of the issue, I saw a code path in dygraph that could at emit a warning in the JS console. I'll suggest adding at least a warning log to help users figure out this kind of situations.

I understand your explanation, but my nodes will run in enclosed environments without access to NTP so their clock will drift over time. I'll add troubleshooting instructions for this specific case.

@ktsaou
Copy link
Member

ktsaou commented Oct 22, 2024

Keep in mind that you should sync the clocks on your servers among themselves. Each clock drifts in a different way and having all your nodes having different UTC times each will complicate things further.

You can configure 3 of your nodes as NTP servers and all your nodes to sync from them. This way they will at least have the same drift.

If you are allowed to sync clocks from the internet on these NTP servers, or you can install hardware high precision clocks on them, there will be no drift either.

@ktsaou
Copy link
Member

ktsaou commented Oct 22, 2024

ChatGPT response on achieving zero drift in isolated environments:

In isolated environments where there's no internet access, achieving zero clock drift across all servers can be challenging but feasible with the right setup. Here are a few strategies you can consider:

  1. Use a GPS-Based NTP Server

GPS Time Servers: GPS satellites transmit highly accurate time signals. You can set up a GPS-based NTP server that connects directly to a GPS receiver. This server acts as a Stratum 1 NTP server, providing accurate time to your entire infrastructure.

Benefits: The time derived from GPS is precise and doesn’t drift, as it syncs directly with the atomic clocks in the satellites.

Considerations: Requires a GPS antenna with a clear line of sight to the sky, which can be a logistical challenge depending on the data center or server location.

  1. Atomic Clocks or Rubidium Clocks

Precision Time Protocol (PTP) with Atomic Clocks: Use atomic clocks (such as cesium or rubidium) as a time source for your NTP master servers. These devices can provide highly accurate time even when disconnected from the internet.

Benefits: Atomic clocks have extremely low drift rates. For example, cesium clocks might drift by a few nanoseconds over days, which is negligible for most use cases.

Considerations: This is a more expensive solution but is often used in environments where high precision is critical, such as telecom and financial services.

  1. Configure an Internal Time Distribution Network

High-Availability NTP Cluster: Set up three or more NTP servers in a highly available cluster. Each of these NTP servers should ideally be connected to a stable time source (like a GPS time server or atomic clock). They can be configured to cross-check and synchronize with each other.

Benefits: Having multiple NTP servers ensures redundancy and reduces the risk of a single point of failure.

Considerations: Without an external reference, the accuracy might still degrade slightly over long periods, but this setup will significantly minimize the drift.

  1. Deploy Precision Time Protocol (PTP) Over NTP

PTP (IEEE 1588v2): Unlike NTP, PTP can provide sub-microsecond synchronization between nodes. You can set up a PTP grandmaster clock (ideally connected to a GPS receiver) to synchronize time across your network.

Benefits: PTP is more accurate than NTP and can help achieve synchronization on the order of microseconds, which is suitable for high-precision applications.

Considerations: Requires network switches that support PTP hardware timestamping to achieve high accuracy.

  1. Periodic Manual Synchronization

Portable Time Reference: If no GPS or external time sources can be used, consider periodically bringing a server online that has been synchronized to an accurate source. This server can then be used to re-synchronize your isolated NTP masters.

Benefits: Provides occasional corrections to prevent significant drift.

Considerations: Inconvenient and not entirely drift-free, but it can help keep time reasonably accurate.

Recommended Approach:

For a fully isolated network, the best practice is to invest in a GPS-based NTP server or atomic clock to act as your Stratum 1 time source. This ensures your NTP masters have a highly accurate reference, minimizing drift across your infrastructure. For additional precision, consider implementing PTP alongside NTP if your applications demand it.


It seems that having multiple NTP servers in a cluster, without an external reliable source, will minimize drift too.

This should be relatively easy and cost efficient to achieve.

@ilyam8 ilyam8 closed this as not planned Won't fix, can't repro, duplicate, stale Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage
Projects
None yet
Development

No branches or pull requests

3 participants