Beacon node restart due to "Could not get rough time result: lookup caesium.tannerryan.ca: too many open files" #7262

xuyenvuong · 2020-09-17T19:03:05Z

🐞 Bug Report

Description

A clear and concise description of the problem...

Occasionally I got several Grafana notification ([OK] WARN NODE/VALIDATOR: The process just restarted) about my Beacon Node is restarting. Checking the log and got these errors that is time corresponding to the node restarting time. During beacon node's down time, the number of attestation and aggregation failures are increasing as more than double, from:

33 (8hrs prior) to 87 attestation failures
3 (8hrs prior) to 7 aggregation failures.

I would want to see if we can implement some failover solution for validator to avoid attestations and aggregation failures due to dependency to just one single beacon node.

Has this worked before in a previous version?

No, this issue is happening very often on previous version as well. I will start to collect more error logs for each time it is auto-restarting

🔬 Minimal Reproduction

No particular reproducible steps.

🔥 Error




time="2020-09-17 18:28:34" level=error msg="Could not get rough time result: lookup caesium.tannerryan.ca: too many open files" prefix=roughtime
time="2020-09-17 18:28:34" level=error msg="Could not get rough time result: lookup roughtime.chainpoint.org: too many open files" prefix=roughtime
time="2020-09-17 18:28:34" level=error msg="Could not get rough time result: lookup roughtime.cloudflare.com: too many open files" prefix=roughtime
time="2020-09-17 18:28:34" level=error msg="Could not get rough time result: lookup roughtime.sandbox.google.com: too many open files" prefix=roughtime
time="2020-09-17 18:28:34" level=error msg="Could not get rough time result: lookup roughtime.int08h.com: too many open files" prefix=roughtime
time="2020-09-17 18:28:34" level=error msg="Could not get rough time result: lookup ticktock.mixmin.net: too many open files" prefix=roughtime
time="2020-09-17 18:28:34" level=error msg="Failed to calculate roughtime offset" error="no valid responses" prefix=roughtime

🌍 Your Environment

Operating System:
Ubuntu latest on Pi 4 8 GB

What version of Prysm are you running? (Which release)
alpha.25

Anything else relevant (validator index / public key)?
https://medalla.beaconcha.in/dashboard?validators=12425,12433,12437,12442,12446,12456,12457,12461,12465,12469,12473,12474,12477,12480,12487,12490,12493,12499,12504,12509,12511,12516,12521,12525,12527,12532,12542,12544,12552,12567,12568,12569,12574

The text was updated successfully, but these errors were encountered:

terencechain · 2020-09-17T19:05:42Z

Fixed in #7221

Will be in alpha.26

xuyenvuong · 2020-09-17T19:57:27Z

Thanks @terencechain

terencechain closed this as completed Sep 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Beacon node restart due to "Could not get rough time result: lookup caesium.tannerryan.ca: too many open files" #7262

Beacon node restart due to "Could not get rough time result: lookup caesium.tannerryan.ca: too many open files" #7262

xuyenvuong commented Sep 17, 2020

terencechain commented Sep 17, 2020

xuyenvuong commented Sep 17, 2020

Beacon node restart due to "Could not get rough time result: lookup caesium.tannerryan.ca: too many open files" #7262

Beacon node restart due to "Could not get rough time result: lookup caesium.tannerryan.ca: too many open files" #7262

Comments

xuyenvuong commented Sep 17, 2020

🐞 Bug Report

Description

Has this worked before in a previous version?

🔬 Minimal Reproduction

🔥 Error

🌍 Your Environment

terencechain commented Sep 17, 2020

xuyenvuong commented Sep 17, 2020