-
Notifications
You must be signed in to change notification settings - Fork 720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] - Restarting node just got a LOT slower #5884
Comments
A similar issue is occurring on my server. OS: Ubuntu |
Things are getting much worse. Restarting the node used to take 8.5 minutes on my ARM machine but it is now taking over 2hrs!!! (It takes over 1hr "Pushing ledger state" from 68% to 90% and then nearly another hour "Pushing ledger state" from 90% to 100%.) One of my AMD machines that used to take 2.5 minutes to restart just took 35 minutes!!! I am now quite fearful to restart my node running on a Contabo vm because this might take a couple of hours. Any ideas what is going on??? |
https://cardanoscan.io/address/addr1q9lywtk836axkaf985822lh033aj5l6nauau8kg7fhdkze9hmc43c96exwsre5xktq4td5h2mzfjmayhtuk44ryy4uas72wh4t and two other addresses push senseless transactions with 194 withdrawal validators each every minute or so causing all blocks to be almost full and all nodes to constantly have to evaluate these validators (that withdraw 0 ADA from unused stake addresses). EDIT: And it started almost exactly 24 hours ago with https://cardanoscan.io/transaction/ab02ef16b0d863bfe4bf9f488873c967adfd461eb774c27689ac011227ea4f9f, so that fits your observation 13 hours ago. |
Hello here, Same thing on my side and I was not able to understand the reason but, event starting from a fresh snapshot, it took ~40 min to my node (producer & relayer) to process ledger state
I've also discovered that our node is loading a very old ledger state on a old slot (~19h ago). I thought the node took a new snapshot every 72 minutes. am i wrong ? We are running our cardano instance with 4CPU and 32GB of RAM with an NVME disk of 250GB. version: 8.9.3 |
Hi, How can we easily check if we can safely restart our nodes ? @HeptaSean How did you find this transaction quickly ? did you check each transactions for the last 24h ? |
I just restarted my slow ARM node and it started in 11 minutes which is not unusual and certainly a lot better than over 2hrs like it was taking yesterday. The best I normally see when restarting this ARM node is around 8.5 minutes. |
Someone in the IOG Discord saw the huge transactions on eutxo.org and traced them to that address, those addresses. Then started looking at them in more detail. |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 120 days. |
The issue was fixed by "hotfix" releases 8.9.4 and 8.12.2. |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 120 days. |
External
Area
Summary
Nodes very slow to restart with significant delay during "Pushing ledger state" from 22% to 44%.
Steps to reproduce
systemctl restart cardano-node
Expected behavior
Restart on a low power ARM machine used to take 8.5 minutes, now takes 36 minutes.
Restart on a more powerful AMD machine used to take 2 minutes, now takes 6 minutes.
All the extra time occurs when "Pushing ledger state" from 22% to 45%.
System info (please complete the following information):
OS Name: Debian
OS Version: 12 (Bookworm)
Node version (low power ARM machine)
Logs
Note that it took 7 minutes to get to "Pushing ledger state" progress 22% which is typical for this machine from past experience. However it then took over 25 minutes to progress "Pushing ledger state" from 22% to 44%. Normally this stage takes only around 1 to 1.5 minutes. It then took 1.5 minutes to Push the ledger state from 45% to 67% and another 1.5 minutes from 67% to 90% both of which is typical for this slow ARM machine compared to the past.
Logs from more powerful AMD machine running cardano-node version 8.9.3:
The AMD machine is more powerful but it has similarly seen a significant increase in restart time with all the time increase occurring when pushing the ledger state from 22% to 45%.
From restart this machine took 1.5 minutes to get to "Pushing ledger state" progress 22%. But then to progress from 22% to 45% took 4 minutes. While the progress from 45% to 67% and 67% to 90% took only 18 seconds. In the past it only took around 15-20 seconds to progress the ledger state from 22% to 45% on this machine.
Something in the ledger is now making the nodes take a lot longer to push their ledger state. I also checked a relay running version 8.9.2 (similarly compiled with ghc 9.8.2) and it also now takes much longer to restart with all the extra time occurring when pushing the ledger state from 22% to 45%.
Additional context
Something has changed in the ledger which is causing the node software to do a lot more processing. Maybe there is some particular transaction causing a problem??? Maybe there is a bug???
The text was updated successfully, but these errors were encountered: