Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

voting: Poll tab locks up GUI and takes unacceptably long on slower machines #2618

Closed
jamescowens opened this issue Dec 22, 2022 · 4 comments · Fixed by #2619
Closed

voting: Poll tab locks up GUI and takes unacceptably long on slower machines #2618

jamescowens opened this issue Dec 22, 2022 · 4 comments · Fixed by #2619
Assignees
Milestone

Comments

@jamescowens
Copy link
Member

jamescowens commented Dec 22, 2022

On slower machines such as older ARM SBC's or PC's with regular HDD's instead of SDD's, the Poll tab exhibits poor performance.

This has not been as noticeable until now, where there are four simultaneous polls in progress at the same time, all with relatively high participation rates. But with these many polls, the performance problems are leading some people with slower machines to get frustrated, and this shows that while the voting system is of extremely high integrity, it will not scale well.

To tabulate the current results of the poll, the GUI must traverse the Poll registry and call functions that resolve every unspent UTXO in every vote in every poll. This is very I/O intensive, since @cyrossignol had elected for both memory conservation and simplicity to leave these details out of memory. The details are only temporarily kept in memory as the poll results are being tabulated, and then the summary is maintained in memory.

This choice saves on the memory footprint and also simplifies situations where there is a chain reorg, which could roll-back polls themselves or votes on a poll, because the entire structure is refreshed, but it places a great strain on the computer CPU and IO as a result.

There are a few ideas to deal with this issue:

  1. (Medium Effort) Redo the locking to remove the big lock taken on cs_main in VotingModel::buildPollTable. This is the main culprit of the GUI lockup, because once another lock is taken and then cs_main is taken again, the GUI will block until the recursive lock(s) are cleared.
  2. (Medium Effort) Changing the GUI to simply list the polls first without the result details, which would be very fast, just like the listpolls RPC call, and then providing the ability to refresh the results on all polls or a particular poll (one at a time).
  3. (High Effort) Implement a Poll Result class that allows rollbacks to be compensated so that the whole thing does not have to be recomputed each time.

I have, in my optimize_poll_locking branch, instrumented the functions involved in the poll refresh with MilliTimer so that we can see where the time is being spent.

Please see attached two runs from the opposite end of the spectrum:

Note that four polls were active with ~961 votes total, which involve thousands of UTXO's to check.

  • A really fast workstation (my new one), which is an Intel i9 13900K with 128 GB ECC memory and 2 x 2 TB Samsung 980 Pro NVMe drives in RAID 1. Doesn't get much faster than this, where a complete refresh takes 3.938 secs.
  • A relatively slow Odroid XU4 with data directory mounted over NFS. (357.365 seconds)

This is a factor of 91x!!

refresh polls run on really fast computer (Intel i9 13900K).log
refresh polls run on slow computer (Odroid XU4).log

@jamescowens jamescowens self-assigned this Dec 22, 2022
@jamescowens jamescowens added this to the LaVerne milestone Dec 22, 2022
@jamescowens
Copy link
Member Author

From Discord...

iFoggz (Paul Jensen) — Today at 5:35 PM
Is this because multiple polls are running at same time?

Jim Owens — Today at 5:51 PM
Yes... with a large number of votes on each one.
[5:54 PM]
The issue with the GUI lockup is the cs_main taken in buildPollTable before the main for loop that traverses the registry. I may break this up into two steps... in a block with cs_main, get a vector or similar to traverse from the m_registry.Polls().Where(flags), and then in a separate block, do the for loop against that temporary with the cs_main lock INSIDE the for loop. That is the simplest way to break up the long lock. It will be taken and released on EACH poll instead of the entire duration.
[5:55 PM]
I think I will have to implement some checking to ensure that the poll reference pointed to by the vector has not become invalid due to a rollback (reorg) in the core while the cs_main lock was released during the traversal.

@Gawiga
Copy link

Gawiga commented Dec 24, 2022

I'm having the same issue. I don't believe that is caused only in slow machines, I have Gridcoin Wallet installed in my C:/ SSD, but my wallet is on D:/ on my HDD. My PC is a Ryzen 2600x, with 16GB RAM.

image

Update: It is freezing, but after 5~10 minutes the application open the Polls.

@jamescowens
Copy link
Member Author

Note I said... or PC's with regular HDD's instead of SDD's. Yours fits that definition since you put the wallet on an HDD.

@jamescowens
Copy link
Member Author

@Gawiga relief is on the way... at least for the apparent lockup part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants