Remove or guards some unwraps in stratum #2453

ignopeverell · 2019-01-23T00:49:04Z

Attempts to fix #2421. Stratum will ignore several client requests until re-login but should not crash anymore.

rlinxy · 2019-01-23T07:29:10Z

I just complied and try this update, seems many rigs got dropped, only several were able to submit shares. The log repeats like below:
20190123 14:43:42.106 WARN grin_servers::mining::stratumserver - (Server ID: 0) New connection: 49.82.237.77:37716 20190123 14:43:42.108 WARN grin_servers::mining::stratumserver - (Server ID: 0) Failed to parse JSONRpc: JSON error - [] 20190123 14:43:42.108 WARN grin_servers::mining::stratumserver - (Server ID: 0) Dropping worker: 495 20190123 14:43:42.110 DEBUG grin_servers::mining::stratumserver - (Server ID: 0) sending block 10674 with id 0 to single worker 20190123 14:43:42.118 WARN grin_servers::mining::stratumserver - (Server ID: 0) Failed to parse JSONRpc: JSON error - [] 20190123 14:43:42.119 WARN grin_servers::mining::stratumserver - (Server ID: 0) Dropping worker: 492 20190123 14:43:42.119 WARN grin_servers::mining::stratumserver - (Server ID: 0) New connection: 110.86.104.123:17278 20190123 14:43:42.119 WARN grin_servers::mining::stratumserver - (Server ID: 0) New connection: 171.105.181.29:56180 20190123 14:43:42.120 WARN grin_servers::mining::stratumserver - (Server ID: 0) New connection: 202.114.49.71:56098 20190123 14:43:42.120 WARN grin_servers::mining::stratumserver - (Server ID: 0) New connection: 114.101.211.253:32965 20190123 14:43:42.121 DEBUG grin_servers::mining::stratumserver - (Server ID: 0) sending block 10674 with id 0 to single worker 20190123 14:43:42.121 DEBUG grin_servers::mining::stratumserver - (Server ID: 0) sending block 10674 with id 0 to single worker 20190123 14:43:42.122 DEBUG grin_servers::mining::stratumserver - (Server ID: 0) sending block 10674 with id 0 to single worker 20190123 14:43:42.122 DEBUG grin_servers::mining::stratumserver - (Server ID: 0) sending block 10674 with id 0 to single worker 20190123 14:43:42.124 WARN grin_servers::mining::stratumserver - (Server ID: 0) Failed to parse JSONRpc: JSON error - [] 20190123 14:43:42.124 WARN grin_servers::mining::stratumserver - (Server ID: 0) Dropping worker: 493 20190123 14:43:42.124 WARN grin_servers::mining::stratumserver - (Server ID: 0) Failed to parse JSONRpc: JSON error - [] 20190123 14:43:42.124 WARN grin_servers::mining::stratumserver - (Server ID: 0) Dropping worker: 494 20190123 14:43:42.132 WARN grin_servers::mining::stratumserver - (Server ID: 0) New connection: 112.3.242.62:11007 20190123 14:43:42.135 WARN grin_servers::mining::stratumserver - (Server ID: 0) New connection: 49.82.237.77:37719 20190123 14:43:42.135 DEBUG grin_servers::mining::stratumserver - (Server ID: 0) sending block 10674 with id 0 to single worker 20190123 14:43:42.137 WARN grin_servers::mining::stratumserver - (Server ID: 0) Failed to parse JSONRpc: JSON error - [] 20190123 14:43:42.137 DEBUG grin_servers::mining::stratumserver - (Server ID: 0) sending block 10674 with id 0 to single worker 20190123 14:43:42.137 WARN grin_servers::mining::stratumserver - (Server ID: 0) Dropping worker: 498 20190123 14:43:42.141 WARN grin_servers::mining::stratumserver - (Server ID: 0) Failed to parse JSONRpc: JSON error - [] 20190123 14:43:42.141 WARN grin_servers::mining::stratumserver - (Server ID: 0) Dropping worker: 499 20190123 14:43:42.148 WARN grin_servers::mining::stratumserver - (Server ID: 0) New connection: 183.184.114.85:55938 20190123 14:43:42.150 WARN grin_servers::mining::stratumserver - (Server ID: 0) Failed to parse JSONRpc: JSON error - [] 20190123 14:43:42.150 WARN grin_servers::mining::stratumserver - (Server ID: 0) Dropping worker: 501

hashmap · 2019-01-23T12:42:04Z

servers/src/mining/stratumserver.rs

 							self.handle_login(request.params, &mut workers_l[num])
 						}
 						"submit" => {
+							if let None = worker_stats_id {


Isn't match (here and below) looks cleaner?

@hashmap 👍 looks like it. I suspect cargo-clippy also agrees.

hashmap · 2019-01-23T13:40:31Z

servers/src/mining/stratumserver.rs

-						.position(|r| r.id == workers_l[num].id)
-						.unwrap();
-					stratum_stats.worker_stats[worker_stats_id].last_seen = SystemTime::now();
+						.position(|r| r.id == workers_l[num].id);


Should we just ignore such worker? I'm trying to understand how we can get into this situation, don't see it so far.

The initial guessing has been that there is some miner (software, or class of user) that is fairly uncommon, and that manages to crash the stratum server silently. Truly crashing would allow pool operators to have the service autorestart. Or not crashing might hide the issue. So, maybe log this case loudly?

sesam · 2019-01-23T22:17:10Z

Related #2446 (awkward PR title, sorry) which also refactors a bit. I haven't yet compared the two PRs.

@bladedoyle and others who'd want to try this, do:
git fetch https://github.com/mimblewimble/grin pull/2453/head:stratum_panic_fix; git checkout stratum_panic_fix # after finishing testing, remember to git checkout master

sesam · 2019-01-23T23:18:51Z

UPDATE: having compared, this PR is cleaner, while potentially excluding some clients from accessing api endpoints keepalive and getjobtemplate - good or bad - is also what hashmap asked about.

The difference with #2446 is with where we ignore a worker (via continue) based on not finding worker_stats_id among stratum worker stats. When would that happen? Maybe if the stratum server has forgotten (intentionally or not) about one worker, or if the worker uses a bad ID.

(is the integer vs string discussion relevant here?)

rlinxy · 2019-01-24T04:35:12Z

@sesam
The stratum was running for several hours smoothly without problem after I used the code updated by you. But hours later, got another panic

20190123 17:48:37.847 ERROR grin_util::logger -
thread 'stratum_server' panicked at 'called Option::unwrap() on a None value': src/libcore/option.rs:355stack backtrace:

I lost the log, but have a screen shot here
https://i.niupic.com/images/2019/01/24/5L1h.png

So it was the 'clean_workers' fonction caused this 'panic'.

I tried to modify the code as what you did in the clean_workers:
let worker_stats_id = match stratum_stats .worker_stats .iter() .position(|r| r.id == workers_l[num].id) { Some(id) => id, None => continue, };

No panic anymore, but the number of tcp connection is keep growing, obviously the 'dead' workers can not be dropped and the inactive connections always remain there.

rlinxy · 2019-01-24T04:37:30Z

Please copy the link to the browser, I found just click doesn't work at github :(

ignopeverell · 2019-01-24T18:32:48Z

Going to close this. It was an attempt at a small improvement but looks like our stratum server needs much more than small fixes.

rlinxy · 2019-01-25T04:50:48Z

@ignopeverell Guessing it might be solved by combining @sesam and @hashmap's update, as what I post here at #2457. I am still running the test, but everything is fine till now, it has been running without problem for more than 1 hour with 100+ rigs mining.

rlinxy · 2019-01-25T07:23:03Z

Has been running for 4 hours, the tcp connection grew from 300 to 1.5K, and keep increasing. So seems the 'dead' tcp connection still can not be cloesed properly.

ignopeverell added 2 commits January 23, 2019 00:46

Remove of guard some unwraps in stratum

b2069c0

rustfmt

7efd130

hashmap reviewed Jan 23, 2019

View reviewed changes

rlinxy mentioned this pull request Jan 24, 2019

Stratum server crashed on miner reconnect #2421

Closed

ignopeverell closed this Jan 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove or guards some unwraps in stratum #2453

Remove or guards some unwraps in stratum #2453

ignopeverell commented Jan 23, 2019

rlinxy commented Jan 23, 2019

hashmap Jan 23, 2019

sesam Jan 23, 2019

hashmap Jan 23, 2019

sesam Jan 23, 2019

sesam commented Jan 23, 2019 •

edited

Loading

sesam commented Jan 23, 2019 •

edited

Loading

rlinxy commented Jan 24, 2019

rlinxy commented Jan 24, 2019

ignopeverell commented Jan 24, 2019

rlinxy commented Jan 25, 2019

rlinxy commented Jan 25, 2019

Remove or guards some unwraps in stratum #2453

Remove or guards some unwraps in stratum #2453

Conversation

ignopeverell commented Jan 23, 2019

rlinxy commented Jan 23, 2019

hashmap Jan 23, 2019

Choose a reason for hiding this comment

sesam Jan 23, 2019

Choose a reason for hiding this comment

hashmap Jan 23, 2019

Choose a reason for hiding this comment

sesam Jan 23, 2019

Choose a reason for hiding this comment

sesam commented Jan 23, 2019 • edited Loading

sesam commented Jan 23, 2019 • edited Loading

rlinxy commented Jan 24, 2019

rlinxy commented Jan 24, 2019

ignopeverell commented Jan 24, 2019

rlinxy commented Jan 25, 2019

rlinxy commented Jan 25, 2019

sesam commented Jan 23, 2019 •

edited

Loading

sesam commented Jan 23, 2019 •

edited

Loading