-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
in PC1, all three producer thread was binded to core group 0 #5076
Comments
whats the hardware you are using? |
PC1 WORKER: cpu AMD EPYC 7H12; 2048GiB mem |
I have same problem, It seem that PC1 worker is not effective for taskset specified CPU! |
I have the same problem. |
EPYC 7272 512Gib Ram Just started to experiment with multicore yesterday. I see a 20% drop in time from a little over 5 hours to 4 hours flat, for up to two PC1 tasks on the same worker. If I add another worker and add a PC1 task the system slows down. 7059 PU:12 PU:13 lotus-worker //this is my add piece worker So running two PC1 tasks on the first worker runs smoothly, when a task is added to the second worker it slows down as it then tries to use cores already being used, 0-2, which should not even be assigned to that PID. lotus version Daemon: 1.4.0+git.e9989d0e4+api1.0.0 Local: lotus version 1.4.0+git.e9989d0e4 |
Oops, seems like we needed more information for this issue, please comment with more details or this issue will be closed in 24 hours. |
This issue was closed because it is missing author input. |
I got 3 pc1 workers on same machine, and i set FIL_PROOFS_USE_MULTICORE_SDR=1, taskset -c 0,1,2,3 lotus-worker run,taskset -c 4,5,6,7 lotus-worker run,taskset -c 8,9,10,11 lotus-worker run,i excpect every worker takes its cpuset(work1:0,1,2,3 ;work2:4,5,6,7;work3:8,9,10,11) and every worker seals 1 layer in 20 minutes, but its result is that three worker's core all bind to core_group 0(cpuset 0,1,2,3), average layer takes 40 minutes.
2020-11-29T23:28:10.477 INFO storage_proofs_porep::stacked::vanilla::proof > replicate_phase1
2020-11-29T23:28:10.477 INFO storage_proofs_porep::stacked::vanilla::graph > using parent_cache[2048 / 1073741824]
2020-11-29T23:28:10.477 INFO storage_proofs_porep::stacked::vanilla::cache > parent cache: opening /data/cpfs/PROOFS_PARENT/v28-sdr-parent-21981246c370f9d76c7a77ab273d94bde0ceb4e938292334960bce05585dc117.cache, verify enabled: false
2020-11-29T23:28:10.477 INFO storage_proofs_porep::stacked::vanilla::proof > multi core replication
2020-11-29T23:28:10.477 INFO storage_proofs_porep::stacked::vanilla::create_label::multi > create labels
2020-11-29T23:28:10.542 DEBUG storage_proofs_porep::stacked::vanilla::cores > Cores: 128, Shared Caches: 32, cores per cache (group_size): 4
2020-11-29T23:28:10.542 DEBUG storage_proofs_porep::stacked::vanilla::cores > checked out core group 0
2020-11-29T23:28:10.542 DEBUG storage_proofs_porep::stacked::vanilla::create_label::multi > binding core in main thread
2020-11-29T23:28:10.542 DEBUG storage_proofs_porep::stacked::vanilla::cores > allowed cpuset: 0
2020-11-29T23:28:10.542 DEBUG storage_proofs_porep::stacked::vanilla::cores > binding to 0
2020-11-29T23:28:10.559 INFO storage_proofs_porep::stacked::vanilla::memory_handling > initializing cache
2020-11-29T23:28:57.189 INFO storage_proofs_porep::stacked::vanilla::create_label::multi > Layer 1
2020-11-29T23:28:57.190 INFO storage_proofs_porep::stacked::vanilla::create_label::multi > Creating labels for layer 1
The text was updated successfully, but these errors were encountered: