You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Top-level iteration 6/8 of new_problem.in demonstrates this. The example from Top below is an extreme example, but the imbalance is consistent enough that we should investigate. Need to profile the threads to understand how work is being divided up. Is the sort in the top-level thread creating a bottleneck? The truncated-dense method may be less sensitive to infrequent sorting. Is the OpenMP dynamic scheduler creating a bottleneck? Does a vector of rows with non-zero entries in the currently reducing column need to be created to better balance load? This vector may be possible to create while reducing the preceding column.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
137309 kent 20 0 4256260 3.1g 4408 R 99.9 5.0 28:33.65 albert
137314 kent 20 0 4256260 3.1g 4408 S 54.5 5.0 17:35.25 albert
137317 kent 20 0 4256260 3.1g 4408 S 27.3 5.0 17:30.03 albert
137322 kent 20 0 4256260 3.1g 4408 S 27.3 5.0 17:31.70 albert
137323 kent 20 0 4256260 3.1g 4408 S 27.3 5.0 17:35.79 albert
137310 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:35.87 albert
137311 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:33.10 albert
137312 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:32.59 albert
137313 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:32.85 albert
137315 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:33.32 albert
137316 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:33.05 albert
137318 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:35.82 albert
137319 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:33.11 albert
137320 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:34.18 albert
137321 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:32.93 albert
137324 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:34.70 albert
The text was updated successfully, but these errors were encountered:
Top-level iteration 6/8 of new_problem.in demonstrates this. The example from Top below is an extreme example, but the imbalance is consistent enough that we should investigate. Need to profile the threads to understand how work is being divided up. Is the sort in the top-level thread creating a bottleneck? The truncated-dense method may be less sensitive to infrequent sorting. Is the OpenMP dynamic scheduler creating a bottleneck? Does a vector of rows with non-zero entries in the currently reducing column need to be created to better balance load? This vector may be possible to create while reducing the preceding column.
137309 kent 20 0 4256260 3.1g 4408 R 99.9 5.0 28:33.65 albert
137314 kent 20 0 4256260 3.1g 4408 S 54.5 5.0 17:35.25 albert
137317 kent 20 0 4256260 3.1g 4408 S 27.3 5.0 17:30.03 albert
137322 kent 20 0 4256260 3.1g 4408 S 27.3 5.0 17:31.70 albert
137323 kent 20 0 4256260 3.1g 4408 S 27.3 5.0 17:35.79 albert
137310 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:35.87 albert
137311 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:33.10 albert
137312 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:32.59 albert
137313 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:32.85 albert
137315 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:33.32 albert
137316 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:33.05 albert
137318 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:35.82 albert
137319 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:33.11 albert
137320 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:34.18 albert
137321 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:32.93 albert
137324 kent 20 0 4256260 3.1g 4408 S 18.2 5.0 17:34.70 albert
The text was updated successfully, but these errors were encountered: