optimize maintenance rebalance/re-replicate with direct asd-to-asd communication #716

domsj · 2017-05-10T13:46:36Z

Rebalance can be optimized by having the too full asd send the fragment data to the not-yet-full-enough asd directly.

Similarly for repair in case of a replication policy it should be possible to send the fragment data directly between the asds.

dejonghb · 2017-05-11T13:37:42Z

Quite some impact from maintenance trying to rebalance an asymmetric backend (1 disk extra in 1 node in this setup; but extra empty disks/nodes would have simular behaviour)

Throughput from dd in a vm via edge:

maintenance off:

...
1494507344 5 16.3308 s 131 MB/s
1494507361 6 17.5935 s 122 MB/s
1494507380 7 18.4305 s 117 MB/s

maintenance on:

1494507400 8 27.6425 s 77.7 MB/s
1494507428 9 50.8574 s 42.2 MB/s
1494507480 10 32.9535 s 65.2 MB/s
1494507514 11 44.0469 s 48.8 MB/s

maintenance off:

1494507561 12 17.793 s 121 MB/s
1494507580 13 17.5795 s 122 MB/s
1494507599 14 18.3454 s 117 MB/s
1494507618 15 17.8269 s 120 MB/s
1494507637 16 19.3551 s 111 MB/s

maintenance on:

1494507657 17 49.1189 s 43.7 MB/s
1494507709 18 42.4096 s 50.6 MB/s
1494507753 19 41.4204 s 51.8 MB/s
1494507797 20 33.4441 s 64.2 MB/s

Network without maintenance:

       eth2       
 KB/s in  KB/s out
147599.3  54549.87
189334.6  54740.97
199151.6  54746.13
167426.1  16094.95
222470.0  39147.36
219685.5  54783.22
206448.5  54979.32
330253.6  55099.64
185136.3  37519.75

network with maintenance:

       eth2       
 KB/s in  KB/s out
409364.7  505545.1
433639.6  513500.1
418442.1  570177.3
454566.2  498419.2
453942.3  513534.5
420513.6  435244.4
438833.4  466755.9
473411.2  526030.5
518031.7  369577.4
489400.4  430726.2

dejonghb · 2017-05-16T12:46:57Z

Maybe the rebalancing should not be enabled by default, given the impact on the network (and disks) that gets lost for ingest?

Is the time/work done for moving old data around indeed worth the effort? Probably this also depends on the use case and for a constant ingest things might be different than for a bursty one...

Maybe the decision when to move data around plus from where to where to move is also something that needs more thoughtful insight (policies used / capacity planning / ...) than the maintenance process itself has?

ps/ rebalancing can be turned off via

alba update-maintenance-config --disable-rebalance --config <abm-configurl>

wimpers · 2017-05-19T08:14:00Z

Isn't there a way to limit the impact of rebalancing (lowering its priority) so there still is some rebalancing going on?

dejonghb · 2017-05-30T14:56:44Z

toolslive · 2018-10-23T08:52:58Z

waiting on QA effort.

wimpers added the type_enhancement label May 22, 2017

wimpers added this to the H milestone May 29, 2017

domsj self-assigned this Jun 2, 2017

domsj added the state_inprogress label Jun 2, 2017

wimpers modified the milestones: I, H Aug 29, 2017

wimpers modified the milestones: I, J Nov 28, 2017

wimpers modified the milestones: J, M Mar 6, 2018

wimpers modified the milestones: M, Roadmap Sep 14, 2018

JeffreyDevloo added state_verification and removed state_inprogress labels Oct 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimize maintenance rebalance/re-replicate with direct asd-to-asd communication #716

optimize maintenance rebalance/re-replicate with direct asd-to-asd communication #716

domsj commented May 10, 2017 •

edited

Loading

dejonghb commented May 11, 2017 •

edited

Loading

dejonghb commented May 16, 2017

wimpers commented May 19, 2017

dejonghb commented May 30, 2017

toolslive commented Oct 23, 2018

optimize maintenance rebalance/re-replicate with direct asd-to-asd communication #716

optimize maintenance rebalance/re-replicate with direct asd-to-asd communication #716

Comments

domsj commented May 10, 2017 • edited Loading

dejonghb commented May 11, 2017 • edited Loading

dejonghb commented May 16, 2017

wimpers commented May 19, 2017

dejonghb commented May 30, 2017

toolslive commented Oct 23, 2018

domsj commented May 10, 2017 •

edited

Loading

dejonghb commented May 11, 2017 •

edited

Loading