Additional config settings for the memory overhead factor #481

ash211 · 2017-09-06T05:19:34Z

I'm seeing the default value of 0.10 fail for even reasonably-sized shuffle
jobs so expect this value to require some tuning to reliably succeed.

We copied this default value from YARN but it appears that kubernetes is more
strict on enforcing memory limits on containers than YARN has been: I have
two identically configured clusters of five AWS r3.4xls, one running YARN and
the other running kubernetes, with identical driver/executor settings, running
identical jobs, and the YARN job succeeds whereas the k8s job fails due to the
pod exceeding its memory limit.

I'm seeing the default value of 0.10 fail for even reasonably-sized shuffle jobs so expect this value to require some tuning to reliably succeed. We copied this default value from YARN but it appears that kubernetes is more strict on enforcing memory limits on containers than YARN has been: I have two identically configured clusters of five AWS r3.4xls, one running YARN and the other running kubernetes, with identical driver/executor settings, running identical jobs, and the YARN job succeeds whereas the k8s job fails due to the pod exceeding its memory limit.

foxish · 2017-09-07T16:18:58Z

To be consistent with YARN, maybe we should do memoryOverhead instead. The memory factor would make the computed value depend on another argument, driverMemory, which isn't ideal IMO.

mccheah approved these changes Sep 7, 2017

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Additional config settings for the memory overhead factor #481

Additional config settings for the memory overhead factor #481

Uh oh!

ash211 commented Sep 6, 2017 •

edited

Loading

Uh oh!

foxish commented Sep 7, 2017 •

edited

Loading

Uh oh!

Uh oh!

Additional config settings for the memory overhead factor #481

Are you sure you want to change the base?

Additional config settings for the memory overhead factor #481

Uh oh!

Conversation

ash211 commented Sep 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

foxish commented Sep 7, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ash211 commented Sep 6, 2017 •

edited

Loading

foxish commented Sep 7, 2017 •

edited

Loading