-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-37618][CORE][Followup] Support cleaning up shuffle blocks from external shuffle service #36473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Tested locally, will wait for GA to also complete. |
|
+CC @srowen, @Kimahriman |
|
Looks good to me, thanks! Figured posix permissions would do something weird on someone's machine eventually |
| // Use jnr to get and override the current process umask. | ||
| // Expects the input mask to be an octal number | ||
| private def getAndSetUmask(posix: POSIX, mask: String): String = { | ||
| val prev = posix.umask(BigInt(mask, 8).toInt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any existing utility for setting umask? I thought Hadoop APIs had this somewhere and that we use it. No big deal if not. But if we have other places we use umask, could be good to standardize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We dont currently get/set process umask. I did see use of jna in hadoop, but that is shaded now.
Any library which allows us to set process umask would do actually, jnr simply seemed to be more easy to use across platforms (this is for tests anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only place I know of is in the native code for the container executor
|
+CC @MaxGekk |
|
Any other comments @srowen ? |
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mridulm . Could you elaborate a little more about the condition where this happens in the PR description? It doesn't happen in Apache Spark GitHub Action and Apple Silicon Farm.
|
Sure, will do. Thx for the review @dongjoon-hyun, PTAL at the changed description. |
|
My local machine seems to have gotten hosed, can you please merge it to master/branch-3.3 @dongjoon-hyun or @srowen ? Thx ! |
|
While testing this PR for merging, I made a PR to you. Could you review that, @mridulm ? |
|
IIUC, that PR will minimize the diff for backporting. |
|
Maybe dumb question... But could this have been used to make the actual logic a little simpler? If we could change the umask to be more permissive (allow group write) when creating blockmgr sub dirs, we wouldn't need to do quite as much permission changing after the fact. I didn't know there was a way to change it in Java land without forking a subprocess |
|
@dongjoon-hyun I wanted to make sure that the process umask is restored in a try/finally. |
|
@Kimahriman That is a possibility, but I did not want to make functional changes as part of RC voting :) |
Yeah wasn't suggesting doing it now! Hah just wanted to bring it up. I can try to play around with it at some point |
|
Any other comments @srowen, @dongjoon-hyun ? Thx |
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM again. No other comments from my side, @mridulm .
… external shuffle service ### What changes were proposed in this pull request? Fix test failure in build. Depending on the umask of the process running tests (which is typically inherited from the user's default umask), the group writable bit for the files/directories could be set or unset. The test was assuming that by default the umask will be restrictive (and so files/directories wont be group writable). Since this is not a valid assumption, we use jnr to change the umask of the process to be more restrictive - so that the test can validate the behavior change - and reset it back once the test is done. ### Why are the changes needed? Fix test failure in build ### Does this PR introduce _any_ user-facing change? No Adds jnr as a test scoped dependency, which does not bring in any other new dependency (asm is already a dep in spark). ``` [INFO] +- com.github.jnr:jnr-posix:jar:3.0.9:test [INFO] | +- com.github.jnr:jnr-ffi:jar:2.0.1:test [INFO] | | +- com.github.jnr:jffi:jar:1.2.7:test [INFO] | | +- com.github.jnr:jffi:jar:native:1.2.7:test [INFO] | | +- org.ow2.asm:asm:jar:5.0.3:test [INFO] | | +- org.ow2.asm:asm-commons:jar:5.0.3:test [INFO] | | +- org.ow2.asm:asm-analysis:jar:5.0.3:test [INFO] | | +- org.ow2.asm:asm-tree:jar:5.0.3:test [INFO] | | +- org.ow2.asm:asm-util:jar:5.0.3:test [INFO] | | \- com.github.jnr:jnr-x86asm:jar:1.0.2:test [INFO] | \- com.github.jnr:jnr-constants:jar:0.8.6:test ``` ### How was this patch tested? Modification to existing test. Tested on Linux, skips test when native posix env is not found. Closes #36473 from mridulm/fix-SPARK-37618-test. Authored-by: Mridul Muralidharan <mridulatgmail.com> Signed-off-by: Sean Owen <srowen@gmail.com> (cherry picked from commit 3174071) Signed-off-by: Sean Owen <srowen@gmail.com>
|
Merged to master/3.3 |
…g the shuffle service for released executors (#1041) * [SPARK-37618][CORE] Remove shuffle blocks using the shuffle service for released executors Add support for removing shuffle files on released executors via the external shuffle service. The shuffle service already supports removing shuffle service cached RDD blocks, so I reused this mechanism to remove shuffle blocks as well, so as not to require updating the shuffle service itself. To support this change functioning in a secure Yarn environment, I updated permissions on some of the block manager folders and files. Specifically: - Block manager sub directories have the group write posix permission added to them. This gives the shuffle service permission to delete files from within these folders. - Shuffle files have the world readable posix permission added to them. This is because when the sub directories are marked group writable, they lose the setgid bit that gets set in a secure Yarn environment. Without this, the permissions on the files would be `rw-r-----`, and since the group running Yarn (and therefore the shuffle service), is no longer the group owner of the file, it does not have access to read the file. The sub directories still do not have world execute permissions, so there's no security issue opening up these files. Both of these changes are done after creating a file so that umasks don't affect the resulting permissions. External shuffle services are very useful for long running jobs and dynamic allocation. However, currently if an executor is removed (either through dynamic deallocation or through some error), the shuffle files created by that executor will live until the application finishes. This results in local disks slowly filling up over time, eventually causing problems for long running applications. No. New unit test. Not sure if there's a better way I could have tested for the files being deleted or any other tests I should add. Closes #35085 from Kimahriman/shuffle-service-remove-shuffle-blocks. Authored-by: Adam Binford <adamq43@gmail.com> Signed-off-by: Thomas Graves <tgraves@apache.org> * [SPARK-37618][CORE][FOLLOWUP] Support cleaning up shuffle blocks from external shuffle service Fix test failure in build. Depending on the umask of the process running tests (which is typically inherited from the user's default umask), the group writable bit for the files/directories could be set or unset. The test was assuming that by default the umask will be restrictive (and so files/directories wont be group writable). Since this is not a valid assumption, we use jnr to change the umask of the process to be more restrictive - so that the test can validate the behavior change - and reset it back once the test is done. Fix test failure in build No Adds jnr as a test scoped dependency, which does not bring in any other new dependency (asm is already a dep in spark). ``` [INFO] +- com.github.jnr:jnr-posix:jar:3.0.9:test [INFO] | +- com.github.jnr:jnr-ffi:jar:2.0.1:test [INFO] | | +- com.github.jnr:jffi:jar:1.2.7:test [INFO] | | +- com.github.jnr:jffi:jar:native:1.2.7:test [INFO] | | +- org.ow2.asm:asm:jar:5.0.3:test [INFO] | | +- org.ow2.asm:asm-commons:jar:5.0.3:test [INFO] | | +- org.ow2.asm:asm-analysis:jar:5.0.3:test [INFO] | | +- org.ow2.asm:asm-tree:jar:5.0.3:test [INFO] | | +- org.ow2.asm:asm-util:jar:5.0.3:test [INFO] | | \- com.github.jnr:jnr-x86asm:jar:1.0.2:test [INFO] | \- com.github.jnr:jnr-constants:jar:0.8.6:test ``` Modification to existing test. Tested on Linux, skips test when native posix env is not found. Closes #36473 from mridulm/fix-SPARK-37618-test. Authored-by: Mridul Muralidharan <mridulatgmail.com> Signed-off-by: Sean Owen <srowen@gmail.com> * Fix ut failure Co-authored-by: Adam Binford <adamq43@gmail.com> Co-authored-by: Mridul Muralidharan <mridulatgmail.com>
What changes were proposed in this pull request?
Fix test failure in build.
Depending on the umask of the process running tests (which is typically inherited from the user's default umask), the group writable bit for the files/directories could be set or unset. The test was assuming that by default the umask will be restrictive (and so files/directories wont be group writable). Since this is not a valid assumption, we use jnr to change the umask of the process to be more restrictive - so that the test can validate the behavior change - and reset it back once the test is done.
Why are the changes needed?
Fix test failure in build
Does this PR introduce any user-facing change?
No
Adds jnr as a test scoped dependency, which does not bring in any other new dependency (asm is already a dep in spark).
How was this patch tested?
Modification to existing test.
Tested on Linux, skips test when native posix env is not found.