Skip to content

Conversation

@brumi1024
Copy link
Member

Reverts #6960

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 17m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 44m 41s trunk passed
+1 💚 compile 1m 29s trunk passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 compile 1m 26s trunk passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 checkstyle 0m 39s trunk passed
+1 💚 mvnsite 0m 46s trunk passed
+1 💚 javadoc 0m 49s trunk passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 javadoc 0m 41s trunk passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 spotbugs 1m 33s trunk passed
+1 💚 shadedclient 35m 38s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 34s the patch passed
+1 💚 compile 1m 19s the patch passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 javac 1m 19s the patch passed
+1 💚 compile 1m 16s the patch passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 javac 1m 16s the patch passed
+1 💚 blanks 0m 1s The patch has no blanks issues.
+1 💚 checkstyle 0m 27s the patch passed
+1 💚 mvnsite 0m 36s the patch passed
+1 💚 javadoc 0m 35s the patch passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 javadoc 0m 33s the patch passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 spotbugs 1m 28s the patch passed
+1 💚 shadedclient 34m 56s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 25m 7s hadoop-yarn-server-nodemanager in the patch passed.
+1 💚 asflicense 0m 36s The patch does not generate ASF License warnings.
173m 9s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7028/1/artifact/out/Dockerfile
GITHUB PR #7028
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 90faf6d5a35c 5.15.0-117-generic #127-Ubuntu SMP Fri Jul 5 20:13:28 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 5648501
Default Java Private Build-1.8.0_422-8u422-b05-1~20.04-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_422-8u422-b05-1~20.04-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7028/1/testReport/
Max. process+thread count 552 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7028/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@slfan1989
Copy link
Contributor

@brumi1024 Why do we need to revert this PR? Is it because we encountered some issues?

@brumi1024
Copy link
Member Author

@slfan1989 there is an issue with the current implementation: we catch every PrivilegedOperationException - including the ones caused by a user-requested application kill - and then proceed to mark the NM unhealthy. This should not happen. Actually bit down in this class there is a separate exit code handling method for the container launch, which throws a config related exception in cases where the error is truly unrecoverable without admin input, I plan to reuse here as well.

But I'll only have time to work on that next week, until then I think this state is harmful, as after a few applications kills most of the NMs will be marked unhealthy, requiring a restart.

@brumi1024 brumi1024 merged commit 8c41fbc into trunk Sep 7, 2024
KeeProMise pushed a commit to KeeProMise/hadoop that referenced this pull request Sep 9, 2024
Hexiaoqiao pushed a commit to Hexiaoqiao/hadoop that referenced this pull request Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants