Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KYUUBI #1449] Fix when KyuubiServer initialize fail but not exit #1468

Closed

Conversation

zhenjiaguo
Copy link
Contributor

Why are the changes needed?

Kyuubi server should exit when initialize fail.

Detail see #1449.

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before make a pull request

@yaooqinn
Copy link
Member

can you check the jstack to see which non-daemon thread prevents termination when an exception occurs?

@zhenjiaguo
Copy link
Contributor Author

zhenjiaguo commented Nov 29, 2021

can you check the jstack to see which non-daemon thread prevents termination when an exception occurs?

I use Arthas to see, Non-daemon thread is:

"SessionTracker" Id=22 TIMED_WAITING on org.apache.zookeeper.server.SessionTrackerImpl@5344d655
    at java.lang.Object.wait(Native Method)
    -  waiting on org.apache.zookeeper.server.SessionTrackerImpl@5344d655
    at org.apache.zookeeper.server.SessionTrackerImpl.run(SessionTrackerImpl.java:147)

Thread states are TIMED_WAIT.

@yaooqinn
Copy link
Member

can you check the jstack to see which non-daemon thread prevents termination when an exception occurs?

I use Arthas to see, Non-daemon thread is:

"SessionTracker" Id=22 TIMED_WAITING on org.apache.zookeeper.server.SessionTrackerImpl@5344d655
    at java.lang.Object.wait(Native Method)
    -  waiting on org.apache.zookeeper.server.SessionTrackerImpl@5344d655
    at org.apache.zookeeper.server.SessionTrackerImpl.run(SessionTrackerImpl.java:147)

Thread states are TIMED_WAIT.

Does this issue only occur when using embedded Zookeeper?

@zhenjiaguo
Copy link
Contributor Author

can you check the jstack to see which non-daemon thread prevents termination when an exception occurs?

I use Arthas to see, Non-daemon thread is:

"SessionTracker" Id=22 TIMED_WAITING on org.apache.zookeeper.server.SessionTrackerImpl@5344d655
    at java.lang.Object.wait(Native Method)
    -  waiting on org.apache.zookeeper.server.SessionTrackerImpl@5344d655
    at org.apache.zookeeper.server.SessionTrackerImpl.run(SessionTrackerImpl.java:147)

Thread states are TIMED_WAIT.

Does this issue only occur when using embedded Zookeeper?

YES, It causes by zkServer.start(). It has a thread still running when KyuubiServer initialize fail.
To make sure of this, I configure external zookeeper to confirm. It can exit when KyuubiServer initialize fail.

What do you think set embedded Zookeeper run as daemon thread, or do you have a better way to resolve it?

@yaooqinn
Copy link
Member

yaooqinn commented Dec 1, 2021

can you check the jstack to see which non-daemon thread prevents termination when an exception occurs?

I use Arthas to see, Non-daemon thread is:

"SessionTracker" Id=22 TIMED_WAITING on org.apache.zookeeper.server.SessionTrackerImpl@5344d655
    at java.lang.Object.wait(Native Method)
    -  waiting on org.apache.zookeeper.server.SessionTrackerImpl@5344d655
    at org.apache.zookeeper.server.SessionTrackerImpl.run(SessionTrackerImpl.java:147)

Thread states are TIMED_WAIT.

Does this issue only occur when using embedded Zookeeper?

YES, It causes by zkServer.start(). It has a thread still running when KyuubiServer initialize fail. To make sure of this, I configure external zookeeper to confirm. It can exit when KyuubiServer initialize fail.

What do you think set embedded Zookeeper run as daemon thread, or do you have a better way to resolve it?

maybe we can stop the zkServer here, if zkServer.getServiceState == started?

@zhenjiaguo zhenjiaguo force-pushed the fix-TBFS-init-fail-not-exit branch from eebef54 to 4208663 Compare December 1, 2021 06:13
@codecov-commenter
Copy link

codecov-commenter commented Dec 1, 2021

Codecov Report

Merging #1468 (4467a24) into master (cc40c87) will decrease coverage by 0.01%.
The diff coverage is 25.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1468      +/-   ##
============================================
- Coverage     59.13%   59.12%   -0.02%     
  Complexity      172      172              
============================================
  Files           235      235              
  Lines         12050    12053       +3     
  Branches       1478     1479       +1     
============================================
  Hits           7126     7126              
- Misses         4319     4322       +3     
  Partials        605      605              
Impacted Files Coverage Δ
.../scala/org/apache/kyuubi/server/KyuubiServer.scala 60.25% <25.00%> (-2.42%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cc40c87...4467a24. Read the comment docs.

@yaooqinn yaooqinn added this to the v1.4.1 milestone Dec 1, 2021
@yaooqinn yaooqinn closed this in 5a4e370 Dec 1, 2021
yaooqinn pushed a commit that referenced this pull request Dec 1, 2021
<!--
Thanks for sending a pull request!

Here are some tips for you:
  1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/contributions.html
  2. If the PR is related to an issue in https://github.com/apache/incubator-kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'.
  3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'.
-->

### _Why are the changes needed?_
<!--
Please clarify why the changes are needed. For instance,
  1. If you add a feature, you can talk about the use case of it.
  2. If you fix a bug, you can clarify why it is a bug.
-->

Kyuubi server should exit when initialize fail.

Detail see #1449.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.readthedocs.io/en/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #1468 from zhenjiaguo/fix-TBFS-init-fail-not-exit.

Closes #1449

4467a24 [zhenjiaguo] add status judgment before zkServer stop
4208663 [zhenjiaguo] stop zkServer when kyuubi server init fail
1ff3a82 [zhenjiaguo] try catch server initialize

Authored-by: zhenjiaguo <zhenjia_guo@163.com>
Signed-off-by: Kent Yao <yao@apache.org>
(cherry picked from commit 5a4e370)
Signed-off-by: Kent Yao <yao@apache.org>
@yaooqinn
Copy link
Member

yaooqinn commented Dec 1, 2021

thanks, merged to master and v1.4.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants