Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix proc status #425

Merged
merged 1 commit into from
Sep 17, 2020
Merged

fix proc status #425

merged 1 commit into from
Sep 17, 2020

Conversation

tongtie
Copy link
Contributor

@tongtie tongtie commented Aug 13, 2019

I got these error:

2019-08-13 14:33:47,828 tcollector[1896951] [line:1345] WARNING: Terminating collector hbase_master.py after 615 seconds of inactivity
2019-08-13 14:33:47,829 tcollector[1896951] [line:210] INFO: Waiting 5s for PID 2527527 (hbase_master.py) to exit...
2019-08-13 14:33:48,831 tcollector[1896951] [line:75] ERROR: hbase_master.py still has a process (pid=2527527) and is being reset, terminating

The log said that the program still exists, but actually it is gone.
So I add these code to verify this.

def register_collector(collector):
        ...
        if col.proc is not None:
            try:
                os.kill(col.proc.pid, 0)
                LOG.info('pid=%d is running' % col.proc.pid)
            except Exception as e:
                LOG.error('pid=%d not running. %s' % (col.proc.pid, e))
            LOG.error('%s still has a process (pid=%d) and is being reset,'
                      ' terminating', col.name, col.proc.pid)

out:

2019-08-13 16:30:26,347 tcollector[2575745] [line:1136] INFO: Heartbeat (13 collectors running)
2019-08-13 16:30:26,350 tcollector[2575745] [line:1350] WARNING: Terminating collector hbase_master.py after 601 seconds of inactivity
2019-08-13 16:30:26,351 tcollector[2575745] [line:215] INFO: Waiting 5s for PID 2575753 (hbase_master.py) to exit...
2019-08-13 16:30:27,351 tcollector[2575745] [line:78] ERROR: pid=2575753 not running. [Errno 3] No such process
2019-08-13 16:30:27,352 tcollector[2575745] [line:80] ERROR: hbase_master.py still has a process (pid=2575753) and is being reset, terminating

So I add self.proc = None in col.shutdown() to solve this problem.

@johann8384 johann8384 merged commit a22817d into OpenTSDB:master Sep 17, 2020
@johann8384 johann8384 added this to the 1.3.3 milestone Sep 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants