Skip to content

Commit 8740b99

Browse files
committed
tb_plugin: fix startup hang when no profile data
When starting up tensorboard and the plugin in a directory where there is no profile data, `TorchProfilerPlugin.is_active()` hangs forever. This makes the tensorboard server hang, and client-side webpage remain blank. This commit fixes the issue by ensuring that the thread monitoring the run directories for data notifies the `_is_active_initialized_event` flag not only when data is found, but also when the search is complete and no data was found.
1 parent 683604e commit 8740b99

File tree

1 file changed

+12
-0
lines changed

1 file changed

+12
-0
lines changed

tb_plugin/torch_tb_profiler/plugin.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,12 @@ def clean():
6969
def is_active(self):
7070
"""Returns whether there is relevant data for the plugin to process.
7171
"""
72+
# On startup, this will block until the _monitor_runs thread has completed its scan and we
73+
# know if there's any profiler data.
74+
# On subsequent calls (eg when the TB UI refreshes), the latest scan result is returned
75+
# immediately with no blocking.
7276
self._is_active_initialized_event.wait()
77+
assert self._is_active is not None, "BUG: _is_active was not properly initialized!"
7378
return self._is_active
7479

7580
def get_plugin_apps(self):
@@ -311,6 +316,13 @@ def _monitor_runs(self):
311316
# Use threading to avoid UI stall and reduce data parsing time
312317
t = threading.Thread(target=self._load_run, args=(run_dir,))
313318
t.start()
319+
320+
# Notify the other threads that we're done looking for run dirs even when none
321+
# were found.
322+
if not self._is_active:
323+
self._is_active = False
324+
self._is_active_initialized_event.set()
325+
314326
except Exception as ex:
315327
logger.warning("Failed to scan runs. Exception=%s", ex, exc_info=True)
316328

0 commit comments

Comments
 (0)