-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
live integration: get rid of dvc from live #5466
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
9fa1c1f
live integration: get rid of dvc from live
pared f31abf8
live: remove from api, add test for html generation during run
pared cd7e2de
stage: run: convert monitors to class context managers
pared a53fffe
run: make monitors run in single thread
pared 5d7a401
run: move monitor logic out of run
pared File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,150 @@ | ||
import functools | ||
import logging | ||
import os | ||
import subprocess | ||
import threading | ||
from dataclasses import dataclass | ||
from typing import TYPE_CHECKING, Callable, List | ||
|
||
from dvc.repo.live import create_summary | ||
from dvc.stage.decorators import relock_repo | ||
from dvc.stage.exceptions import StageCmdFailedError | ||
|
||
if TYPE_CHECKING: | ||
from dvc.output import BaseOutput | ||
from dvc.stage import Stage | ||
|
||
|
||
logger = logging.getLogger(__name__) | ||
|
||
|
||
class CheckpointKilledError(StageCmdFailedError): | ||
pass | ||
|
||
|
||
class LiveKilledError(StageCmdFailedError): | ||
pass | ||
|
||
|
||
@dataclass | ||
class MonitorTask: | ||
stage: "Stage" | ||
execute: Callable | ||
proc: subprocess.Popen | ||
done: threading.Event = threading.Event() | ||
killed: threading.Event = threading.Event() | ||
|
||
@property | ||
def name(self) -> str: | ||
raise NotImplementedError | ||
|
||
@property | ||
def SIGNAL_FILE(self) -> str: | ||
raise NotImplementedError | ||
|
||
@property | ||
def error_cls(self) -> type: | ||
raise NotImplementedError | ||
|
||
@property | ||
def signal_path(self) -> str: | ||
return os.path.join(self.stage.repo.tmp_dir, self.SIGNAL_FILE) | ||
|
||
def after_run(self): | ||
pass | ||
|
||
|
||
class CheckpointTask(MonitorTask): | ||
name = "checkpoint" | ||
SIGNAL_FILE = "DVC_CHECKPOINT" | ||
error_cls = CheckpointKilledError | ||
|
||
def __init__( | ||
self, stage: "Stage", callback_func: Callable, proc: subprocess.Popen | ||
): | ||
super().__init__( | ||
stage, | ||
functools.partial( | ||
CheckpointTask._run_callback, stage, callback_func | ||
), | ||
proc, | ||
) | ||
|
||
@staticmethod | ||
@relock_repo | ||
def _run_callback(stage, callback_func): | ||
stage.save(allow_missing=True) | ||
stage.commit(allow_missing=True) | ||
logger.debug("Running checkpoint callback for stage '%s'", stage) | ||
callback_func() | ||
|
||
|
||
class LiveTask(MonitorTask): | ||
name = "live" | ||
SIGNAL_FILE = "DVC_LIVE" | ||
error_cls = LiveKilledError | ||
|
||
def __init__( | ||
self, stage: "Stage", out: "BaseOutput", proc: subprocess.Popen | ||
): | ||
super().__init__(stage, functools.partial(create_summary, out), proc) | ||
|
||
def after_run(self): | ||
# make sure summary is prepared for all the data | ||
self.execute() | ||
|
||
|
||
class Monitor: | ||
AWAIT: float = 1.0 | ||
|
||
def __init__(self, tasks: List[MonitorTask]): | ||
self.done = threading.Event() | ||
self.tasks = tasks | ||
self.monitor_thread = threading.Thread( | ||
target=Monitor._loop, args=(self.tasks, self.done,), | ||
) | ||
|
||
def __enter__(self): | ||
self.monitor_thread.start() | ||
|
||
def __exit__(self, exc_type, exc_val, exc_tb): | ||
self.done.set() | ||
self.monitor_thread.join() | ||
for t in self.tasks: | ||
t.after_run() | ||
|
||
@staticmethod | ||
def kill(proc): | ||
if os.name == "nt": | ||
return Monitor._kill_nt(proc) | ||
proc.terminate() | ||
proc.wait() | ||
|
||
@staticmethod | ||
def _kill_nt(proc): | ||
# windows stages are spawned with shell=True, proc is the shell process | ||
# and not the actual stage process - we have to kill the entire tree | ||
subprocess.call(["taskkill", "/F", "/T", "/PID", str(proc.pid)]) | ||
|
||
@staticmethod | ||
def _loop(tasks: List[MonitorTask], done: threading.Event): | ||
while True: | ||
for task in tasks: | ||
if os.path.exists(task.signal_path): | ||
try: | ||
task.execute() | ||
except Exception: # pylint: disable=broad-except | ||
logger.exception( | ||
"Error running '%s' task, '%s' will be aborted", | ||
task.name, | ||
task.stage, | ||
) | ||
Monitor.kill(task.proc) | ||
task.killed.set() | ||
finally: | ||
logger.debug( | ||
"Removing signal file for '%s' task", task.name | ||
) | ||
os.remove(task.signal_path) | ||
if done.wait(Monitor.AWAIT): | ||
return |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like there's much to this HTML summary (either here or in https://github.com/iterative/dvc/blob/master/dvc/utils/html.py). Would it make sense to build out summary capability in dvclive? That would allow for complete dvclive functionality without dvc.
Also, could the summary output format be abstracted in dvclive so that non-HTML outputs could be built? For example, a matplotlib output that auto-updates, or even a very basic cli output. Different output types could enable users to see realtime updates for model performance (similar to a progress bar) without opening a separate page that needs to be manually refreshed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dberenbaum
This question has few levels.
I guess it would make sense, but it would mean copying not only html module and this particular function, but quite a lot of code from
metrics
andplots
(live.show
is actually calling them) - so all the functionality related to vega-js would probably need to be copied too, the templates for example. Also having it in dvc allows us to use a commanddvc live
to visualize thelive
outputs. Its not too flashy as of today, but AFAIK, it is supposed to be iterated upon in the future.Probably, but that leaves us with a lot of questions related to plots itself - how would it work on different os-es? What about some pipeline running in a server where we connect via SSH? Also, what to do with
plots
command? Do we want to keep the HTML functionality? The plots were made in certain way in order to not have to support our own plotting framework, and use something that probably all of the users share - having browser. I think visualizinglive
using different means will be hard both to maintain and explain to users.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Including these features in dvclive doesn't need to impact dvc at all. No need to block this PR or the release.
For users to be able to get value out of dvclive without dvc, could we build a basic summary capability in dvclive? I was wondering whether we could adapt what's currently in https://github.com/iterative/dvc/blob/master/dvc/utils/html.py as a starting point, but it doesn't need to be if there's a simpler way. Looking at that file, it looked like the data structures to feed to
HTML.write()
are reasonably simple, and the data is already being collected by dvclive.Thoughts? I can open a new issue to discuss if it's worthwhile.