-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
introduce index #6300
introduce index #6300
Conversation
Related #6039 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Let's clean it up, review and merge it π
I just noted that I didn't wrote: LGTM so far. |
42ae5ac
to
cb5b342
Compare
cb5b342
to
f01a576
Compare
|
||
|
||
if __name__ == "__main__": | ||
from funcy import log_durations | ||
|
||
from dvc.repo import Repo | ||
|
||
repo = Repo() | ||
index = Index(repo, repo.fs) | ||
print(index) | ||
with log_durations(print, "collecting stages"): | ||
# pylint: disable=pointless-statement | ||
print("no of stages", len(index.stages)) | ||
with log_durations(print, "building graph"): | ||
index.build_graph() | ||
with log_durations(print, "calculating hash"): | ||
print(index.identifier) | ||
with log_durations(print, "updating"): | ||
index2 = index.update(index.stages) | ||
with log_durations(print, "calculating hash"): | ||
print(index2.identifier) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if __name__ == "__main__": | |
from funcy import log_durations | |
from dvc.repo import Repo | |
repo = Repo() | |
index = Index(repo, repo.fs) | |
print(index) | |
with log_durations(print, "collecting stages"): | |
# pylint: disable=pointless-statement | |
print("no of stages", len(index.stages)) | |
with log_durations(print, "building graph"): | |
index.build_graph() | |
with log_durations(print, "calculating hash"): | |
print(index.identifier) | |
with log_durations(print, "updating"): | |
index2 = index.update(index.stages) | |
with log_durations(print, "calculating hash"): | |
print(index2.identifier) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll keep these as it is quite helpful in checking bugs/performance issues quickly, unless there is some strong reason against this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skshetry This is not suitable for the core codebase, just create a test instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skshetry Looks like we also still have these in parsing and in ui. They really should not be there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I cannot run test on a real project, with this, I can do that by just python dvc/repo/index.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, because it's not merged yet. Like I said above, it acts as a quick litmus test, especially for UI where I can quickly visually inspect if there's any issues.
The functionality makes sense to me - to quickly check if our subsystems work correctly. I only have a problem with code snippets scattered around the codebase, that no one even knows exist and could be ran. How about we create a command (dvc test
? dvc doctor
? dvc debug
?), where we will collect all such self-checks, so that there is one place to test if dvc subsystems are operating correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For having a central way of running all these, it requires certain pre-conditions/post-conditions (eg: this needs to always be run inside a DVC repo). And, having a central way is not the point of this, tests: end-to-end or unit is there for that.
but it still looks to me as a code snippet that someone forgot to remove.
It is very much intentional. They use it for showing example usages and demonstration, not to add quick visual inspection.
no one even knows exist and could be ran
It's okay to not know they exist, as they are very local to the subsystem. Whenever someone happens to work in this module, they'll take a note of if __name__ == "main"
. When refactoring, they'll be able to grep it or find usages of it to fix. And, as it's not critical to us, it's okay if it breaks occasionally. Though the parsing/UI has never been broken, they still work.
one place to test
The parsing one is a test and uses doctest, so we could run it together with the tests. But there is not much benefit on running others, as they serve different purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skshetry Are there any other examples besides rich? I feel like it is a very specific library, I would love to see some other examples in serious projects (cli tools would be great). Just trying to understand when and how this is used.
CC @iterative/dvc , guys, wdyt about this practice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd completely agree with moving this to tests/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so looks like this was indeed a practice that wasn't aware of. Thanks to @isidentical and @skshetry for pointing out that this is used in things like cpython, doctests, rich. However, I wasn't able to find a single case in any of the projects that I've looked through (e.g. numpy, pyinstaller, duliwch, boto, moto, scipy, spacy, etc), so this still feels like a pretty cryptic practice to me, so I'm not personally 100% sold on it, it feels like these snippets belong somewhere else. But there doesn't seem to be much opposition to it and we already have a few of these in the codebase, so let's keep it for now, and get back to discussing this practice in general on retro or elsewhere.
β I have followed the Contributing to DVC checklist.
π If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
Thank you for the contribution - we'll try to review it as soon as possible. π