Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

init: rename/alias --no-scm and --subdir? #2901

Closed
dashohoxha opened this issue Dec 5, 2019 · 17 comments
Closed

init: rename/alias --no-scm and --subdir? #2901

dashohoxha opened this issue Dec 5, 2019 · 17 comments
Labels
discussion requires active participation to reach a conclusion enhancement Enhances DVC ui user interface / interaction

Comments

@dashohoxha
Copy link
Contributor

dashohoxha commented Dec 5, 2019

UPDATE: Skip to #2901 (comment)

Since Git is the only SCM supported and there are no plans (yet) to support any other SCM.

@triage-new-issues triage-new-issues bot added the triage Needs to be triaged label Dec 5, 2019
@shcheklein
Copy link
Member

We can make an alias and deprecate, hide --no-scm

@efiop efiop added the enhancement Enhances DVC label Dec 6, 2019
@triage-new-issues triage-new-issues bot removed the triage Needs to be triaged label Dec 6, 2019
@efiop efiop added triage Needs to be triaged ui user interface / interaction labels Dec 6, 2019
@triage-new-issues triage-new-issues bot removed the triage Needs to be triaged label Dec 6, 2019
@triage-new-issues triage-new-issues bot removed the triage Needs to be triaged label Dec 6, 2019
@efiop efiop added p2-medium Medium priority, should be done, but less important triage Needs to be triaged labels Dec 6, 2019
@triage-new-issues triage-new-issues bot removed the triage Needs to be triaged label Dec 6, 2019
@jorgeorpinel jorgeorpinel changed the title Rename --no-scm to --no-git init: rename/alias --no-scm to --no-git Dec 12, 2019
@jorgeorpinel
Copy link
Contributor

Context: iterative/dvc.org#540 (comment)

@jorgeorpinel
Copy link
Contributor

We could even automate detecting whether there's a Git repo underneath and not offer this option, merging this issue with #2472.

@efiop efiop added p3-nice-to-have It should be done this or next sprint and removed p2-medium Medium priority, should be done, but less important labels Feb 26, 2020
@jorgeorpinel
Copy link
Contributor

Also, maybe --subdir should be --sub-git? (Context: iterative/dvc.org#1022 (comment))

@shcheklein
Copy link
Member

I really don't the idea of --sub-git - super confusing. I understood the motivation for the --subdir (to make the intention extra clear and explicit), but I still prefer it to be DVC-centric, going from DVC use cases when we describe it. Meaning that Git should be part of the technical explanation, but not an initial user-facing interface.

The use case for me - is being able to create isolated DVC sub-projects. Mostly due to monorepos. So, starting explanations from Git or naming an option this leads this into a wrong direction.

@jorgeorpinel
Copy link
Contributor

Good points, I agree with a lot of that now that I thing about it.

The use case for me - is being able to create isolated DVC sub-projects

Except it kind of is about Git originally, here's the original issue: Support for multiple .dvc roots in a single git repo /issues/2349 . You can have a parent DVC project started with --subdir. "subdir" refers to a subdirectory of a Git repo, which can also be a DVC repo but that's not the key aspect.

Another interface I just thought of would be to kind of join these options into a single --git (or --scm if we want to avoid the work Git in our UI) that accepts 3 values: root (default), subdir, and no. To me right now that would make the most sense semantically.

$ mkdir repo && cd repo
$ git init
$ dvc init --scm=root  # same as plain  dvc init

$ mkdir repo && cd repo
$ git init
$ mkdir dir && cd dir
$ dvc init --scm=subdir

$ mkdir project && cd project
$ dvc init --scm=no

Cc @efiop @pared thoughts?

@shcheklein
Copy link
Member

@jorgeorpinel

looks like you don't need one of those (at least --scm=root)?

and --scm again, makes it about some technical detail, more then about use case - just initializing an isolated sub-project.

also, let's imagine we forget about no-scm, let's assume for now that we always deal with Git (this is 99% of the cases) - would you name the first two options differently?

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Mar 10, 2020

Yes, the root value would be the default so no need to use it explicitly. It was just an example to illustrate the 3 effective options.

--scm again, makes it about some technical detail

Same as --no-scm. Actual use case: no data versioning. Should it be init --no-ver?

If init --no-scm didn't exist, my suggestion wouldn't be worth the change indeed. But it does exist and arguably the subdir case is similarly rare. And since they're related, merging them could make sense.

use case - just initializing an isolated sub-project

That's one of the use cases: a DVC subproject inside a parent DVC repository (tracked by Git) – not a subproject inside a non-git parent project though.

But there's also the use case of a root DVC project inside a plain Git repo a.k.a. monorepo (I think that's the main one).

So what about init --monorepo? 😋

@shcheklein
Copy link
Member

arguably the subdir case is similarly rare

it's might be rare but those are big companies with the large projects - so they are important because of this :) no-scm is rare and it's not yet clear if anybody using it at all except us for tests and debug.

That's one of the use cases: a DVC subproject inside a parent DVC repository (tracked by Git) – not a subproject inside a non-git parent project though.

sounds the same to me - isolated sub-project. What is the difference?

But there's also the use case of a root DVC project inside a plain Git repo a.k.a. monorepo (I think that's the main one). So what about init --monorepo? 😋

Again, not sure I'm following. Could you elaborate?

@jorgeorpinel jorgeorpinel changed the title init: rename/alias --no-scm to --no-git init: rename/alias --no-scm and --subdir? Mar 10, 2020
@jorgeorpinel jorgeorpinel added discussion requires active participation to reach a conclusion and removed help wanted p3-nice-to-have It should be done this or next sprint labels Mar 10, 2020
@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Mar 10, 2020

To summarize, both options are related to Git so (1) one idea is to combine them into one, but in any case we would need to decide the final naming, either for the option itself, or for the accepted values.

(2) if we want to name them based on the use case and avoid technical/implementation jargon, I would say --no-scm should be called -n | --no-versioning.

(3) --subdir is also not a great name IMO because the main use case is mostly about monorepos (DVC project inside Git repo subdir). [To answer your question @shcheklein:] I wouldn't call this a sub-project either because that implies the parent is also a project and that's not always the case, probably not even the most common case.
I'm suggesting -m | --monorepo, keeping in mind that this option internally tells DVC to look up the working tree for a .git/ dir.

@shcheklein
Copy link
Member

To summarize, both options are related to Git

So, that's probably the key - where I do see things differently (and may be I'm wrong, of course). They are related in a way how they use Git, but for the --subdir I don't agree that it's a primary factor. For example it changes the way DVC commands work - is it also related to Git? The primary thing for the dvc subdir to my mind is creating isolated DVC projects - nested or parallel. How they are connect to Git? it's just a matter of specification, the same as how isolation specifically works.

I wouldn't call this a sub-project either because that implies the parent is also a project

I think it's fine to call in case of a monorepo all separate projects a sub-project - at least it's clear to me what it means.

@shcheklein
Copy link
Member

Some additional thoughts.

dvc init and dvc init --subdir use Git in the same way (at least it's easy to generalize and easier to understand, rather then trying to explain this option from some Git-related behavior changes) - they look for root up to /. In this sense if we forget about --no-scm there will no difference at all. The only reason --subdir was introduced is to make extra explicit (to avoid someone by chance initializing DVC in a wrong dir) - still not 100% convinced that was needed tbh, but I committed to this design after Ruslan explained me the reason.

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Mar 18, 2020

it changes the way DVC commands work - is it also related to Git?

I think so, yes. It affects them in that they look for .git/ above the .dvc/ root.

The primary thing for the dvc subdir to my mind is creating isolated DVC projects - nested or parallel. How they are connect to Git?

I don't think this is what --subdir ended up doing. And I know the docs say this but it may be incorrect. I tested nesting dvc init --no-scm projects and its perfectly possible without --subdir, and each isolated project ignores the other ones, whether nested, siblings, or parents.

I think --subdir is actually just a required flag in order to init a regular project in a subdirectory of a Git repo. Cc @pared right? Feel free to read above and chime in here!

dvc init and dvc init --subdir use Git in the same way ... they look for root up to /

Maybe I didn't understand you, but only --subdir looks for .git root above cwd.

The only reason --subdir was introduced is to make extra explicit

I think it was introduced to allow multiple DVC projects inside a single Git repo (monorepo use case, see #2349). It's not possible doing this accidentally (with plain dvc init).


Anyway, this is a huge discussion about options names. Probably there's no huge impact in renaming them, but I think this shows there's still some confusion as to what these options achieve. Maybe we should transfer this issue to dvc.org and focus on improving the init cmd ref?

@jorgeorpinel
Copy link
Contributor

Actually there's already an issue related to this in the docs repo. I added a iterative/dvc.org#1039 (comment) there. So up to you guys to keep this discussion any further or not. Just curious to see what Pawel and maybe @efiop think, if they have some extra time to read above.

@shcheklein
Copy link
Member

I don't think this is what --subdir ended up doing. And I know the docs say this but it may be incorrect. I tested nesting dvc init --no-scm projects and its perfectly possible without --subdir, and each isolated project ignores the other ones, whether nested, siblings, or parents.

Let's not mix things together. Let's forget about --no-scm when we talk about --subdir. The fact that in the --no-scm mode you don't have to provide any explicit flags to make nested/isolated repos does not change the purpose of --subdir - to create isolated projects in Git-tracked repositories.

I think --subdir is actually just a required flag in order to init a regular project in a subdirectory of a Git repo.

again, it's a technical implementation detail, it does not answer the question why do we need this?

but I think this shows there's still some confusion as to what these options achieve.

that's important. If we forget about no-scm and implementation details and read the current doc is something not clear enough? how can we improve that?

@pared
Copy link
Contributor

pared commented Mar 23, 2020

Sorry for chiming in so late.

I think --subdir is actually just a required flag in order to init a regular project in a subdirectory of a Git repo.

@jorgeorpinel you are right here.

As to renaming the option I have no strong opinion here. I understand that current naming might not be perfectly describing what those options do. But this seems to be the problem with initialization issue all along: we had to had quite a few discussions just to agree on the behavior in different situations.

If anything, I would refrain from using git in those options naming. The fact that as of today scm == git in our internal glossary, does not mean it will be always the case.

@jorgeorpinel
Copy link
Contributor

If we forget about no-scm and implementation details and read the current doc is something not clear enough? how can we improve that?

Yes @shcheklein, migrated this part of the discussion to iterative/dvc.org/issues/1039.

And thanks for the input, Pawel.

So I guess since there's no strong opinions in the option names and no users have reported much ocnfusion around this (that I know of), we can close this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion requires active participation to reach a conclusion enhancement Enhances DVC ui user interface / interaction
Projects
None yet
Development

No branches or pull requests

5 participants