-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ restructure deps command with package-lock.yml #6735
Conversation
also leave dbt deps as just a new click group
change deps command to deps install
@@ -289,7 +289,13 @@ def debug(ctx, **kwargs): | |||
|
|||
|
|||
# dbt deps | |||
@cli.command("deps") | |||
@cli.group() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured out why invoke_without_command=True
doesn't work for this case. Everything in click works, but in Flags we did some not ideal lookup here that made the assumption that a function name should always match the last part of input command. I will create a follow up ticket for that but you will find that if you change function name deps_install
and deps_lock
below to just install
and lock
, you should be able to run without any issue. One thing that I didn't check is whether the params at group level is correctly represented in Flags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @justbldwn on this amazing PR! And sorry for getting back to you late! This makes a lot of sense to me, left some comments questions to see what you think of them!
@@ -94,13 +94,14 @@ def _load_yaml(path): | |||
return load_yaml_text(contents) | |||
|
|||
|
|||
def package_data_from_root(project_root): | |||
package_filepath = resolve_path_from_base("packages.yml", project_root) | |||
def package_data_from_root(project_root, package_file_name="packages.yml"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the update and how this is being used it seems like just more of a function to load a yaml file vs anything related to package data. Maybe consider not have the default value for package_file_name
and just rename the function completely?
class DepsTask(BaseTask): | ||
def __init__(self, args: Any, project: Project): | ||
move_to_nearest_project_dir(project.project_root) | ||
super().__init__(args=args, config=None, project=project) | ||
self.cli_vars = args.vars | ||
|
||
if not system.path_exists(f"{self.project.project_root}/package-lock.yml"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If ja lock file already exists, I updated my packages.yml
, and run dbt deps
, do we update the lock yaml somehow?
@@ -55,39 +96,149 @@ def track_package_install( | |||
) | |||
|
|||
def run(self) -> None: | |||
if system.path_exists(self.project.packages_install_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this could cause any surprise behavior that's different from how deps works today?
return | ||
|
||
with downloads_directory(): | ||
resolved_deps = resolve_packages(packages, self.project, self.cli_vars) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on where those packages lives, this might actually pull down a git repo right? Just to understand what's happening, we will pull it down, and remove it in the function above that has comment on it, and pull it down again when install? Did we always pull down git repos twice before?
fire_event( | ||
DepsLockUpdating(lock_filepath=f"{self.project.project_root}/package-lock.yml") | ||
) | ||
LockTask(self.args, self.project).run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure I understand correctly, both deps install
(when no lock file added before) and deps add
will run LockTask
and update lock yaml. Have you considered if it is not dry run deps add
would just also install the package? I am wondering do you think it would be more natural to have deps add
just install the newly added package. CC @dbeatty10
In terms of the yaml write I found this answer works. |
@justbldwn are you still interested in working on this feature? Otherwise we can help bring it across the finish line. |
@ChenyuLInx so sorry for the delay in responding! i may be able to help in the coming weeks, if you have someone that takes over and you need my assistance, too, i'm happy to help where possible. |
@justbldwn no worries at all! Thank you so much for the contribution! Deps has been something that didn't got much improvement over the years and we are excited to see changes to it. I am just back to pick this up and working on getting it into dbt-core. I just merged the main branch into your branch and pushed it to branch |
This PR was continued in #8408 and ultimately merged. |
resolves #6643
Description
this PR restructures the
dbt deps
command, aiming to add a few new key features:package-lock.yml
file that will keep record of all packages installed during the lastdbt deps
runpackage-lock.yml
as the default file to install packages from, which helps reduce install timedbt deps
subcommands:dbt deps lock
anddbt deps add
dbt_packages/
directory each timedbt deps
is run (no more persisting old packages left in the directory)the below will provide an overview of the changes, what they accomplish, and
somea lot of detail on implementation and examples. sorry in advance for the length! 😅dbt deps lock
packages.yml
and list them and their transitive dependencies in a newpackage-lock.yml
file.package-lock.yml
will get written to the project_root alongsidepackages.yml
.example command:
dbt deps add
packages.yml
file with configurable version and location of a given package.--dry-run
flag to thedbt deps add
sub-command that, whenFalse
, will re-create the lock file after a new package is added to thepackages.yml
file. (this is setup asFalse
by default inargparse
, a user can use the flag and set to True to not persist changes down to thepackage-lock.yml
packages.yml
if a package name already exists (either explicit or fuzzy name matching)source != "local"
example command:
dbt deps
🚩(this is now changed as of 2/10/2023, changes noted in strikethrough) 🚩1. largely remains unchanged in order to stay backward compatibledbt deps
command. when working with click, i couldn't getdbt deps
to be a@cli.group()
, while also allowing it to run as an isolated command. i tried usinginvoke_without_command=True
in the@cli.group()
decorator, but still didn't get it to work. so for now, i changeddbt deps
to be an isolated@cli.group()
and added a new@deps.command("install")
to handle the olddbt deps
functionality (this mimics howdbt docs
anddbt source
are setup.key takeaway:
dbt deps
has been changed todbt deps install
and should be treated as a breaking change. happy to discuss this further!dbt deps
dbt deps install
, it will check immediately to see if apackage-lock.yml
file exists in the project root directory. if not, it will create one based on thepackages.yml
in the project root directory.package-lock.yml
now becomes the default for installing packages.example command:
Examples:
example
packages.yml
:example
package-lock.yml
first, i believe the breaking change from
dbt deps
todbt deps install
should warrant discussion on this PR. notes are in the above section titleddbt deps
additionally, there were 2 notable strange yaml formatting issues i ran into with this when outputting data to yaml files:
number 1:
using
yaml.dump()
oryaml.safe_dump()
when writing package/lock results formatted slightly differently than what's in apackages.yml
and advertised on the dbt website. i tried indent formatting but nothing I tried seemed to be able to replicate like what's above in thepackages.yml
vs.package-lock.yml
examples.number 2:
another strange one that i couldn't figure out with yaml formatting was how to get the semantic versioning to display like it's advertised on the dbt website. for example:
online:
dbt deps add
if i use the
default_flow_style=None
argument inyaml.safe_dump()
, we can fix how lists are displayed in the yaml file, but it then messes up how dicts are displayed....functionally, neither of these should impact programmatic reading of the yaml data, but it does make things a touch different than what's in the examples online/the expectations you hope for with file formatting.
Conclusion:
I think that covers nearly everything! please don't hesitate to ask me any questions on this. i'm really looking forward to your review and working on any changes that may be needed. thank you!
Checklist
changie new
to create a changelog entry