Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache package-lock.yml file #1086

Merged
merged 9 commits into from
Aug 8, 2024
Merged

Cache package-lock.yml file #1086

merged 9 commits into from
Aug 8, 2024

Conversation

pankajastro
Copy link
Contributor

@pankajastro pankajastro commented Jul 10, 2024

Description

This PR aims to cache the package-lock.yml in cache_dir/dbt_project

Since dbt version 1.7.0, executing the dbt deps command results in the generation of a package-lock.yml file. This file pins the dependencies and their versions for the dbt project. dbt uses this file to install packages, ensuring predictable and consistent package installations across environments.

  • This feature is enabled only if the user checks in package-lock.yml in their dbt project. Also, I'm assuming if package-lock.yml their dbt-core version is >= 1.7.0 since this feature is available for only dbt >= 1.7.0
  • package-lock.yml also contains the sha1_hash of the packages. This is used to check if the cached package-lock.yml is outdated or not in this PR
  • The cached package-lock.yml is finally copied from from cached path to the tmp project and used
  • To update dependencies or versions, it is expected that the user will manually update their package-lock.yml in the dbt project using the dbt deps command.

Related Issue(s)

closes: #930

Breaking Change?

Checklist

  • I have made corresponding changes to the documentation (if required)
  • I have added tests that prove my fix is effective or that my feature works

Copy link

netlify bot commented Jul 10, 2024

Deploy Preview for sunny-pastelito-5ecb04 canceled.

Name Link
🔨 Latest commit 4559a42
🔍 Latest deploy log https://app.netlify.com/sites/sunny-pastelito-5ecb04/deploys/668e0c141d9e710008156b88

Copy link

netlify bot commented Jul 10, 2024

Deploy Preview for sunny-pastelito-5ecb04 canceled.

Name Link
🔨 Latest commit af2a4ea
🔍 Latest deploy log https://app.netlify.com/sites/sunny-pastelito-5ecb04/deploys/66b5296ae850850008045096

cosmos/cache.py Outdated Show resolved Hide resolved
Copy link

codecov bot commented Jul 11, 2024

Codecov Report

Attention: Patch coverage is 98.07692% with 1 line in your changes missing coverage. Please review.

Project coverage is 96.53%. Comparing base (711bb7c) to head (af2a4ea).

Files Patch % Lines
cosmos/cache.py 97.22% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1086      +/-   ##
==========================================
+ Coverage   96.51%   96.53%   +0.02%     
==========================================
  Files          64       64              
  Lines        3325     3374      +49     
==========================================
+ Hits         3209     3257      +48     
- Misses        116      117       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pankajastro pankajastro changed the title WIP: Cache package-lock.yml file Cache package-lock.yml file Jul 14, 2024
@pankajastro pankajastro marked this pull request as ready for review July 15, 2024 06:37
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. area:dependencies Related to dependencies, like Python packages, library versions, etc dbt:deps Primarily related to dbt deps command or functionality labels Jul 15, 2024
@tatiana tatiana added this to the Cosmos 1.6.0 milestone Jul 18, 2024
Copy link
Contributor

@pankajkoti pankajkoti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM mostly, have a couple of questions inline

cosmos/cache.py Show resolved Hide resolved
cosmos/cache.py Outdated Show resolved Hide resolved
cosmos/cache.py Outdated Show resolved Hide resolved
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Aug 5, 2024
cosmos/cache.py Outdated Show resolved Hide resolved
Copy link
Contributor

@pankajkoti pankajkoti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one minor suggestion inline

cosmos/dbt/graph.py Outdated Show resolved Hide resolved
@pankajastro pankajastro merged commit e847f19 into main Aug 8, 2024
62 checks passed
@pankajastro pankajastro deleted the cache-lockfile branch August 8, 2024 22:53
tatiana pushed a commit that referenced this pull request Aug 14, 2024
This PR aims to cache the package-lock.yml in `cache_dir/dbt_project`

Since dbt version 1.7.0, executing the dbt deps command results in the
generation of a package-lock.yml file. This file pins the dependencies
and their versions for the dbt project. dbt uses this file to install
packages, ensuring predictable and consistent package installations
across environments.

- This feature is enabled only if the user checks in package-lock.yml in
their dbt project. Also, I'm assuming if `package-lock.yml` their
dbt-core version is >= 1.7.0 since this feature is available for only
dbt >= 1.7.0
- package-lock.yml also contains the sha1_hash of the packages. This is
used to check if the cached package-lock.yml is outdated or not in this
PR
- The cached `package-lock.yml` is finally copied from from cached path
to the tmp project and used
- To update dependencies or versions, it is expected that the user will
manually update their package-lock.yml in the dbt project using the dbt
deps command.


closes: #930
@pankajkoti pankajkoti mentioned this pull request Aug 16, 2024
pankajkoti added a commit that referenced this pull request Aug 20, 2024
New Features

* Add support for loading manifest from cloud stores using Airflow
Object Storage by @pankajkoti in #1109
* Cache ``package-lock.yml`` file by @pankajastro in #1086
* Support persisting the ``LoadMode.VIRTUALENV`` directory by @tatiana
in #1079
* Add support to store and fetch ``dbt ls`` cache in remote stores by
@pankajkoti in #1147
* Add default source nodes rendering by @arojasb3 in #1107
* Add Teradata ``ProfileMapping`` by @sc250072 in #1077

Enhancements

* Add ``DatabricksOauthProfileMapping`` profile by @CorsettiS in #1091
* Use ``dbt ls`` as the default parser when ``profile_config`` is
provided by @pankajastro in #1101
* Add task owner to dbt operators by @wornjs in #1082
* Extend Cosmos custom selector to support + when using paths and tags
by @mvictoria in #1150
* Simplify logging by @dwreeves in #1108

Bug fixes

* Fix Teradata ``ProfileMapping`` target invalid issue by @sc250072 in
#1088
* Fix empty tag in case of custom parser by @pankajastro in #1100
* Fix ``dbt deps`` of ``LoadMode.DBT_LS`` should use
``ProjectConfig.dbt_vars`` by @tatiana in #1114
* Fix import handling by lazy loading hooks introduced in PR #1109 by
@dwreeves in #1132
* Fix Airflow 2.10 regression and add Airflow 2.10 in test matrix by
@pankajastro in #1162

Docs

* Fix typo in azure-container-instance docs by @pankajastro in #1106
* Use Airflow trademark as it has been registered by @pankajastro in
#1105

Others

* Run some example DAGs in Kubernetes execution mode in CI by
@pankajastro in #1127
* Install requirements.txt by default during dev env spin up by
@@CorsettiS in #1099
* Remove ``DbtGraph.current_version`` dead code by @tatiana in #1111
* Disable test for Airflow-2.5 and Python-3.11 combination in CI by
@pankajastro in #1124
* Pre-commit hook updates in #1074, #1113, #1125, #1144, #1154,  #1167

---------

Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dependencies Related to dependencies, like Python packages, library versions, etc dbt:deps Primarily related to dbt deps command or functionality lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cache dbt deps lock file
3 participants