-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a databricks-iris starter that enables packaged deployment on Databricks #129
Create a databricks-iris starter that enables packaged deployment on Databricks #129
Conversation
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments, I haven't done any manual testing.
...ks-iris/{{ cookiecutter.repo_name }}/src/{{ cookiecutter.python_package }}/databricks_run.py
Show resolved
Hide resolved
databricks-iris/{{ cookiecutter.repo_name }}/src/{{ cookiecutter.python_package }}/hooks.py
Show resolved
Hide resolved
databricks-iris/{{ cookiecutter.repo_name }}/conf/base/logging.yml
Outdated
Show resolved
Hide resolved
databricks-iris/{{ cookiecutter.repo_name }}/src/{{ cookiecutter.python_package }}/__main__.py
Outdated
Show resolved
Hide resolved
….yml Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
…' of github.com:kedro-org/kedro-starters into feat/modify-pyspark-iris-databricks-packaged-deployment Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
To test this:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments, see more at kedro-org/kedro#2595
databricks-iris/{{ cookiecutter.repo_name }}/conf/base/spark.yml
Outdated
Show resolved
Hide resolved
Thanks for figuring this out @astrojuanlu! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! LGTM
Motivation and Context
The guide on deploying packaged projects to Databricks proposed in kedro-org/kedro#2595 uses the
databricks-iris
starter. This PR adds this starter. Thedatabricks-iris
starter is a duplicate of thepyspark-iris
starter with a few changes.databricks_run.py
: a module for running the project on Databricks, as Click causes us to be unable to run projects with the default entry point on Databricks.conf/base/logging.yml
).conf/base/catalog.yml
are saved in/dbfs/FileStore
.This PR has a large diff because it is a brand new starter, only the following files have been changed from
pyspark-iris
:{{ cookiecutter.repo_name }}/src/setup.py
: contains an entry point definitiondatabricks_run
.{{ cookiecutter.repo_name }}/src/{{ cookiecutter.python_package }}/databricks_run.py
: contains a script needed to run a packaged Kedro project on Databricks.{{ cookiecutter.repo_name }}/src/conf/base/logging.yml
: config for writing logs to DBFS.{{ cookiecutter.repo_name }}/src/conf/base/catalog.yml
: points to datasets on DBFS.How has this been tested?
Manually on Databricks in conjunction with the new guide.
Checklist