Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add databricks asset bundles docs #4265

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

noklam
Copy link
Contributor

@noklam noklam commented Oct 28, 2024

Description

Partially #3360

Development notes

  • Move the old dbx docs to a separate page
  • take over the go-to ide integration page with Databricks Asset bundle using kedro-databricks

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

  • Read the contributing guidelines
  • Signed off each commit with a Developer Certificate of Origin (DCO)
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes
  • Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
@@ -0,0 +1,272 @@
# Use an IDE, dbx and Databricks Repos to develop a Kedro project
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not new docs so no need to review, I move the original databricks_ide_developmenet.md page to a new file and use that file for DAB instead.

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
@noklam noklam force-pushed the noklam/databricks-asset-bundles-docs branch from 8eec4ba to 2733196 Compare October 29, 2024 15:07
@noklam noklam marked this pull request as ready for review October 31, 2024 01:02
@noklam noklam requested review from DimedS and removed request for yetudada October 31, 2024 01:03
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also update the index page https://docs.kedro.org/en/stable/deployment/databricks/index.html and mention asset bundles?

The `dbx` package is deprecated by Databricks, and dbx workflow documentation is moved to a [new page](./databricks_dbx_workflow.md).
```

This guide demonstrates a workflow for developing a Kedro Project on Databricks using Databricks Asset Bundles. You will learn how to develop your project using a local environment, then use `kedro-databricks` and Databricks Asset Bundle to package your code for running pipelines on Databricks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


The next step is to use dbx to sync your project to your Repo.
`kedro-databricks` is a wrapper around the `databricks` CLI. It's the simplest way to get started without getting stuck with configuration. To learn more about Databricks Asset Bundles and customization, read [What are Databricks Asset Bundles](https://docs.databricks.com/en/dev-tools/bundles/index.html).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: UK spelling

Suggested change
`kedro-databricks` is a wrapper around the `databricks` CLI. It's the simplest way to get started without getting stuck with configuration. To learn more about Databricks Asset Bundles and customization, read [What are Databricks Asset Bundles](https://docs.databricks.com/en/dev-tools/bundles/index.html).
`kedro-databricks` is a wrapper around the `databricks` CLI. It's the simplest way to get started without getting stuck with configuration. To learn more about Databricks Asset Bundles and customisation, read [What are Databricks Asset Bundles](https://docs.databricks.com/en/dev-tools/bundles/index.html).

```

This command will sync your local directory (`--source .`) with your Repo (`--dest-repo iris-databricks`) on Databricks. When started for the first time, `dbx sync` will write output similar to the following to your terminal:
2. Initialize the Databricks configuration:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. Initialize the Databricks configuration:
2. Initialise the Databricks configuration:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants