Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add ManagedTableDataset for managed Delta Lake tables in Databricks #127

Closed
wants to merge 78 commits into from
Closed
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
27ec9a3
committing first version of UnityTableCatalog with unit tests. This d…
dannyrfar Feb 10, 2023
fa14ea0
Replace kedro.pipeline with modular_pipeline.pipeline factory (#99)
adamfrly Feb 1, 2023
c0724c6
Fix outdated links in Kedro Datasets (#111)
SajidAlamQB Feb 1, 2023
227a5df
Fix docs formatting and phrasing for some datasets (#107)
deepyaman Feb 1, 2023
5f01c67
Release `kedro-datasets` `version 1.0.2` (#112)
SajidAlamQB Feb 2, 2023
6734a7e
Bump pytest to 7.2 (#113)
merelcht Feb 7, 2023
86aa3a7
Prefix Docker plugin name with "Kedro-" in usage message (#57)
deepyaman Feb 7, 2023
5115607
Keep Kedro-Docker plugin docstring from appearing in `kedro -h` (#56)
deepyaman Feb 7, 2023
4b5da98
[kedro-datasets ] Add `Polars.CSVDataSet` (#95)
wmoreiraa Feb 9, 2023
deb3cce
Remove deprecated `test_requires` from `setup.py` in Kedro-Docker (#54)
deepyaman Feb 9, 2023
76e477a
renaming dataset
dannyrfar Feb 14, 2023
d0542fc
adding mlflow connectors
dannyrfar Feb 23, 2023
b2a03b6
fixing mlflow imports
dannyrfar Feb 23, 2023
201fc8a
cleaned up mlflow for initial release
dannyrfar Mar 8, 2023
5de5fd9
Pass the `kedro_init_version` to `ProjectMetadata` (#119)
deepyaman Feb 20, 2023
cd3b6f3
Keep Kedro-Airflow plugin docstring from appearing in `kedro -h` (#118)
deepyaman Feb 21, 2023
a56257c
Make the SQLQueryDataSet compatible with mssql. (#101)
yassineAlouini Feb 27, 2023
e14f5b4
Add warning when `SparkDataSet` is used on Databricks without a valid…
jmholzer Mar 6, 2023
524535a
cleaned up mlflow references from setup.py for initial release
dannyrfar Mar 8, 2023
9389aa4
fixed deps in setup.py
dannyrfar Mar 8, 2023
e6157a5
adding comments before intiial PR
dannyrfar Mar 13, 2023
a314685
Snowpark (Snowflake) dataset for kedro (#104)
Vladimir-Filimonov Mar 9, 2023
cb73804
moved validation to dataclass
dannyrfar Mar 14, 2023
19da1c0
Release `kedro-datasets` `version 1.0.2` (#112)
SajidAlamQB Feb 2, 2023
cdb563f
Fix bandit check by adding timeout to requests.post calls (#133)
merelcht Mar 20, 2023
a227c36
Bump version (#132)
merelcht Mar 20, 2023
6d41956
bug fix in type of partition column and cleanup
dannyrfar Mar 21, 2023
ee5f424
Fix malformed doc strings causing RTD builds to fail on Kedro (#136)
jmholzer Mar 21, 2023
795099a
updated docstring for ManagedTableDataSet
dannyrfar Mar 21, 2023
1319336
Fix docs formatting and phrasing for some datasets (#107)
deepyaman Feb 1, 2023
b80525f
Release `kedro-datasets` `version 1.0.2` (#112)
SajidAlamQB Feb 2, 2023
f516cc2
[kedro-datasets ] Add `Polars.CSVDataSet` (#95)
wmoreiraa Feb 9, 2023
5148772
Make the SQLQueryDataSet compatible with mssql. (#101)
yassineAlouini Feb 27, 2023
0f6da60
Add warning when `SparkDataSet` is used on Databricks without a valid…
jmholzer Mar 6, 2023
4c07c9b
Snowpark (Snowflake) dataset for kedro (#104)
Vladimir-Filimonov Mar 9, 2023
83e8388
Bump version (#132)
merelcht Mar 20, 2023
4f7bac1
Fix malformed doc strings causing RTD builds to fail on Kedro (#136)
jmholzer Mar 21, 2023
c5500f3
Merge branch 'main' into main
dannyrfar Mar 21, 2023
a6454a8
added backticks to catalog
dannyrfar Apr 5, 2023
9cf8e6e
Merge branch 'main' into main
dannyrfar Apr 5, 2023
9873d18
fixing regex to allow hyphens
dannyrfar Apr 11, 2023
f0c9e2e
Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py
dannyrfar May 3, 2023
a8bd47d
Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py
dannyrfar May 3, 2023
e9adb76
Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py
dannyrfar May 3, 2023
3f85f73
Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py
dannyrfar May 3, 2023
00b4eaf
Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py
dannyrfar May 3, 2023
fe5440e
Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py
dannyrfar May 3, 2023
1621578
Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py
dannyrfar May 3, 2023
787ed0d
Update kedro-datasets/test_requirements.txt
dannyrfar May 3, 2023
b1c6832
Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py
dannyrfar May 3, 2023
085dea9
Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py
dannyrfar May 3, 2023
2c2e960
Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py
dannyrfar May 3, 2023
267c9ef
Sync delta-spark requirements (#160)
noklam Apr 6, 2023
0febe06
Fix links on GitHub issue templates (#150)
astrojuanlu Apr 11, 2023
c190e31
Migrate most of `kedro-datasets` metadata to `pyproject.toml` (#161)
astrojuanlu Apr 12, 2023
7d648e6
Upgrade Polars (#171)
astrojuanlu Apr 17, 2023
18d9350
if release is failed, it return exit code and fail the CI (#158)
noklam Apr 17, 2023
2eb53ac
Migrate `kedro-airflow` to static metadata (#172)
astrojuanlu Apr 18, 2023
494fa5f
Migrate `kedro-telemetry` to static metadata (#174)
astrojuanlu Apr 18, 2023
45151ec
ci: port lint, unit test, and e2e tests to Actions (#155)
ankatiyar Apr 19, 2023
942fd01
Migrate `kedro-docker` to static metadata (#173)
astrojuanlu Apr 19, 2023
2d8eb28
Introdcuing .gitpod.yml to kedro-plugins (#185)
noklam Apr 21, 2023
ce1138e
sync APIDataSet from kedro's `develop` (#184)
noklam Apr 24, 2023
f3e361a
[kedro-datasets] Bump version of `tables` in `test_requirements.txt` …
ankatiyar Apr 25, 2023
99e3a41
ci: ensure title matches Conventional Commits spec (#187)
deepyaman Apr 26, 2023
fdd205c
Use PEP 526 syntax for variable type annotations (#190)
merelcht Apr 26, 2023
c0dd796
fix(datasets): Refactor TensorFlowModelDataset to DataSet (#186)
BrianCechmanek Apr 28, 2023
074429c
Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py
dannyrfar May 3, 2023
c0bb229
adding backticks to catalog
dannyrfar May 3, 2023
cca7559
Merge branch 'main' into main
dannyrfar May 3, 2023
d911eb6
Merge branch 'kedro-org:main' into main
jmholzer May 4, 2023
fa52c47
Require pandas < 2.0 for compatibility with spark < 3.4
jmholzer May 4, 2023
5b0b84b
Replace use of walrus operator
jmholzer May 4, 2023
8d0c00d
Add test coverage for validation methods
jmholzer May 4, 2023
adaf2f6
Remove unused versioning functions
jmholzer May 4, 2023
ae5235f
Fix exception catching for invalid schema, add test for invalid schema
jmholzer May 5, 2023
76b593c
Add pylint ignore
jmholzer May 5, 2023
2ced5a9
Add tests/databricks to ignore for no-spark tests
jmholzer May 12, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions kedro-datasets/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -145,3 +145,6 @@ kedro.db
kedro/html
docs/tmp-build-artifacts
docs/build
spark-warehouse
metastore_db/
derby.log
8 changes: 8 additions & 0 deletions kedro-datasets/kedro_datasets/databricks/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
"""Provides interface to Unity Catalog Tables."""

__all__ = ["ManagedTableDataSet"]

from contextlib import suppress

with suppress(ImportError):
from .managed_table_dataset import ManagedTableDataSet
Loading