Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(datasets): Add GoogleSheetsDataset #810

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions kedro-datasets/RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Upcoming Release
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S. Whats wrong with the pre-commit hooks in this repo? Seems all files have something going on.

Copy link
Contributor

@DimedS DimedS Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's fix it in separate PR, I will do it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to replicate the issue with pre-commit. Could you let me know how you're running pre-commit, because for me it wasn't generate so many changes in different files when trying to commit this dataset? You can also remove that commit and continue without it: 06a5439

## Major features and improvements

* Added experiment `GoogleSheetsDataset` to read/write data to Google Sheet

## Bug fixes and other changes
## Breaking Changes
## Community contributions
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/api/api_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``APIDataset`` loads the data from HTTP(S) APIs.
It uses the python requests library: https://requests.readthedocs.io/en/latest/
"""

from __future__ import annotations

import json as json_ # make pylint happy
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""BioSequenceDataset loads and saves data to/from bio-sequence objects to
file.
"""

from __future__ import annotations

from copy import deepcopy
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/dask/csv_dataset.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
"""``CSVDataset`` is a data set used to load and save data to CSV files using Dask
dataframe"""

from __future__ import annotations

from copy import deepcopy
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/dask/parquet_dataset.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
"""``ParquetDataset`` is a data set used to load and save data to parquet files using Dask
dataframe"""

from __future__ import annotations

from copy import deepcopy
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``ManagedTableDataset`` implementation to access managed delta tables
in Databricks.
"""

from __future__ import annotations

import logging
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/email/message_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
using an underlying filesystem (e.g.: local, S3, GCS). It uses the
``email`` package in the standard library to manage email messages.
"""

from __future__ import annotations

from copy import deepcopy
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/geopandas/geojson_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
underlying functionality is supported by geopandas, so it supports all
allowed geopandas (pandas) options for loading and saving geosjon files.
"""

from __future__ import annotations

import copy
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
"""``HoloviewsWriter`` saves Holoviews objects as image file(s) to an underlying
filesystem (e.g. local, S3, GCS)."""

from __future__ import annotations

import io
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/ibis/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
"""Provide data loading and saving functionality for Ibis's backends."""

from typing import Any

import lazy_loader as lazy
Expand Down
9 changes: 6 additions & 3 deletions kedro-datasets/kedro_datasets/ibis/table_dataset.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
"""Provide data loading and saving functionality for Ibis's backends."""

from __future__ import annotations

from copy import deepcopy
Expand Down Expand Up @@ -185,9 +186,11 @@ def _describe(self) -> dict[str, Any]:
"filepath": self._filepath,
"file_format": self._file_format,
"table_name": self._table_name,
"backend": self._connection_config.get("backend")
if self._connection_config
else None,
"backend": (
self._connection_config.get("backend")
if self._connection_config
else None
),
"load_args": self._load_args,
"save_args": self._save_args,
"materialized": self._materialized,
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/json/json_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``JSONDataset`` loads/saves data from/to a JSON file using an underlying
filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file.
"""

from __future__ import annotations

import json
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/matlab/matlab_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
the specified backend library passed in (defaults to the ``matlab`` library), so it
supports all allowed options for loading and saving matlab files.
"""

from __future__ import annotations

from copy import deepcopy
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
"""``MatplotlibWriter`` saves one or more Matplotlib objects as image
files to an underlying filesystem (e.g. local, S3, GCS)."""

from __future__ import annotations

import base64
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/networkx/gml_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to
create GML data.
"""

from __future__ import annotations

from copy import deepcopy
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/networkx/graphml_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""NetworkX ``GraphMLDataset`` loads and saves graphs to a GraphML file using an underlying
filesystem (e.g.: local, S3, GCS). NetworkX is used to create GraphML data.
"""

from __future__ import annotations

from copy import deepcopy
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/networkx/json_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``JSONDataset`` loads and saves graphs to a JSON file using an underlying
filesystem (e.g.: local, S3, GCS). NetworkX is used to create JSON data.
"""

from __future__ import annotations

import json
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/pandas/csv_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``CSVDataset`` loads/saves data from/to a CSV file using an underlying
filesystem (e.g.: local, S3, GCS). It uses pandas to handle the CSV file.
"""

from __future__ import annotations

import logging
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/pandas/deltatable_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
S3, GCS), Databricks unity catalog and AWS Glue catalog respectively. It handles
load and save using a pandas dataframe.
"""

from __future__ import annotations

from copy import deepcopy
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/pandas/excel_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``ExcelDataset`` loads/saves data from/to a Excel file using an underlying
filesystem (e.g.: local, S3, GCS). It uses pandas to handle the Excel file.
"""

from __future__ import annotations

import logging
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/pandas/feather_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
using an underlying filesystem (e.g.: local, S3, GCS). The underlying functionality
is supported by pandas, so it supports all operations the pandas supports.
"""

from __future__ import annotations

import logging
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/pandas/generic_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
filesystem (e.g.: local, S3, GCS). It uses pandas to handle the
type of read/write target.
"""

from __future__ import annotations

from copy import deepcopy
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/pandas/hdf_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``HDFDataset`` loads/saves data from/to a hdf file using an underlying
filesystem (e.g.: local, S3, GCS). It uses pandas.HDFStore to handle the hdf file.
"""

from __future__ import annotations

from copy import deepcopy
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/pandas/json_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``JSONDataset`` loads/saves data from/to a JSON file using an underlying
filesystem (e.g.: local, S3, GCS). It uses pandas to handle the JSON file.
"""

from __future__ import annotations

import logging
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/pandas/parquet_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``ParquetDataset`` loads/saves data from/to a Parquet file using an underlying
filesystem (e.g.: local, S3, GCS). It uses pandas to handle the Parquet file.
"""

from __future__ import annotations

import logging
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/pandas/xml_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``XMLDataset`` loads/saves data from/to a XML file using an underlying
filesystem (e.g.: local, S3, GCS). It uses pandas to handle the XML file.
"""

from __future__ import annotations

import logging
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/pickle/pickle_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
the specified backend library passed in (defaults to the ``pickle`` library), so it
supports all allowed options for loading and saving pickle files.
"""

from __future__ import annotations

import importlib
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/pillow/image_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``ImageDataset`` loads/saves image data as `numpy` from an underlying
filesystem (e.g.: local, S3, GCS). It uses Pillow to handle image file.
"""

from __future__ import annotations

from copy import deepcopy
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/plotly/json_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``JSONDataset`` loads/saves a plotly figure from/to a JSON file using an underlying
filesystem (e.g.: local, S3, GCS).
"""

from __future__ import annotations

import json
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/plotly/plotly_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
file using an underlying filesystem (e.g.: local, S3, GCS). It loads the JSON into a
plotly figure.
"""

from __future__ import annotations

import json
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/polars/csv_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``CSVDataset`` loads/saves data from/to a CSV file using an underlying
filesystem (e.g.: local, S3, GCS). It uses polars to handle the CSV file.
"""

from __future__ import annotations

import logging
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
filesystem (e.g.: local, S3, GCS). It uses polars to handle the
type of read/write target.
"""

from __future__ import annotations

from copy import deepcopy
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
filesystem (e.g.: local, S3, GCS). It uses polars to handle the
type of read/write target.
"""

from __future__ import annotations

import logging
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/redis/redis_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``PickleDataset`` loads/saves data from/to a Redis database. The underlying
functionality is supported by the redis library, so it supports all allowed
options for instantiating the redis app ``from_url`` and setting a value."""

from __future__ import annotations

import importlib
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
"""``AbstractDataset`` implementation to access Snowflake using Snowpark dataframes
"""

from __future__ import annotations

import logging
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/spark/deltatable_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``AbstractDataset`` implementation to access DeltaTables using
``delta-spark``.
"""

from __future__ import annotations

from pathlib import PurePosixPath
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/spark/spark_hive_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``AbstractDataset`` implementation to access Spark dataframes using
``pyspark`` on Apache Hive.
"""

from __future__ import annotations

import pickle
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/spark/spark_jdbc_dataset.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
"""SparkJDBCDataset to load and save a PySpark DataFrame via JDBC."""

from __future__ import annotations

from copy import deepcopy
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
"""SparkStreamingDataset to load and save a PySpark Streaming DataFrame."""

from __future__ import annotations

from copy import deepcopy
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/svmlight/svmlight_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
underlying filesystem (e.g.: local, S3, GCS). It uses sklearn functions
``dump_svmlight_file`` to save and ``load_svmlight_file`` to load a file.
"""

from __future__ import annotations

from copy import deepcopy
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``TensorFlowModelDataset`` is a dataset implementation which can save and load
TensorFlow models.
"""

from __future__ import annotations

import copy
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/text/text_dataset.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""``TextDataset`` loads/saves data from/to a text file using an underlying
filesystem (e.g.: local, S3, GCS).
"""

from __future__ import annotations

from copy import deepcopy
Expand Down
1 change: 1 addition & 0 deletions kedro-datasets/kedro_datasets/video/video_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
filesystem (e.g.: local, S3, GCS). It uses OpenCV VideoCapture to read
and decode videos and OpenCV VideoWriter to encode and write video.
"""

from __future__ import annotations

import itertools
Expand Down
Loading
Loading