PySQL Connector split into connector and sqlalchemy #444

jprakash-db · 2024-09-24T05:04:17Z

Major Change - v4.x.x

Description

databricks-sql-python library is being split into 2 packages to satisfy the business needs

PyArrow wants to be kept optional for users not intending to deal with large volumes of data. And also for users who want a small package for their needs
SQLAlchemy part of the code is moved to a separate library databricks-sqlalchemy such that the user can use either the SQLAlchemy v1 or SQLAlchemy v2 with the latest version of the connector

The Split

The two packages post split are

databricks-sql-python

It will be the core part of the library and will exist in this github repo itself.
It will have an optional dependency on PyArrow and will not be installed by default.
pip install databricks-sql-connector will install the lean connector and pip install databricks-sql-connector[pyarrow] will install the complete connector

! Not installing PyArrow will disable features such as Cloudfetch and other Arrow needed functions. Without PyArrow only inline results will be supported

databricks-sqlalchemy

The SQLAlchemy code is moved to a separate repository to control it release flow
databricks-sqlalchemy library will have a core dependency on the connector with PyArrow and hence the databricks-sql-python and PyArrow will be installed while installing databricks-sqlalchemy
You can install latest SQLAlchemy v1 based library using pip install databricks-sqlalchemy~=1.0 or the SQLAlchemy v2 based library using pip install databricks-sqlalchemy

Published Library on PyPi

Development Details

Going forward all the PRs related to databricks-sql-python will be raised on this repo
SQLAlchemy v1 based library is not under active development and hence has been moved to v1/main branch in the databricks-sqlalchemy repo. All future PRs must be raised wrt this branch
SQLAlchemy v2 based library is under active development and will be the default main branch in the databricks-sqlalchemy repo

PR Details

Tasks Completed

Refractored the code into its respective folders based on the proposed design doc
pyproject.toml file has been changed to reflect the proper dependencies for the split
Made sure that all the existing e2e and units tests are working pre and post spit, ensuring parity
Added benchmarking queries to test the performance of pre and post split and a dashboard has been created for visualization
Dependency tests are also added to check how the library behaves when certain libraries are not available and the user requests their functions

How to Test

Testing pipeline remains the same as it is before the split.
pytest can be used to directly run both the integration as well as unit tests, by pytest [directory_name or file_name]

Performance Comparison - Benchmarking

The pre-split and post-split preformance comparison has been made using the large and small queries to make sure their is no regression of performance
Dashboard has been created so that everytime the benchmarking is run the result are stored in the benchfood, and comparisons can be made easily

…ore part (#417) * Implemented ColumnQueue to test the fetchall without pyarrow Removed token removed token * order of fields in row corrected * Changed the folder structure and tested the basic setup to work * Refractored the code to make connector to work * Basic Setup of connector, core and sqlalchemy is working * Basic integration of core, connect and sqlalchemy is working * Setup working dynamic change from ColumnQueue to ArrowQueue * Refractored the test code and moved to respective folders * Added the unit test for column_queue Fixed __version__ Fix * venv_main added to git ignore * Added code for merging columnar table * Merging code for columnar * Fixed the retry_close sesssion test issue with logging * Fixed the databricks_sqlalchemy tests and introduced pytest.ini for the sqla_testing * Added pyarrow_test mark on pytest * Fixed databricks.sqlalchemy to databricks_sqlalchemy imports * Added poetry.lock * Added dist folder * Changed the pyproject.toml * Minor Fix * Added the pyarrow skip tag on unit tests and tested their working * Fixed the Decimal and timestamp conversion issue in non arrow pipeline * Removed not required files and reformatted * Fixed test_retry error * Changed the folder structure to src / databricks * Removed the columnar non arrow flow to another PR * Moved the README to the root * removed columnQueue instance * Revmoved databricks_sqlalchemy dependency in core * Changed the pysql_supports_arrow predicate, introduced changes in the pyproject.toml * Ran the black formatter with the original version * Extra .py removed from all the __init__.py files names * Undo formatting check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * BIG UPDATE * Refeactor code * Refractor * Fixed versioning * Minor refractoring * Minor refractoring

…ave pyarrow as optional

Print warning message if pyarrow is not installed Signed-off-by: Jacky Hu <jacky.hu@databricks.com>

Remove sqlalchemy and update README.md Signed-off-by: Jacky Hu <jacky.hu@databricks.com>

github-actions · 2024-12-11T06:37:29Z

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

github-actions · 2024-12-11T06:39:38Z

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

github-actions · 2024-12-11T06:55:42Z

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

.github/workflows/code-quality-checks.yml

github-actions · 2024-12-26T07:13:26Z

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

pyproject.toml

* Modified the gitignore file to not have .idea file * [PECO-1803] Splitting the PySql connector into the core and the non core part (#417) * Implemented ColumnQueue to test the fetchall without pyarrow Removed token removed token * order of fields in row corrected * Changed the folder structure and tested the basic setup to work * Refractored the code to make connector to work * Basic Setup of connector, core and sqlalchemy is working * Basic integration of core, connect and sqlalchemy is working * Setup working dynamic change from ColumnQueue to ArrowQueue * Refractored the test code and moved to respective folders * Added the unit test for column_queue Fixed __version__ Fix * venv_main added to git ignore * Added code for merging columnar table * Merging code for columnar * Fixed the retry_close sesssion test issue with logging * Fixed the databricks_sqlalchemy tests and introduced pytest.ini for the sqla_testing * Added pyarrow_test mark on pytest * Fixed databricks.sqlalchemy to databricks_sqlalchemy imports * Added poetry.lock * Added dist folder * Changed the pyproject.toml * Minor Fix * Added the pyarrow skip tag on unit tests and tested their working * Fixed the Decimal and timestamp conversion issue in non arrow pipeline * Removed not required files and reformatted * Fixed test_retry error * Changed the folder structure to src / databricks * Removed the columnar non arrow flow to another PR * Moved the README to the root * removed columnQueue instance * Revmoved databricks_sqlalchemy dependency in core * Changed the pysql_supports_arrow predicate, introduced changes in the pyproject.toml * Ran the black formatter with the original version * Extra .py removed from all the __init__.py files names * Undo formatting check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * BIG UPDATE * Refeactor code * Refractor * Fixed versioning * Minor refractoring * Minor refractoring * Changed the folder structure such that sqlalchemy has not reference here * Fixed README.md and CONTRIBUTING.md * Added manual publish * On push trigger added * Manually setting the publish step * Changed versioning in pyproject.toml * Bumped up the version to 4.0.0.b3 and also changed the structure to have pyarrow as optional * Removed the sqlalchemy tests from integration.yml file * [PECO-1803] Print warning message if pyarrow is not installed (#468) Print warning message if pyarrow is not installed Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * [PECO-1803] Remove sqlalchemy and update README.md (#469) Remove sqlalchemy and update README.md Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * Removed all sqlalchemy related stuff * generated the lock file * Fixed failing tests * removed poetry.lock * Updated the lock file * Fixed poetry numpy 2.2.2 issue * Workflow fixes --------- Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Co-authored-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>

* Modified the gitignore file to not have .idea file * [PECO-1803] Splitting the PySql connector into the core and the non core part (#417) * Implemented ColumnQueue to test the fetchall without pyarrow Removed token removed token * order of fields in row corrected * Changed the folder structure and tested the basic setup to work * Refractored the code to make connector to work * Basic Setup of connector, core and sqlalchemy is working * Basic integration of core, connect and sqlalchemy is working * Setup working dynamic change from ColumnQueue to ArrowQueue * Refractored the test code and moved to respective folders * Added the unit test for column_queue Fixed __version__ Fix * venv_main added to git ignore * Added code for merging columnar table * Merging code for columnar * Fixed the retry_close sesssion test issue with logging * Fixed the databricks_sqlalchemy tests and introduced pytest.ini for the sqla_testing * Added pyarrow_test mark on pytest * Fixed databricks.sqlalchemy to databricks_sqlalchemy imports * Added poetry.lock * Added dist folder * Changed the pyproject.toml * Minor Fix * Added the pyarrow skip tag on unit tests and tested their working * Fixed the Decimal and timestamp conversion issue in non arrow pipeline * Removed not required files and reformatted * Fixed test_retry error * Changed the folder structure to src / databricks * Removed the columnar non arrow flow to another PR * Moved the README to the root * removed columnQueue instance * Revmoved databricks_sqlalchemy dependency in core * Changed the pysql_supports_arrow predicate, introduced changes in the pyproject.toml * Ran the black formatter with the original version * Extra .py removed from all the __init__.py files names * Undo formatting check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * BIG UPDATE * Refeactor code * Refractor * Fixed versioning * Minor refractoring * Minor refractoring * Changed the folder structure such that sqlalchemy has not reference here * Fixed README.md and CONTRIBUTING.md * Added manual publish * On push trigger added * Manually setting the publish step * Changed versioning in pyproject.toml * Bumped up the version to 4.0.0.b3 and also changed the structure to have pyarrow as optional * Removed the sqlalchemy tests from integration.yml file * [PECO-1803] Print warning message if pyarrow is not installed (#468) Print warning message if pyarrow is not installed Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * [PECO-1803] Remove sqlalchemy and update README.md (#469) Remove sqlalchemy and update README.md Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * Removed all sqlalchemy related stuff * generated the lock file * Fixed failing tests * removed poetry.lock * Updated the lock file * Fixed poetry numpy 2.2.2 issue * Workflow fixes --------- Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Co-authored-by: Jacky Hu <jacky.hu@databricks.com>

* Modified the gitignore file to not have .idea file * [PECO-1803] Splitting the PySql connector into the core and the non core part (#417) * Implemented ColumnQueue to test the fetchall without pyarrow Removed token removed token * order of fields in row corrected * Changed the folder structure and tested the basic setup to work * Refractored the code to make connector to work * Basic Setup of connector, core and sqlalchemy is working * Basic integration of core, connect and sqlalchemy is working * Setup working dynamic change from ColumnQueue to ArrowQueue * Refractored the test code and moved to respective folders * Added the unit test for column_queue Fixed __version__ Fix * venv_main added to git ignore * Added code for merging columnar table * Merging code for columnar * Fixed the retry_close sesssion test issue with logging * Fixed the databricks_sqlalchemy tests and introduced pytest.ini for the sqla_testing * Added pyarrow_test mark on pytest * Fixed databricks.sqlalchemy to databricks_sqlalchemy imports * Added poetry.lock * Added dist folder * Changed the pyproject.toml * Minor Fix * Added the pyarrow skip tag on unit tests and tested their working * Fixed the Decimal and timestamp conversion issue in non arrow pipeline * Removed not required files and reformatted * Fixed test_retry error * Changed the folder structure to src / databricks * Removed the columnar non arrow flow to another PR * Moved the README to the root * removed columnQueue instance * Revmoved databricks_sqlalchemy dependency in core * Changed the pysql_supports_arrow predicate, introduced changes in the pyproject.toml * Ran the black formatter with the original version * Extra .py removed from all the __init__.py files names * Undo formatting check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * BIG UPDATE * Refeactor code * Refractor * Fixed versioning * Minor refractoring * Minor refractoring * Changed the folder structure such that sqlalchemy has not reference here * Fixed README.md and CONTRIBUTING.md * Added manual publish * On push trigger added * Manually setting the publish step * Changed versioning in pyproject.toml * Bumped up the version to 4.0.0.b3 and also changed the structure to have pyarrow as optional * Removed the sqlalchemy tests from integration.yml file * [PECO-1803] Print warning message if pyarrow is not installed (#468) Print warning message if pyarrow is not installed Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * [PECO-1803] Remove sqlalchemy and update README.md (#469) Remove sqlalchemy and update README.md Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * Removed all sqlalchemy related stuff * generated the lock file * Fixed failing tests * removed poetry.lock * Updated the lock file * Fixed poetry numpy 2.2.2 issue * Workflow fixes --------- Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Co-authored-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com>

* [ES-402013] Close cursors before closing connection (#38) * Add test: cursors are closed when connection closes Signed-off-by: Jesse Whitehouse <jesse@whitehouse.dev> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump version to 2.0.5 and improve CHANGELOG (#40) Signed-off-by: Jesse Whitehouse <jesse@whitehouse.dev> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * fix dco issue Signed-off-by: Moe Derakhshani <moe.derakhshani@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * fix dco issue Signed-off-by: Moe Derakhshani <moe.derakhshani@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * dco tunning Signed-off-by: Moe Derakhshani <moe.derakhshani@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * dco tunning Signed-off-by: Moe Derakhshani <moe.derakhshani@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Github workflows: run checks on pull requests from forks (#47) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * OAuth implementation (#15) This PR: * Adds the foundation for OAuth against Databricks account on AWS with BYOIDP. * It copies one internal module that Steve Weis @sweisdb wrote for Databricks CLI (oauth.py). Once ecosystem-dev team (Serge, Pieter) build a python sdk core we will move this code to their repo as a dependency. * the PR provides authenticators with visitor pattern format for stamping auth-token which later is intended to be moved to the repo owned by Serge @nfx and and Pieter @pietern Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Automate deploys to Pypi (#48) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-205] Add functional examples (#52) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump version to 2.1.0 (#54) Bump to v2.1.0 and update changelog Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [SC-110400] Enabling compression in Python SQL Connector (#49) Signed-off-by: Mohit Singla <mohit.singla@databricks.com> Co-authored-by: Moe Derakhshani <moe.derakhshani@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Add tests for parameter sanitisation / escaping (#46) * Refactor so we can unit test `inject_parameters` * Add unit tests for inject_parameters * Remove inaccurate comment. Per #51, spark sql does not support escaping a single quote with a second single quote. * Closes #51 and adds unit tests plus the integration test provided in #56 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Courtney Holcomb (@courtneyholcomb) Co-authored-by: @mcannamela Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump thrift dependency to 0.16.0 (#65) Addresses https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13949 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump version to 2.2.0 (#66) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Support Python 3.11 (#60) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump version to 2.2.1 (#70) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Add none check on _oauth_persistence in DatabricksOAuthProvider (#71) Add none check on _oauth_persistence in DatabricksOAuthProvider to avoid app crash when _oauth_persistence is None. Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Support custom oauth client id and redirect port (#75) * Support custom oauth client id and rediret port range PySQL is used by other tools/CLIs which have own oauth client id, we need to expose oauth_client_id and oauth_redirect_port_range as the connection parameters to support this customization. Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * Change oauth redirect port range to port Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * Fix type check issue Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump version to 2.2.2 (#76) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Jesse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Merge staging ingestion into main (#78) Follow up to #67 and #64 * Regenerate TCLIService using latest TCLIService.thrift from DBR (#64) * SI: Implement GET, PUT, and REMOVE (#67) * Re-lock dependencies after merging `main` Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump version to 2.3.0 and update changelog (#80) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Add pkgutil-style for the package (#84) Since the package is under databricks namespace. pip install this package will cause issue importing other packages under the same namespace like automl and feature store. Adding pkgutil style to resolve the issue. Signed-off-by: lu-wang-dl <lu.wang@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Add SQLAlchemy Dialect (#57) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump to version 2.4.0(#89) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix syntax in examples in root readme. (#92) Do this because the environment variable pulls did not have closing quotes on their string literals. Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Less strict numpy and pyarrow dependencies (#90) Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Thomas Newton <thomas.w.newton@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Update example in docstring so query output is valid Spark SQL (#95) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump version to 2.4.1 (#96) Per the sermver.org spec, updating the projects dependencies is considered a compatible change. https: //semver.org/#what-should-i-do-if-i-update-my-own-dependencies-without-changing-the-public-api Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Update CODEOWNERS (#97) Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Add Andre to CODEOWNERS (#98) * Add Andre. Signed-off-by: Yunbo Deng <yunbo.deng@databricks.com> Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> * Revert the change temporarily so I can sign off. Signed-off-by: Yunbo Deng <yunbo.deng@databricks.com> Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> * Add Andre and sign off. Signed-off-by: Yunbo Deng <yunbo.deng@databricks.com> Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> * Remove redundant line Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> --------- Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Add external auth provider + example (#101) Signed-off-by: Andre Furlan <andre.furlan@databricks.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Retry on connection timeout (#103) A lot of the time we see the error `[Errno 110] Connection timed out`. This happens a lot in Azure, particularly. In this PR I make it a retryable error as it is safe Signed-off-by: Andre Furlan <andre.furlan@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-244] Make http proxies work (#81) Override thrift's proxy header encoding function. Uses the fix identified in https://github.com/apache/thrift/pull/2565 H/T @pspeter Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump to version 2.5.0 (#104) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix changelog release date for version 2.5.0 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Relax sqlalchemy requirement (#113) * Plus update docs about how to change dependency spec Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Update to version 2.5.1 (#114) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix SQLAlchemy timestamp converter + docs (#117) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Relax pandas and alembic requirements (#119) Update dependencies for alembic and pandas per customer request Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump to version 2.5.2 (#118) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Use urllib3 for thrift transport + reuse http connections (#131) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Default socket timeout to 15 min (#137) Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump version to 2.6.0 (#139) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix: some thrift RPCs failed with BadStatusLine (#141) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump version to 2.6.1 (#142) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [ES-706907] Retry GetOperationStatus for http errors (#145) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump version to 2.6.2 (#147) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-626] Support OAuth flow for Databricks Azure (#86) ## Summary Support OAuth flow for Databricks Azure ## Background Some OAuth endpoints (e.g. Open ID Configuration) and scopes are different between Databricks Azure and AWS. Current code only supports OAuth flow on Databricks in AWS ## What changes are proposed in this pull request? - Change `OAuthManager` to decouple Databricks AWS specific configuration from OAuth flow - Add `sql/auth/endpoint.py` that implements cloud specific OAuth endpoint configuration - Change `DatabricksOAuthProvider` to work with the OAuth configurations in different Databricks cloud (AWS, Azure) - Add the corresponding unit tests Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Use a separate logger for unsafe thrift responses (#153) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Improve e2e test development ergonomics (#155) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Don't raise exception when closing a stale Thrift session (#159) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump to version 2.7.0 (#161) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Cloud Fetch download handler (#127) * Cloud Fetch download handler Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Issue fix: final result link compressed data has multiple LZ4 end-of-frame markers Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Addressing PR comments - Linting - Type annotations - Use response.ok - Log exception - Remove semaphore and only use threading.event - reset() flags method - Fix tests after removing semaphore - Link expiry logic should be in secs - Decompress data static function - link_expiry_buffer and static public methods - Docstrings and comments Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Changing logger.debug to remove url Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * _reset() comment to docstring Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * link_expiry_buffer -> link_expiry_buffer_secs Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> --------- Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Cloud Fetch download manager (#146) * Cloud Fetch download manager Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Bug fix: submit handler.run Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Type annotations Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Namedtuple -> dataclass Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Shutdown thread pool and clear handlers Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Docstrings and comments Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * handler.run is the correct call Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Link expiry buffer in secs Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Adding type annotations for download_handlers and downloadable_result_settings Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Move DownloadableResultSettings to downloader.py to avoid circular import Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Black linting Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Timeout is never None Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> --------- Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Cloud fetch queue and integration (#151) * Cloud fetch queue and integration Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Enable cloudfetch with direct results Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Typing and style changes Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Client-settable max_download_threads Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Docstrings and comments Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Increase default buffer size bytes to 104857600 Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Move max_download_threads to kwargs of ThriftBackend, fix unit tests Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Fix tests: staticmethod make_arrow_table mock not callable Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * cancel_futures in shutdown() only available in python >=3.9.0 Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Black linting Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Fix typing errors Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> --------- Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Cloud Fetch e2e tests (#154) * Cloud Fetch e2e tests Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Test case works for e2-dogfood shared unity catalog Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Moving test to LargeQueriesSuite and setting catalog to hive_metastore Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Align default value of buffer_size_bytes in driver tests Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Adding comment to specify what's needed to run successfully Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> --------- Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Update changelog for cloudfetch (#172) Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Improve sqlalchemy backward compatibility with 1.3.24 (#173) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * OAuth: don't override auth headers with contents of .netrc file (#122) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix proxy connection pool creation (#158) Signed-off-by: Sebastian Eckweiler <sebastian.eckweiler@mercedes-benz.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Sebastian Eckweiler <sebastian.eckweiler@mercedes-benz.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Relax pandas dependency constraint to allow ^2.0.0 (#164) Signed-off-by: Daniel Segesdi <daniel.segesdi@turbine.ai> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Use hex string version of operation ID instead of bytes (#170) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy: fix has_table so it honours schema= argument (#174) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix socket timeout test (#144) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Disable non_native_boolean_check_constraint (#120) --------- Signed-off-by: Bogdan Kyryliuk <b.kyryliuk@gmail.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Remove unused import for SQLAlchemy 2 compatibility (#128) Signed-off-by: William Gentry <william.barr.gentry@gmail.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump version to 2.8.0 (#178) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix typo in python README quick start example (#186) --------- Co-authored-by: Jesse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Configure autospec for mocked Client objects (#188) Resolves #187 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Use urllib3 for retries (#182) Behaviour is gated behind `enable_v3_retries` config. This will be removed and become the default behaviour in a subsequent release. Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump version to 2.9.0 (#189) * Add note to changelog about using cloud_fetch Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Explicitly add urllib3 dependency (#191) Signed-off-by: Jacobus Herman <jacobus.herman@otrium.com> Co-authored-by: Jesse <jesse.whitehouse@databricks.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump to 2.9.1 (#195) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Make backwards compatible with urllib3~=1.0 (#197) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Convenience improvements to v3 retry logic (#199) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump version to 2.9.2 (#201) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Github Actions Fix: poetry install fails for python 3.7 tests (#208) snok/install-poetry@v1 installs the latest version of Poetry The latest version of poetry released on 20 August 2023 (four days ago as of this commit) which drops support for Python 3.7, causing our github action to fail. Until we complete #207 we need to conditionally install the last version of poetry that supports Python 3.7 (poetry==1.5.1) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Make backwards compatible with urllib3~=1.0 [Follow up #197] (#206) * Make retry policy backwards compatible with urllib3~=1.0.0 We already implement the equivalent of backoff_max so the behaviour will be the same for urllib3==1.x and urllib3==2.x We do not implement backoff jitter so the behaviour for urllib3==1.x will NOT include backoff jitter whereas urllib3==2.x WILL include jitter. --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump version to 2.9.3 (#209) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Add note to sqlalchemy example: IDENTITY isn't supported yet (#212) ES-842237 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1029] Updated thrift compiler version (#216) * Updated thrift definitions Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Tried with a different thrift installation Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Reverted TCLI to previous Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Reverted to older thrift Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Updated version again Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Upgraded thrift Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Final commit Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> --------- Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1055] Updated thrift defs to allow Tsparkparameters (#220) Updated thrift defs to most recent versions Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Update changelog to indicate that 2.9.1 and 2.9.2 have been yanked. (#222) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix changelog typo: _enable_v3_retries (#225) Closes #219 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Introduce SQLAlchemy reusable dialog tests (#125) Signed-off-by: Jim Fulton <jim.fulton@unsupervised.com> Co-Authored-By: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1026] Add Parameterized Query support to Python (#217) * Initial commit Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Added tsparkparam handling Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Added basic test Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Addressed comments Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Addressed missed comments Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Resolved comments --------- Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Parameterized queries: Add e2e tests for inference (#227) Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1109] Parameterized Query: add suport for inferring decimal types (#228) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy 2: reorganise dialect files into a single directory (#231) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1083] Updated thrift files and added check for protocol version (#229) * Updated thrift files and added check for protocol version Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Made error message more clear Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Changed name of fn Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Ran linter Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Update src/databricks/sql/client.py Co-authored-by: Jesse <jwhitehouse@airpost.net> --------- Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> Co-authored-by: Jesse <jwhitehouse@airpost.net> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-840] Port staging ingestion behaviour to new UC Volumes (#235) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Query parameters: implement support for binding NoneType parameters (#233) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy 2: Bump dependency version and update e2e tests for existing behaviour (#236) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Revert "[PECO-1083] Updated thrift files and added check for protocol version" (#237) Reverts #229 as it causes all of our e2e tests to fail on some versions of DBR. We'll reimplement the protocol version check in a follow-up. This reverts commit 241e934a96737d506c2a1f77c7012e1ab8de967b. Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy 2: add type compilation for all CamelCase types (#238) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy 2: add type compilation for uppercase types (#240) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy 2: Stop skipping all type tests (#242) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1134] v3 Retries: allow users to bound the number of redirects to follow (#244) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Parameters: Add type inference for BIGINT and TINYINT types (#246) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy 2: Stop skipping some non-type tests (#247) * Stop skipping TableDDLTest and permanent skip HasIndexTest We're now in the territory of features that aren't required for sqla2 compat as of pysql==3.0.0 but we may consider adding this in the future. In this case, table comment reflection needs to be manually implemented. Index reflection would require hooking into the compiler to reflect the partition strategy. test_suite.py::HasIndexTest_databricks+databricks::test_has_index[dialect] SKIPPED (Databricks does not support indexes.) test_suite.py::HasIndexTest_databricks+databricks::test_has_index[inspector] SKIPPED (Databricks does not support indexes.) test_suite.py::HasIndexTest_databricks+databricks::test_has_index_schema[dialect] SKIPPED (Databricks does not support indexes.) test_suite.py::HasIndexTest_databricks+databricks::test_has_index_schema[inspector] SKIPPED (Databricks does not support indexes.) test_suite.py::TableDDLTest_databricks+databricks::test_add_table_comment SKIPPED (Comment reflection is possible but not implemented in this dialect.) test_suite.py::TableDDLTest_databricks+databricks::test_create_index_if_not_exists SKIPPED (Databricks does not support indexes.) test_suite.py::TableDDLTest_databricks+databricks::test_create_table PASSED test_suite.py::TableDDLTest_databricks+databricks::test_create_table_if_not_exists PASSED test_suite.py::TableDDLTest_databricks+databricks::test_create_table_schema PASSED test_suite.py::TableDDLTest_databricks+databricks::test_drop_index_if_exists SKIPPED (Databricks does not support indexes.) test_suite.py::TableDDLTest_databricks+databricks::test_drop_table PASSED test_suite.py::TableDDLTest_databricks+databricks::test_drop_table_comment SKIPPED (Comment reflection is possible but not implemented in this dialect.) test_suite.py::TableDDLTest_databricks+databricks::test_drop_table_if_exists PASSED test_suite.py::TableDDLTest_databricks+databricks::test_underscore_names PASSED Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Permanently skip QuotedNameArgumentTest with comments The fixes to DESCRIBE TABLE and visit_xxx were necessary to get to the point where I could even determine that these tests wouldn't pass. But those changes are not currently tested in the dialect. If, in the course of reviewing the remaining tests in the compliance suite, I find that these visit_xxxx methods are not tested anywhere else then we should extend test_suite.py with our own tests to confirm the behaviour for ourselves. Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Move files from base.py to _ddl.py The presence of this pytest.ini file is _required_ to establish pytest's root_path https://docs.pytest.org/en/7.1.x/reference/customize.html#finding-the-rootdir Without it, the custom pytest plugin from SQLAlchemy can't read the contents of setup.cfg which makes none of the tests runnable. Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Emit a warning for certain constructs Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Stop skipping RowFetchTest Date type work fixed this test failure Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Revise infer_types logic to never infer a TINYINT This allows these SQLAlchemy tests to pass: test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_bound_limit PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_bound_limit_offset PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_expr_limit_simple_offset PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_simple_limit PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_simple_limit_expr_offset PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_simple_limit_offset[cases0] PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_simple_limit_offset[cases1] PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_simple_limit_offset[cases2] PASSED This partially reverts the change introduced in #246 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Stop skipping FetchLimitOffsetTest I implemented our custom DatabricksStatementCompiler so we can override the default rendering of unbounded LIMIT clauses from `LIMIT -1` to `LIMIT ALL` We also explicitly skip the FETCH clause tests since Databricks doesn't support this syntax. Blacked all source code here too. Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Stop skipping FutureTableDDLTest Add meaningful skip markers for table comment reflection and indexes Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Stop skipping Identity column tests This closes https://github.com/databricks/databricks-sql-python/issues/175 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Stop skipping HasTableTest Adding the @reflection.cache decorator to has_table is necessary to pass test_has_table_cache Caching calls to has_table improves the efficiency of the connector Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Permanently skip LongNameBlowoutTest Databricks constraint names are limited to 255 characters Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Stop skipping ExceptionTest Black test_suite.py Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Permanently skip LastrowidTest Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Implement PRIMARY KEY and FOREIGN KEY reflection and enable tests Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Skip all IdentityColumnTest tests Turns out that none of these can pass for the same reason that the first two seemed un-runnable in db6f52bb329f3f43a9215b5cd46b03c3459a302a Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy 2: implement and refactor schema reflection methods (#249) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Add GovCloud domain into AWS domains (#252) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy 2: Refactor __init__.py into base.py (#250) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy 2: Finish implementing all of ComponentReflectionTest (#251) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy 2: Finish marking all tests in the suite (#253) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy 2: Finish organising compliance test suite (#256) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy 2: Fix failing mypy checks from development (#257) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Enable cloud fetch by default (#258) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1137] Reintroduce protocol checking to Python test fw (#248) * Put in some unit tests, will add e2e Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Added e2e test Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Linted Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * re-bumped thrift files Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Changed structure to store protocol version as feature of connection Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Fixed parameters test Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Fixed comments Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Update src/databricks/sql/client.py Co-authored-by: Jesse <jwhitehouse@airpost.net> Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Fixed comments Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Removed extra indent Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> --------- Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> Co-authored-by: Jesse <jwhitehouse@airpost.net> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * sqla2 clean-up: make sqlalchemy optional and don't mangle the user-agent (#264) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy 2: Add support for TINYINT (#265) Closes #123 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Add OAuth M2M example (#266) * Add OAuth M2M example Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Native Parameters: reintroduce INLINE approach with tests (#267) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Document behaviour of executemany (#213) Signed-off-by: Martin Rueckl <enigma@nbubu.de> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy 2: Expose TIMESTAMP and TIMESTAMP_NTZ types to users (#268) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Drop Python 3.7 as a supported version (#270) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> (cherry picked from commit 8d85fa8b33a70331141c0c6556196f641d1b8ed5) Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * GH Workflows: remove Python 3.7 from the matrix for _all_ workflows (#274) Remove Python 3.7 from the matrix for _all_ workflows This was missed in #270 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Add README and updated example for SQLAlchemy usage (#273) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Rewrite native parameter implementation with docs and tests (#281) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Enable v3 retries by default (#282) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * security: bump pyarrow dependency to 14.0.1 (#284) pyarrow is currently compatible with Python 3.8 → Python 3.11 I also removed specifiers for when Python is 3.7 since this no longer applies. Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump package version to 3.0.0 (#285) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix docstring about default parameter approach (#287) Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1286] Add tests for complex types in query results (#293) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * sqlalchemy: fix deprecation warning for dbapi classmethod (#294) Rename `dbapi` classmethod to `import_dbapi` as required by SQLAlchemy 2 Closes #289 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1297] sqlalchemy: fix: can't read columns for tables containing a TIMESTAMP_NTZ column (#296) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Prepared 3.0.1 release (#297) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Make contents of `__init__.py` equal across projects (#304) --------- Signed-off-by: Pieter Noordhuis <pieter.noordhuis@databricks.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix URI construction in ThriftBackend (#303) Signed-off-by: Jessica <12jessicasmith34@gmail.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [sqlalchemy] Add table and column comment support (#329) Signed-off-by: Christophe Bornet <cbornet@hotmail.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Pin pandas and urllib3 versions to fix runtime issues in dbt-databricks (#330) Signed-off-by: Ben Cassell <ben.cassell@databricks.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * SQLAlchemy: TINYINT types didn't reflect properly (#315) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1435] Restore `tests.py` to the test suite (#331) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump to version 3.0.2 (#335) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Update some outdated OAuth comments (#339) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Redact the URL query parameters from the urllib3.connectionpool logs (#341) * Redact the URL query parameters from the urllib3.connectionpool logs Signed-off-by: Mubashir Kazia <mubashir.kazia@databricks.com> * Fix code formatting Signed-off-by: Mubashir Kazia <mubashir.kazia@databricks.com> * Add str check for the log record message arg dict values Signed-off-by: Mubashir Kazia <mubashir.kazia@databricks.com> --------- Signed-off-by: Mubashir Kazia <mubashir.kazia@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump to version 3.0.3 (#344) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1411] Support Databricks OAuth on GCP (#338) * [PECO-1411] Support OAuth InHouse on GCP Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * Update changelog Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> --------- Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1414] Support Databricks native OAuth in Azure (#351) * [PECO-1414] Support Databricks InHouse OAuth in Azure Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Prep for Test Automation (#352) Getting ready for test automation Signed-off-by: Ben Cassell <ben.cassell@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Update code owners (#345) * update owners Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> * update owners Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> * update owners Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> --------- Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Reverting retry behavior on 429s/503s to how it worked in 2.9.3 (#349) Signed-off-by: Ben Cassell <ben.cassell@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump to version 3.1.0 (#358) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1440] Expose current query id on cursor object (#364) * [PECO-1440] Expose current query id on cursor object Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Clear `active_op_handle` when closing the cursor Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Add a default for retry after (#371) * Add a default for retry after Signed-off-by: Ben Cassell <ben.cassell@databricks.com> * Applied black formatter Signed-off-by: Ben Cassell <ben.cassell@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix boolean literals (#357) Set supports_native_boolean to True Signed-off-by: Alex Holyoke <alexander.holyoke@growthloop.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Don't retry network requests that fail with code 403 (#373) * Don't retry requests that fail with 404 Signed-off-by: Jesse Whitehouse <jesse@whitehouse.dev> * Fix lint error Signed-off-by: Jesse Whitehouse <jesse@whitehouse.dev> --------- Signed-off-by: Jesse Whitehouse <jesse@whitehouse.dev> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bump to 3.1.1 (#374) * bump to 3.1.1 Signed-off-by: Ben Cassell <ben.cassell@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix cookie setting (#379) * fix cookie setting Signed-off-by: Ben Cassell <ben.cassell@databricks.com> * Removing cookie code Signed-off-by: Ben Cassell <ben.cassell@databricks.com> --------- Signed-off-by: Ben Cassell <ben.cassell@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fixing a couple type problems: how I would address most of #381 (#382) * Create py.typed Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * add -> Connection annotation Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * massage the code to appease the particular version of the project's mypy deps Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * fix circular import problem Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> --------- Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * fix the return types of the classes' __enter__ functions (#384) fix the return types of the classes' __enter__ functions so that the type information is preserved in context managers eg with-as blocks Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Add Kravets Levko to codeowners (#386) Signed-off-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Prepare for 3.1.2 (#387) Signed-off-by: Ben Cassell <ben.cassell@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Update the proxy authentication (#354) changed authentication for proxy Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix failing tests (#392) Signed-off-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Relax `pyarrow` pin (#389) * Relax `pyarrow` pin Signed-off-by: Dave Hirschfeld <dave.hirschfeld@gmail.com> * Allow `pyarrow` 16 Signed-off-by: Dave Hirschfeld <dave.hirschfeld@gmail.com> * Update `poetry.lock` Signed-off-by: Dave Hirschfeld <dave.hirschfeld@gmail.com> --------- Signed-off-by: Dave Hirschfeld <dave.hirschfeld@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix log error in oauth.py (#269) * Duplicate of applicable change from #93 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Update changelog Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Fix after merge Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Levko Kravets <levko.ne@gmail.com> Co-authored-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Enable `delta.feature.allowColumnDefaults` for all tables (#343) * Enable `delta.feature.allowColumnDefaults` for all tables * Code style Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> Co-authored-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix SQLAlchemy tests (#393) Signed-off-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Add more debug logging for CloudFetch (#395) Signed-off-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Update Thrift package (#397) Signed-off-by: Milan Lukac <milan@lukac.online> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Prepare release 3.2.0 (#396) * Prepare release 3.2.0 Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Update changelog Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * move py.typed to correct places (#403) * move py.typed to correct places https://peps.python.org/pep-0561/ says 'For namespace packages (see PEP 420), the py.typed file should be in the submodules of the namespace, to avoid conflicts and for clarity.'. Previously, when I added the py.typed file to this project, https://github.com/databricks/databricks-sql-python/pull/382 , I was unaware this was a namespace package (although, curiously, it seems I had done it right initially and then changed to the wrong way). As PEP 561 warns us, this does create conflicts; other libraries in the databricks namespace package (such as, in my case, databricks-vectorsearch) are then treated as though they are typed, which they are not. This commit moves the py.typed file to the correct places, the submodule folders, fixing that problem. Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * change target of mypy to src/databricks instead of src. I think this might fix the CI code-quality checks failure, but unfortunately I can't replicate that failure locally and the error message is unhelpful Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * Possible workaround for bad error message 'error: --install-types failed (no mypy cache directory)'; see https://github.com/python/mypy/issues/10768#issuecomment-2178450153 Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * fix invalid yaml syntax Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * Best fix (#3) Fixes the problem by cding and supplying a flag to mypy (that mypy needs this flag is seemingly fixed/changed in later versions of mypy; but that's another pr altogether...). Also fixes a type error that was somehow in the arguments of the program (?!) (I guess this is because you guys are still using implicit optional) --------- Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * return the old result_links default (#5) Return the old result_links default, make the type optional, & I'm pretty sure the original problem is that add_file_links can't take a None, so these statements should be in the body of the if-statement that ensures it is not None Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * Update src/databricks/sql/utils.py "self.download_manager is unconditionally used later, so must be created. Looks this part of code is totally not covered with tests 🤔" Co-authored-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> --------- Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> Co-authored-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Upgrade mypy (#406) * Upgrade mypy This commit removes the flag (and cd step) from https://github.com/databricks/databricks-sql-python/commit/f53aa37a34dc37026d430e71b5e0d1b871bc5ac1 which we added to get mypy to treat namespaces correctly. This was apparently a bug in mypy, or behavior they decided to change. To get the new behavior, we must upgrade mypy. (This also allows us to remove a couple `# type: ignore` comment that are no longer needed.) This commit runs changes the version of mypy and runs `poetry lock`. It also conforms the whitespace of files in this project to the expectations of various tools and standard (namely: removing trailing whitespace as expected by git and enforcing the existence of one and only one newline at the end of a file as expected by unix and github.) It also uses https://github.com/hauntsaninja/no_implicit_optional to automatically upgrade codebase due to a change in mypy behavior. For a similar reason, it also fixes a new type (or otherwise) errors: * "Return type 'Retry' of 'new' incompatible with return type 'DatabricksRetryPolicy' in supertype 'Retry'" * databricks/sql/auth/retry.py:225: error: object has no attribute update [attr-defined] * /test_param_escaper.py:31: DeprecationWarning: invalid escape sequence \) [as it happens, I think it was also wrong for the string not to be raw, because I'm pretty sure it wants all of its backslashed single-quotes to appear literally with the backslashes, which wasn't happening until now] * ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject [this is like a numpy version thing, which I fixed by being stricter about numpy version] --------- Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * Incorporate suggestion. I decided the most expedient way of dealing with this type error was just adding the type ignore comment back in, but with a `[attr-defined]` specifier this time. I mean, otherwise I would have to restructure the code or figure out the proper types for a TypedDict for the dict and I don't think that's worth it at the moment. Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> --------- Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Do not retry failing requests with status code 401 (#408) - Raises NonRecoverableNetworkError when request results in 401 status code Signed-off-by: Tor Hødnebø <thodnebo@gmail.com> Signed-off-by: Tor Hødnebø <tor.hodnebo@gjensidige.no> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1715] Remove username/password (BasicAuth) auth option (#409) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1751] Refactor CloudFetch downloader: handle files sequentially (#405) * [PECO-1751] Refactor CloudFetch downloader: handle files sequentially; utilize Futures Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Retry failed CloudFetch downloads Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Update tests Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix CloudFetch retry policy to be compatible with all `urllib3` versions we support (#412) Signed-off-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Disable SSL verification for CloudFetch links (#414) * Disable SSL verification for CloudFetch links Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Use existing `_tls_no_verify` option in CloudFetch downloader Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Update tests Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Prepare relese 3.3.0 (#415) * Prepare relese 3.3.0 Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Remove @arikfr from CODEOWNERS Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix pandas 2.2.2 support (#416) * Support pandas 2.2.2 See release note numpy 2.2.2: https://pandas.pydata.org/docs/dev/whatsnew/v2.2.0.html#to-numpy-for-numpy-nullable-and-arrow-types-converts-to-suitable-numpy-dtype * Allow pandas 2.2.2 in pyproject.toml * Update poetry.lock, poetry lock --no-update * Code style Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> Co-authored-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1801] Make OAuth as the default authenticator if no authentication setting is provided (#419) * [PECO-1801] Make OAuth as the default authenticator if no authentication setting is provided Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1857] Use SSL options with HTTPS connection pool (#425) * [PECO-1857] Use SSL options with HTTPS connection pool Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Some cleanup Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Resolve circular dependencies Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Update existing tests Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Fix MyPy issues Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Fix `_tls_no_verify` handling Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Add tests Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Prepare release v3.4.0 (#430) Prepare release 3.4.0 Signed-off-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1926] Create a non pyarrow flow to handle small results for the column set (#440) * Implemented the columnar flow for non arrow users * Minor fixes * Introduced the Column Table structure * Added test for the new column table * Minor fix * Removed unnecessory fikes Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-1961] On non-retryable error, ensure PySQL includes useful information in error (#447) * added error info on non-retryable error Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Reformatted all the files using black (#448) Reformatted the files using black Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Prepare release v3.5.0 (#457) Prepare release 3.5.0 Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECO-2051] Add custom auth headers into cloud fetch request (#460) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Prepare release 3.6.0 (#461) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [ PECO - 1768 ] PySQL: adjust HTTP retry logic to align with Go and Nodejs drivers (#467) * Added the exponential backoff code * Added the exponential backoff algorithm and refractored the code * Added jitter and added unit tests * Reformatted * Fixed the test_retry_exponential_backoff integration test Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [ PECO-2065 ] Create the async execution flow for the PySQL Connector (#463) * Built the basic flow for the async pipeline - testing is remaining * Implemented the flow for the get_execution_result, but the problem of invalid operation handle still persists * Missed adding some files in previous commit * Working prototype of execute_async, get_query_state and get_execution_result * Added integration tests for execute_async * add docs for functions * Refractored the async code * Fixed java doc * Reformatted Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Fix for check_types github action failing (#472) Fixed the chekc_types failing Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Remove upper caps on dependencies (#452) * Remove upper caps on numpy and pyarrow versions Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Updated the doc to specify native parameters in PUT operation is not supported from >=3.x connector (#477) Added doc update Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Incorrect rows in inline fetch result (#479) * Raised error when incorrect Row offset it returned * Changed error type * grammar fix * Added unit tests and modified the code * Updated error message * Updated the non retying to only inline case * Updated fix * Changed the flow * Minor update * Updated the retryable condition * Minor test fix * Added extra space Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Bumped up to version 3.7.0 (#482) * bumped up version * Updated to version 3.7.0 * Grammar fix * Minor fix Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * PySQL Connector split into connector and sqlalchemy (#444) * Modified the gitignore file to not have .idea file * [PECO-1803] Splitting the PySql connector into the core and the non core part (#417) …

* [PECO-197] Support Python 3.10 (#31) * Test with multiple python versions. * Update pyarrow to version 9.0.0 to address issue in relation to python 3.10 & a specific version of numpy being pulled in by pyarrow. Closes #26 Signed-off-by: David Black <dblack@atlassian.com> * Update changelog and bump to v2.0.4 (#34) * Update changelog and bump to v2.0.4 * Specifically thank @dbaxa for this change. Signed-off-by: Jesse Whitehouse <jesse@whitehouse.dev> * Bump to 2.0.5-dev on main (#35) Signed-off-by: Jesse Whitehouse <jesse@whitehouse.dev> * On Pypi, display the "Project Links" sidebar. (#36) Signed-off-by: Jesse Whitehouse <jesse@whitehouse.dev> * [ES-402013] Close cursors before closing connection (#38) * Add test: cursors are closed when connection closes Signed-off-by: Jesse Whitehouse <jesse@whitehouse.dev> * Bump version to 2.0.5 and improve CHANGELOG (#40) Signed-off-by: Jesse Whitehouse <jesse@whitehouse.dev> * fix dco issue Signed-off-by: Moe Derakhshani <moe.derakhshani@databricks.com> * fix dco issue Signed-off-by: Moe Derakhshani <moe.derakhshani@databricks.com> * dco tunning Signed-off-by: Moe Derakhshani <moe.derakhshani@databricks.com> * dco tunning Signed-off-by: Moe Derakhshani <moe.derakhshani@databricks.com> * Github workflows: run checks on pull requests from forks (#47) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * OAuth implementation (#15) This PR: * Adds the foundation for OAuth against Databricks account on AWS with BYOIDP. * It copies one internal module that Steve Weis @sweisdb wrote for Databricks CLI (oauth.py). Once ecosystem-dev team (Serge, Pieter) build a python sdk core we will move this code to their repo as a dependency. * the PR provides authenticators with visitor pattern format for stamping auth-token which later is intended to be moved to the repo owned by Serge @nfx and and Pieter @pietern * Automate deploys to Pypi (#48) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * [PECO-205] Add functional examples (#52) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump version to 2.1.0 (#54) Bump to v2.1.0 and update changelog Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * [SC-110400] Enabling compression in Python SQL Connector (#49) Signed-off-by: Mohit Singla <mohit.singla@databricks.com> Co-authored-by: Moe Derakhshani <moe.derakhshani@databricks.com> * Add tests for parameter sanitisation / escaping (#46) * Refactor so we can unit test `inject_parameters` * Add unit tests for inject_parameters * Remove inaccurate comment. Per #51, spark sql does not support escaping a single quote with a second single quote. * Closes #51 and adds unit tests plus the integration test provided in #56 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Courtney Holcomb (@courtneyholcomb) Co-authored-by: @mcannamela * Bump thrift dependency to 0.16.0 (#65) Addresses https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13949 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump version to 2.2.0 (#66) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Support Python 3.11 (#60) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump version to 2.2.1 (#70) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Add none check on _oauth_persistence in DatabricksOAuthProvider (#71) Add none check on _oauth_persistence in DatabricksOAuthProvider to avoid app crash when _oauth_persistence is None. Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * Support custom oauth client id and redirect port (#75) * Support custom oauth client id and rediret port range PySQL is used by other tools/CLIs which have own oauth client id, we need to expose oauth_client_id and oauth_redirect_port_range as the connection parameters to support this customization. Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * Change oauth redirect port range to port Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * Fix type check issue Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * Bump version to 2.2.2 (#76) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Jesse <jesse.whitehouse@databricks.com> * Merge staging ingestion into main (#78) Follow up to #67 and #64 * Regenerate TCLIService using latest TCLIService.thrift from DBR (#64) * SI: Implement GET, PUT, and REMOVE (#67) * Re-lock dependencies after merging `main` Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump version to 2.3.0 and update changelog (#80) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Add pkgutil-style for the package (#84) Since the package is under databricks namespace. pip install this package will cause issue importing other packages under the same namespace like automl and feature store. Adding pkgutil style to resolve the issue. Signed-off-by: lu-wang-dl <lu.wang@databricks.com> * Add SQLAlchemy Dialect (#57) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump to version 2.4.0(#89) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Fix syntax in examples in root readme. (#92) Do this because the environment variable pulls did not have closing quotes on their string literals. * Less strict numpy and pyarrow dependencies (#90) Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Thomas Newton <thomas.w.newton@gmail.com> * Update example in docstring so query output is valid Spark SQL (#95) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump version to 2.4.1 (#96) Per the sermver.org spec, updating the projects dependencies is considered a compatible change. https: //semver.org/#what-should-i-do-if-i-update-my-own-dependencies-without-changing-the-public-api Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Update CODEOWNERS (#97) * Add Andre to CODEOWNERS (#98) * Add Andre. Signed-off-by: Yunbo Deng <yunbo.deng@databricks.com> Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> * Revert the change temporarily so I can sign off. Signed-off-by: Yunbo Deng <yunbo.deng@databricks.com> Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> * Add Andre and sign off. Signed-off-by: Yunbo Deng <yunbo.deng@databricks.com> Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> * Remove redundant line Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> --------- Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> * Add external auth provider + example (#101) Signed-off-by: Andre Furlan <andre.furlan@databricks.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Retry on connection timeout (#103) A lot of the time we see the error `[Errno 110] Connection timed out`. This happens a lot in Azure, particularly. In this PR I make it a retryable error as it is safe Signed-off-by: Andre Furlan <andre.furlan@databricks.com> * [PECO-244] Make http proxies work (#81) Override thrift's proxy header encoding function. Uses the fix identified in https://github.com/apache/thrift/pull/2565 H/T @pspeter Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump to version 2.5.0 (#104) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Fix changelog release date for version 2.5.0 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Relax sqlalchemy requirement (#113) * Plus update docs about how to change dependency spec Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Update to version 2.5.1 (#114) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Fix SQLAlchemy timestamp converter + docs (#117) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Relax pandas and alembic requirements (#119) Update dependencies for alembic and pandas per customer request Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump to version 2.5.2 (#118) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Use urllib3 for thrift transport + reuse http connections (#131) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Default socket timeout to 15 min (#137) Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Bump version to 2.6.0 (#139) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Fix: some thrift RPCs failed with BadStatusLine (#141) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump version to 2.6.1 (#142) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * [ES-706907] Retry GetOperationStatus for http errors (#145) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump version to 2.6.2 (#147) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * [PECO-626] Support OAuth flow for Databricks Azure (#86) ## Summary Support OAuth flow for Databricks Azure ## Background Some OAuth endpoints (e.g. Open ID Configuration) and scopes are different between Databricks Azure and AWS. Current code only supports OAuth flow on Databricks in AWS ## What changes are proposed in this pull request? - Change `OAuthManager` to decouple Databricks AWS specific configuration from OAuth flow - Add `sql/auth/endpoint.py` that implements cloud specific OAuth endpoint configuration - Change `DatabricksOAuthProvider` to work with the OAuth configurations in different Databricks cloud (AWS, Azure) - Add the corresponding unit tests * Use a separate logger for unsafe thrift responses (#153) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Improve e2e test development ergonomics (#155) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Don't raise exception when closing a stale Thrift session (#159) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump to version 2.7.0 (#161) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Cloud Fetch download handler (#127) * Cloud Fetch download handler Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Issue fix: final result link compressed data has multiple LZ4 end-of-frame markers Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Addressing PR comments - Linting - Type annotations - Use response.ok - Log exception - Remove semaphore and only use threading.event - reset() flags method - Fix tests after removing semaphore - Link expiry logic should be in secs - Decompress data static function - link_expiry_buffer and static public methods - Docstrings and comments Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Changing logger.debug to remove url Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * _reset() comment to docstring Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * link_expiry_buffer -> link_expiry_buffer_secs Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> --------- Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Cloud Fetch download manager (#146) * Cloud Fetch download manager Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Bug fix: submit handler.run Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Type annotations Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Namedtuple -> dataclass Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Shutdown thread pool and clear handlers Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Docstrings and comments Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * handler.run is the correct call Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Link expiry buffer in secs Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Adding type annotations for download_handlers and downloadable_result_settings Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Move DownloadableResultSettings to downloader.py to avoid circular import Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Black linting Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Timeout is never None Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> --------- Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Cloud fetch queue and integration (#151) * Cloud fetch queue and integration Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Enable cloudfetch with direct results Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Typing and style changes Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Client-settable max_download_threads Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Docstrings and comments Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Increase default buffer size bytes to 104857600 Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Move max_download_threads to kwargs of ThriftBackend, fix unit tests Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Fix tests: staticmethod make_arrow_table mock not callable Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * cancel_futures in shutdown() only available in python >=3.9.0 Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Black linting Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Fix typing errors Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> --------- Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Cloud Fetch e2e tests (#154) * Cloud Fetch e2e tests Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Test case works for e2-dogfood shared unity catalog Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Moving test to LargeQueriesSuite and setting catalog to hive_metastore Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Align default value of buffer_size_bytes in driver tests Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Adding comment to specify what's needed to run successfully Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> --------- Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Update changelog for cloudfetch (#172) Signed-off-by: Matthew Kim <11141331+mattdeekay@users.noreply.github.com> * Improve sqlalchemy backward compatibility with 1.3.24 (#173) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * OAuth: don't override auth headers with contents of .netrc file (#122) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Fix proxy connection pool creation (#158) Signed-off-by: Sebastian Eckweiler <sebastian.eckweiler@mercedes-benz.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Sebastian Eckweiler <sebastian.eckweiler@mercedes-benz.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Relax pandas dependency constraint to allow ^2.0.0 (#164) Signed-off-by: Daniel Segesdi <daniel.segesdi@turbine.ai> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Use hex string version of operation ID instead of bytes (#170) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * SQLAlchemy: fix has_table so it honours schema= argument (#174) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Fix socket timeout test (#144) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Disable non_native_boolean_check_constraint (#120) --------- Signed-off-by: Bogdan Kyryliuk <b.kyryliuk@gmail.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Remove unused import for SQLAlchemy 2 compatibility (#128) Signed-off-by: William Gentry <william.barr.gentry@gmail.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump version to 2.8.0 (#178) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Fix typo in python README quick start example (#186) --------- Co-authored-by: Jesse <jesse.whitehouse@databricks.com> * Configure autospec for mocked Client objects (#188) Resolves #187 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Use urllib3 for retries (#182) Behaviour is gated behind `enable_v3_retries` config. This will be removed and become the default behaviour in a subsequent release. Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump version to 2.9.0 (#189) * Add note to changelog about using cloud_fetch Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Explicitly add urllib3 dependency (#191) Signed-off-by: Jacobus Herman <jacobus.herman@otrium.com> Co-authored-by: Jesse <jesse.whitehouse@databricks.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump to 2.9.1 (#195) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Make backwards compatible with urllib3~=1.0 (#197) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Convenience improvements to v3 retry logic (#199) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump version to 2.9.2 (#201) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Github Actions Fix: poetry install fails for python 3.7 tests (#208) snok/install-poetry@v1 installs the latest version of Poetry The latest version of poetry released on 20 August 2023 (four days ago as of this commit) which drops support for Python 3.7, causing our github action to fail. Until we complete #207 we need to conditionally install the last version of poetry that supports Python 3.7 (poetry==1.5.1) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Make backwards compatible with urllib3~=1.0 [Follow up #197] (#206) * Make retry policy backwards compatible with urllib3~=1.0.0 We already implement the equivalent of backoff_max so the behaviour will be the same for urllib3==1.x and urllib3==2.x We do not implement backoff jitter so the behaviour for urllib3==1.x will NOT include backoff jitter whereas urllib3==2.x WILL include jitter. --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump version to 2.9.3 (#209) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Add note to sqlalchemy example: IDENTITY isn't supported yet (#212) ES-842237 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * [PECO-1029] Updated thrift compiler version (#216) * Updated thrift definitions Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Tried with a different thrift installation Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Reverted TCLI to previous Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Reverted to older thrift Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Updated version again Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Upgraded thrift Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Final commit Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> --------- Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * [PECO-1055] Updated thrift defs to allow Tsparkparameters (#220) Updated thrift defs to most recent versions * Update changelog to indicate that 2.9.1 and 2.9.2 have been yanked. (#222) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Fix changelog typo: _enable_v3_retries (#225) Closes #219 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Introduce SQLAlchemy reusable dialog tests (#125) Signed-off-by: Jim Fulton <jim.fulton@unsupervised.com> Co-Authored-By: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * [PECO-1026] Add Parameterized Query support to Python (#217) * Initial commit Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Added tsparkparam handling Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Added basic test Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Addressed comments Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Addressed missed comments Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Resolved comments --------- Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Parameterized queries: Add e2e tests for inference (#227) * [PECO-1109] Parameterized Query: add suport for inferring decimal types (#228) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * SQLAlchemy 2: reorganise dialect files into a single directory (#231) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * [PECO-1083] Updated thrift files and added check for protocol version (#229) * Updated thrift files and added check for protocol version Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Made error message more clear Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Changed name of fn Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Ran linter Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Update src/databricks/sql/client.py Co-authored-by: Jesse <jwhitehouse@airpost.net> --------- Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> Co-authored-by: Jesse <jwhitehouse@airpost.net> * [PECO-840] Port staging ingestion behaviour to new UC Volumes (#235) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Query parameters: implement support for binding NoneType parameters (#233) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * SQLAlchemy 2: Bump dependency version and update e2e tests for existing behaviour (#236) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Revert "[PECO-1083] Updated thrift files and added check for protocol version" (#237) Reverts #229 as it causes all of our e2e tests to fail on some versions of DBR. We'll reimplement the protocol version check in a follow-up. This reverts commit 241e934a96737d506c2a1f77c7012e1ab8de967b. * SQLAlchemy 2: add type compilation for all CamelCase types (#238) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * SQLAlchemy 2: add type compilation for uppercase types (#240) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * SQLAlchemy 2: Stop skipping all type tests (#242) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * [PECO-1134] v3 Retries: allow users to bound the number of redirects to follow (#244) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Parameters: Add type inference for BIGINT and TINYINT types (#246) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * SQLAlchemy 2: Stop skipping some non-type tests (#247) * Stop skipping TableDDLTest and permanent skip HasIndexTest We're now in the territory of features that aren't required for sqla2 compat as of pysql==3.0.0 but we may consider adding this in the future. In this case, table comment reflection needs to be manually implemented. Index reflection would require hooking into the compiler to reflect the partition strategy. test_suite.py::HasIndexTest_databricks+databricks::test_has_index[dialect] SKIPPED (Databricks does not support indexes.) test_suite.py::HasIndexTest_databricks+databricks::test_has_index[inspector] SKIPPED (Databricks does not support indexes.) test_suite.py::HasIndexTest_databricks+databricks::test_has_index_schema[dialect] SKIPPED (Databricks does not support indexes.) test_suite.py::HasIndexTest_databricks+databricks::test_has_index_schema[inspector] SKIPPED (Databricks does not support indexes.) test_suite.py::TableDDLTest_databricks+databricks::test_add_table_comment SKIPPED (Comment reflection is possible but not implemented in this dialect.) test_suite.py::TableDDLTest_databricks+databricks::test_create_index_if_not_exists SKIPPED (Databricks does not support indexes.) test_suite.py::TableDDLTest_databricks+databricks::test_create_table PASSED test_suite.py::TableDDLTest_databricks+databricks::test_create_table_if_not_exists PASSED test_suite.py::TableDDLTest_databricks+databricks::test_create_table_schema PASSED test_suite.py::TableDDLTest_databricks+databricks::test_drop_index_if_exists SKIPPED (Databricks does not support indexes.) test_suite.py::TableDDLTest_databricks+databricks::test_drop_table PASSED test_suite.py::TableDDLTest_databricks+databricks::test_drop_table_comment SKIPPED (Comment reflection is possible but not implemented in this dialect.) test_suite.py::TableDDLTest_databricks+databricks::test_drop_table_if_exists PASSED test_suite.py::TableDDLTest_databricks+databricks::test_underscore_names PASSED Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Permanently skip QuotedNameArgumentTest with comments The fixes to DESCRIBE TABLE and visit_xxx were necessary to get to the point where I could even determine that these tests wouldn't pass. But those changes are not currently tested in the dialect. If, in the course of reviewing the remaining tests in the compliance suite, I find that these visit_xxxx methods are not tested anywhere else then we should extend test_suite.py with our own tests to confirm the behaviour for ourselves. Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Move files from base.py to _ddl.py The presence of this pytest.ini file is _required_ to establish pytest's root_path https://docs.pytest.org/en/7.1.x/reference/customize.html#finding-the-rootdir Without it, the custom pytest plugin from SQLAlchemy can't read the contents of setup.cfg which makes none of the tests runnable. Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Emit a warning for certain constructs Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Stop skipping RowFetchTest Date type work fixed this test failure Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Revise infer_types logic to never infer a TINYINT This allows these SQLAlchemy tests to pass: test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_bound_limit PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_bound_limit_offset PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_expr_limit_simple_offset PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_simple_limit PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_simple_limit_expr_offset PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_simple_limit_offset[cases0] PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_simple_limit_offset[cases1] PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_simple_limit_offset[cases2] PASSED This partially reverts the change introduced in #246 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Stop skipping FetchLimitOffsetTest I implemented our custom DatabricksStatementCompiler so we can override the default rendering of unbounded LIMIT clauses from `LIMIT -1` to `LIMIT ALL` We also explicitly skip the FETCH clause tests since Databricks doesn't support this syntax. Blacked all source code here too. Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Stop skipping FutureTableDDLTest Add meaningful skip markers for table comment reflection and indexes Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Stop skipping Identity column tests This closes https://github.com/databricks/databricks-sql-python/issues/175 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Stop skipping HasTableTest Adding the @reflection.cache decorator to has_table is necessary to pass test_has_table_cache Caching calls to has_table improves the efficiency of the connector Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Permanently skip LongNameBlowoutTest Databricks constraint names are limited to 255 characters Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Stop skipping ExceptionTest Black test_suite.py Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Permanently skip LastrowidTest Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Implement PRIMARY KEY and FOREIGN KEY reflection and enable tests Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Skip all IdentityColumnTest tests Turns out that none of these can pass for the same reason that the first two seemed un-runnable in db6f52bb329f3f43a9215b5cd46b03c3459a302a Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * SQLAlchemy 2: implement and refactor schema reflection methods (#249) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Add GovCloud domain into AWS domains (#252) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * SQLAlchemy 2: Refactor __init__.py into base.py (#250) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * SQLAlchemy 2: Finish implementing all of ComponentReflectionTest (#251) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * SQLAlchemy 2: Finish marking all tests in the suite (#253) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * SQLAlchemy 2: Finish organising compliance test suite (#256) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * SQLAlchemy 2: Fix failing mypy checks from development (#257) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Enable cloud fetch by default (#258) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * [PECO-1137] Reintroduce protocol checking to Python test fw (#248) * Put in some unit tests, will add e2e Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Added e2e test Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Linted Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * re-bumped thrift files Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Changed structure to store protocol version as feature of connection Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Fixed parameters test Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Fixed comments Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Update src/databricks/sql/client.py Co-authored-by: Jesse <jwhitehouse@airpost.net> Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Fixed comments Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> * Removed extra indent Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> --------- Signed-off-by: nithinkdb <nithin.krishnamurthi@databricks.com> Co-authored-by: Jesse <jwhitehouse@airpost.net> * sqla2 clean-up: make sqlalchemy optional and don't mangle the user-agent (#264) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * SQLAlchemy 2: Add support for TINYINT (#265) Closes #123 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Add OAuth M2M example (#266) * Add OAuth M2M example Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * Native Parameters: reintroduce INLINE approach with tests (#267) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Document behaviour of executemany (#213) Signed-off-by: Martin Rueckl <enigma@nbubu.de> * SQLAlchemy 2: Expose TIMESTAMP and TIMESTAMP_NTZ types to users (#268) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Drop Python 3.7 as a supported version (#270) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> (cherry picked from commit 8d85fa8b33a70331141c0c6556196f641d1b8ed5) * GH Workflows: remove Python 3.7 from the matrix for _all_ workflows (#274) Remove Python 3.7 from the matrix for _all_ workflows This was missed in #270 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Add README and updated example for SQLAlchemy usage (#273) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Rewrite native parameter implementation with docs and tests (#281) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Enable v3 retries by default (#282) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * security: bump pyarrow dependency to 14.0.1 (#284) pyarrow is currently compatible with Python 3.8 → Python 3.11 I also removed specifiers for when Python is 3.7 since this no longer applies. Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump package version to 3.0.0 (#285) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Fix docstring about default parameter approach (#287) * [PECO-1286] Add tests for complex types in query results (#293) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * sqlalchemy: fix deprecation warning for dbapi classmethod (#294) Rename `dbapi` classmethod to `import_dbapi` as required by SQLAlchemy 2 Closes #289 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * [PECO-1297] sqlalchemy: fix: can't read columns for tables containing a TIMESTAMP_NTZ column (#296) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Prepared 3.0.1 release (#297) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Make contents of `__init__.py` equal across projects (#304) --------- Signed-off-by: Pieter Noordhuis <pieter.noordhuis@databricks.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Fix URI construction in ThriftBackend (#303) Signed-off-by: Jessica <12jessicasmith34@gmail.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * [sqlalchemy] Add table and column comment support (#329) Signed-off-by: Christophe Bornet <cbornet@hotmail.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Pin pandas and urllib3 versions to fix runtime issues in dbt-databricks (#330) Signed-off-by: Ben Cassell <ben.cassell@databricks.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * SQLAlchemy: TINYINT types didn't reflect properly (#315) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * [PECO-1435] Restore `tests.py` to the test suite (#331) --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Bump to version 3.0.2 (#335) Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Update some outdated OAuth comments (#339) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Redact the URL query parameters from the urllib3.connectionpool logs (#341) * Redact the URL query parameters from the urllib3.connectionpool logs Signed-off-by: Mubashir Kazia <mubashir.kazia@databricks.com> * Fix code formatting Signed-off-by: Mubashir Kazia <mubashir.kazia@databricks.com> * Add str check for the log record message arg dict values Signed-off-by: Mubashir Kazia <mubashir.kazia@databricks.com> --------- Signed-off-by: Mubashir Kazia <mubashir.kazia@databricks.com> * Bump to version 3.0.3 (#344) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * [PECO-1411] Support Databricks OAuth on GCP (#338) * [PECO-1411] Support OAuth InHouse on GCP Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * Update changelog Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> --------- Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * [PECO-1414] Support Databricks native OAuth in Azure (#351) * [PECO-1414] Support Databricks InHouse OAuth in Azure Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * Prep for Test Automation (#352) Getting ready for test automation Signed-off-by: Ben Cassell <ben.cassell@databricks.com> * Update code owners (#345) * update owners Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> * update owners Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> * update owners Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> --------- Signed-off-by: yunbodeng-db <104732431+yunbodeng-db@users.noreply.github.com> * Reverting retry behavior on 429s/503s to how it worked in 2.9.3 (#349) Signed-off-by: Ben Cassell <ben.cassell@databricks.com> * Bump to version 3.1.0 (#358) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * [PECO-1440] Expose current query id on cursor object (#364) * [PECO-1440] Expose current query id on cursor object Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Clear `active_op_handle` when closing the cursor Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Add a default for retry after (#371) * Add a default for retry after Signed-off-by: Ben Cassell <ben.cassell@databricks.com> * Applied black formatter Signed-off-by: Ben Cassell <ben.cassell@databricks.com> * Fix boolean literals (#357) Set supports_native_boolean to True Signed-off-by: Alex Holyoke <alexander.holyoke@growthloop.com> * Don't retry network requests that fail with code 403 (#373) * Don't retry requests that fail with 404 Signed-off-by: Jesse Whitehouse <jesse@whitehouse.dev> * Fix lint error Signed-off-by: Jesse Whitehouse <jesse@whitehouse.dev> --------- Signed-off-by: Jesse Whitehouse <jesse@whitehouse.dev> * Bump to 3.1.1 (#374) * bump to 3.1.1 Signed-off-by: Ben Cassell <ben.cassell@databricks.com> * Fix cookie setting (#379) * fix cookie setting Signed-off-by: Ben Cassell <ben.cassell@databricks.com> * Removing cookie code Signed-off-by: Ben Cassell <ben.cassell@databricks.com> --------- Signed-off-by: Ben Cassell <ben.cassell@databricks.com> * Fixing a couple type problems: how I would address most of #381 (#382) * Create py.typed Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * add -> Connection annotation Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * massage the code to appease the particular version of the project's mypy deps Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * fix circular import problem Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> --------- Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * fix the return types of the classes' __enter__ functions (#384) fix the return types of the classes' __enter__ functions so that the type information is preserved in context managers eg with-as blocks Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * Add Kravets Levko to codeowners (#386) Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Prepare for 3.1.2 (#387) Signed-off-by: Ben Cassell <ben.cassell@databricks.com> * Update the proxy authentication (#354) changed authentication for proxy * Fix failing tests (#392) Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Relax `pyarrow` pin (#389) * Relax `pyarrow` pin Signed-off-by: Dave Hirschfeld <dave.hirschfeld@gmail.com> * Allow `pyarrow` 16 Signed-off-by: Dave Hirschfeld <dave.hirschfeld@gmail.com> * Update `poetry.lock` Signed-off-by: Dave Hirschfeld <dave.hirschfeld@gmail.com> --------- Signed-off-by: Dave Hirschfeld <dave.hirschfeld@gmail.com> * Fix log error in oauth.py (#269) * Duplicate of applicable change from #93 Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Update changelog Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> * Fix after merge Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com> Signed-off-by: Levko Kravets <levko.ne@gmail.com> Co-authored-by: Levko Kravets <levko.ne@gmail.com> * Enable `delta.feature.allowColumnDefaults` for all tables (#343) * Enable `delta.feature.allowColumnDefaults` for all tables * Code style Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> Co-authored-by: Levko Kravets <levko.ne@gmail.com> * Fix SQLAlchemy tests (#393) Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Add more debug logging for CloudFetch (#395) Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Update Thrift package (#397) Signed-off-by: Milan Lukac <milan@lukac.online> * Prepare release 3.2.0 (#396) * Prepare release 3.2.0 Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Update changelog Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> * move py.typed to correct places (#403) * move py.typed to correct places https://peps.python.org/pep-0561/ says 'For namespace packages (see PEP 420), the py.typed file should be in the submodules of the namespace, to avoid conflicts and for clarity.'. Previously, when I added the py.typed file to this project, https://github.com/databricks/databricks-sql-python/pull/382 , I was unaware this was a namespace package (although, curiously, it seems I had done it right initially and then changed to the wrong way). As PEP 561 warns us, this does create conflicts; other libraries in the databricks namespace package (such as, in my case, databricks-vectorsearch) are then treated as though they are typed, which they are not. This commit moves the py.typed file to the correct places, the submodule folders, fixing that problem. Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * change target of mypy to src/databricks instead of src. I think this might fix the CI code-quality checks failure, but unfortunately I can't replicate that failure locally and the error message is unhelpful Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * Possible workaround for bad error message 'error: --install-types failed (no mypy cache directory)'; see https://github.com/python/mypy/issues/10768#issuecomment-2178450153 Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * fix invalid yaml syntax Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * Best fix (#3) Fixes the problem by cding and supplying a flag to mypy (that mypy needs this flag is seemingly fixed/changed in later versions of mypy; but that's another pr altogether...). Also fixes a type error that was somehow in the arguments of the program (?!) (I guess this is because you guys are still using implicit optional) --------- Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * return the old result_links default (#5) Return the old result_links default, make the type optional, & I'm pretty sure the original problem is that add_file_links can't take a None, so these statements should be in the body of the if-statement that ensures it is not None Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * Update src/databricks/sql/utils.py "self.download_manager is unconditionally used later, so must be created. Looks this part of code is totally not covered with tests 🤔" Co-authored-by: Levko Kravets <levko.ne@gmail.com> Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> --------- Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> Co-authored-by: Levko Kravets <levko.ne@gmail.com> * Upgrade mypy (#406) * Upgrade mypy This commit removes the flag (and cd step) from https://github.com/databricks/databricks-sql-python/commit/f53aa37a34dc37026d430e71b5e0d1b871bc5ac1 which we added to get mypy to treat namespaces correctly. This was apparently a bug in mypy, or behavior they decided to change. To get the new behavior, we must upgrade mypy. (This also allows us to remove a couple `# type: ignore` comment that are no longer needed.) This commit runs changes the version of mypy and runs `poetry lock`. It also conforms the whitespace of files in this project to the expectations of various tools and standard (namely: removing trailing whitespace as expected by git and enforcing the existence of one and only one newline at the end of a file as expected by unix and github.) It also uses https://github.com/hauntsaninja/no_implicit_optional to automatically upgrade codebase due to a change in mypy behavior. For a similar reason, it also fixes a new type (or otherwise) errors: * "Return type 'Retry' of 'new' incompatible with return type 'DatabricksRetryPolicy' in supertype 'Retry'" * databricks/sql/auth/retry.py:225: error: object has no attribute update [attr-defined] * /test_param_escaper.py:31: DeprecationWarning: invalid escape sequence \) [as it happens, I think it was also wrong for the string not to be raw, because I'm pretty sure it wants all of its backslashed single-quotes to appear literally with the backslashes, which wasn't happening until now] * ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject [this is like a numpy version thing, which I fixed by being stricter about numpy version] --------- Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * Incorporate suggestion. I decided the most expedient way of dealing with this type error was just adding the type ignore comment back in, but with a `[attr-defined]` specifier this time. I mean, otherwise I would have to restructure the code or figure out the proper types for a TypedDict for the dict and I don't think that's worth it at the moment. Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> --------- Signed-off-by: wyattscarpenter <wyattscarpenter@gmail.com> * Do not retry failing requests with status code 401 (#408) - Raises NonRecoverableNetworkError when request results in 401 status code Signed-off-by: Tor Hødnebø <thodnebo@gmail.com> Signed-off-by: Tor Hødnebø <tor.hodnebo@gjensidige.no> * [PECO-1715] Remove username/password (BasicAuth) auth option (#409) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * [PECO-1751] Refactor CloudFetch downloader: handle files sequentially (#405) * [PECO-1751] Refactor CloudFetch downloader: handle files sequentially; utilize Futures Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Retry failed CloudFetch downloads Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Update tests Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Fix CloudFetch retry policy to be compatible with all `urllib3` versions we support (#412) Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Disable SSL verification for CloudFetch links (#414) * Disable SSL verification for CloudFetch links Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Use existing `_tls_no_verify` option in CloudFetch downloader Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Update tests Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Prepare relese 3.3.0 (#415) * Prepare relese 3.3.0 Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Remove @arikfr from CODEOWNERS Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Fix pandas 2.2.2 support (#416) * Support pandas 2.2.2 See release note numpy 2.2.2: https://pandas.pydata.org/docs/dev/whatsnew/v2.2.0.html#to-numpy-for-numpy-nullable-and-arrow-types-converts-to-suitable-numpy-dtype * Allow pandas 2.2.2 in pyproject.toml * Update poetry.lock, poetry lock --no-update * Code style Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> Co-authored-by: Levko Kravets <levko.ne@gmail.com> * [PECO-1801] Make OAuth as the default authenticator if no authentication setting is provided (#419) * [PECO-1801] Make OAuth as the default authenticator if no authentication setting is provided Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * [PECO-1857] Use SSL options with HTTPS connection pool (#425) * [PECO-1857] Use SSL options with HTTPS connection pool Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Some cleanup Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Resolve circular dependencies Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Update existing tests Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Fix MyPy issues Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Fix `_tls_no_verify` handling Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Add tests Signed-off-by: Levko Kravets <levko.ne@gmail.com> --------- Signed-off-by: Levko Kravets <levko.ne@gmail.com> * Prepare release v3.4.0 (#430) Prepare release 3.4.0 Signed-off-by: Levko Kravets <levko.ne@gmail.com> * [PECO-1926] Create a non pyarrow flow to handle small results for the column set (#440) * Implemented the columnar flow for non arrow users * Minor fixes * Introduced the Column Table structure * Added test for the new column table * Minor fix * Removed unnecessory fikes * [PECO-1961] On non-retryable error, ensure PySQL includes useful information in error (#447) * added error info on non-retryable error * Reformatted all the files using black (#448) Reformatted the files using black * Prepare release v3.5.0 (#457) Prepare release 3.5.0 Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * [PECO-2051] Add custom auth headers into cloud fetch request (#460) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * Prepare release 3.6.0 (#461) Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * [ PECO - 1768 ] PySQL: adjust HTTP retry logic to align with Go and Nodejs drivers (#467) * Added the exponential backoff code * Added the exponential backoff algorithm and refractored the code * Added jitter and added unit tests * Reformatted * Fixed the test_retry_exponential_backoff integration test * [ PECO-2065 ] Create the async execution flow for the PySQL Connector (#463) * Built the basic flow for the async pipeline - testing is remaining * Implemented the flow for the get_execution_result, but the problem of invalid operation handle still persists * Missed adding some files in previous commit * Working prototype of execute_async, get_query_state and get_execution_result * Added integration tests for execute_async * add docs for functions * Refractored the async code * Fixed java doc * Reformatted * Fix for check_types github action failing (#472) Fixed the chekc_types failing * Remove upper caps on dependencies (#452) * Remove upper caps on numpy and pyarrow versions * Updated the doc to specify native parameters in PUT operation is not supported from >=3.x connector (#477) Added doc update * Incorrect rows in inline fetch result (#479) * Raised error when incorrect Row offset it returned * Changed error type * grammar fix * Added unit tests and modified the code * Updated error message * Updated the non retying to only inline case * Updated fix * Changed the flow * Minor update * Updated the retryable condition * Minor test fix * Added extra space * Bumped up to version 3.7.0 (#482) * bumped up version * Updated to version 3.7.0 * Grammar fix * Minor fix * PySQL Connector split into connector and sqlalchemy (#444) * Modified the gitignore file to not have .idea file * [PECO-1803] Splitting the PySql connector into the core and the non core part (#417) * Implemented ColumnQueue to test the fetchall without pyarrow Removed token removed token * order of fields in row corrected * Changed the folder structure and tested the basic setup to work * Refractored the code to make connector to work * Basic Setup of connector, core and sqlalchemy is working * Basic integration of core, connect and sqlalchemy is working * Setup working dynamic change from ColumnQueue to ArrowQueue * Refractored the test code and moved to respective folders * Added the unit test for column_queue Fixed __version__ Fix * venv_main added to git ignore * Added code for merging columnar table * Merging code for columnar * Fixed the retry_close sesssion test issue with logging * Fixed the databricks_sqlalchemy tests and introduced pytest.ini for the sqla_testing * Added pyarrow_test mark on pytest * Fixed databricks.sqlalchemy to databricks_sqlalchemy imports * Added poetry.lock * Added dist folder * Changed the pyproject.toml * Minor Fix * Added the pyarrow skip tag on unit tests and tested their working * Fixed the Decimal and timestamp conversion issue in non arrow pipeline * Removed not required files and reformatted * Fixed test_retry error * Changed the folder structure to src / databricks * Removed the columnar non arrow flow to another PR * Moved the README to the root * removed columnQueue instance * Revmoved databricks_sqlalchemy dependency in core * Changed the pysql_supports_arrow predicate, introduced changes in the pyproject.toml * Ran the black formatter with the original version * Extra .py removed from all the __init__.py files names * Undo formatting check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * BIG UPDATE * Refeactor code * Refractor * Fixed versioning * Minor refractoring * Minor refractoring * Changed the folder structure such that sqlalchemy has not reference here * Fixed README.md and CONTRIBUTING.md * Added manual publish * On push trigger added * Manually setting the publish step * Changed versioning in pyproject.toml * Bumped up the version to 4.0.0.b3 and also changed the structure to have pyarrow as optional * Removed the sqlalchemy tests from integration.yml file * [PECO-1803] Print warning message if pyarrow is not installed (#468) Print warning message if pyarrow is not installed Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * [PECO-1803] Remove sqlalchemy and update README.md (#469) Remove sqlalchemy and update README.md Signed-off-by: Jacky Hu <jacky.hu@databricks.com> * Removed all sqlalchemy related stuff * generated the lock file * Fixed failing tests * removed poetry.lock * Updated the lock file * Fixed poetry numpy 2.2.2 issue * Workflow fixes --------- Signed-off-by: Jacky Hu <jacky.hu@databricks.com> Co-authored-by: Jacky Hu <jacky.hu@databricks.com> * Removed CI CD for python3.8 (#490) * Removed python3.8 support * Minor fix * Added CI CD upto python 3.12 (#491) Support for Py till 3.12 * Merging changes from v3.7.1 release (#488) * Increased the number of retry attempts allowed (#486) Updated the number of attempts allowed * bump version to 3.7.1 (#487) bumped up version * Refractore * Minor change * Bumped up to version 4.0.0 (#493) bumped up the version * Updated action's version (#455) Updated actions version. Signed-off-by: Arata Hatori <newwingbird@gmail.com> * Support Python 3.13 and update deps (#510) * Remove upper caps on dependencies (#452) * Remove upper caps on numpy and pyarrow versions Signed-off-by: David Black <dblack@atlassian.com> * Added CI CD upto python 3.13 Signed-off-by: David Black <dblack@atlassian.com> * Specify pandas 2.2.3 as the lower bound for python 3.13 Signed-off-by: David Black <dblack@atlassian.com> * Specify pyarrow 18.0.0 as the lower bound for python 3.13 Signed-off-by: David Black <dblack@atlassian.com> * Move `numpy` to dev dependencies Signed-off-by: Dave Hirschfeld <dave.hirschfeld@gmail.com> * Updated lockfile Signed-off-by: Dave Hirschfeld <dave.hirschfeld@gmail.com> --------- Signed-off-by: David Black <dblack@atlassian.com> Signed-off-by: Dave Hirschfeld <dave.hirschfeld@gmail.com> Co-authored-by: David Black <dblack@atlassian.com> * Improve debugging + fix PR review template (#514) * Improve debugging + add PR review template * case sensitivity of PR template * Forward porting all changes into 4.x.x. uptil v3.7.3 (#529) * Base changes * Black formatter * Cache version fix * Added the changed test_retry.py file * retry_test_mixins changes * Updated the CODEOWNERS (#531) Updated the codeowners * Add version check for urllib3 in backoff calculation (#526) Signed-off-by: Shivam Raj <shivam.raj@databricks.com> * [ES-1372353] make user_agent_header part of public API (#530) * make user_agent_header part of public API * removed user_agent_entry from list of internal params * add backward compatibility * Updates runner used to run DCO check to use databricks-protected-runner (#521) * commit 1 Signed-off-by: Madhav Sainanee <madhav.sainanee@databricks.com> * commit 1 Signed-off-by: Madhav Sainanee <madhav.sainanee@databricks.com> * updates runner for dco check Signed-off-by: Madhav Sainanee <madhav.sainanee@databricks.com> * removes contributing file changes Signed-off-by: Madhav Sainanee <madhav.sainanee@databricks.com> --------- Signed-off-by: Madhav Sainanee <madhav.sainanee@databricks.com> * Support multiple timestamp formats in non arrow flow (#533) * Added check for 2 formats * Wrote unit tests * Added more supporting formats * Added the T format datetime * Added more timestamp formats * Added python-dateutil library * prepare release for v4.0.1 (#534) Signed-off-by: Shivam Raj <shivam.raj@databricks.com> * Relaxed bound for python-dateutil (#538) Changed bound for python-datetutil * Bumped up the version for 4.0.2 (#539) * Added example for async execute query (#537) Added examples and fixed the async execute not working without pyarrow * Added urllib3 version check (#547) * Added version check * Removed packaging * Bump version to 4.0.3 (#549) Updated the version to 4.0.3 * Cleanup fields as they might be deprecated/removed/change in the future (#553) * Clean thrift files Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com> * Refactor decimal conversion in PyArrow tables to use direct casting (#544) This PR replaces the previous implementation of convert_decimals_in_arrow_table() with a more efficient approach that uses PyArrow's native casting operation instead of going through pandas conversion and array creation. - Remove conversion to pandas DataFrame via to_pandas() and apply() methods - Remove intermediate steps of creating array from decimal column and setting it back - Replace with direct type casting using PyArrow's cast() method - Build a new table with transformed columns rather than modifying the original table - Create a new schema based on the modified fields The new approach is more performant by avoiding pandas conversion overhead. The table below highlights substantial performance improvements when retrieving all rows from a table containing decimal columns, particularly when compression is disabled. Even greater gains were observed with compression enabled—showing approximately an 84% improvement (6 seconds compared to 39 seconds). Benchmarking was performed against e2-dogfood, with the client located in the us-west-2 region. ![image](https://github.com/user-attachments/assets/5407b651-8ab6-4c13-b525-cf912f503ba0) Signed-off-by: Jayant Singh <jayant.singh@databricks.com> * [PECOBLR-361] convert column table to arrow if arrow present (#551) * Update CODEOWNERS (#562) new codeowners * Enhance Cursor close handling and context manager exception management to prevent server side resource leaks (#554) * Enhance Cursor close handling and context manager exception management * tests * fmt * Fix Cursor.close() to properly handle CursorAlreadyClosedError * Remove specific test message from Cursor.close() error handling * Improve error handling in connection and cursor context managers to ensure proper closure during exceptions, including KeyboardInterrupt. Add tests for nested cursor management and verify operation closure on server-side errors. * add * add * PECOBLR-86 improve logging on python driver (#556) * PECOBLR-86 Improve logging for debug level Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * PECOBLR-86 Improve logging for debug level Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * fixed format Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * used lazy logging Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * changed debug to error logs Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * used lazy logging Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> --------- Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Update github actions run conditions (#569) More conditions to run github actions * Added classes required for telemetry (#572) * PECOBLR-86 Improve logging for debug level Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * PECOBLR-86 Improve logging for debug level Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * fixed format Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * used lazy logging Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * changed debug to error logs Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * added classes required for telemetry Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * removed TelemetryHelper Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * [PECOBLR-361] convert column table to arrow if arrow present (#551) Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Update CODEOWNERS (#562) new codeowners Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Enhance Cursor close handling and context manager exception management to prevent server side resource leaks (#554) * Enhance Cursor close handling and context manager exception management * tests * fmt * Fix Cursor.close() to properly handle CursorAlreadyClosedError * Remove specific test message from Cursor.close() error handling * Improve error handling in connection and cursor context managers to ensure proper closure during exceptions, including KeyboardInterrupt. Add tests for nested cursor management and verify operation closure on server-side errors. * add * add Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * PECOBLR-86 improve logging on python driver (#556) * PECOBLR-86 Improve logging for debug level Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * PECOBLR-86 Improve logging for debug level Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * fixed format Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * used lazy logging Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * changed debug to error logs Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * used lazy logging Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> --------- Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Update github actions run conditions (#569) More conditions to run github actions Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * Added classes required for telemetry Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * fixed example Signed-off-by: Sai Shree Pradhan <saishree.pradhan@databricks.com> * chan…

jprakash-db added 2 commits August 14, 2024 14:56

Modified the gitignore file to not have .idea file

9cb1ea3

jprakash-db requested a review from gopalldb September 24, 2024 05:04

jprakash-db self-assigned this Sep 24, 2024

jprakash-db requested review from rcypher-databricks, yunbodeng-db, andrefurlan-db, jackyhu-db, benc-db and kravets-levko as code owners September 24, 2024 05:04

Changed the folder structure such that sqlalchemy has not reference here

a022590

jprakash-db had a problem deploying to azure-prod September 25, 2024 17:11 — with GitHub Actions Failure

jprakash-db added 2 commits October 8, 2024 12:15

Fixed README.md and CONTRIBUTING.md

af47301

Added manual publish

64b2818

jprakash-db had a problem deploying to azure-prod October 8, 2024 19:02 — with GitHub Actions Failure

On push trigger added

44b52ac

jprakash-db had a problem deploying to azure-prod October 8, 2024 19:28 — with GitHub Actions Failure

Manually setting the publish step

8db3fd0

jprakash-db had a problem deploying to azure-prod October 8, 2024 19:34 — with GitHub Actions Failure

Changed versioning in pyproject.toml

3d1ef79

jprakash-db had a problem deploying to azure-prod October 17, 2024 05:22 — with GitHub Actions Failure

Bumped up the version to 4.0.0.b3 and also changed the structure to h…

ee7f1e3

…ave pyarrow as optional

jprakash-db had a problem deploying to azure-prod November 6, 2024 08:04 — with GitHub Actions Failure

Removed the sqlalchemy tests from integration.yml file

608d237

jprakash-db temporarily deployed to azure-prod November 11, 2024 17:07 — with GitHub Actions Inactive

[PECO-1803] Print warning message if pyarrow is not installed (#468)

85af9c0

Print warning message if pyarrow is not installed Signed-off-by: Jacky Hu <jacky.hu@databricks.com>

jprakash-db temporarily deployed to azure-prod November 13, 2024 04:48 — with GitHub Actions Inactive

[PECO-1803] Remove sqlalchemy and update README.md (#469)

38ffa95

Remove sqlalchemy and update README.md Signed-off-by: Jacky Hu <jacky.hu@databricks.com>

jprakash-db temporarily deployed to azure-prod November 13, 2024 05:12 — with GitHub Actions Inactive

Removed all sqlalchemy related stuff

6ce555a

jprakash-db had a problem deploying to azure-prod December 10, 2024 09:02 — with GitHub Actions Failure

jprakash-db changed the title ~~PySQL Connector split into core and non core part~~ PySQL Connector split into connector and sqlalchemy Dec 11, 2024

jprakash-db had a problem deploying to azure-prod December 11, 2024 06:32 — with GitHub Actions Failure

removed poetry.lock

e4205cc

jprakash-db had a problem deploying to azure-prod December 11, 2024 06:37 — with GitHub Actions Failure

Updated the lock file

3853b76

jprakash-db had a problem deploying to azure-prod December 11, 2024 06:39 — with GitHub Actions Failure

Fixed poetry numpy 2.2.2 issue

8f70b5b

jprakash-db temporarily deployed to azure-prod December 11, 2024 06:55 — with GitHub Actions Inactive

jackyhu-db approved these changes Dec 11, 2024

View reviewed changes

jackyhu-db reviewed Dec 11, 2024

View reviewed changes

.github/workflows/code-quality-checks.yml Outdated Show resolved Hide resolved

Workflow fixes

3fc4e01

jprakash-db temporarily deployed to azure-prod December 26, 2024 07:08 — with GitHub Actions Inactive

Fixed merge conflicts

a63ece8

jprakash-db temporarily deployed to azure-prod December 26, 2024 07:13 — with GitHub Actions Inactive

jackyhu-db reviewed Dec 26, 2024

View reviewed changes

pyproject.toml Show resolved Hide resolved

jprakash-db merged commit 01e998c into main Dec 27, 2024
16 of 17 checks passed

dhirschfeld mentioned this pull request Feb 19, 2025

Allow installing with numpy>=2 #509

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PySQL Connector split into connector and sqlalchemy #444

PySQL Connector split into connector and sqlalchemy #444

Uh oh!

jprakash-db commented Sep 24, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Dec 11, 2024

Uh oh!

github-actions bot commented Dec 11, 2024

Uh oh!

github-actions bot commented Dec 11, 2024

Uh oh!

Uh oh!

github-actions bot commented Dec 26, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

PySQL Connector split into connector and sqlalchemy #444

PySQL Connector split into connector and sqlalchemy #444

Uh oh!

Conversation

jprakash-db commented Sep 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Major Change - v4.x.x

Related Links

Description

The Split

databricks-sql-python

databricks-sqlalchemy

Published Library on PyPi

Development Details

PR Details

Tasks Completed

How to Test

Performance Comparison - Benchmarking

Uh oh!

github-actions bot commented Dec 11, 2024

Uh oh!

github-actions bot commented Dec 11, 2024

Uh oh!

github-actions bot commented Dec 11, 2024

Uh oh!

Uh oh!

github-actions bot commented Dec 26, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jprakash-db commented Sep 24, 2024 •

edited

Loading