-
Notifications
You must be signed in to change notification settings - Fork 160
Support dev and staging workspaces #514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a700edb
to
f33cdd6
Compare
f33cdd6
to
1efaea2
Compare
super().__init__() | ||
self._content = iter(content) | ||
|
||
def iter_content(self, chunk_size: int = 1) -> Iterator[bytes]: | ||
def iter_content(self, chunk_size: int = 1, decode_unicode=False) -> Iterator[bytes]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes some yellow squigglies in IntelliJ.
from .retries import retried | ||
from .version import __version__ | ||
|
||
__all__ = ['Config', 'DatabricksError'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm relying on credentials_provider
and couple of other classes from core
. don't break it. Please verify that https://github.com/databrickslabs/ucx still works before this PR is merged. If you make a PR to accommodate these changes - also fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just added imports in core.py
to make sure all credentials_provider.py and config.py classes are reexported. Manually tested and was able to import those definitions after installing.
ALL_ENVS = [ | ||
DatabricksEnvironment(Cloud.AWS, ".dev.databricks.com"), | ||
DatabricksEnvironment(Cloud.AWS, ".staging.cloud.databricks.com"), | ||
DatabricksEnvironment(Cloud.AWS, ".cloud.databricks.us"), DEFAULT_ENVIRONMENT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, put DEFAULT_ENVIRONMENT in a new line (maybe in the first line).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was done by our formatter, so I think I will leave this as is.
@@ -15,19 +15,15 @@ class AzureEnvironment: | |||
ARM_DATABRICKS_RESOURCE_ID = "2ff814a6-3304-4ab8-85cb-cd0e6f879c1d" | |||
|
|||
ENVIRONMENTS = dict( | |||
PUBLIC=AzureEnvironment(name="AzurePublicCloud", | |||
PUBLIC=AzureEnvironment(name="PUBLIC", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I see, the name is not really used. Correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, users don't need to specify this if they specify the host name, since we can derive the Azure environment from the hostname. However, if users only specify the workspace resource ID and are deploying in either China or Govcloud, they need to specify either ARM_ENVIRONMENT
or the azure_environment
parameter in WorkspaceClient/AccountClient/Config.
This PR does change this (before you would specify AzurePublicCloud
and now you would specify PUBLIC
or public
), aligning with the Azure TF provider and with the Go SDK.
I added a test to cover this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up: I added back compatibility for the old style of environment names just in case, so customers specifying these parameters now will not need to change.
resource_manager_endpoint="https://management.microsoftazure.de/", | ||
active_directory_endpoint="https://login.microsoftonline.de/"), | ||
USGOVERNMENT=AzureEnvironment(name="AzureUSGovernmentCloud", | ||
USGOVERNMENT=AzureEnvironment(name="USGOVERNMENT", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we remove the German cloud?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was added unnecessarily. Databricks is not run in the German cloud, and we don't include it in the Go SDK either. It is purely in maintenance mode since 2018: https://learn.microsoft.com/en-us/azure/germany/.
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #514 +/- ##
==========================================
+ Coverage 57.69% 57.77% +0.07%
==========================================
Files 40 43 +3
Lines 27148 27222 +74
==========================================
+ Hits 15664 15728 +64
- Misses 11484 11494 +10 ☔ View full report in Codecov by Sentry. |
Cloned ucx
, ran make test
, installed this version of the SDK in the test virtualenv, and reran unit tests, which passed.
Bugfixes: * Fix Databricks OAuth M2M on Azure ([#513](#513)). Other noteworthy changes: * Use `[]` instead of `None` as default list value for deserialising responses ([#361](#361)). * Support dev and staging workspaces ([#514](#514)). API Changes: * Added `exists()` method for [w.tables](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/tables.html) workspace-level service. * Added [w.lakehouse_monitors](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/lakehouse_monitors.html) workspace-level service. * Added `databricks.sdk.service.catalog.CreateMonitor` dataclass. * Added `databricks.sdk.service.catalog.DeleteLakehouseMonitorRequest` dataclass. * Added `databricks.sdk.service.catalog.ExistsRequest` dataclass. * Added `databricks.sdk.service.catalog.GetLakehouseMonitorRequest` dataclass. * Added `databricks.sdk.service.catalog.MonitorCronSchedule` dataclass. * Added `databricks.sdk.service.catalog.MonitorCronSchedulePauseStatus` dataclass. * Added `databricks.sdk.service.catalog.MonitorCustomMetric` dataclass. * Added `databricks.sdk.service.catalog.MonitorCustomMetricType` dataclass. * Added `databricks.sdk.service.catalog.MonitorDataClassificationConfig` dataclass. * Added `databricks.sdk.service.catalog.MonitorDestinations` dataclass. * Added `databricks.sdk.service.catalog.MonitorInferenceLogProfileType` dataclass. * Added `databricks.sdk.service.catalog.MonitorInferenceLogProfileTypeProblemType` dataclass. * Added `databricks.sdk.service.catalog.MonitorInfo` dataclass. * Added `databricks.sdk.service.catalog.MonitorInfoStatus` dataclass. * Added `databricks.sdk.service.catalog.MonitorNotificationsConfig` dataclass. * Added `databricks.sdk.service.catalog.MonitorTimeSeriesProfileType` dataclass. * Added `databricks.sdk.service.catalog.TableExistsResponse` dataclass. * Added `databricks.sdk.service.catalog.UpdateMonitor` dataclass. * Changed `create_obo_token()` method for [w.token_management](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/token_management.html) workspace-level service with new required argument order. * Changed `get()` method for [w.token_management](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/token_management.html) workspace-level service to return `databricks.sdk.service.settings.GetTokenResponse` dataclass. * Changed `lifetime_seconds` field for `databricks.sdk.service.settings.CreateOboTokenRequest` to no longer be required. * Added `databricks.sdk.service.settings.GetTokenResponse` dataclass. OpenAPI SHA: e05401ed5dd4974c5333d737ec308a7d451f749f, Date: 2024-01-23
Bugfixes: * Fix Databricks OAuth M2M on Azure ([#513](#513)). Other noteworthy changes: * Use `[]` instead of `None` as default list value for deserialising responses ([#361](#361)). * Support dev and staging workspaces ([#514](#514)). API Changes: * Added `exists()` method for [w.tables](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/tables.html) workspace-level service. * Added [w.lakehouse_monitors](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/lakehouse_monitors.html) workspace-level service. * Added the following dataclasses: `databricks.sdk.service.catalog.CreateMonitor`, `databricks.sdk.service.catalog.DeleteLakehouseMonitorRequest`, `databricks.sdk.service.catalog.ExistsRequest`, `databricks.sdk.service.catalog.GetLakehouseMonitorRequest`, `databricks.sdk.service.catalog.MonitorCronSchedule`, `databricks.sdk.service.catalog.MonitorCronSchedulePauseStatus`, `databricks.sdk.service.catalog.MonitorCustomMetric`, `databricks.sdk.service.catalog.MonitorCustomMetricType`, `databricks.sdk.service.catalog.MonitorDataClassificationConfig`, `databricks.sdk.service.catalog.MonitorDestinations`, `databricks.sdk.service.catalog.MonitorInferenceLogProfileType`, `databricks.sdk.service.catalog.MonitorInferenceLogProfileTypeProblemType`, `databricks.sdk.service.catalog.MonitorInfo`, `databricks.sdk.service.catalog.MonitorInfoStatus`, `databricks.sdk.service.catalog.MonitorNotificationsConfig`, `databricks.sdk.service.catalog.MonitorTimeSeriesProfileType`, `databricks.sdk.service.catalog.TableExistsResponse` and `databricks.sdk.service.catalog.UpdateMonitor`. * Changed `create_obo_token()` method for [w.token_management](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/token_management.html) workspace-level service with new required argument order. * Changed `get()` method for [w.token_management](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/token_management.html) workspace-level service to return `databricks.sdk.service.settings.GetTokenResponse` dataclass. * Changed `lifetime_seconds` field for `databricks.sdk.service.settings.CreateOboTokenRequest` to no longer be required. * Added `databricks.sdk.service.settings.GetTokenResponse` dataclass. OpenAPI SHA: e05401ed5dd4974c5333d737ec308a7d451f749f, Date: 2024-01-23
Changes
Copies the DatabricksEnvironment abstraction from the Go SDK to the Python SDK. This enables using the SDK for Azure development and staging workspaces, which use an alternate Azure login application ID.
Additionally, this PR changes
azure_environment
to accept the same values as the Go SDK and Terraform providers for both Databricks and Azure, i.e.PUBLIC
,USGOVERNMENT
, andCHINA
.The first commit of this PR is a refactor moving config & credentials provider code to separate files, as core.py is getting quite big and difficult to maintain. The second one is much smaller and has just the net new bits.
Tests