Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAB pipeline for LakeFlow Connect - unknown field types - dbr_version not updating for custom image #2272

Open
sysopmatt opened this issue Jan 31, 2025 · 2 comments
Labels
DABs DABs related issues Response Requested

Comments

@sysopmatt
Copy link

Describe the issue

We are implementing LakeFlow Connect SQL Replication. When deploying with DABs, we are seeing some configurations that appear to WARN as unknown field types. One does not seem to be a functional issue (source_type), but the other is a functional issue because it is not updating (dbr_version).

Warning: unknown field: source_type
...
Warning: unknown field: dbr_version

The source_type seems to actually work on initial deployments, and it deploys the gateway pipeline and ingestion pipeline just fine. I haven't ever tried changing this (it wouldn't make sense to do so), so whether it handles updates or not I'm unsure. It initially deployed correctly, even though it warned.

The dbr_version does not work when updating a pipeline (for this, I did not specify this on the initial deployment), and we need to update our pipeline to use the custom image due to an error that it has encountered on the default image (engineering provided us with the custom image). Ideally we should not need to destroy the pipeline and re-replicate the entire dataset just to update the image. Using the Python SDK I'm able to do exactly this, and update the gateway pipeline's image.

Configuration

resources:
  pipelines:
    pipeline_replication_gateway:
      name: replication_gateway 
      channel: PREVIEW
      clusters:
        - label: updates
          spark_conf:
            gateway.logging.level: INFO
          azure_attributes:
            first_on_demand: 1
            availability: SPOT_WITH_FALLBACK_AZURE
            spot_bid_max_price: -1
      gateway_definition:
        connection_name: sql_server
        gateway_storage_catalog: replicated
        gateway_storage_schema: lf_replication
        gateway_storage_name: lf_replication
        source_type: SQLSERVER
      target: lf_replication
      continuous: true
      catalog: replicated
      dbr_version: custom:redacted-version-name.lz4

Steps to reproduce the behavior

  1. Create gateway pipeline with default image
  2. Run databricks bundle deploy ...
  3. Update config to add dbr_version for custom image
  4. Run databricks bundle deploy ... to update pipeline
  5. See warning
  6. Check pipeline and notice the custom dbr_version is not set

Expected Behavior

The gateway pipeline should restart with the custom image specified in dbr_version

Actual Behavior

The gateway pipeline did not update its image.

OS and CLI version

Deploying with a VM Scale Set via Azure DevOps build pipeline.

Starting: Deploy bundle
==============================================================================
Task         : Command line
Description  : Run a command line script using Bash on Linux and macOS and cmd.exe on Windows
Version      : 2.246.1
Author       : Microsoft Corporation
Help         : https://docs.microsoft.com/azure/devops/pipelines/tasks/utility/command-line
==============================================================================
Generating script.
Script contents:
databricks bundle deploy -t prod --auto-approve

Is this a regression?

Did not test other versions

@sysopmatt sysopmatt added the DABs DABs related issues label Jan 31, 2025
@pietern
Copy link
Contributor

pietern commented Feb 3, 2025

Thanks for reporting the issue.

Which version of the CLI are you using?

The "unknown field" warning means that the CLI doesn't know about the property. One possibility is that the CLI version you're using was released prior to the field being added. Another possibility is that the product hasn't externalized this property yet.

@sysopmatt
Copy link
Author

sysopmatt commented Feb 6, 2025

My mistake. I copy/pasted in the wrong section for version.

Databricks CLI v0.239.0

I have confirmed as recently as yesterday that I'm able to use the Python SDK and update these pipelines to use the custom image. But any time a DAB redeploy happens, it reverts to the old image and does not adhere to the dbr_version image that I have in the asset bundle configuration file.

I have also tried deleting the pipeline entirely, and running the initial deployment of the pipeline using the custom image defined in dbr_version and it also does not set it in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DABs DABs related issues Response Requested
Projects
None yet
Development

No branches or pull requests

3 participants