Spark version failing with Value Error in GCP #378

dipankarkush-db · 2023-10-04T00:02:22Z

Description
Call to "job_clusters": self._job_clusters({t.job_cluster for t in tasks} is failing while trying to select a specific spark version.

Error Message - ValueError: Not a valid SemVer: v8.x-snapshot-scala2.12```

Reproduction
while installing https://github.com/databrickslabs/ucx using ./install.sh

Expected behavior
It should install the ucx package components in the GCP workspace.

Debug Logs
The SDK logs helpful debugging information when debug logging is enabled. Set the log level to debug by adding logging.basicConfig(level=logging.DEBUG) to your program, and include the logs here.

Select PRO or SERVERLESS SQL warehouse to run assessment dashboards on
[0] Lineage (50b5cfe92de3c79e, PRO, STOPPED)
[1] Private Preview: Query Federation Pro Warehouse (51e8635b408aff71, PRO, STOPPED)
[2] [Create new PRO SQL warehouse]
[3] [Cypress:create_endpoint_spec] Lakehouse 2iPCWLtCsvMUTHQ5x5xzFj (d44854f0689f6e25, PRO, STOPPED)
[4] dkushari-sql-wh (630f38063e0a6830, PRO, STOPPED)
[5] james-test-warehouse (e6e9543284b5e084, PRO, STOPPED)
[6] lincoln-test (5ff48f9c6a2756f8, PRO, STOPPED)
[7] sql-endpoint-dust-835755732 (076260cf8113b75b, PRO, STOPPED)
Enter a number between 0 and 7: 4
Comma-separated list of workspace group names to migrate. If not specified, we'll wse all account-level groups with matching names to workspace-level groups. (default: <ALL>):
Backup prefix (default: db-temp-):
Log level (default: INFO): DEBUG
Number of threads (default: 8):
19:57  INFO [_] Creating configuration file: /Users/dipankar.kushari@databricks.com/.ucx/config.yml
Open config file in the browser and continue installing? (default: yes):
19:58  INFO [_] Uploading wheel to dbfs:/Users/dipankar.kushari@databricks.com/.ucx/wheels/databricks_labs_ucx-0.1.1-py3-none-any.whl
19:58  INFO [_] Uploading wheel to /Workspace/Users/dipankar.kushari@databricks.com/.ucx/wheels/databricks_labs_ucx-0.1.1-py3-none-any.whl
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/dipankar.kushari/Library/CloudStorage/GoogleDrive-dipankar.kushari@databricks.com/My Drive/10x/UC/UCX/ucx/src/databricks/labs/ucx/install.py", line 547, in <module>
    installer.run()
  File "/Users/dipankar.kushari/Library/CloudStorage/GoogleDrive-dipankar.kushari@databricks.com/My Drive/10x/UC/UCX/ucx/src/databricks/labs/ucx/install.py", line 77, in run
    self._run_configured()
  File "/Users/dipankar.kushari/Library/CloudStorage/GoogleDrive-dipankar.kushari@databricks.com/My Drive/10x/UC/UCX/ucx/src/databricks/labs/ucx/install.py", line 81, in _run_configured
    self._create_jobs()
  File "/Users/dipankar.kushari/Library/CloudStorage/GoogleDrive-dipankar.kushari@databricks.com/My Drive/10x/UC/UCX/ucx/src/databricks/labs/ucx/install.py", line 249, in _create_jobs
    settings = self._job_settings(step_name, remote_wheel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dipankar.kushari/Library/CloudStorage/GoogleDrive-dipankar.kushari@databricks.com/My Drive/10x/UC/UCX/ucx/src/databricks/labs/ucx/install.py", line 397, in _job_settings
    "job_clusters": self._job_clusters({t.job_cluster for t in tasks}),
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dipankar.kushari/Library/CloudStorage/GoogleDrive-dipankar.kushari@databricks.com/My Drive/10x/UC/UCX/ucx/src/databricks/labs/ucx/install.py", line 453, in _job_clusters
    spark_version=self._ws.clusters.select_spark_version(latest=True),
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/private/var/folders/76/zv4jmwyd2djf6c35f2nly9br0000gp/T/tmp.4DZ2CvPM/lib/python3.11/site-packages/databricks/sdk/mixins/compute.py", line 113, in select_spark_version
    versions = sorted(versions, key=SemVer.parse, reverse=True)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/private/var/folders/76/zv4jmwyd2djf6c35f2nly9br0000gp/T/tmp.4DZ2CvPM/lib/python3.11/site-packages/databricks/sdk/mixins/compute.py", line 39, in parse
    raise ValueError(f'Not a valid SemVer: {v}')
ValueError: Not a valid SemVer: v8.x-snapshot-scala2.12

Other Information

macOS
Ventura 13.5.2

Additional context
Add any other context about the problem here.

The text was updated successfully, but these errors were encountered:

mgyucht · 2023-10-04T13:40:15Z

Thanks for reporting this. This duplicates #352. We'll follow-up on that issue instead.

## Changes In staging, there are some spark versions that do not have a patch version. We fall back to a different pattern with no patch component for these. Additionally, we check if the builds are equal to one another (which allows us to avoid comparing one < the other in the case where they are both None). Closes #352 and #378. ## Tests  - [x] Added a unit test with a format that was previously not supported. - [x] Hand-wrote an example using select_spark_versions(). This failed on `main` but passed on my branch when targeting a staging workspace.

dipankarkush-db mentioned this issue Oct 4, 2023

Install in GCP workspace fails databrickslabs/ucx#372

Closed

mgyucht closed this as not planned Won't fix, can't repro, duplicate, stale Oct 4, 2023

mgyucht mentioned this issue Oct 5, 2023

Fix select spark version in staging #388

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spark version failing with Value Error in GCP #378

Spark version failing with Value Error in GCP #378

dipankarkush-db commented Oct 4, 2023

mgyucht commented Oct 4, 2023

Spark version failing with Value Error in GCP #378

Spark version failing with Value Error in GCP #378

Comments

dipankarkush-db commented Oct 4, 2023

mgyucht commented Oct 4, 2023