Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark version failing with Value Error in GCP #378

Closed
dipankarkush-db opened this issue Oct 4, 2023 · 1 comment
Closed

Spark version failing with Value Error in GCP #378

dipankarkush-db opened this issue Oct 4, 2023 · 1 comment

Comments

@dipankarkush-db
Copy link

Description
Call to "job_clusters": self._job_clusters({t.job_cluster for t in tasks} is failing while trying to select a specific spark version.

Error Message - ValueError: Not a valid SemVer: v8.x-snapshot-scala2.12```

Reproduction
while installing https://github.com/databrickslabs/ucx using ./install.sh

Expected behavior
It should install the ucx package components in the GCP workspace.

Debug Logs
The SDK logs helpful debugging information when debug logging is enabled. Set the log level to debug by adding logging.basicConfig(level=logging.DEBUG) to your program, and include the logs here.

Select PRO or SERVERLESS SQL warehouse to run assessment dashboards on
[0] Lineage (50b5cfe92de3c79e, PRO, STOPPED)
[1] Private Preview: Query Federation Pro Warehouse (51e8635b408aff71, PRO, STOPPED)
[2] [Create new PRO SQL warehouse]
[3] [Cypress:create_endpoint_spec] Lakehouse 2iPCWLtCsvMUTHQ5x5xzFj (d44854f0689f6e25, PRO, STOPPED)
[4] dkushari-sql-wh (630f38063e0a6830, PRO, STOPPED)
[5] james-test-warehouse (e6e9543284b5e084, PRO, STOPPED)
[6] lincoln-test (5ff48f9c6a2756f8, PRO, STOPPED)
[7] sql-endpoint-dust-835755732 (076260cf8113b75b, PRO, STOPPED)
Enter a number between 0 and 7: 4
Comma-separated list of workspace group names to migrate. If not specified, we'll wse all account-level groups with matching names to workspace-level groups. (default: <ALL>):
Backup prefix (default: db-temp-):
Log level (default: INFO): DEBUG
Number of threads (default: 8):
19:57  INFO [_] Creating configuration file: /Users/dipankar.kushari@databricks.com/.ucx/config.yml
Open config file in the browser and continue installing? (default: yes):
19:58  INFO [_] Uploading wheel to dbfs:/Users/dipankar.kushari@databricks.com/.ucx/wheels/databricks_labs_ucx-0.1.1-py3-none-any.whl
19:58  INFO [_] Uploading wheel to /Workspace/Users/dipankar.kushari@databricks.com/.ucx/wheels/databricks_labs_ucx-0.1.1-py3-none-any.whl
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/dipankar.kushari/Library/CloudStorage/GoogleDrive-dipankar.kushari@databricks.com/My Drive/10x/UC/UCX/ucx/src/databricks/labs/ucx/install.py", line 547, in <module>
    installer.run()
  File "/Users/dipankar.kushari/Library/CloudStorage/GoogleDrive-dipankar.kushari@databricks.com/My Drive/10x/UC/UCX/ucx/src/databricks/labs/ucx/install.py", line 77, in run
    self._run_configured()
  File "/Users/dipankar.kushari/Library/CloudStorage/GoogleDrive-dipankar.kushari@databricks.com/My Drive/10x/UC/UCX/ucx/src/databricks/labs/ucx/install.py", line 81, in _run_configured
    self._create_jobs()
  File "/Users/dipankar.kushari/Library/CloudStorage/GoogleDrive-dipankar.kushari@databricks.com/My Drive/10x/UC/UCX/ucx/src/databricks/labs/ucx/install.py", line 249, in _create_jobs
    settings = self._job_settings(step_name, remote_wheel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dipankar.kushari/Library/CloudStorage/GoogleDrive-dipankar.kushari@databricks.com/My Drive/10x/UC/UCX/ucx/src/databricks/labs/ucx/install.py", line 397, in _job_settings
    "job_clusters": self._job_clusters({t.job_cluster for t in tasks}),
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dipankar.kushari/Library/CloudStorage/GoogleDrive-dipankar.kushari@databricks.com/My Drive/10x/UC/UCX/ucx/src/databricks/labs/ucx/install.py", line 453, in _job_clusters
    spark_version=self._ws.clusters.select_spark_version(latest=True),
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/private/var/folders/76/zv4jmwyd2djf6c35f2nly9br0000gp/T/tmp.4DZ2CvPM/lib/python3.11/site-packages/databricks/sdk/mixins/compute.py", line 113, in select_spark_version
    versions = sorted(versions, key=SemVer.parse, reverse=True)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/private/var/folders/76/zv4jmwyd2djf6c35f2nly9br0000gp/T/tmp.4DZ2CvPM/lib/python3.11/site-packages/databricks/sdk/mixins/compute.py", line 39, in parse
    raise ValueError(f'Not a valid SemVer: {v}')
ValueError: Not a valid SemVer: v8.x-snapshot-scala2.12


Other Information

  • macOS
  • Ventura 13.5.2

Additional context
Add any other context about the problem here.

@mgyucht
Copy link
Contributor

mgyucht commented Oct 4, 2023

Thanks for reporting this. This duplicates #352. We'll follow-up on that issue instead.

@mgyucht mgyucht closed this as not planned Won't fix, can't repro, duplicate, stale Oct 4, 2023
github-merge-queue bot pushed a commit that referenced this issue Oct 6, 2023
## Changes
In staging, there are some spark versions that do not have a patch
version. We fall back to a different pattern with no patch component for
these. Additionally, we check if the builds are equal to one another
(which allows us to avoid comparing one < the other in the case where
they are both None).

Closes #352 and #378.

## Tests
<!-- 
How is this tested? Please see the checklist below and also describe any
other relevant tests
-->

- [x] Added a unit test with a format that was previously not supported.
- [x] Hand-wrote an example using select_spark_versions(). This failed
on `main` but passed on my branch when targeting a staging workspace.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants