Skip to content

Conversation

@khluu
Copy link
Contributor

@khluu khluu commented Oct 20, 2025

  • Add Azure VM launcher release test
  • Change region for the Azure cluster to be in centralus since westus2 has trouble with availability.
  • Add helper function to authenticate with Azure using service principal in launch cluster script

khluu added 21 commits October 20, 2025 19:17
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
@khluu khluu changed the base branch from master to khluu/azure_cli_baseextra October 24, 2025 10:01
@khluu khluu changed the base branch from khluu/azure_cli_baseextra to master October 24, 2025 10:02
@khluu khluu requested a review from aslonnie October 24, 2025 10:03
@khluu khluu marked this pull request as ready for review October 24, 2025 10:03
@khluu khluu requested review from a team as code owners October 24, 2025 10:03
cursor[bot]

This comment was marked as outdated.

@ray-gardener ray-gardener bot added core Issues that should be addressed in Ray Core release-test release test labels Oct 24, 2025
@jjyao
Copy link
Collaborator

jjyao commented Oct 24, 2025

Could you paste a link of a success run?

Copy link
Collaborator

@aslonnie aslonnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me know when this is ready to review.

khluu added 3 commits October 24, 2025 20:31
Signed-off-by: kevin <kevin@anyscale.com>
p
Signed-off-by: kevin <kevin@anyscale.com>
@khluu khluu requested a review from aslonnie October 24, 2025 20:32
cursor[bot]

This comment was marked as outdated.


set -exo pipefail

pip3 install azure-cli-core==2.21.0 azure-core azure-identity azure-mgmt-compute azure-mgmt-network azure-mgmt-resource azure-common msrest msrestazure
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use depset for this?

I thought many of the deps are already installed in the image?

p
Signed-off-by: kevin <kevin@anyscale.com>
@aslonnie aslonnie added the go add ONLY when ready to merge, run all tests label Oct 27, 2025
@aslonnie aslonnie self-requested a review October 27, 2025 20:17
"--tenant",
tenant_id,
]
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Azure Auth Fails Without AWS Credentials

The azure_authenticate() function is called unconditionally for all Azure provider types, but it requires AWS credentials to access AWS Secrets Manager. This will fail when users try to run the script locally without AWS credentials, making the script difficult to use outside of CI. According to the PR discussion, this makes it "difficult to run the script from as someone who doesn't have access to the AWS bucket with the secrets in there." The authentication logic should be moved to a separate script or made optional to allow local development and testing.

Fix in Cursor Fix in Web

cluster_compute: azure/tests/azure_compute.yaml
run:
timeout: 2400
script: bash release/azure_docker_login.sh && python -I launch_and_verify_cluster.py azure/tests/azure-cluster.yaml --num-expected-nodes 3 --retries 10
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Incorrect Script Path in Azure Cluster Test

The azure_cluster_launcher test's script path release/azure_docker_login.sh is relative to the working_dir (../python/ray/autoscaler/). This path is incorrect, and the script won't be found.

Fix in Cursor Fix in Web

@aslonnie aslonnie merged commit 3b84611 into master Oct 27, 2025
6 checks passed
@aslonnie aslonnie deleted the khluu/azure_release_vm branch October 27, 2025 23:03
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
- Add Azure VM launcher release test
- Change region for the Azure cluster to be in `centralus` since
`westus2` has trouble with availability.
- Add helper function to authenticate with Azure using service principal
in launch cluster script

---------

Signed-off-by: kevin <kevin@anyscale.com>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
- Add Azure VM launcher release test
- Change region for the Azure cluster to be in `centralus` since
`westus2` has trouble with availability.
- Add helper function to authenticate with Azure using service principal
in launch cluster script

---------

Signed-off-by: kevin <kevin@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests release-test release test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants