Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Provisioner] Open ports on Azure occasionally failed #2627

Closed
cblmemo opened this issue Sep 29, 2023 · 1 comment
Closed

[Provisioner] Open ports on Azure occasionally failed #2627

cblmemo opened this issue Sep 29, 2023 · 1 comment
Assignees

Comments

@cblmemo
Copy link
Collaborator

cblmemo commented Sep 29, 2023

As mentioned in #2597, open ports in Azure will occasionally fail. To reproduce, enable australiaeast region in Azure and sky launch following YAML:

resources:
  # you could change this section to any resources you want, like a CPU VM
  # accelerators: V100:1
  ports: 8080
  cloud: azure
  region: australiaeast

# tabby base dir
workdir: .

setup: |
  # On some cloud providers, docker-compose is not installed by default.
  sudo curl -sS -L https://github.com/docker/compose/releases/download/v2.17.2/docker-compose-linux-x86_64 -o /usr/local/bin/docker-compose
  sudo chmod a+x /usr/local/bin/docker-compose

  # On certain cloud providers (e.g lambda cloud), the default user is not added to docker group, so we need sudo here
  sudo docker-compose pull > /dev/null 2>&1

  # Add current user to docker group, it won't take effect immediately as skypilot job is started by a long-running daemon.
  sudo usermod -aG docker $USER


run: |
  docker-compose down

  if nvidia-smi; then
    docker-compose -f docker-compose.yaml -f docker-compose.cuda.yaml up -d
  else
    docker-compose up -d
  fi

  while ! curl -s -X POST http://localhost:8080/v1/health; do
    echo "server not ready, waiting..."
    sleep 5
  done

  echo "tabby server is ready, enjoy!"

notice that not every time will fail. (I tried 3 times and 1 of them failed)

Cannot reproduce using sky launch --cloud azure --ports 8080 for now.

TODO: Make a minimal YAML that could reproduce this bug.

@Michaelvll
Copy link
Collaborator

This should be fixed by #2649. Closing this issue now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants