Skip to content

Commit

Permalink
Merge branch 'skypilot-org:master' into master
Browse files Browse the repository at this point in the history
  • Loading branch information
asaiacai authored Oct 7, 2024
2 parents e9f3112 + d5b6d89 commit 4294493
Show file tree
Hide file tree
Showing 6 changed files with 75 additions and 18 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ This repository is a fork of the [original Skypilot](https://github.com/skypilot

----
:fire: *News* :fire:
- [Sep, 2024] Point, Launch and Serve **Llama 3.2** on on Kubernetes or Any Cloud: [**example**](./llm/llama-3_2/)
- [Sep, 2024] Point, Launch and Serve **Llama 3.2** on Kubernetes or Any Cloud: [**example**](./llm/llama-3_2/)
- [Sep, 2024] Run and deploy [**Pixtral**](./llm/pixtral), the first open-source multimodal model from Mistral AI.
- [Jul, 2024] [**Finetune**](./llm/llama-3_1-finetuning/) and [**serve**](./llm/llama-3_1/) **Llama 3.1** on your infra
- [Jun, 2024] Reproduce **GPT** with [llm.c](https://github.com/karpathy/llm.c/discussions/481) on any cloud: [**guide**](./llm/gpt-2/)
Expand Down
1 change: 1 addition & 0 deletions docs/source/getting-started/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,7 @@ Fluidstack
~~~~~~~~~~~~~~~~~~

`Fluidstack <https://fluidstack.io/>`__ is a cloud provider offering low-cost GPUs. To configure Fluidstack access, go to the `Home <https://dashboard.fluidstack.io/>`__ page on your Fluidstack console to generate an API key and then add the :code:`API key` to :code:`~/.fluidstack/api_key` :

.. code-block:: shell
mkdir -p ~/.fluidstack
Expand Down
48 changes: 45 additions & 3 deletions docs/source/reference/kubernetes/kubernetes-ports.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _kubernetes-ports:

Exposing Services on Kubernetes
-------------------------------
===============================

.. note::
This is a guide on how to configure an existing Kubernetes cluster (along with the caveats involved) to successfully expose ports and services externally through SkyPilot.
Expand All @@ -23,7 +23,7 @@ If your cluster does not support LoadBalancer services, SkyPilot can also use `a
.. _kubernetes-loadbalancer:

LoadBalancer Service
^^^^^^^^^^^^^^^^^^^^
--------------------

This mode exposes ports through a Kubernetes `LoadBalancer Service <https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer>`__. This is the default mode used by SkyPilot.

Expand Down Expand Up @@ -52,11 +52,53 @@ These load balancers will be automatically terminated when the cluster is delete

To work around this issue, make sure all your ports have services running behind them.

Internal Load Balancers
^^^^^^^^^^^^^^^^^^^^^^^

To restrict your services to be accessible only within the cluster, you can set all SkyPilot services to use `internal load balancers <https://kubernetes.io/docs/concepts/services-networking/service/#internal-load-balancer>`_.

Depending on your cloud, set the appropriate annotation in the SkyPilot config file (``~/.sky/config.yaml``):

.. tab-set::

.. tab-item:: GCP
:sync: internal-lb-gke

.. code-block:: yaml
# ~/.sky/config.yaml
kubernetes:
custom_metadata:
annotations:
networking.gke.io/load-balancer-type: "Internal"
.. tab-item:: AWS
:sync: internal-lb-aws

.. code-block:: yaml
# ~/.sky/config.yaml
kubernetes:
custom_metadata:
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
.. tab-item:: Azure
:sync: internal-lb-azure

.. code-block:: yaml
# ~/.sky/config.yaml
kubernetes:
custom_metadata:
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
.. _kubernetes-ingress:

Nginx Ingress
^^^^^^^^^^^^^
-------------

This mode exposes ports by creating a Kubernetes `Ingress <https://kubernetes.io/docs/concepts/services-networking/ingress/>`_ backed by an existing `Nginx Ingress Controller <https://kubernetes.github.io/ingress-nginx/>`_.

Expand Down
37 changes: 23 additions & 14 deletions sky/serve/controller.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,15 @@
Responsible for autoscaling and replica management.
"""
import contextlib
import logging
import threading
import time
import traceback
from typing import Any, Dict, List

import fastapi
from fastapi import responses
import uvicorn

from sky import serve
Expand Down Expand Up @@ -49,7 +51,14 @@ def __init__(self, service_name: str, service_spec: serve.SkyServiceSpec,
autoscalers.Autoscaler.from_spec(service_name, service_spec))
self._host = host
self._port = port
self._app = fastapi.FastAPI()
self._app = fastapi.FastAPI(lifespan=self.lifespan)

@contextlib.asynccontextmanager
async def lifespan(self, _: fastapi.FastAPI):
uvicorn_access_logger = logging.getLogger('uvicorn.access')
for handler in uvicorn_access_logger.handlers:
handler.setFormatter(sky_logging.FORMATTER)
yield

def _run_autoscaler(self):
logger.info('Starting autoscaler.')
Expand Down Expand Up @@ -88,26 +97,30 @@ def _run_autoscaler(self):
def run(self) -> None:

@self._app.post('/controller/load_balancer_sync')
async def load_balancer_sync(request: fastapi.Request):
async def load_balancer_sync(
request: fastapi.Request) -> fastapi.Response:
request_data = await request.json()
# TODO(MaoZiming): Check aggregator type.
request_aggregator: Dict[str, Any] = request_data.get(
'request_aggregator', {})
timestamps: List[int] = request_aggregator.get('timestamps', [])
logger.info(f'Received {len(timestamps)} inflight requests.')
self._autoscaler.collect_request_information(request_aggregator)
return {
return responses.JSONResponse(content={
'ready_replica_urls':
self._replica_manager.get_active_replica_urls()
}
},
status_code=200)

@self._app.post('/controller/update_service')
async def update_service(request: fastapi.Request):
async def update_service(request: fastapi.Request) -> fastapi.Response:
request_data = await request.json()
try:
version = request_data.get('version', None)
if version is None:
return {'message': 'Error: version is not specified.'}
return responses.JSONResponse(
content={'message': 'Error: version is not specified.'},
status_code=400)
update_mode_str = request_data.get(
'mode', serve_utils.DEFAULT_UPDATE_MODE.value)
update_mode = serve_utils.UpdateMode(update_mode_str)
Expand Down Expand Up @@ -136,17 +149,13 @@ async def update_service(request: fastapi.Request):
self._autoscaler.update_version(version,
service,
update_mode=update_mode)
return {'message': 'Success'}
return responses.JSONResponse(content={'message': 'Success'},
status_code=200)
except Exception as e: # pylint: disable=broad-except
logger.error(f'Error in update_service: '
f'{common_utils.format_exception(e)}')
return {'message': 'Error'}

@self._app.on_event('startup')
def configure_logger():
uvicorn_access_logger = logging.getLogger('uvicorn.access')
for handler in uvicorn_access_logger.handlers:
handler.setFormatter(sky_logging.FORMATTER)
return responses.JSONResponse(content={'message': 'Error'},
status_code=500)

threading.Thread(target=self._run_autoscaler).start()

Expand Down
4 changes: 4 additions & 0 deletions sky/serve/serve_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,10 @@ def update_service_encoded(service_name: str, version: int, mode: str) -> str:
raise ValueError('The service is up-ed in an old version and does not '
'support update. Please `sky serve down` '
'it first and relaunch the service. ')
elif resp.status_code == 400:
raise ValueError(f'Client error during service update: {resp.text}')
elif resp.status_code == 500:
raise RuntimeError(f'Server error during service update: {resp.text}')
elif resp.status_code != 200:
raise ValueError(f'Failed to update service: {resp.text}')

Expand Down
1 change: 1 addition & 0 deletions sky/utils/subprocess_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ def handle_returncode(returncode: int,
command: The command that was run.
error_msg: The error message to print.
stderr: The stderr of the command.
stream_logs: Whether to stream logs.
"""
echo = logger.error if stream_logs else logger.debug
if returncode != 0:
Expand Down

0 comments on commit 4294493

Please sign in to comment.