-
Notifications
You must be signed in to change notification settings - Fork 512
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Core] Admin policy enforcement plugin (#3966)
* support policy hook * test task labels * Add test for policy that sets labels * Fix comment * format * use -e to make test related files visible * Add config.rst * Fix test * fix config rst * Apply policy to service * add policy for serving * Add docs * fix * format * Update interface * fix * Fix * fix * Fix test config * Fix mutated config * fix * Add policy doc * rename * minor * Add additional arguments for autostop * fix mypy * format * rejected message * format * Update sky/utils/policy_utils.py Co-authored-by: Zongheng Yang <zongheng.y@gmail.com> * Update sky/utils/policy_utils.py Co-authored-by: Zongheng Yang <zongheng.y@gmail.com> * Fix * Update examples/admin_policy/example_policy/example_policy/__init__.py Co-authored-by: Zongheng Yang <zongheng.y@gmail.com> * Update docs/source/reference/config.rst Co-authored-by: Zongheng Yang <zongheng.y@gmail.com> * Address comments * format * changes in examples * Fix enforce autostop * Fix autostop enforcement * fix test * Update docs/source/cloud-setup/policy.rst Co-authored-by: Zongheng Yang <zongheng.y@gmail.com> * Update sky/admin_policy.py Co-authored-by: Zongheng Yang <zongheng.y@gmail.com> * Update sky/admin_policy.py Co-authored-by: Zongheng Yang <zongheng.y@gmail.com> * wip * Update docs/source/cloud-setup/policy.rst Co-authored-by: Zongheng Yang <zongheng.y@gmail.com> * Update docs/source/cloud-setup/policy.rst Co-authored-by: Zongheng Yang <zongheng.y@gmail.com> * Update docs/source/cloud-setup/policy.rst Co-authored-by: Zongheng Yang <zongheng.y@gmail.com> * fix * fix * fix * Use sky.status for autostop * update policy * Update docs/source/cloud-setup/policy.rst Co-authored-by: Zongheng Yang <zongheng.y@gmail.com> * fix policy.rst * Add comment * Fix logging * fix CI * Update docs/source/cloud-setup/policy.rst Co-authored-by: Zongheng Yang <zongheng.y@gmail.com> * Use sphnix inline code * Add comment * fix skypilot config file mounts for jobs and serve --------- Co-authored-by: Zongheng Yang <zongheng.y@gmail.com>
- Loading branch information
1 parent
31c0a5c
commit 800f7d6
Showing
34 changed files
with
1,024 additions
and
139 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,195 @@ | ||
.. _advanced-policy-config: | ||
|
||
Admin Policy Enforcement | ||
======================== | ||
|
||
|
||
SkyPilot provides an **admin policy** mechanism that admins can use to enforce certain policies on users' SkyPilot usage. An admin policy applies | ||
custom validation and mutation logic to a user's tasks and SkyPilot config. | ||
|
||
Example usage: | ||
|
||
- :ref:`kubernetes-labels-policy` | ||
- :ref:`disable-public-ip-policy` | ||
- :ref:`use-spot-for-gpu-policy` | ||
- :ref:`enforce-autostop-policy` | ||
|
||
|
||
To implement and use an admin policy: | ||
|
||
- Admins writes a simple Python package with a policy class that implements SkyPilot's ``sky.AdminPolicy`` interface; | ||
- Admins distributes this package to users; | ||
- Users simply set the ``admin_policy`` field in the SkyPilot config file ``~/.sky/config.yaml`` for the policy to go into effect. | ||
|
||
|
||
Overview | ||
-------- | ||
|
||
|
||
|
||
User-Side | ||
~~~~~~~~~~ | ||
|
||
To apply the policy, a user needs to set the ``admin_policy`` field in the SkyPilot config | ||
``~/.sky/config.yaml`` to the path of the Python package that implements the policy. | ||
For example: | ||
|
||
.. code-block:: yaml | ||
admin_policy: mypackage.subpackage.MyPolicy | ||
.. hint:: | ||
|
||
SkyPilot loads the policy from the given package in the same Python environment. | ||
You can test the existence of the policy by running: | ||
|
||
.. code-block:: bash | ||
python -c "from mypackage.subpackage import MyPolicy" | ||
Admin-Side | ||
~~~~~~~~~~ | ||
|
||
An admin can distribute the Python package to users with a pre-defined policy. The | ||
policy should implement the ``sky.AdminPolicy`` `interface <https://github.com/skypilot-org/skypilot/blob/master/sky/admin_policy.py>`_: | ||
|
||
|
||
.. literalinclude:: ../../../sky/admin_policy.py | ||
:language: python | ||
:pyobject: AdminPolicy | ||
:caption: `AdminPolicy Interface <https://github.com/skypilot-org/skypilot/blob/master/sky/admin_policy.py>`_ | ||
|
||
|
||
Your custom admin policy should look like this: | ||
|
||
.. code-block:: python | ||
import sky | ||
class MyPolicy(sky.AdminPolicy): | ||
@classmethod | ||
def validate_and_mutate(cls, user_request: sky.UserRequest) -> sky.MutatedUserRequest: | ||
# Logic for validate and modify user requests. | ||
... | ||
return sky.MutatedUserRequest(user_request.task, | ||
user_request.skypilot_config) | ||
``UserRequest`` and ``MutatedUserRequest`` are defined as follows (see `source code <https://github.com/skypilot-org/skypilot/blob/master/sky/admin_policy.py>`_ for more details): | ||
|
||
|
||
.. literalinclude:: ../../../sky/admin_policy.py | ||
:language: python | ||
:pyobject: UserRequest | ||
:caption: `UserRequest Class <https://github.com/skypilot-org/skypilot/blob/master/sky/admin_policy.py>`_ | ||
|
||
.. literalinclude:: ../../../sky/admin_policy.py | ||
:language: python | ||
:pyobject: MutatedUserRequest | ||
:caption: `MutatedUserRequest Class <https://github.com/skypilot-org/skypilot/blob/master/sky/admin_policy.py>`_ | ||
|
||
|
||
In other words, an ``AdminPolicy`` can mutate any fields of a user request, including | ||
the :ref:`task <yaml-spec>` and the :ref:`global skypilot config <config-yaml>`, | ||
giving admins a lot of flexibility to control user's SkyPilot usage. | ||
|
||
An ``AdminPolicy`` can be used to both validate and mutate user requests. If | ||
a request should be rejected, the policy should raise an exception. | ||
|
||
|
||
The ``sky.Config`` and ``sky.RequestOptions`` classes are defined as follows: | ||
|
||
.. literalinclude:: ../../../sky/skypilot_config.py | ||
:language: python | ||
:pyobject: Config | ||
:caption: `Config Class <https://github.com/skypilot-org/skypilot/blob/master/sky/skypilot_config.py>`_ | ||
|
||
|
||
.. literalinclude:: ../../../sky/admin_policy.py | ||
:language: python | ||
:pyobject: RequestOptions | ||
:caption: `RequestOptions Class <https://github.com/skypilot-org/skypilot/blob/master/sky/admin_policy.py>`_ | ||
|
||
|
||
Example Policies | ||
---------------- | ||
|
||
We have provided a few example policies in `examples/admin_policy/example_policy <https://github.com/skypilot-org/skypilot/tree/master/examples/admin_policy/example_policy>`_. You can test these policies by installing the example policy package in your Python environment. | ||
|
||
.. code-block:: bash | ||
git clone https://github.com/skypilot-org/skypilot.git | ||
cd skypilot | ||
pip install examples/admin_policy/example_policy | ||
Reject All | ||
~~~~~~~~~~ | ||
|
||
.. literalinclude:: ../../../examples/admin_policy/example_policy/example_policy/skypilot_policy.py | ||
:language: python | ||
:pyobject: RejectAllPolicy | ||
:caption: `RejectAllPolicy <https://github.com/skypilot-org/skypilot/blob/master/examples/admin_policy/example_policy/example_policy/skypilot_policy.py>`_ | ||
|
||
.. literalinclude:: ../../../examples/admin_policy/reject_all.yaml | ||
:language: yaml | ||
:caption: `Config YAML for using RejectAllPolicy <https://github.com/skypilot-org/skypilot/blob/master/examples/admin_policy/reject_all.yaml>`_ | ||
|
||
.. _kubernetes-labels-policy: | ||
|
||
Add Labels for all Tasks on Kubernetes | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
.. literalinclude:: ../../../examples/admin_policy/example_policy/example_policy/skypilot_policy.py | ||
:language: python | ||
:pyobject: AddLabelsPolicy | ||
:caption: `AddLabelsPolicy <https://github.com/skypilot-org/skypilot/blob/master/examples/admin_policy/example_policy/example_policy/skypilot_policy.py>`_ | ||
|
||
.. literalinclude:: ../../../examples/admin_policy/add_labels.yaml | ||
:language: yaml | ||
:caption: `Config YAML for using AddLabelsPolicy <https://github.com/skypilot-org/skypilot/blob/master/examples/admin_policy/add_labels.yaml>`_ | ||
|
||
|
||
.. _disable-public-ip-policy: | ||
|
||
Always Disable Public IP for AWS Tasks | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
.. literalinclude:: ../../../examples/admin_policy/example_policy/example_policy/skypilot_policy.py | ||
:language: python | ||
:pyobject: DisablePublicIpPolicy | ||
:caption: `DisablePublicIpPolicy <https://github.com/skypilot-org/skypilot/blob/master/examples/admin_policy/example_policy/example_policy/skypilot_policy.py>`_ | ||
|
||
.. literalinclude:: ../../../examples/admin_policy/disable_public_ip.yaml | ||
:language: yaml | ||
:caption: `Config YAML for using DisablePublicIpPolicy <https://github.com/skypilot-org/skypilot/blob/master/examples/admin_policy/disable_public_ip.yaml>`_ | ||
|
||
.. _use-spot-for-gpu-policy: | ||
|
||
Use Spot for all GPU Tasks | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
.. | ||
.. literalinclude:: ../../../examples/admin_policy/example_policy/example_policy/skypilot_policy.py | ||
:language: python | ||
:pyobject: UseSpotForGpuPolicy | ||
:caption: `UseSpotForGpuPolicy <https://github.com/skypilot-org/skypilot/blob/master/examples/admin_policy/example_policy/example_policy/skypilot_policy.py>`_ | ||
|
||
.. literalinclude:: ../../../examples/admin_policy/use_spot_for_gpu.yaml | ||
:language: yaml | ||
:caption: `Config YAML for using UseSpotForGpuPolicy <https://github.com/skypilot-org/skypilot/blob/master/examples/admin_policy/use_spot_for_gpu.yaml>`_ | ||
|
||
.. _enforce-autostop-policy: | ||
|
||
Enforce Autostop for all Tasks | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
.. literalinclude:: ../../../examples/admin_policy/example_policy/example_policy/skypilot_policy.py | ||
:language: python | ||
:pyobject: EnforceAutostopPolicy | ||
:caption: `EnforceAutostopPolicy <https://github.com/skypilot-org/skypilot/blob/master/examples/admin_policy/example_policy/example_policy/skypilot_policy.py>`_ | ||
|
||
.. literalinclude:: ../../../examples/admin_policy/enforce_autostop.yaml | ||
:language: yaml | ||
:caption: `Config YAML for using EnforceAutostopPolicy <https://github.com/skypilot-org/skypilot/blob/master/examples/admin_policy/enforce_autostop.yaml>`_ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
admin_policy: example_policy.AddLabelsPolicy |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
admin_policy: example_policy.DisablePublicIpPolicy |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
admin_policy: example_policy.EnforceAutostopPolicy |
6 changes: 6 additions & 0 deletions
6
examples/admin_policy/example_policy/example_policy/__init__.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"""Example admin policy module and prebuilt policies.""" | ||
from example_policy.skypilot_policy import AddLabelsPolicy | ||
from example_policy.skypilot_policy import DisablePublicIpPolicy | ||
from example_policy.skypilot_policy import EnforceAutostopPolicy | ||
from example_policy.skypilot_policy import RejectAllPolicy | ||
from example_policy.skypilot_policy import UseSpotForGpuPolicy |
121 changes: 121 additions & 0 deletions
121
examples/admin_policy/example_policy/example_policy/skypilot_policy.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,121 @@ | ||
"""Example prebuilt admin policies.""" | ||
import sky | ||
|
||
|
||
class RejectAllPolicy(sky.AdminPolicy): | ||
"""Example policy: rejects all user requests.""" | ||
|
||
@classmethod | ||
def validate_and_mutate( | ||
cls, user_request: sky.UserRequest) -> sky.MutatedUserRequest: | ||
"""Rejects all user requests.""" | ||
raise RuntimeError('Reject all policy') | ||
|
||
|
||
class AddLabelsPolicy(sky.AdminPolicy): | ||
"""Example policy: adds a kubernetes label for skypilot_config.""" | ||
|
||
@classmethod | ||
def validate_and_mutate( | ||
cls, user_request: sky.UserRequest) -> sky.MutatedUserRequest: | ||
config = user_request.skypilot_config | ||
labels = config.get_nested(('kubernetes', 'custom_metadata', 'labels'), | ||
{}) | ||
labels['app'] = 'skypilot' | ||
config.set_nested(('kubernetes', 'custom_metadata', 'labels'), labels) | ||
return sky.MutatedUserRequest(user_request.task, config) | ||
|
||
|
||
class DisablePublicIpPolicy(sky.AdminPolicy): | ||
"""Example policy: disables public IP for all AWS tasks.""" | ||
|
||
@classmethod | ||
def validate_and_mutate( | ||
cls, user_request: sky.UserRequest) -> sky.MutatedUserRequest: | ||
config = user_request.skypilot_config | ||
config.set_nested(('aws', 'use_internal_ip'), True) | ||
if config.get_nested(('aws', 'vpc_name'), None) is None: | ||
# If no VPC name is specified, it is likely a mistake. We should | ||
# reject the request | ||
raise RuntimeError('VPC name should be set. Check organization ' | ||
'wiki for more information.') | ||
return sky.MutatedUserRequest(user_request.task, config) | ||
|
||
|
||
class UseSpotForGpuPolicy(sky.AdminPolicy): | ||
"""Example policy: use spot instances for all GPU tasks.""" | ||
|
||
@classmethod | ||
def validate_and_mutate( | ||
cls, user_request: sky.UserRequest) -> sky.MutatedUserRequest: | ||
"""Sets use_spot to True for all GPU tasks.""" | ||
task = user_request.task | ||
new_resources = [] | ||
for r in task.resources: | ||
if r.accelerators: | ||
new_resources.append(r.copy(use_spot=True)) | ||
else: | ||
new_resources.append(r) | ||
|
||
task.set_resources(type(task.resources)(new_resources)) | ||
|
||
return sky.MutatedUserRequest( | ||
task=task, skypilot_config=user_request.skypilot_config) | ||
|
||
|
||
class EnforceAutostopPolicy(sky.AdminPolicy): | ||
"""Example policy: enforce autostop for all tasks.""" | ||
|
||
@classmethod | ||
def validate_and_mutate( | ||
cls, user_request: sky.UserRequest) -> sky.MutatedUserRequest: | ||
"""Enforces autostop for all tasks. | ||
Note that with this policy enforced, users can still change the autostop | ||
setting for an existing cluster by using `sky autostop`. | ||
Since we refresh the cluster status with `sky.status` whenever this | ||
policy is applied, we should expect a few seconds latency when a user | ||
run a request. | ||
""" | ||
request_options = user_request.request_options | ||
|
||
# Request options is None when a task is executed with `jobs launch` or | ||
# `sky serve up`. | ||
if request_options is None: | ||
return sky.MutatedUserRequest( | ||
task=user_request.task, | ||
skypilot_config=user_request.skypilot_config) | ||
|
||
# Get the cluster record to operate on. | ||
cluster_name = request_options.cluster_name | ||
cluster_records = [] | ||
if cluster_name is not None: | ||
cluster_records = sky.status(cluster_name, refresh=True) | ||
|
||
# Check if the user request should specify autostop settings. | ||
need_autostop = False | ||
if not cluster_records: | ||
# Cluster does not exist | ||
need_autostop = True | ||
elif cluster_records[0]['status'] == sky.ClusterStatus.STOPPED: | ||
# Cluster is stopped | ||
need_autostop = True | ||
elif cluster_records[0]['autostop'] < 0: | ||
# Cluster is running but autostop is not set | ||
need_autostop = True | ||
|
||
# Check if the user request is setting autostop settings. | ||
is_setting_autostop = False | ||
idle_minutes_to_autostop = request_options.idle_minutes_to_autostop | ||
is_setting_autostop = (idle_minutes_to_autostop is not None and | ||
idle_minutes_to_autostop >= 0) | ||
|
||
# If the cluster requires autostop but the user request is not setting | ||
# autostop settings, raise an error. | ||
if need_autostop and not is_setting_autostop: | ||
raise RuntimeError('Autostop/down must be set for all clusters.') | ||
|
||
return sky.MutatedUserRequest( | ||
task=user_request.task, | ||
skypilot_config=user_request.skypilot_config) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
[build-system] | ||
requires = ["setuptools>=61.0", "wheel"] | ||
build-backend = "setuptools.build_meta" | ||
|
||
[project] | ||
name = "example_policy" | ||
version = "0.0.1" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
admin_policy: example_policy.RejectAllPolicy |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
resources: | ||
cloud: aws | ||
cpus: 2 | ||
labels: | ||
other_labels: test | ||
|
||
|
||
setup: | | ||
echo "setup" | ||
run: | | ||
echo "run" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
admin_policy: example_policy.UseSpotForGpuPolicy |
Oops, something went wrong.