Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Admin policy enforcement plugin #3966

Merged
merged 67 commits into from
Sep 24, 2024
Merged

[Core] Admin policy enforcement plugin #3966

merged 67 commits into from
Sep 24, 2024

Conversation

Michaelvll
Copy link
Collaborator

@Michaelvll Michaelvll commented Sep 20, 2024

This PR allows the admin to bring a customized policy enforcement script in Python, which can apply any mutation to all user tasks based on customized policy requirements.

Usage:

User-side

In ~/.sky/config.yaml:

policy: my_package.skypilot_policy_fn

Admin-side

Create the skypilot_policy_fn function with the following signature:

def skypilot_policy_fn(user_task: sky.UserTask) -> sky.MutatedUserTask:
    pass

See policy.py for the definition of these two types.

TODO:

Tested (run the relevant ones):

  • Code formatting: bash format.sh
  • Any manual or new tests for this PR (please specify below)
    • normal launch with customized labels
    • normal launch/exec with autostop enforcement
    • jobs launch with customized labels
    • jobs launch with autostop enforcement
    • service launch with customized labels
export SKYPILOT_CONFIG='/home/gcpuser/skypilot-dev/examples/admin_policy/config_label_config.yaml'
sky serve up -n test-global-label examples/serve/http_server/task.yaml --cloud gcp
  • All smoke tests: pytest tests/test_smoke.py
  • Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
  • Backward compatibility tests: conda deactivate; bash -i tests/backward_compatibility_tests.sh

@Michaelvll Michaelvll changed the title [Core] Customized policy enforcement Plugin [Core] Customized policy enforcement plugin Sep 20, 2024
docs/source/reference/config.rst Outdated Show resolved Hide resolved
sky/policy.py Outdated Show resolved Hide resolved
sky/policy.py Outdated Show resolved Hide resolved
@@ -124,6 +125,8 @@ def up(

_validate_service_task(task)

task = policy.Policy().apply_to_task(task)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit unclear if it's in-place since it also returns the task. How about "apply_in_place"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not in-place. Either way is fine, but I found not in-place is more commonly seen operations.

sky/utils/dag_utils.py Outdated Show resolved Hide resolved
sky/policy.py Outdated Show resolved Hide resolved
@Michaelvll Michaelvll changed the title [Core] Customized policy enforcement plugin [Core] Admin policy enforcement plugin Sep 23, 2024
Copy link
Member

@concretevitamin concretevitamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @Michaelvll. I think two TODOs before merging:

(1) Smoke tests, since this touches config / backend
(2) Update PR description (policy)

docs/source/cloud-setup/policy.rst Outdated Show resolved Hide resolved
docs/source/cloud-setup/policy.rst Outdated Show resolved Hide resolved
sky/execution.py Outdated Show resolved Hide resolved
tests/unit_tests/test_admin_policy.py Outdated Show resolved Hide resolved
sky/execution.py Outdated
cluster_exists = existing_handle is not None
cluster_record = global_user_state.get_cluster_from_name(cluster_name)
cluster_exists = cluster_record is not None
cluster_running = (cluster_record is not None and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: what's the latency impact of this policy on sky launch/exec? Is it small? Do we expect users to use this example policy and be ok with the latency?

docs/source/cloud-setup/policy.rst Outdated Show resolved Hide resolved
@Michaelvll
Copy link
Collaborator Author

@Michaelvll Michaelvll added this pull request to the merge queue Sep 24, 2024
Merged via the queue into master with commit 800f7d6 Sep 24, 2024
20 checks passed
@Michaelvll Michaelvll deleted the policy-hook branch September 24, 2024 04:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants