Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serve] Support headers in Readiness Probe #3552

Merged
merged 4 commits into from
May 21, 2024
Merged

Conversation

cblmemo
Copy link
Collaborator

@cblmemo cblmemo commented May 15, 2024

Closes #3549 by adding headers configuration in readiness probe.

Tested (run the relevant ones):

  • Code formatting: bash format.sh
  • Any manual or new tests for this PR (please specify below)
    • llm/vllm/service-with-auth.yaml
$ sky serve up llm/vllm/service-with-auth.yaml --gpus L4
Service from YAML spec: llm/vllm/service-with-auth.yaml
Service Spec:
Readiness probe method:           GET /v1/models with headers {"Authorization": "Bearer sky-authkey-3d2105b9-a9ba-4f13"}
Readiness initial delay seconds:  1200
Replica autoscaling policy:       Fixed 1 replica
Spot Policy:                      No spot policy
$ curl $(sky serve status --endpoint sky-service-4233)/v1/models -H 'Authorization: Bearer sky-authkey-3d2105b9-a9ba-4f13' 
{"object":"list","data":[{"id":"meta-llama/Llama-2-7b-chat-hf","object":"model","created":1715746959,"owned_by":"vllm","root":"meta-llama/Llama-2-7b-chat-hf","parent":null,"permission":[{"id":"modelperm-5cf910e014484663a648f66e38b08174","object":"model_permission","created":1715746959,"allow_create_engine":false,"allow_sampling":true,"allow_logprobs":true,"allow_search_indices":false,"allow_view":true,"allow_fine_tuning":false,"organization":"*","group":null,"is_blocking":false}]}]}
  • All smoke tests: pytest tests/test_smoke.py
  • Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
  • Backward compatibility tests: conda deactivate; bash -i tests/backward_compatibility_tests.sh

@cblmemo cblmemo changed the title [Serve] [Serve] Support headers in Readiness Probe May 15, 2024
Copy link
Collaborator

@Michaelvll Michaelvll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the support for the authorization token in the readiness probe @cblmemo! It looks mostly good to me!

llm/vllm/service-with-auth.yaml Outdated Show resolved Hide resolved
llm/vllm/service-with-auth.yaml Outdated Show resolved Hide resolved
if self.readiness_headers is None:
headers = ''
else:
headers = f' with headers {json.dumps(self.readiness_headers)}'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Showing the authorization token directly in the log may not be very safe. We may want to redact the value of the headers?

Copy link
Collaborator Author

@cblmemo cblmemo May 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Changed to with custom headers now.

@cblmemo cblmemo requested a review from Michaelvll May 20, 2024 09:05
Copy link
Collaborator

@Michaelvll Michaelvll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this @cblmemo! LGTM with the comment fixed.

@@ -567,7 +570,8 @@ def __init__(self, service_name: str,
self._update_mode = serve_utils.DEFAULT_UPDATE_MODE
logger.info(f'Readiness probe path: {spec.readiness_path}\n'
f'Initial delay seconds: {spec.initial_delay_seconds}\n'
f'Post data: {spec.post_data}')
f'Post data: {spec.post_data}\n'
f'Readiness headers: {spec.readiness_headers}')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we are still printing out the headers? Should we use list(spec.readiness_headers.keys()) instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Fixed. Thanks!

@cblmemo cblmemo merged commit cf840dc into master May 21, 2024
20 checks passed
@cblmemo cblmemo deleted the serve-probe-headers branch May 21, 2024 04:31
@cblmemo cblmemo mentioned this pull request May 23, 2024
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: vLLM API-Key support for Sky Serve
2 participants