-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove call to Cloud Run API and set CPU & concurrency in GCF API instead #5605
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #5605 +/- ##
==========================================
- Coverage 56.04% 55.97% -0.07%
==========================================
Files 317 317
Lines 21496 21483 -13
Branches 4393 4385 -8
==========================================
- Hits 12048 12026 -22
- Misses 8381 8394 +13
+ Partials 1067 1063 -4
... and 1 file with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
"maxInstanceRequestConcurrency", | ||
"concurrency" | ||
); | ||
proto.convertIfPresent(gcfFunction.serviceConfig, endpoint, "availableCpu", "cpu", (cpu) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's leave a note or guard to only do this when this is a 2nd gen function (we'll probably eventually use the v2 API to configure v1 and v2 functions)
gcfFunction.serviceConfig.availableMemory = mem > 1024 ? `${mem / 1024}Gi` : `${mem}Mi`; | ||
proto.renameIfPresent(gcfFunction.serviceConfig, endpoint, "minInstanceCount", "minInstances"); | ||
proto.renameIfPresent(gcfFunction.serviceConfig, endpoint, "maxInstanceCount", "maxInstances"); | ||
proto.renameIfPresent( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old code defaulted to 80 concurrents if concurrency was not set and CPU was >= 1. We should do the same still.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resolveCpuAndConcurrency
function in the prepare phase already provides the default — are there any edge cases in which the prepare phase does not happen, but the release phase does? Or should we move this logic into the release phase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine. I may have just been programming more defensively than necessary in previous versions (or, more likely, it became clear that resolveCpuAndConcurrency had to be written to support inferencing on update after I had already written the fabricator code)
The GCF API was recently updated to expose the concurrency and CPU fields for the Cloud Run service underlying 2nd gen Cloud Functions. Setting concurrency and CPU used to require a separate request to the Cloud Run API, which meant an extra API call per 2nd gen function deploy (for functions with non-default concurrency and CPU values). This PR eliminates the extra API call to the Cloud Run service and sets the concurrency and CPU values in the GCF API in the same API request as creating or updating the function.
Deploying functions via the GCF API consumes the user's project quota for the Cloud Run API; reducing the number of Cloud Run API calls should lead to less quota-related failures on large function deployments.