Skip to content

Collection of bet practices, reference architectures, examples, and utilities to deploy Foundation Models with KServe on AWS.

License

Notifications You must be signed in to change notification settings

aws-samples/awsome-kserve-inference

Foundation Model Inference Architectures with KServe on EKS

This repository contains a reference architecture and test cases for Foundation Model inference with KServe on Amazon EKS, integrating Karpenter as cluster autoscaler.

KServe offers a standard Kubernetes-based Model Inference Platform for scalable use-cases. Complementing it, Karpenter provides swift, simplified compute provisioning, optimally leveraging cloud resources. This synergy offers a unique opportunity to exploit Spot instances, enhancing cost efficiency. This reference architecture illustrates the mechanics of these technologies and demonstrates their combined power in enabling efficient serverless ML deployments.

Deployment

This section guide you through how to deploy an EKS cluster and Kubernetes custom resources required. This repository is built on top of Karpenter Blueprints. Please refer to the repository for the infrastructure set up or run Make to

Infrastructure validation

Test cases

Once you have deploy

License

This library is licensed under the MIT-0 License. See the LICENSE file.

About

Collection of bet practices, reference architectures, examples, and utilities to deploy Foundation Models with KServe on AWS.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published