Releases · kserve/kserve

21 May 09:58

yuzisun

v0.13.0-rc1

6c37dce

v0.13.0-rc1 Pre-release

Pre-release

What's Changed

upgrade vllm/transformers version by @johnugeorge in #3671
Add openai models endpoint by @cmaddalozzo in #3666
feat: Support customizable deployment strategy for RawDeployment mode. Fixes #3452 by @terrytangyuan in #3603
Enable dtype support for huggingface server by @Datta0 in #3613
Add method for checking model health/readiness by @cmaddalozzo in #3673
fix for extract zip from gcs by @andyi2it in #3510
Update Dockerfile and Readme by @gavrishp in #3676
Update huggingface readme by @alexagriffith in #3678
fix: HPA equality check should include annotations by @terrytangyuan in #3650
Fix: huggingface runtime in helm chart by @yuzisun in #3679
Fix: model id and model dir check order by @yuzisun in #3680
Fix:vLLM Model Supported check throwing circular dependency by @gavrishp in #3688
Fix: Allow null in Finish reason streaming response in vLLM by @gavrishp in #3684
Unify the log configuration using kserve logger by @sivanantha321 in #3577
Remove conversion webhook from kubeflow manifest patch by @sivanantha321 in #3700
Add the field ResponseStartTimeoutSeconds to create ksvc by @houshengbo in #3705

New Contributors

@Datta0 made their first contribution in #3613

Full Changelog: v0.13.0-rc0...v0.13.0-rc1

Contributors

cmaddalozzo, houshengbo, and 8 other contributors

Assets 7

07 May 10:11

yuzisun

v0.13.0-rc0

bfc2e21

v0.13.0-rc0 Pre-release

Pre-release

🌈 What's New?

add support for async streaming in predict by @alexagriffith in #3475
Fix: Support model parallelism in HF transformer by @gavrishp in #3459
Support model revision and tokenizer revision in huggingface server by @lizzzcai in #3558
OpenAI schema by @tessapham in #3477
Support OpenAIModel in ModelRepository by @grandbora in #3590
updated xgboost to support json and ubj models by @andyi2it in #3551
Add OpenAI API support to Huggingfaceserver by @cmaddalozzo in #3582
VLLM support for OpenAI Completions in HF server by @gavrishp in #3589
Add a user friendly error message for http exceptions by @grandbora in #3581
feat: Provide minimal distribution of CRDs by @terrytangyuan in #3492
set default SAFETENSORS_FAST_GPU and HF_HUB_DISABLE_TELEMETRY in HF Server by @lizzzcai in #3594
Enabled the multiple domains support on an inference service by @houshengbo in #3615
Add base model for proxying request to an OpenAI API enabled model server by @cmaddalozzo in #3621
Add headers to predictor exception logging by @grandbora in #3658
Enhance controller setup based on available CRDs by @israel-hdez in #3472

⚠️ What's Changed

Remove conversion webhook from manifests by @Jooho in #3476
Remove cluster level list/watch for configmaps, serviceaccounts, secrets by @sivanantha321 in #3469
chore: Remove Seldon Alibi dependencies. Fixes #3380 by @terrytangyuan in #3443
docs: Move Alibi explainer to docs by @terrytangyuan in #3579
Remove generate endpoints by @cmaddalozzo in #3654

🐛 What's Fixed

Fix:Support Parallelism in vllm runtime by @gavrishp in #3464
fix: Instantiate HuggingfaceModelRepository only when model cannot be loaded. Fixes #3423 by @terrytangyuan in #3424
Fix isADirectoryError in Azure blob download by @tjandy98 in #3502
Fix bug: Remove redundant helm chart affinity on predictor CRD by @trojaond in #3481
Make the modelcar injection idempotent by @rhuss in #3517
Only pad left for decode-only architecture models. by @sivanantha321 in #3534
fix lint typo on Makefile by @spolti in #3569
fix: Set writable cache folder to avoid permission issue. Fixes #3562 by @terrytangyuan in #3576
Fix model unload in server stop method by @sivanantha321 in #3587
Fix golint errors by @andyi2it in #3552
Fix make deploy-dev-storage-initializer not working by @sivanantha321 in #3617
Fix Pydantic 2 warnings by @cmaddalozzo in #3622
build: Fix CRD copying in generate-install.sh by @terrytangyuan in #3620
Only load from model repository if model binary is not found under model_dir by @sivanantha321 in #3559
build: Remove misleading logs from minimal-crdgen.sh by @terrytangyuan in #3641
Assign device to input tensors in huggingface server with huggingface backend by @saileshd1402 in #3657
Fix Huggingface server stopping criteria by @cmaddalozzo in #3659
Explicitly specify pad token id when generating tokens by @sivanantha321 in #3565
Fix quick install does not cleans up Istio installer by @sivanantha321 in #3660

⬆️ Version Upgrade

Upgrade orjson to version 3.9.15 by @spolti in #3488
feat: upgrade to new fastapi, update models to handle both pydantic v… by @timothyjlaurent in #3374
Update cert manager version in quick install script by @shauryagoel in #3496
ci: Bump minikube version to work with newer K8s version by @terrytangyuan in #3498
upgrade knative to 1.13 by @andyi2it in #3457
Upgrade istio to 1.20 works for the Github Actions by @houshengbo in #3529
chore: Bump ModelMesh version to v0.12.0-rc0 in Helm chart by @terrytangyuan in #3642

🔨 Project SDLC

Enhance CI environment by @sivanantha321 in #3440
Fixed go lint error using golangci-lint tool. by @andyi2it in #3378
chore: Update list of reviewers by @ckadner in #3484
build: Add helm docs update to make generate command by @terrytangyuan in #3437
Added v2 infer test for supported model frameworks. by @andyi2it in #3349
fix the quote format same with others and docstrings by @leyao-daily in #3490
remove unnecessary Istio settings from quick_install.sh by @peterj in #3493
Remove GOARCH by @mkumatag in #3523
GH Alert: Potential file inclusion via variable by @spolti in #3520
Update codeQL to v3 by @spolti in #3548
switch e2e test inference graph to raw mode by @andyi2it in #3511
Black lint by @cmaddalozzo in #3568
Fix python linter by @sivanantha321 in #3571
build: Add flake8 and black to pre-commit hooks by @terrytangyuan in #3578
build: Allow pre-commit to keep changes in reformatted code by @terrytangyuan in #3604
Allow rerunning failed workflows by comment by @andyi2it in #3550
add re-run info in the PR templates by @spolti in #3633
Add e2e tests for huggingface by @sivanantha321 in #3600
Test image builds for ARM64 arch in CI by @sivanantha321 in #3629
workflow file for cherry-pick on comment by @andyi2it in #3653

CVE patches

CVE-2024-24762 - update fastapi to 0.109.1 by @spolti in #3556
golang.org/x/net Allocation of Resources Without Limits or Throttling by @spolti in #3596
Fix CVE-2023-45288 for qpext by @sivanantha321 in #3618
Security fix - CVE 2024 24786 by @andyi2it in #3585

📝 Documentation Update

qpext: fix a typo in qpext doc by @daixiang0 in #3491
Update KServe project description by @yuzisun in #3524
Update kserve cake diagram by @yuzisun in #3530
Remove white background for the kserve diagram by @yuzisun in #3531
fix a typo in OPENSHIFT_GUIDE.md by @marek-veber in #3544
Fix typo in README.md by @terrytangyuan in #3575

New Contributors

@leyao-daily made their first contribution in #3490
@peterj made their first contribution in #3493
@timothyjlaurent made their first contribution in #3374
@shauryagoel made their first contribution in #3496
@mkumatag made their first contribution in #3523
@marek-veber made their first contribution in #3544
@trojaond made their first contribution in #3481
@grandbora made their first contribution in #3590
@saileshd1402 made their first contribution in #3657

Full Changelog: v0.12.1...v0.13.0-rc0

Contributors

rhuss, cmaddalozzo, and 24 other contributors

Assets 8

23 Apr 12:20

yuzisun

v0.12.1

d94ca25

v0.12.1 Latest

Latest

What's Changed

[release-0.12] Update fastapi to 0.109.1 and Support ray 2.10 by @sivanantha321 in #3609
[release-0.12] Pydantic 2 support by @cmaddalozzo in #3614
[release-0.12] Make the modelcar injection idempotent by @sivanantha321 in #3612
Prepare for release 0.12.1 by @sivanantha321 in #3610
release-0.12 pin back ray to 2.10 by @yuzisun in #3616
[release-0.12] Fix docker build failure for ARM64 by @sivanantha321 in #3627

Full Changelog: v0.12.0...v0.12.1

Contributors

cmaddalozzo, yuzisun, and sivanantha321

Assets 7

25 Feb 17:17

yuzisun

v0.12.0

c9570d6

v0.12.0

🌈 What's New?

Core Inference & Serving Runtimes

Implement HuggingFace model server by @yuzisun in #3334
eat: Add HuggingFace runtime out-of-the-box support by @terrytangyuan in #3395
Implement support for vllm as alternative backend by @gavrishp in #3415
Torchserve grpc v2 by @andyi2it in #3247
feat: CA bundle mount options for storage initializer by @Jooho in #3250
Add support for modelcars by @rhuss in #3110
Add compatibility for Istio CNI plugin by @israel-hdez in #3316
feat: Allow to disable ingress creation for raw deployment mode by @terrytangyuan in #3436

Advanced Inference

RawDeployment support for Inference Graph by @bmopuri in #3199, @bmopuri in #3194
Added custom request timeout for inferencegraph. by @andyi2it in #3173
Add regex support for propagating IG headers by @sivanantha321 in #3178

KServe Python SDK, Storage

Unpack archive files for hdfs by @sivanantha321 in #3093
feat: Support S3 transfer acceleration by @terrytangyuan in #3305

⚠️ What's Changed

Change the default value for enableDirectPvcVolumeMount to true by @Jooho in #3371
Add model arguments to API and update BERT inference example by @yuzisun in #3332

--model_name, --predictor_host, --predictor_use_ssl, --predictor_request_timeout_seconds are added to the kserve model server and no longer need to be defined in the custom predictor or transformer. --protocol is deprecated and superceded by --predictor_protocol. More details can be found on API reference doc.

🐛 What's Fixed

Removing update op from pod-mutator webhook by @rachitchauhan43 in #3163
Fix quick install script by @dtrifiro in #3164
Fix self-signed-ca installation by @sivanantha321 in #3165
Add S3_VERIFY_SSL to storage.py for S3 by @Jooho in #3172
Fix runtime not found for triton due to wrong default protocolVersion by @sivanantha321 in #3177
Make ModelServer to stop correctly when using more than 1 worker by @andyi2it in #3174
Fix serving runtime webhook cert namespace for kubeflow installation by @sivanantha321 in #3188
Fix knative config-defaults values overrided by kserve by @sivanantha321 in #3130
Fix qpext metrics port by @yuzisun in #3209
Added async with postprocess method. by @andyi2it in #3204
Fix lightgbm model input conversion when input is list of lists by @sivanantha321 in #3226
Validation added for ensuring same model format has same priority for runtime by @andyi2it in #3181
Fix: Unexpected Panic in Inference graph when it fails to create http request by @HAO2167 in #3079
Support verify variable with storage-config json style (fix-3263) by @Jooho in #3267
s3 storage initializer: only set environment variables if variables are set in storage secret json by @dtrifiro in #3259
Fix tensorflow e2e test fails due to OOM error by @sivanantha321 in #3293
fix: Properly handle the creation and closure of success file in DownloadModel() by @terrytangyuan in #3295
fix: Surface errors when writing graphHandler response by @terrytangyuan in #3308
Fix qpext hangs during shutdown by @sivanantha321 in #3268
fix: Check if HPA has the same scaleTargetRef and behavior by @terrytangyuan in #3294
Updated quick_install script to temporarily fix 0.11.2 release install by @andyi2it in #3311
image_patch_dev.sh: set pipefail by @dtrifiro in #3274
Move pmml worker validation to runtime by @sivanantha321 in #3182
Introduce retry on resource conflict by @sivanantha321 in #3240
Fix inference request fails when sending with less number of features than the total model features on lightgbm by @sivanantha321 in #3313
Fix raw deployment service points to predictor container port instead of transformer container port in transformer collocation by @sivanantha321 in #3318
Restrict storage uri to predictor only in collocation of transformer and predictor by @sivanantha321 in #3280
feat: Expose defaults for several batcher handler parameters by @terrytangyuan in #3301
fix: Properly close resources and handle errors in agent and storage. Fixes #3323 by @terrytangyuan in #3321
Handles s3 download for object name starts with folder name. by @andyi2it in #3205
chore: Remove unused timeout annotation and flag in batcher by @terrytangyuan in #3341
Pass missing infer parameters during conversion by @sivanantha321 in #3368
Add exception handler for model server and Add ability to specify custom handler by @sivanantha321 in #3405
fix: Add missing volume mount to transformer container when using modelcars by @rhuss in #3384
fix: Add 'model_version' to InferResponse in python library by @ajstewart in #3466
Fix v2 model ready url in kserve client by @sivanantha321 in #3403
Fix parameters value type conversion by pydantic by @sivanantha321 in #3430
Fix Raw Logger E2E by @israel-hdez in #3434
Expose qpext aggregate metrics port on container by @sivanantha321 in #3291
Fix dup metrics aggr port by @yuzisun in #3447
fix: HuggingFace predictor should not be recognized as multi-model server by @terrytangyuan in #3449
Fix: bugs for huggingface runtime template by @yuzisun in #3448
Fix: Add padding and truncation in huggingface tokenizer by @kevinmingtarja in #3450
Fix: vllm backend does not work with model_dir for huggingface runtime by @yuzisun in #3456
Fix azure workload identity federation by excluding azure client secret by @robbertvdg in #3390
Change certificate to ca_bundle in json style of s3 storageSecret by @Jooho in #3463

⬆️ Version Upgrade

Upgrade istio Api and migrate to v1beta1 Api version by @sivanantha321 in #3150
Bump torchserve version to 0.9.0 by @gavrishp in #3217
Allow ray >=2.7,<3 by @ddelange in #3075
Bump istio version to 1.19.4 by @sivanantha321 in #3258
Updated ray to 2.8.0 and removed detached flag to avoid deprecation error in future by @andyi2it in #3272
chore: Upgrade to XGBoost v2.0.2. Fixes #3310 by @terrytangyuan in #3309
chore: Upgrade Go to v1.21 by @terrytangyuan in #3296
Added 3.11 support for paddle in workflow. by @andyi2it in #3246
Upgraded poetry version to 1.7.1 by @andyi2it in #3271
Upgrade cloudevent to v2 by @homily707 in #3255
Update knative-serving by @spolti in #3362
Update google-cloud-storage dependecy to >=2.3.0,<3.0.0 and ray dependency to >=2.8.1, <3.0.0 by @sivanantha321 in #3389

🔨 Project SDLC

chore: Add design doc template links to feature request template by @ckadner in #3155
Make storage initializer image configurable by @yuzisun in #3145
Increase pytest workers for kourier e2e test by @sivanantha321 in #3151
Restrict workflow concurrency by @vignesh-murugani2i in #3167
Generate client-go for StorageContainer CR by @sivanantha321 in #3152
Refractor v1 vs. v2 endpoint unit tests in kserve/test/test_server.py… by @guohaoyu110 in #3158
Verify codegen in CI by @sivanantha321 in ...

Contributors

rhuss, spolti, and 23 other contributors

Assets 7

27 Jan 14:10

yuzisun

v0.12.0-rc1

6fee880

v0.12.0-rc1 Pre-release

Pre-release

What's Changed

docs: Corrections and edits on release process document by @terrytangyuan in #3326
build: Switch to use kustomize in kubectl to simplify build process. Fixes #3314 by @terrytangyuan in #3315
feat: Expose defaults for several batcher handler parameters by @terrytangyuan in #3301
fix: Properly close resources and handle errors in agent and storage. Fixes #3323 by @terrytangyuan in #3321
Add model arguments to API and update BERT inference example by @yuzisun in #3332
chore: Update generated APIs and check generated manifests by @terrytangyuan in #3335
Update python model serving runtime API docstring by @yuzisun in #3338
Handles s3 download for object name starts with folder name. by @andyi2it in #3205
chore: Remove unused timeout annotation and flag in batcher by @terrytangyuan in #3341
ci: Automate release process by @terrytangyuan in #3345
fixes critical vulnerabilities on ray by @spolti in #3285
chore: Bump versions to prepare v0.12.0-rc1 release by @terrytangyuan in #3352
Change version for helm charts in README by @gawsoftpl in #3353
Fixes CVE-2023-48795 by @spolti in #3354
Fix Stack-based Buffer Overflow on protobuf by @spolti in #3358
Update knative-serving by @spolti in #3362
Fixes vulnerabilities on the otelhttp dependency by @spolti in #3361
Change the default value for enableDirectPvcVolumeMount to true by @Jooho in #3371
feat: Automatically generate Helm Chart docs. Fixes #3356 by @terrytangyuan in #3363
Modified script for include all kserve poetry projects. by @andyi2it in #3350
RawDeployment support for Inference Graph by @bmopuri in #3199
Add compatibility for Istio CNI plugin by @israel-hdez in #3316
Pass missing infer parameters during conversion by @sivanantha321 in #3368
feat: Support S3 transfer acceleration by @terrytangyuan in #3305
Implement HuggingFace model server by @yuzisun in #3334
fix: Add missing volume mount to transformer container when using modelcars by @rhuss in #3384
align cloudevents/sdk-go dependency by @spolti in #3387

New Contributors

@gawsoftpl made their first contribution in #3353

Full Changelog: v0.12.0-rc0...v0.12.0-rc1

Contributors

rhuss, spolti, and 8 other contributors

Assets 7

24 Dec 19:14

yuzisun

v0.12.0-rc0

85eca89

v0.12.0-rc0 Pre-release

Pre-release

What's Changed

Make storage initializer image configurable by @yuzisun in #3145
chore: Add design doc template links to feature request template by @ckadner in #3155
Increase pytest workers for kourier e2e test by @sivanantha321 in #3151
Upgrade istio Api and migrate to v1beta1 Api version by @sivanantha321 in #3150
Unpack archive files for hdfs by @sivanantha321 in #3093
Removing update op from pod-mutator webhook by @rachitchauhan43 in #3163
Fix quick install script by @dtrifiro in #3164
Fix self-signed-ca installation by @sivanantha321 in #3165
Generate client-go for StorageContainer CR by @sivanantha321 in #3152
Add S3_VERIFY_SSL to storage.py for S3 by @Jooho in #3172
Allow disabling creation of the HPA in raw deployment mode by @andyi2it in #3086
Restrict workflow concurrency by @vignesh-murugani2i in #3167
Refractor v1 vs. v2 endpoint unit tests in kserve/test/test_server.py… by @guohaoyu110 in #3158
Fix runtime not found for triton due to wrong default protocolVersion by @sivanantha321 in #3177
Make ModelServer to stop correctly when using more than 1 worker by @andyi2it in #3174
Added custom request timeout for inferencegraph. by @andyi2it in #3173
Fix serving runtime webhook cert namespace for kubeflow installation by @sivanantha321 in #3188
Add go security scan for PRs and set it up to run on a regular schedule by @sivanantha321 in #3170
Verify codegen in CI by @sivanantha321 in #3189
Fix knative config-defaults values overrided by kserve by @sivanantha321 in #3130
Fix qpext metrics port by @yuzisun in #3209
docs: fix some typos by @daixiang0 in #3214
chore: Add new PR reviewers and approvers by @ckadner in #3213
Added async with postprocess method. by @andyi2it in #3204
Remove the redundant python lint check in CI environment by @nilakshi104 in #3184
Move pmml worker validation to runtime by @sivanantha321 in #3182
Bump torchserve version to 0.9.0 by @gavrishp in #3217
CVE-2023-44487 - qpext by @spolti in #3203
Allow ray >=2.7,<3 by @ddelange in #3075
Fix lightgbm model input conversion when input is list of lists by @sivanantha321 in #3226
CVE-2023-44487 by @spolti in #3202
Sanitize a command line argument in agent by @israel-hdez in #3245
Validation added for ensuring same model format has same priority for runtime by @andyi2it in #3181
Fix: Unexpected Panic in Inference graph when it fails to create http request by @HAO2167 in #3079
Add default clusterstoragecontainer cr into resources by @homily707 in #3219
Support verify variable with storage-config json style (fix-3263) by @Jooho in #3267
Update qpext docs on image patch by @sivanantha321 in #3266
Added 3.11 support for paddle in workflow. by @andyi2it in #3246
Torchserve grpc v2 by @andyi2it in #3247
Bump istio version to 1.19.4 by @sivanantha321 in #3258
image_patch_dev.sh: set pipefail by @dtrifiro in #3274
s3 storage initializer: only set environment variables if variables are set in storage secret json by @dtrifiro in #3259
feat: CA bundle mount options for storage initializer by @Jooho in #3250
Fix tensorflow e2e test fails due to OOM error by @sivanantha321 in #3293
Update Istio-Dex docs by @sivanantha321 in #3260
chore: Upgrade Go to v1.21 by @terrytangyuan in #3296
fix: Properly handle the creation and closure of success file in DownloadModel() by @terrytangyuan in #3295
Updated ray to 2.8.0 and removed detached flag to avoid deprecation error in future by @andyi2it in #3272
fix: Surface errors when writing graphHandler response by @terrytangyuan in #3308
Fix qpext hangs during shutdown by @sivanantha321 in #3268
chore: Upgrade to XGBoost v2.0.2. Fixes #3310 by @terrytangyuan in #3309
fix: Check if HPA has the same scaleTargetRef and behavior by @terrytangyuan in #3294
Updated quick_install script to temporarily fix 0.11.2 release install by @andyi2it in #3311
Remove deprecated protobuf packages by @sivanantha321 in #3328
Add health check for controller manager by @sivanantha321 in #3289
Introduce retry on resource conflict by @sivanantha321 in #3240
Updated Kserve version file path in pyproject.toml. by @andyi2it in #3225
docs: Add link to OpenShift Container Platform instructions by @terrytangyuan in #3322
Fix inference request fails when sending with less number of features than the total model features on lightgbm by @sivanantha321 in #3313
Add a CI_USE_ISVC_HOST for testing with the ISVC hostname by @israel-hdez in #3324
Upgraded poetry version to 1.7.1 by @andyi2it in #3271
ci: publish helm chart to ghcr by @davidspek in #3319
Fix raw deployment service points to predictor container port instead of transformer container port in transformer collocation by @sivanantha321 in #3318
Upgrade cloudevent to v2 by @homily707 in #3255
Restrict storage uri to predictor only in collocation of transformer and predictor by @sivanantha321 in #3280
Add support for modelcars by @rhuss in #3110
Add regex support for propagating IG headers by @sivanantha321 in #3178
chore: Prepare v0.12.0-rc0 release by @terrytangyuan in #3325

New Contributors

@dtrifiro made their first contribution in #3164
@Jooho made their first contribution in #3172
@vignesh-murugani2i made their first contribution in #3167
@guohaoyu110 made their first contribution in #3158
@bmopuri made their first contribution in #3194
@daixiang0 made their first contribution in #3214
@nilakshi104 made their first contribution in #3184
@gavrishp made their first contribution in #3217
@spolti made their first contribution in #3203
@HAO2167 made their first contribution in #3079
@homily707 made their first contribution in #3219
@rhuss made their first contribution in #3110

Full Changelog: v0.11.1...v0.12.0-rc0

Contributors

rhuss, spolti, and 19 other contributors

Assets 7

15 Nov 14:13

yuzisun

v0.11.2

f7db2a3

v0.11.2

What's Changed

Fix serving runtime webhook cert namespace for kubeflow installation by @sivanantha321 in #3190
[release-0.11] Fix qpext metrics port (#3209) by @houshengbo in #3210
[release-0.11] Fix lightgbm model input conversion when input is list of lists by @sivanantha321 in #3229
[release-0.11]Fix runtime not found for triton due to wrong default protocolVersion by @sivanantha321 in #3232
[release-0.11]Fix mlserver runtime priority for sklearn by @sivanantha321 in #3233
[release-0.11] Cherry-picks related to CVE-2023-44487 by @israel-hdez in #3242
Version bump to 0.11.2 by @israel-hdez in #3244

New Contributors

@houshengbo made their first contribution in #3210

Full Changelog: v0.11.1...v0.11.2

Contributors

houshengbo, israel-hdez, and sivanantha321

Assets 7

22 Sep 22:53

yuzisun

v0.11.1

52b8804

v0.11.1

What's Changed

[docker] reduction in the number of layers for controller image by @alekseyolg in #3070
document status conditions for RoutesReady and LatestDeploymentReady by @tessapham in #3069
Update indirect dependency golang.org/x/net/html by @israel-hdez in #3072
Fix kubeflow overlay kustomization by @sivanantha321 in #3083
Check sys platform before using SIGQUIT - fix windows development by @andyi2it in #3089
Use knative operator for installing knative in e2e tests by @sivanantha321 in #2984
Upgrade k8s to 1.27, istio 1.8 in test environment by @sivanantha321 in #3077
Introduce Storage container CRD by @greenmoon55 in #3060
List Models v2 REST API by @jvujjini in #2963
Introduce Priority field in ServingRuntime by @sivanantha321 in #3031
Storage initializer fix so that it downloads only specific file when provided uri is not a folder by @andyi2it in #3088
Inference Graph error response handling by @rachitchauhan43 in #3039
Add doc.go for v1alpha1 API version by @sivanantha321 in #3118
Fixed torchserve e2e test. by @andyi2it in #3106
Fix: error response handling for splitter and switch nodes by @rachitchauhan43 in #3116
Fix validation for custom storageUri by @greenmoon55 in #3134
Bumping version for 0.11.1 by @rachitchauhan43 in #3141

New Contributors

@alekseyolg made their first contribution in #3070
@israel-hdez made their first contribution in #3072
@jvujjini made their first contribution in #2963

Full Changelog: v0.11.0...v0.11.1

Contributors

greenmoon55, jvujjini, and 6 other contributors

Assets 7

07 Aug 01:19

yuzisun

v0.11.0

1529b71

v0.11.0

🌈 What's New?

Core Inference & Serving Runtimes

Feature enable ingress for path based routing by @kandrio in #2357
Allow multiple containers in ServingRuntime by @markwinter in #2321
Add disable ingress configuration for raw deployment by @andyi2it in #2773
Add support for collocation of transformer and predictor by @sivanantha321 in #2873
Support setting labels and annotations on the component level by @lizzzcai in #2925
Add RoutesReady and LastDeploymentReady status conditions by @tessapham in #3008
Triton FasterTransformer LLM by @cmaddalozzo in #2836
Implement v2/open inference endpoints for kserve python runtimes by @Suresh-Nakkeran in #2655
Support mixed input type for kserve python runtimes by @Suresh-Nakkeran in #2789
Upgrade mlserver version to 1.3.2 by @sivanantha321 in #2910
TorchServe 0.8.0 for LLM support by @sivanantha321 in #2980
Bump triton server version to 23.05-py3 by @sivanantha321 in #2992

Advanced Inference

Allowing setting minReplicas for Inference Graph router. by @rachitchauhan43 in #2679
Adding pod affinity and resource requirements to IG Spec by @rachitchauhan43 in #2711
add json mimetype header for IG response by @krazik-intuit in #2877

Storage Provider

Storage Initializer: Support virtual path style in S3 by @lizzzcai in #2887
Support Direct VolumeMount for PVC by @lizzzcai in #2738

KServe Python SDK

kserve 0.11.0 now uses poetry for dependency management, cloud storage dependencies are now made optional and you can run pip install kserve[storage] to install those dependencies.

Make storage dependency as an optional dependency by @andyi2it in #2700
Support parameters in InferInput and InferOutput by @Suresh-Nakkeran in #2699
Dependency resolver using poetry by @andyi2it in #2602
Allow to override the UvicornServer's default log config by @elukey in #2782
Move cloud event decode logic from preprocess to decode method by @sivanantha321 in #2881
Allow using SSL between transformer and predictor/explainer by @greenmoon55 in #2911
Make postprocess interface consistent with V2 protocol by @cmaddalozzo in #2876
Add a KServe module level logger by @cmaddalozzo in #2884
Log exception stack for python runtimes by @yuzisun in #2939
Checked model readiness before attempting inference. by @andyi2it in #2917
Load Kubeconfig from python dict by @ShreehariVaasishta in #2924
Added response id based on request id. by @andyi2it in #3020

⚠️ What's Changed

Remove "default" suffix from generated component name - updated by @Suresh-Nakkeran in #2508
Remove(AIX explainer): Remove AIX explainer API and SDK by @Tomcli in #2826
Fix raw deployment status.address.url displays wrong url by @sivanantha321 in #2830
Make status.address.url consistent across installations by @sivanantha321 in #2875
Added support to accept the request body/payload in any format (not just json) by @SatishBethi in #2524
⚠️ Now, you are required to set content-type header to application/json for the server to recognize and decode the json type payload.

🐛 What's Fixed

Loosen protobuf and numpy dependency by @andyi2it in #2673
Fix dependency issue and remove pinned pip version in xgbserver by @dependabot in #2689
Fixing the dockerfile to run as non-root user by @chirag-orbittec in #2687
Add check for adding gpu tag suffix when image field is specified by @yuzisun in #2709
Use fork for multiprocessing mode by @yuzisun in #2718
Adjust order of types to default to float,int by @pascalwhoop in #2754
Fix trained model ready status by @andyi2it in #2774
Fix missing ingress config options in helm chart by @andyi2it in #2772
model_server.py: fix documentation for enable_latency_logging by @elukey in #2777
Handle file exist scenario for local storage by @sivanantha321 in #2794
Loosen tritonclient and azure storage blob dependencies by @sivanantha321 in #2815
Add RBAC for inferencegraphs/finalizers by @ReToCode in #2839
Do not create GRPCServer when grpc is not enabled by @greenmoon55 in #2878
Make Pod mutator idempotent to support fluid by @hclchimumu in #2896
Fix incorrect log message in transformer reconciliation by @leecs0503 in #2968
Reconcile when rollout duration is changed by @henrysecond1 in #2916
Fix status.address.url for networking layer other than istio by @sivanantha321 in #2908
Fix IngressReady condition by @tessapham in #2977
Use standard socket for single process by @yuzisun in #3000
Remove distutils from Python SDK by @xfu83 in #3010
Fix raw deployment service port by @sivanantha321 in #2967
Allow passthrough of InferRequest id by @markgeejw in #2945
Fix: if there is a symbol '?' in the path, force url. query as the mi… by @Wercurial in #3014

⬆️ Version Upgrade

Bump torch from 1.13.0 to 1.13.1 in /python/aixexplainer by @dependabot in #2802
Add support for k8s 1.26 by @sivanantha321 in #2835
Go 1.20 upgrade by @sivanantha321 in #2914
Bump the Go version to 1.20 for the builder image by @skonto in #2860
Update the OpenShift guide to version 4.12 and use OpenShift Serverless by @ReToCode in #2855
Added support for python 3.10 and removed python 3.7 references by @andyi2it in #2832
Poetry plugin should update pyproject.toml by @andyi2it in #2899
Python 3.11 support by @andyi2it in #2933
Add python 3.11 support for alibi explainer by @sivanantha321 in #3006

🔨 Project SDLC

Fix running out of disk space in e2e by @andyi2it in #2765
Updating Knative Serving and Istio to their latest version by @matzew in #2697
Fix formatting and controller tests by @yuzisun in #2783
Parametrized docker builds by @peterableda in #2666
Fix minimum k8s version in quick install by @sivanantha321 in #2791
Make deployment scheduling behavior configurable by @ddelange in #2627
Upgrade kustomization.yaml to support kustomize 5.0 by @sivanantha321 in #2841
Extract modelmesh part in helm chart by @hhk7734 in #2704
chown python virtual env path to runtime user by @ReToCode in #2845
Use buildkit for building docker images by @sivanantha321 in #2848
Implement dynamic versioning using poetry plugin by @andyi2it in #2869
Break down the Serving and net-istio artifact downloads into their own release trains by @matzew in #2890
Add poetry lockfile consistency check to CI environment by @sivanantha321 in #2905
Upgrade controller gen version to 0.12.0 by @sivanantha321 in #2912
Fix kourier e2e test by @sivanantha321 in #2991
chore: Free up disk space on E2E test GHA runner node by @ckadner in #2972
Fix e2e workflow syntax typo that broke e2e tests by @andyi2it in #2998
Enable Triton e2e test by @sivanantha321 in #3004
Added manual workflow trigger for release branch. by @andyi2it in https://gith...

Contributors

matzew, aliok, and 38 other contributors

Assets 7

10 Jul 17:01

yuzisun

v0.11.0-rc1

165b475

v0.11.0-rc1 Pre-release

Pre-release

What's Changed

Replace unmaintained satori/go.uuid package by @sivanantha321 in #2932
Log exception stack for python runtimes by @yuzisun in #2939
Upgrade controller gen version to 0.12.0 by @sivanantha321 in #2912
Checked model readiness before attempting inference. by @andyi2it in #2917
moviesentiment storageUri updated. by @andyi2it in #2952
Python 3.11 support by @andyi2it in #2933
Update alibi explainer storage uri by @sivanantha321 in #2966
Updated storageUri for paddle example and e2e test. by @andyi2it in #2956
Decode avro cloud event by @sivanantha321 in #2929
Fix incorrect log message in transformer reconciliation by @leecs0503 in #2968
Reconcile when rollout duration is changed by @henrysecond1 in #2916
Support setting labels and annotations on the component level by @lizzzcai in #2925
Load Kubeconfig from python dict by @ShreehariVaasishta in #2924
Fix status.address.url for networking layer other than istio by @sivanantha321 in #2908
Fix kourier e2e test by @sivanantha321 in #2991
fix IngressReady condition by @tessapham in #2977
chore: Free up disk space on E2E test GHA runner node by @ckadner in #2972
Fix e2e workflow syntax typo that broke e2e tests by @andyi2it in #2998
Use standard socket for single process by @yuzisun in #3000
Bump torchserve version to 0.8.0 by @sivanantha321 in #2980
Fix params while loading kube config from dict by @ShreehariVaasishta in #3005
Add python 3.11 support for alibi explainer by @sivanantha321 in #3006
Enable Triton e2e test by @sivanantha321 in #3004
Bump triton server version to 23.05-py3 by @sivanantha321 in #2992
Update ModelMesh version to v0.11.0-rc0 by @rafvasq in #2969
Fix a typo docs/samples/metrics-and-monitoring/README.md by @fj-ochiai in #3017
Remove distutils from Python SDK by @xfu83 in #3010
Fix raw deployment service port by @sivanantha321 in #2967
Allow passthrough of InferRequest id by @markgeejw in #2945
Add configmap docs by @sivanantha321 in #3016
add ServiceReady status conditions by @tessapham in #3008
Updated onnx-example. by @andyi2it in #3001
Fixes for Integrating KServe with Openshift by @skonto in #2853
Added manual workflow trigger for release branch. by @andyi2it in #3002
fix: if there is a symbol '?' in the path, force url. query as the mi… by @Wercurial in #3014
Added response id based on request id. by @andyi2it in #3020
put back PredictorReady in living condition set by @tessapham in #3026
add documentation about storage-initializer userid for istio-cni by @ReToCode in #3028
publish RC1 version for v0.11.0 by @tessapham in #3027

New Contributors

@leecs0503 made their first contribution in #2968
@henrysecond1 made their first contribution in #2916
@ShreehariVaasishta made their first contribution in #2924
@tessapham made their first contribution in #2977
@fj-ochiai made their first contribution in #3017
@xfu83 made their first contribution in #3010
@markgeejw made their first contribution in #2945
@Wercurial made their first contribution in #3014

Full Changelog: v0.11.0-rc0...v0.11.0-rc1

Contributors

yuzisun, skonto, and 14 other contributors

Assets 7

Releases: kserve/kserve

v0.13.0-rc1

What's Changed

New Contributors

Contributors

v0.13.0-rc0

🌈 What's New?

⚠️ What's Changed

🐛 What's Fixed

⬆️ Version Upgrade

🔨 Project SDLC

CVE patches

📝 Documentation Update

New Contributors

Contributors

v0.12.1

What's Changed

Contributors

v0.12.0

🌈 What's New?

Core Inference & Serving Runtimes

Advanced Inference

KServe Python SDK, Storage

⚠️ What's Changed

🐛 What's Fixed

⬆️ Version Upgrade

🔨 Project SDLC

Contributors

v0.12.0-rc1

What's Changed

New Contributors

Contributors

v0.12.0-rc0

What's Changed

New Contributors

Contributors

v0.11.2

What's Changed

New Contributors

Contributors

v0.11.1

What's Changed

New Contributors

Contributors

v0.11.0

🌈 What's New?

Core Inference & Serving Runtimes

Advanced Inference

Storage Provider

KServe Python SDK

⚠️ What's Changed

🐛 What's Fixed

⬆️ Version Upgrade

🔨 Project SDLC

Contributors

v0.11.0-rc1

What's Changed

New Contributors

Contributors