Support overriding model mount path in model server container #3606

cmaddalozzo · 2024-04-15T21:30:26Z

/kind feature

Describe the solution you'd like
Currently it is not possible to specify at what path the downloaded model should be available in the model server container. The downloaded model is always mounted in the model server container at /mnt/models.

We should allow the user to specify the path inside the model server container the volume should be mounted at. e.g.

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  annotations:
    ds.bloomberg.com/gpuSubtype: bloomberg.com/gpu-l4
    identities.bloomberg.com/bcs: platform-bcosv2
  name: huggingface-t5-hfs
  namespace: s-dsplatform
spec:
  predictor:
    containers:
    - args:
      - --model
      - llama-2-70b
      - --num_gpus=4
      command:
      - nemollm_inference_ms
      env:
      - name: STORAGE_URI
        value: s3://models/llama-7b
      - name: STORAGE_MOUNT_PATH
        value: /model-store
      image: nvcr.io/ohlfw0olaadg/ea-participants/nim_llm:24.02

Since serving runtimes may require models to be mounted at a specific location we should also make it configurable on a runtime level. e.g.

apiVersion: serving.kserve.io/v1alpha1
kind: ClusterServingRuntime
metadata:
  name: nim
spec:
  containers:
  - command:
    - nemollm_inference_ms
    image: nvcr.io/ohlfw0olaadg/ea-participants/nim_llm:24.02
    name: kserve-container
    ports:
    - containerPort: 9999
      protocol: TCP
  modelMountPath: /model-store
  supportedModelFormats:
  - autoSelect: true
    name: nim

The text was updated successfully, but these errors were encountered:

spolti · 2024-04-18T20:31:54Z

@terrytangyuan didn't you change something related to it recently?

terrytangyuan · 2024-04-18T20:42:33Z

My change was specific to HF: #3576

DanielTemesgen · 2024-04-27T14:25:48Z

This would be a good feature to have, as a workaround you can set up a symlink to mnt/models for containers which only read artefacts from one location, but it's not ideal.

e.g. the bitnami tensorflow serving image only reads from /bitnami/model-data.

oss-prow-bot bot added the kind/feature label Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support overriding model mount path in model server container #3606

Support overriding model mount path in model server container #3606

cmaddalozzo commented Apr 15, 2024

spolti commented Apr 18, 2024

terrytangyuan commented Apr 18, 2024

DanielTemesgen commented Apr 27, 2024

Support overriding model mount path in model server container #3606

Support overriding model mount path in model server container #3606

Comments

cmaddalozzo commented Apr 15, 2024

spolti commented Apr 18, 2024

terrytangyuan commented Apr 18, 2024

DanielTemesgen commented Apr 27, 2024