Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support overriding model mount path in model server container #3606

Open
cmaddalozzo opened this issue Apr 15, 2024 · 3 comments
Open

Support overriding model mount path in model server container #3606

cmaddalozzo opened this issue Apr 15, 2024 · 3 comments

Comments

@cmaddalozzo
Copy link
Contributor

/kind feature

Describe the solution you'd like
Currently it is not possible to specify at what path the downloaded model should be available in the model server container. The downloaded model is always mounted in the model server container at /mnt/models.

We should allow the user to specify the path inside the model server container the volume should be mounted at. e.g.

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  annotations:
    ds.bloomberg.com/gpuSubtype: bloomberg.com/gpu-l4
    identities.bloomberg.com/bcs: platform-bcosv2
  name: huggingface-t5-hfs
  namespace: s-dsplatform
spec:
  predictor:
    containers:
    - args:
      - --model
      - llama-2-70b
      - --num_gpus=4
      command:
      - nemollm_inference_ms
      env:
      - name: STORAGE_URI
        value: s3://models/llama-7b
      - name: STORAGE_MOUNT_PATH
        value: /model-store
      image: nvcr.io/ohlfw0olaadg/ea-participants/nim_llm:24.02

Since serving runtimes may require models to be mounted at a specific location we should also make it configurable on a runtime level. e.g.

apiVersion: serving.kserve.io/v1alpha1
kind: ClusterServingRuntime
metadata:
  name: nim
spec:
  containers:
  - command:
    - nemollm_inference_ms
    image: nvcr.io/ohlfw0olaadg/ea-participants/nim_llm:24.02
    name: kserve-container
    ports:
    - containerPort: 9999
      protocol: TCP
  modelMountPath: /model-store
  supportedModelFormats:
  - autoSelect: true
    name: nim
@spolti
Copy link
Contributor

spolti commented Apr 18, 2024

@terrytangyuan didn't you change something related to it recently?

@terrytangyuan
Copy link
Member

My change was specific to HF: #3576

@DanielTemesgen
Copy link

This would be a good feature to have, as a workaround you can set up a symlink to mnt/models for containers which only read artefacts from one location, but it's not ideal.

e.g. the bitnami tensorflow serving image only reads from /bitnami/model-data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants