Skip to content

Conversation

@NaomiEisen
Copy link
Contributor

Current problem:

The standup.sh and teardown.sh scripts cannot be executed successfully by users who do not have cluster-level privileges, even if they have admin permissions in the namespace where they are trying to install the llm-d stack.


Changes made:

Added an additional run option via the --non-admin flag for the setup.sh and teardown.sh scripts.
This option allows users without cluster-level admin privileges to run and deploy the llm-d stack in a specific namespace (pre-created one).

Detailed explanation of the code changes:

  • setup/standup.sh: Added the -i|--non-admin flag. When the flag is set, serval default variables are overridden to prevent modification to cluster level resources (the blocking commands).
  • setup/env.sh: Added a variable for the non-admin flag. Changed the default model to the llm-d well-lit-paths default model (I believe the previous default model was a bug? It didn't run successfully).
  • setup/steps/07_deploy_setup.py: When the --non-admin flag is set to true, the helmfile yaml now sets the createNamespace variable to false (as non-cluster-admin users are unable to create namespaces).
  • setup/teardown.sh: Added the -i|--non-admin flag. When this flag is set, the delete ClusterRole step is skipped.

How the PR was tested:

  1. Final commit was made using the pre-commit hook.
  2. Manual verification - running:
    ./setup/standup.sh -c "$(pwd)/setup/example_env.sh" --non-admin
    where example_env.sh contained :
export LLMDBENCH_VLLM_COMMON_NAMESPACE="naomi"
export LLMDBENCH_HARNESS_NAMESPACE="naomi"

# HuggingFace token
export LLMDBENCH_HF_TOKEN="<token>"

# The default pvc in our cluster stuck on pending
export LLMDBENCH_VLLM_COMMON_PVC_STORAGE_CLASS="ocs-storagecluster-cephfs"

# The default chart is oci helm - looks like it is a known issue
export LLMDBENCH_VLLM_GAIE_CHART_NAME="oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts/inferencepool"

The result:

oc get all
NAME                                                      READY   STATUS      RESTARTS   AGE
pod/access-to-harness-data-workload-pvc                   1/1     Running     0          14m
pod/download-model-qk227                                  0/1     Completed   0          14m
pod/infra-llmdbench-inference-gateway-586f4b6664-sk5ld    1/1     Running     0          13m
pod/qwen-qwe-1b61a417-en3-0-6b-decode-55ff6ccf56-qnfxh    2/2     Running     0          13m
pod/qwen-qwe-1b61a417-en3-0-6b-gaie-epp-69f88d7c4-zm92z   1/1     Running     0          13m
pod/qwen-qwe-1b61a417-en3-0-6b-prefill-5b48f7fbc9-k9pgf   1/1     Running     0          13m

NAME                                                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
service/infra-llmdbench-inference-gateway             NodePort    172.30.204.154   <none>        80:30186/TCP                 14m
service/llm-d-benchmark-harness                       ClusterIP   172.30.173.38    <none>        20873/TCP                    10d
service/qwen-qwe-1b61a417-en3-0-6b-gaie-epp           ClusterIP   172.30.188.9     <none>        9002/TCP,9090/TCP,5557/TCP   13m
service/qwen-qwe-1b61a417-en3-0-6b-gaie-ip-89c32a87   ClusterIP   None             <none>        54321/TCP                    13m

NAME                                                  READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/infra-llmdbench-inference-gateway     1/1     1            1           14m
deployment.apps/qwen-qwe-1b61a417-en3-0-6b-decode     1/1     1            1           13m
deployment.apps/qwen-qwe-1b61a417-en3-0-6b-gaie-epp   1/1     1            1           13m
deployment.apps/qwen-qwe-1b61a417-en3-0-6b-prefill    1/1     1            1           13m

NAME                                                            DESIRED   CURRENT   READY   AGE
replicaset.apps/infra-llmdbench-inference-gateway-586f4b6664    1         1         1       14m
replicaset.apps/qwen-qwe-1b61a417-en3-0-6b-decode-55ff6ccf56    1         1         1       13m
replicaset.apps/qwen-qwe-1b61a417-en3-0-6b-gaie-epp-69f88d7c4   1         1         1       13m
replicaset.apps/qwen-qwe-1b61a417-en3-0-6b-prefill-5b48f7fbc9   1         1         1       13m

NAME                       STATUS     COMPLETIONS   DURATION   AGE
job.batch/download-model   Complete   1/1           11s        14m

NAME                                                         HOST/PORT                                                                     PATH   SERVICES                            PORT      TERMINATION   WILDCARD
route.route.openshift.io/llmdbench-inference-gateway-route   llmdbench-inference-gateway-route-naomi.apps.fmaas-vllm-d.fmaas.res.ibm.com          infra-llmdbench-inference-gateway   default                 None

Finally, running:

./setup/teardown.sh -c "$(pwd)/setup/example_env.sh" --non-admin

@maugustosilva
Copy link
Collaborator

@NaomiEisen This is a really good and welcome addition. Good to see that modelservice is now working for non-admin users. However, I need to understand why the already existing automatic detection of privileges in env.sh (encoded in the variable LLMDBENCH_USER_IS_ADMIN) was not enough/not working.

@NaomiEisen
Copy link
Contributor Author

@NaomiEisen This is a really good and welcome addition. Good to see that modelservice is now working for non-admin users. However, I need to understand why the already existing automatic detection of privileges in env.sh (encoded in the variable LLMDBENCH_USER_IS_ADMIN) was not enough/not working.

You're absolutely right, the automatic detection is working, and I switched to using it. I just kept the modification to avoid the blocking commands when the user is detected as a non-admin :) Thanks!

@maugustosilva maugustosilva merged commit 87e54bb into llm-d:main Dec 3, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants