diff --git a/demos/llm_routing/model_routing_service/README.md b/demos/llm_routing/model_routing_service/README.md
index 676de9e1..72b672f3 100644
--- a/demos/llm_routing/model_routing_service/README.md
+++ b/demos/llm_routing/model_routing_service/README.md
@@ -107,22 +107,23 @@ The response tells you which model would handle this request and which route was
 
 To run Arch-Router in-cluster using vLLM instead of the default hosted endpoint:
 
-**1. Update `vllm-deployment.yaml`** — set `nodeSelector` to match your GPU node's labels:
-
-```yaml
-# Examples:
-#   GKE: cloud.google.com/gke-accelerator: nvidia-l4
-#   EKS: eks.amazonaws.com/nodegroup: gpu-nodes
-#   AKS: kubernetes.azure.com/agentpool: gpupool
-nodeSelector:
-  node.kubernetes.io/instance-type: gpu-node
-```
-
-**2. Deploy Arch-Router and Plano:**
+**0. Check your GPU node labels and taints**
 
 ```bash
+kubectl get nodes --show-labels | grep -i gpu
+kubectl get node <gpu-node-name> -o jsonpath='{.spec.taints}'
+```
+
+GPU nodes commonly have a `nvidia.com/gpu:NoSchedule` taint — `vllm-deployment.yaml` includes a matching toleration. If you have multiple GPU node pools and need to pin to a specific one, uncomment and set the `nodeSelector` in `vllm-deployment.yaml` using the label for your cloud provider.
+
+**1. Deploy Arch-Router and Plano:**
+
+```bash
+
+# arch-router deployment
 kubectl apply -f vllm-deployment.yaml
 
+# plano deployment
 kubectl create secret generic plano-secrets \
   --from-literal=OPENAI_API_KEY=$OPENAI_API_KEY \
   --from-literal=ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY
diff --git a/demos/llm_routing/model_routing_service/vllm-deployment.yaml b/demos/llm_routing/model_routing_service/vllm-deployment.yaml
index a2f40cf2..1debe15e 100644
--- a/demos/llm_routing/model_routing_service/vllm-deployment.yaml
+++ b/demos/llm_routing/model_routing_service/vllm-deployment.yaml
@@ -18,13 +18,13 @@ spec:
         - key: nvidia.com/gpu
           operator: Exists
           effect: NoSchedule
-      nodeSelector:
-        # Replace with the label that identifies GPU nodes in your cluster
-        # Examples:
-        #   GKE: cloud.google.com/gke-accelerator: nvidia-l4
-        #   EKS: eks.amazonaws.com/nodegroup: gpu-nodes
-        #   AKS: kubernetes.azure.com/agentpool: gpupool
-        node.kubernetes.io/instance-type: gpu-node
+      # Optional: add a nodeSelector to pin to a specific GPU node pool.
+      # The nvidia.com/gpu resource request below is sufficient for most clusters.
+      # nodeSelector:
+      #   DigitalOcean: doks.digitalocean.com/gpu-model: l40s
+      #   GKE:          cloud.google.com/gke-accelerator: nvidia-l4
+      #   EKS:          eks.amazonaws.com/nodegroup: gpu-nodes
+      #   AKS:          kubernetes.azure.com/agentpool: gpupool
       initContainers:
         - name: download-model
           image: python:3.11-slim
diff --git a/docs/source/guides/llm_router.rst b/docs/source/guides/llm_router.rst
index 2fceb112..7c4ad685 100644
--- a/docs/source/guides/llm_router.rst
+++ b/docs/source/guides/llm_router.rst
@@ -362,10 +362,9 @@ The ``demos/llm_routing/model_routing_service/`` directory includes ready-to-use
 Key things to know before deploying:
 
 - GPU nodes commonly have a ``nvidia.com/gpu:NoSchedule`` taint — the ``vllm-deployment.yaml``
-  includes a matching toleration. Update the ``nodeSelector`` to match your cluster's GPU node
-  labels (GKE, EKS, AKS each use different label keys).
-- The ``nvidia.com/gpu: "1"`` resource request alone is sufficient for scheduling, but a
-  ``nodeSelector`` is recommended when you have mixed node pools.
+  includes a matching toleration. The ``nvidia.com/gpu: "1"`` resource request is sufficient
+  for scheduling in most clusters; a ``nodeSelector`` is optional and commented out in the
+  manifest for cases where you need to pin to a specific GPU node pool.
 - Model download takes ~1 minute; vLLM loads the model in ~1-2 minutes after that. The
   ``livenessProbe`` has a 180-second ``initialDelaySeconds`` to avoid premature restarts.
 - The Plano config ConfigMap must use ``--from-file=plano_config.yaml=config_k8s.yaml`` with