trustgraph/trustgraph_configurator/resources/dialog/docs/model/vllm-compose.md
elpresidank 74cc8a4685 Squashed 'ai-context/trustgraph-templates/' content from commit 42a5fd1b
git-subtree-dir: ai-context/trustgraph-templates
git-subtree-split: 42a5fd1b678f32be378062e30451e2052ccb95dd
2026-04-05 21:09:49 -05:00

548 B

vLLM is a high-throughput, memory-efficient inference and serving engine for LLMs. Using PagedAttention and continuous batching, vLLM enables fully secure AI TrustGraph pipelines that aren't relying on any external APIs. No data is leaving the host environment or network.

The vLLM service must be running with the required model loaded using vllm serve. The vLLM service URL must be provided in an environment variable.

VLLM_BASE_URL=http://vllm-host:8000/v1

Replace the URL with the URL of your vLLM service, noting the v1 suffix.