2026-04-05 22:44:45 -05:00
|
|
|
global:
|
|
|
|
|
scrape_interval: 15s
|
|
|
|
|
evaluation_interval: 15s
|
|
|
|
|
|
|
|
|
|
external_labels:
|
|
|
|
|
monitor: "trustgraph-ts"
|
|
|
|
|
|
|
|
|
|
scrape_configs:
|
|
|
|
|
# Prometheus self-monitoring
|
|
|
|
|
- job_name: "prometheus"
|
|
|
|
|
scrape_interval: 15s
|
|
|
|
|
static_configs:
|
|
|
|
|
- targets:
|
|
|
|
|
- "prometheus:9090"
|
|
|
|
|
|
fix: comprehensive QA — resolve 13 bugs, add UX improvements across all services
Client SDK: add .catch() to graphRagStreaming/documentRagStreaming (silent timeout),
null-guard JSON.parse in getPrompts/getSystemPrompt/getPrompt.
Backend: implement "getvalues" config operation for token costs, null-check
createTerm() in FalkorDB triples query, add knowledge-cores service entrypoint
and Docker entry, return proper HTTP 400/404 for gateway error responses.
Workbench: cancel button + elapsed timer for chat, clear agent spinner on error,
flow dialog inline validation, responsive header wrapping, knowledge cores
loading timeout, sidebar/page naming consistency, theme toggle indicator.
Infrastructure: enable Grafana Explore for viewers, add gateway Prometheus
scrape target, fix RAG pipeline dashboard layout (6 panels visible),
filter Service Health to configured targets only.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 05:20:10 -05:00
|
|
|
# NATS monitoring (uses nats-prometheus-exporter format)
|
|
|
|
|
# NATS exposes JSON at /varz, not Prometheus format.
|
|
|
|
|
# To get proper Prometheus metrics, deploy nats-exporter sidecar.
|
|
|
|
|
# For now, we rely on NATS healthcheck and JetStream monitoring via /jsz.
|
2026-04-05 22:44:45 -05:00
|
|
|
|
|
|
|
|
# OpenTelemetry Collector (exposes Prometheus metrics from OTLP pipeline)
|
|
|
|
|
- job_name: "otel-collector"
|
|
|
|
|
scrape_interval: 15s
|
|
|
|
|
static_configs:
|
|
|
|
|
- targets:
|
|
|
|
|
- "otel-collector:8889"
|
|
|
|
|
|
fix: comprehensive QA — resolve 13 bugs, add UX improvements across all services
Client SDK: add .catch() to graphRagStreaming/documentRagStreaming (silent timeout),
null-guard JSON.parse in getPrompts/getSystemPrompt/getPrompt.
Backend: implement "getvalues" config operation for token costs, null-check
createTerm() in FalkorDB triples query, add knowledge-cores service entrypoint
and Docker entry, return proper HTTP 400/404 for gateway error responses.
Workbench: cancel button + elapsed timer for chat, clear agent spinner on error,
flow dialog inline validation, responsive header wrapping, knowledge cores
loading timeout, sidebar/page naming consistency, theme toggle indicator.
Infrastructure: enable Grafana Explore for viewers, add gateway Prometheus
scrape target, fix RAG pipeline dashboard layout (6 panels visible),
filter Service Health to configured targets only.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 05:20:10 -05:00
|
|
|
# TrustGraph gateway metrics (prom-client)
|
2026-04-05 22:44:45 -05:00
|
|
|
- job_name: "gateway"
|
fix: comprehensive QA — resolve 13 bugs, add UX improvements across all services
Client SDK: add .catch() to graphRagStreaming/documentRagStreaming (silent timeout),
null-guard JSON.parse in getPrompts/getSystemPrompt/getPrompt.
Backend: implement "getvalues" config operation for token costs, null-check
createTerm() in FalkorDB triples query, add knowledge-cores service entrypoint
and Docker entry, return proper HTTP 400/404 for gateway error responses.
Workbench: cancel button + elapsed timer for chat, clear agent spinner on error,
flow dialog inline validation, responsive header wrapping, knowledge cores
loading timeout, sidebar/page naming consistency, theme toggle indicator.
Infrastructure: enable Grafana Explore for viewers, add gateway Prometheus
scrape target, fix RAG pipeline dashboard layout (6 panels visible),
filter Service Health to configured targets only.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 05:20:10 -05:00
|
|
|
scrape_interval: 15s
|
|
|
|
|
metrics_path: "/api/v1/metrics"
|
2026-04-05 22:44:45 -05:00
|
|
|
static_configs:
|
|
|
|
|
- targets:
|
fix: comprehensive QA — resolve 13 bugs, add UX improvements across all services
Client SDK: add .catch() to graphRagStreaming/documentRagStreaming (silent timeout),
null-guard JSON.parse in getPrompts/getSystemPrompt/getPrompt.
Backend: implement "getvalues" config operation for token costs, null-check
createTerm() in FalkorDB triples query, add knowledge-cores service entrypoint
and Docker entry, return proper HTTP 400/404 for gateway error responses.
Workbench: cancel button + elapsed timer for chat, clear agent spinner on error,
flow dialog inline validation, responsive header wrapping, knowledge cores
loading timeout, sidebar/page naming consistency, theme toggle indicator.
Infrastructure: enable Grafana Explore for viewers, add gateway Prometheus
scrape target, fix RAG pipeline dashboard layout (6 panels visible),
filter Service Health to configured targets only.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 05:20:10 -05:00
|
|
|
- "gateway:8088"
|