plano/docs/source/guides/observability
Syed A. Hashmi 5a652eb666
docs: align signals page with paper taxonomy (#910)
* docs: align signals page with paper taxonomy

Updates docs/source/concepts/signals.rst and the tracing guide's signals
subsection to reflect the three-layer taxonomy shipped in #903:

- Introduces the paper reference (arXiv:2604.00356) and the three layers
  (interaction, execution, environment) with all 20 leaf signal types in
  three reference tables
- Documents the new layered OTel attribute set
  (signals.interaction.*, signals.execution.*, signals.environment.*)
  and marks the legacy aggregate keys (signals.follow_up.repair.*,
  signals.frustration.*, signals.repetition.count,
  signals.escalation.requested, signals.positive_feedback.count) as
  deprecated-but-still-emitted
- Adds a Span Events section describing the per-instance signal.<type>
  events with confidence / snippet / metadata attributes
- Fixes the flag marker reference ([!] in the code vs 🚩 in the old docs)
- Updates all example attributes, dashboard queries, and alert rules to
  use the layered keys
- Updates the tracing guide's behavioral-signals subsection to match
- Notes that the triage sampler is a planned follow-up and today sampling
  is consumer-side via observability-platform filters

Build verified locally: sphinx-build produces no warnings on these files.

Made-with: Cursor

* docs: reframe signals intro around the improvement flywheel

Addresses review feedback on #910:

- Replace the triage-only framing at the top with an instrument -> sample
  & triage -> construct data -> optimize -> deploy flywheel that explains
  why signals matter, not just what they surface. Paper's 82% / 1.52x
  numbers move into step 2 of the flywheel where they belong.
- Remove the 'Signals vs Response Quality' section. Per review, signals
  and response quality overlap rather than complement each other, so the
  comparison is misleading.
- Borrow the per-category summaries and leaf-type descriptions verbatim
  from the katanemo/signals reference implementation (module docstrings)
  so the documentation and the detector contract stay in sync. Drops the
  hand-crafted examples that were not strictly accurate (e.g. 'semantic
  overlap is high' for rephrase, 'user explicitly corrects the agent'
  for correction).

Made-with: Cursor

* docs: address signals flywheel review feedback

Addresses review comments on #910:

- Shorten the paper citation to (Chen et al., 2026) per common cite
  practice (replacing the full author list form).
- Replace the Why Signals Matter section with the review-suggested
  rewrite verbatim: more formal intro framing, renumbered steps to
  Instrument / Sample & triage / Data Construction / Model Optimization
  / Deploy, removes 'routing decisions' from the data-construction
  step, and adds DPO/RLHF/SFT as model-optimization examples.
- Renders tau and O(messages) as proper math glyphs via the sphinx
  built-in :math: role (enabled by adding sphinx.ext.mathjax to
  conf.py). Using the RST role form rather than raw $...$ inline so
  sphinx only injects MathJax on pages that actually have math,
  instead of loading ~1MB of JS on every page.

Build verified locally: sphinx-build produces no warnings on the
changed files and the rendered HTML wraps tau and O(messages) in
MathJax-ready <span class="math">\(\tau\)</span> containers.

Made-with: Cursor
2026-04-24 12:31:14 -07:00
..
access_logging.rst Rename all arch references to plano (#745) 2026-02-13 15:16:56 -08:00
monitoring.rst Add Prometheus metrics endpoint and Grafana dashboard for brightstaff (#904) 2026-04-22 11:19:10 -07:00
observability.rst Doc Update (#129) 2024-10-06 16:54:34 -07:00
tracing.rst docs: align signals page with paper taxonomy (#910) 2026-04-24 12:31:14 -07:00