Compare commits

...

3 commits

Author SHA1 Message Date
Syed Hashmi
041ba75034
signals: restore the pre-port flag marker emoji
#903 inadvertently replaced the legacy FLAG_MARKER (U+1F6A9, '🚩') with
'[!]', which broke any downstream dashboard / alert that searches span
names for the flag emoji. Restores the original marker and updates the
#910 docs pass to match.

- crates/brightstaff/src/signals/analyzer.rs: FLAG_MARKER back to
  "\\u{1F6A9}" with a comment noting the backwards-compatibility
  reason so it doesn't drift again.
- docs/source/concepts/signals.rst and docs/source/guides/observability/
  tracing.rst: swap every '[!]' reference (subheading text, example
  span name, tip box, dashboard query hint) back to 🚩.

Verified: cargo test -p brightstaff --lib (162 passed, 1 ignored);
sphinx-build clean on both files; rendered HTML shows 🚩 in all
flag-marker references.

Made-with: Cursor
2026-04-24 13:36:51 -07:00
Syed Hashmi
cea43c5da5
docs: address signals flywheel review feedback
Addresses review comments on #910:

- Shorten the paper citation to (Chen et al., 2026) per common cite
  practice (replacing the full author list form).
- Replace the Why Signals Matter section with the review-suggested
  rewrite verbatim: more formal intro framing, renumbered steps to
  Instrument / Sample & triage / Data Construction / Model Optimization
  / Deploy, removes 'routing decisions' from the data-construction
  step, and adds DPO/RLHF/SFT as model-optimization examples.
- Renders tau and O(messages) as proper math glyphs via the sphinx
  built-in :math: role (enabled by adding sphinx.ext.mathjax to
  conf.py). Using the RST role form rather than raw $...$ inline so
  sphinx only injects MathJax on pages that actually have math,
  instead of loading ~1MB of JS on every page.

Build verified locally: sphinx-build produces no warnings on the
changed files and the rendered HTML wraps tau and O(messages) in
MathJax-ready <span class="math">\(\tau\)</span> containers.

Made-with: Cursor
2026-04-24 12:05:48 -07:00
Syed Hashmi
ae629d3635
docs: reframe signals intro around the improvement flywheel
Addresses review feedback on #910:

- Replace the triage-only framing at the top with an instrument -> sample
  & triage -> construct data -> optimize -> deploy flywheel that explains
  why signals matter, not just what they surface. Paper's 82% / 1.52x
  numbers move into step 2 of the flywheel where they belong.
- Remove the 'Signals vs Response Quality' section. Per review, signals
  and response quality overlap rather than complement each other, so the
  comparison is misleading.
- Borrow the per-category summaries and leaf-type descriptions verbatim
  from the katanemo/signals reference implementation (module docstrings)
  so the documentation and the detector contract stay in sync. Drops the
  hand-crafted examples that were not strictly accurate (e.g. 'semantic
  overlap is high' for rephrase, 'user explicitly corrects the agent'
  for correction).

Made-with: Cursor
2026-04-24 10:59:37 -07:00
4 changed files with 191 additions and 153 deletions

View file

@ -21,9 +21,10 @@ use super::schemas::{
use super::text_processing::NormalizedMessage;
/// Marker appended to the span operation name when concerning signals are
/// detected. Kept in sync with the previous implementation for backward
/// compatibility with downstream consumers.
pub const FLAG_MARKER: &str = "[!]";
/// detected. The 🚩 emoji (U+1F6A9) matches the pre-port implementation so
/// downstream consumers that search for flagged traces by span-name emoji
/// keep working.
pub const FLAG_MARKER: &str = "\u{1F6A9}";
/// ShareGPT-shaped row used as the canonical input to the analyzer's
/// detectors. `from` is one of `"human"`, `"gpt"`, `"function_call"`,

View file

@ -4,39 +4,64 @@
Signals™
========
Agentic Signals are lightweight, model-free behavioral indicators computed from
live interaction trajectories and attached to your existing
OpenTelemetry traces. They make it possible to triage the small fraction of
trajectories that are most likely to be informative — brilliant successes or
**severe failures** — without running an LLM-as-judge on every session.
Agentic Signals are lightweight, model-free behavioral indicators computed
from live interaction trajectories and attached to your existing
OpenTelemetry traces. They are the instrumentation layer of a closed-loop
improvement flywheel for agents — turning raw production traffic into
prioritized data that can drive prompt, routing, and model updates without
running an LLM-as-judge on every session.
The framework implemented here follows the taxonomy and detector design in
*Signals: Trajectory Sampling and Triage for Agentic Interactions* (Chen,
Hafeez, Paracha, 2026; `arXiv:2604.00356
<https://arxiv.org/abs/2604.00356>`_). All detectors are computed without
model calls; the entire pipeline attaches structured attributes and span
events to existing spans so your dashboards and alerts work unmodified.
*Signals: Trajectory Sampling and Triage for Agentic Interactions*
(`Chen et al., 2026 <https://arxiv.org/abs/2604.00356>`_). All detectors
are computed without model calls; the entire pipeline attaches structured
attributes and span events to existing spans so your dashboards and alerts
work unmodified.
The Problem: Knowing What's "Good"
==================================
Why Signals Matter: The Improvement Flywheel
============================================
One of the hardest parts of building agents is measuring how well they
perform in the real world.
Agentic applications are increasingly deployed at scale, yet improving them
after deployment remains difficult. Production trajectories are long,
numerous, and non-deterministic, making exhaustive human review infeasible
and auxiliary LLM evaluation expensive. As a result, teams face a
bottleneck: they cannot score every response, inspect every trace, or
reliably identify which failures and successes should inform the next model
update. Without a low-cost triage layer, the feedback loop from production
behavior to model improvement remains incomplete.
**Offline testing** relies on hand-picked examples and happy-path scenarios,
missing the messy diversity of real usage. Developers manually prompt models,
evaluate responses, and tune prompts by guesswork — a slow, incomplete
feedback loop.
Signals close this loop by cheaply identifying which interactions among
millions are worth inspecting:
**Production debugging** floods developers with traces and logs but provides
little guidance on which interactions actually matter. Finding failures means
painstakingly reconstructing sessions and manually labeling quality issues.
1. **Instrument.** Live trajectories are scored with model-free signals
attached as structured attributes on existing OpenTelemetry spans,
organized under a fixed taxonomy of interaction, execution, and
environment signals. This requires no additional model calls,
infrastructure, or changes to online agent behavior.
2. **Sample & triage.** Signal attributes act as filters: they surface
severe failures, retrieve representative exemplars, and exclude the
uninformative middle. In our experiments, signal-based sampling
achieves 82% informativeness on :math:`\tau`-bench, compared with 54%
for random sampling, yielding a 1.52× efficiency gain per informative
trajectory.
3. **Data Construction.** The triaged subset becomes targeted input for
constructing preference datasets or supervised fine-tuning datasets
from production trajectories.
4. **Model Optimization.** The resulting preference or supervised
fine-tuning data is used to update the model through methods such as
DPO, RLHF, or supervised fine-tuning, so optimization is driven by
targeted production behavior rather than undifferentiated trace noise.
5. **Deploy.** The improved model is deployed and immediately
re-instrumented with the same signals, enabling teams to measure
whether the change improved production behavior and to feed the next
iteration.
You can't score every response with an LLM-as-judge (too expensive, too slow)
or manually review every trace (doesn't scale). What you need are
**behavioral signals** — fast, economical proxies that don't label quality
outright but dramatically shrink the search space, pointing to sessions most
likely to be broken or brilliant.
This loop depends on the first step being nearly free. The framework is
therefore designed around fixed-taxonomy, model-free detectors with
:math:`O(\text{messages})` cost, no online behavior change, and no
dependence on expensive evaluator models. By making production traces
searchable and sampleable at scale, signals turn raw agent telemetry into a
practical model-optimization flywheel.
What Are Behavioral Signals?
============================
@ -61,150 +86,159 @@ agent performance. Embedded directly into traces, they make it easy to spot
friction as it happens: where users struggle, where agents loop, where tool
failures cluster, and where escalations occur.
Signals vs Response Quality
===========================
Behavioral signals and response quality are complementary.
**Response Quality**
Domain-specific correctness: did the agent do the right thing given
business rules, user intent, and operational context? This often
requires subject-matter experts or outcome instrumentation and is
time-intensive but irreplaceable.
**Behavioral Signals**
Observable patterns that correlate with quality: misalignment,
stagnation, disengagement, satisfaction, tool failures, loops, and
environment exhaustion. Fast to compute and valuable for prioritizing
which traces deserve inspection.
Used together, signals tell you *where to look*, and quality evaluation tells
you *what went wrong (or right)*.
Signal Taxonomy
===============
Signals are organized into three top-level **layers**, each with its own
intent. Every detected signal belongs to exactly one leaf type under one of
seven categories.
seven categories. The per-category summaries and leaf-type descriptions
below are borrowed verbatim from the reference implementation at
`katanemo/signals <https://github.com/katanemo/signals>`_ to keep the
documentation and the detector contract in sync.
Interaction (user ↔ agent conversational quality)
Interaction — user ↔ agent conversational quality
-------------------------------------------------
Covers how the discourse itself is going: is the user being understood, is
the conversation progressing, is the user engaged, is the user satisfied?
**Misalignment** — Misalignment signals capture semantic or intent mismatch
between the user and the agent, such as rephrasing, corrections,
clarifications, and restated constraints. These signals do not assert that
either party is "wrong"; they only indicate that shared understanding has
not yet been established.
.. list-table::
:header-rows: 1
:widths: 25 25 50
:widths: 30 70
* - Category
- Leaf signal type
- Meaning
* - **Misalignment**
- ``misalignment.correction``
- User explicitly corrects the agent ("No, I meant Paris, France").
* -
- ``misalignment.rephrase``
- User reformulates a previous request; semantic overlap is high.
* -
- ``misalignment.clarification``
- User signals confusion ("I don't understand", "what do you mean").
* - **Stagnation**
- ``stagnation.dragging``
- Conversation length significantly exceeds the expected baseline.
* -
- ``stagnation.repetition``
- Assistant near-duplicates prior turns (bigram Jaccard similarity).
* - **Disengagement**
- ``disengagement.escalation``
- User asks to speak to a human / supervisor / support.
* -
- ``disengagement.quit``
- User expresses intent to give up or abandon the session.
* -
- ``disengagement.negative_stance``
- User expresses frustration: complaints, ALL CAPS, excessive
punctuation, agent-directed profanity.
* - **Satisfaction**
- ``satisfaction.gratitude``
- User expresses thanks or appreciation.
* -
- ``satisfaction.confirmation``
- User confirms the outcome ("got it", "sounds good").
* -
- ``satisfaction.success``
- User confirms task success ("that worked", "perfect").
* - Leaf signal type
- Description
* - ``misalignment.correction``
- Explicit corrections, negations, mistake acknowledgments.
* - ``misalignment.rephrase``
- Rephrasing indicators, alternative explanations.
* - ``misalignment.clarification``
- Confusion expressions, requests for clarification.
Execution (agent-caused action quality)
**Stagnation** — Stagnation signals capture cases where the discourse
continues but fails to make visible progress. This includes near-duplicate
assistant responses, circular explanations, repeated scaffolding, and other
forms of linguistic degeneration.
.. list-table::
:header-rows: 1
:widths: 30 70
* - Leaf signal type
- Description
* - ``stagnation.dragging``
- Excessive turn count, conversation not progressing efficiently.
* - ``stagnation.repetition``
- Near-duplicate or repetitive assistant responses.
**Disengagement** — Disengagement signals mark the withdrawal of
cooperative intent from the interaction. These include explicit requests to
exit the agent flow (e.g., "talk to a human"), strong negative stances, and
abandonment markers.
.. list-table::
:header-rows: 1
:widths: 30 70
* - Leaf signal type
- Description
* - ``disengagement.escalation``
- Requests for human agent or support.
* - ``disengagement.quit``
- Notification to quit or leave.
* - ``disengagement.negative_stance``
- Complaints, frustration, negative sentiment.
**Satisfaction** — Satisfaction signals indicate explicit stabilization and
completion of the interaction. These include expressions of gratitude,
success confirmations, and closing utterances. We use these signals to
sample exemplar traces rather than to assign quality scores.
.. list-table::
:header-rows: 1
:widths: 30 70
* - Leaf signal type
- Description
* - ``satisfaction.gratitude``
- Expressions of thanks and appreciation.
* - ``satisfaction.confirmation``
- Explicit satisfaction expressions.
* - ``satisfaction.success``
- Confirmation of task completion or understanding.
Execution — agent-caused action quality
---------------------------------------
Covers attempts to act in the world that don't yield usable outcomes.
Requires tool-call traces (``function_call`` / ``observation``) to fire.
**Failure** — Detects agent-caused failures in tool/function usage. These
are issues the agent is responsible for (as opposed to environment failures
which are external system issues). Requires tool-call traces
(``function_call`` / ``observation``) to fire.
.. list-table::
:header-rows: 1
:widths: 25 25 50
:widths: 30 70
* - Category
- Leaf signal type
- Meaning
* - **Failure**
- ``failure.invalid_args``
- Tool call rejected due to schema / argument validation failure.
* -
- ``failure.bad_query``
- Downstream query rejected as malformed by the tool.
* -
- ``failure.tool_not_found``
- Agent called a tool that doesn't exist or isn't available.
* -
- ``failure.auth_misuse``
- Authentication / authorization failure on a tool call.
* -
- ``failure.state_error``
- Call-order / state-machine violation (e.g. commit without begin).
* - **Loops**
- ``loops.retry``
- Same tool call repeated with near-identical arguments.
* -
- ``loops.parameter_drift``
- Same tool called with slowly drifting parameters (walk pattern).
* -
- ``loops.oscillation``
- Call A → Call B → Call A → Call B pattern across multiple turns.
* - Leaf signal type
- Description
* - ``execution.failure.invalid_args``
- Wrong type, missing required field.
* - ``execution.failure.bad_query``
- Empty results due to overly narrow/wrong query.
* - ``execution.failure.tool_not_found``
- Agent called non-existent tool.
* - ``execution.failure.auth_misuse``
- Agent didn't pass credentials correctly.
* - ``execution.failure.state_error``
- Tool called in wrong state/order.
Environment (external system / boundary conditions)
**Loops** — Detects behavioral patterns where the agent gets stuck
repeating tool calls. These are distinct from
``interaction.stagnation`` (conversation text repetition) and
``execution.failure`` (single tool errors) — these detect tool-level
behavioral loops.
.. list-table::
:header-rows: 1
:widths: 30 70
* - Leaf signal type
- Description
* - ``execution.loops.retry``
- Same tool with identical args ≥3 times.
* - ``execution.loops.parameter_drift``
- Same tool with varied args ≥3 times.
* - ``execution.loops.oscillation``
- Multi-tool A→B→A→B pattern ≥3 cycles.
Environment — external system / boundary conditions
---------------------------------------------------
Covers failures **outside** the agent's control that still break the
interaction. Useful for separating agent-caused issues from infrastructure.
**Exhaustion** — Detects failures and constraints arising from the
surrounding system rather than the agent's internal policy or reasoning.
These are external issues the agent cannot control.
.. list-table::
:header-rows: 1
:widths: 25 25 50
:widths: 30 70
* - Category
- Leaf signal type
- Meaning
* - **Exhaustion**
- ``exhaustion.api_error``
- Downstream API returned a 5xx or unexpected error.
* -
- ``exhaustion.timeout``
- Tool / API call timed out.
* -
- ``exhaustion.rate_limit``
- Rate-limit response from a tool / API.
* -
- ``exhaustion.network``
- Transient network failure mid-call.
* -
- ``exhaustion.malformed_response``
- Response received but couldn't be parsed.
* -
- ``exhaustion.context_overflow``
- Context window / token budget exceeded.
* - Leaf signal type
- Description
* - ``environment.exhaustion.api_error``
- 5xx errors, service unavailable.
* - ``environment.exhaustion.timeout``
- Connection/read timeouts.
* - ``environment.exhaustion.rate_limit``
- 429, quota exceeded.
* - ``environment.exhaustion.network``
- Connection refused, DNS errors.
* - ``environment.exhaustion.malformed_response``
- Invalid JSON, unexpected schema.
* - ``environment.exhaustion.context_overflow``
- Token/context limit exceeded.
How It Works
============
@ -368,7 +402,8 @@ Visual Flag Marker
When concerning signals are detected (disengagement present, stagnation
count > 2, any execution failure / loop, or overall quality ``poor``/
``severe``), the marker ``[!]`` is appended to the span's operation name.
``severe``), the marker 🚩 (U+1F6A9) is appended to the span's operation
name.
This makes flagged sessions immediately visible in trace UIs without
requiring attribute filtering.
@ -386,7 +421,7 @@ Example queries against the layered keys::
signals.execution.failure.count > 0
signals.environment.exhaustion.count > 0
For flagged sessions, search for ``[!]`` in span names.
For flagged sessions, search for 🚩 in span names.
.. image:: /_static/img/signals_trace.png
:width: 100%
@ -473,7 +508,7 @@ Example Span
A concerning session, showing both layered attributes and a per-instance
event::
# Span name: "POST /v1/chat/completions gpt-5.2 [!]"
# Span name: "POST /v1/chat/completions gpt-5.2 🚩"
# Top-level
signals.quality = "severe"
@ -585,7 +620,7 @@ Mitigation strategies:
causes.
.. tip::
The ``[!]`` marker in the span name provides instant visual feedback in
The 🚩 marker in the span name provides instant visual feedback in
trace UIs, while the structured attributes (``signals.quality``,
``signals.interaction.disengagement.severity``, etc.) and per-instance
span events enable powerful querying and drill-down in your observability

View file

@ -33,6 +33,7 @@ extensions = [
"sphinx.ext.autodoc",
"sphinx.ext.intersphinx",
"sphinx.ext.extlinks",
"sphinx.ext.mathjax",
"sphinx.ext.viewcode",
"sphinx_sitemap",
"sphinx_design",
@ -41,6 +42,7 @@ extensions = [
"provider_models",
]
# Paths that contain templates, relative to this directory.
templates_path = ["_templates"]

View file

@ -114,11 +114,11 @@ Signals act as early warning indicators embedded in your traces:
**Visual Flag Markers**
When concerning signals are detected (disengagement, execution failures / loops, stagnation > 2, or ``poor`` / ``severe`` quality), Plano automatically appends a ``[!]`` marker to the span's operation name. This makes problematic traces immediately visible in your tracing UI without requiring additional queries.
When concerning signals are detected (disengagement, execution failures / loops, stagnation > 2, or ``poor`` / ``severe`` quality), Plano automatically appends a 🚩 marker to the span's operation name. This makes problematic traces immediately visible in your tracing UI without requiring additional queries.
**Example Span with Signals**::
# Span name: "POST /v1/chat/completions gpt-4 [!]"
# Span name: "POST /v1/chat/completions gpt-4 🚩"
# Standard LLM attributes:
llm.model = "gpt-4"
llm.usage.total_tokens = 225