mirror of
https://github.com/katanemo/plano.git
synced 2026-04-25 00:36:34 +02:00
Compare commits
3 commits
ca4a9e57f2
...
041ba75034
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
041ba75034 | ||
|
|
cea43c5da5 | ||
|
|
ae629d3635 |
4 changed files with 191 additions and 153 deletions
|
|
@ -21,9 +21,10 @@ use super::schemas::{
|
|||
use super::text_processing::NormalizedMessage;
|
||||
|
||||
/// Marker appended to the span operation name when concerning signals are
|
||||
/// detected. Kept in sync with the previous implementation for backward
|
||||
/// compatibility with downstream consumers.
|
||||
pub const FLAG_MARKER: &str = "[!]";
|
||||
/// detected. The 🚩 emoji (U+1F6A9) matches the pre-port implementation so
|
||||
/// downstream consumers that search for flagged traces by span-name emoji
|
||||
/// keep working.
|
||||
pub const FLAG_MARKER: &str = "\u{1F6A9}";
|
||||
|
||||
/// ShareGPT-shaped row used as the canonical input to the analyzer's
|
||||
/// detectors. `from` is one of `"human"`, `"gpt"`, `"function_call"`,
|
||||
|
|
|
|||
|
|
@ -4,39 +4,64 @@
|
|||
Signals™
|
||||
========
|
||||
|
||||
Agentic Signals are lightweight, model-free behavioral indicators computed from
|
||||
live interaction trajectories and attached to your existing
|
||||
OpenTelemetry traces. They make it possible to triage the small fraction of
|
||||
trajectories that are most likely to be informative — brilliant successes or
|
||||
**severe failures** — without running an LLM-as-judge on every session.
|
||||
Agentic Signals are lightweight, model-free behavioral indicators computed
|
||||
from live interaction trajectories and attached to your existing
|
||||
OpenTelemetry traces. They are the instrumentation layer of a closed-loop
|
||||
improvement flywheel for agents — turning raw production traffic into
|
||||
prioritized data that can drive prompt, routing, and model updates without
|
||||
running an LLM-as-judge on every session.
|
||||
|
||||
The framework implemented here follows the taxonomy and detector design in
|
||||
*Signals: Trajectory Sampling and Triage for Agentic Interactions* (Chen,
|
||||
Hafeez, Paracha, 2026; `arXiv:2604.00356
|
||||
<https://arxiv.org/abs/2604.00356>`_). All detectors are computed without
|
||||
model calls; the entire pipeline attaches structured attributes and span
|
||||
events to existing spans so your dashboards and alerts work unmodified.
|
||||
*Signals: Trajectory Sampling and Triage for Agentic Interactions*
|
||||
(`Chen et al., 2026 <https://arxiv.org/abs/2604.00356>`_). All detectors
|
||||
are computed without model calls; the entire pipeline attaches structured
|
||||
attributes and span events to existing spans so your dashboards and alerts
|
||||
work unmodified.
|
||||
|
||||
The Problem: Knowing What's "Good"
|
||||
==================================
|
||||
Why Signals Matter: The Improvement Flywheel
|
||||
============================================
|
||||
|
||||
One of the hardest parts of building agents is measuring how well they
|
||||
perform in the real world.
|
||||
Agentic applications are increasingly deployed at scale, yet improving them
|
||||
after deployment remains difficult. Production trajectories are long,
|
||||
numerous, and non-deterministic, making exhaustive human review infeasible
|
||||
and auxiliary LLM evaluation expensive. As a result, teams face a
|
||||
bottleneck: they cannot score every response, inspect every trace, or
|
||||
reliably identify which failures and successes should inform the next model
|
||||
update. Without a low-cost triage layer, the feedback loop from production
|
||||
behavior to model improvement remains incomplete.
|
||||
|
||||
**Offline testing** relies on hand-picked examples and happy-path scenarios,
|
||||
missing the messy diversity of real usage. Developers manually prompt models,
|
||||
evaluate responses, and tune prompts by guesswork — a slow, incomplete
|
||||
feedback loop.
|
||||
Signals close this loop by cheaply identifying which interactions among
|
||||
millions are worth inspecting:
|
||||
|
||||
**Production debugging** floods developers with traces and logs but provides
|
||||
little guidance on which interactions actually matter. Finding failures means
|
||||
painstakingly reconstructing sessions and manually labeling quality issues.
|
||||
1. **Instrument.** Live trajectories are scored with model-free signals
|
||||
attached as structured attributes on existing OpenTelemetry spans,
|
||||
organized under a fixed taxonomy of interaction, execution, and
|
||||
environment signals. This requires no additional model calls,
|
||||
infrastructure, or changes to online agent behavior.
|
||||
2. **Sample & triage.** Signal attributes act as filters: they surface
|
||||
severe failures, retrieve representative exemplars, and exclude the
|
||||
uninformative middle. In our experiments, signal-based sampling
|
||||
achieves 82% informativeness on :math:`\tau`-bench, compared with 54%
|
||||
for random sampling, yielding a 1.52× efficiency gain per informative
|
||||
trajectory.
|
||||
3. **Data Construction.** The triaged subset becomes targeted input for
|
||||
constructing preference datasets or supervised fine-tuning datasets
|
||||
from production trajectories.
|
||||
4. **Model Optimization.** The resulting preference or supervised
|
||||
fine-tuning data is used to update the model through methods such as
|
||||
DPO, RLHF, or supervised fine-tuning, so optimization is driven by
|
||||
targeted production behavior rather than undifferentiated trace noise.
|
||||
5. **Deploy.** The improved model is deployed and immediately
|
||||
re-instrumented with the same signals, enabling teams to measure
|
||||
whether the change improved production behavior and to feed the next
|
||||
iteration.
|
||||
|
||||
You can't score every response with an LLM-as-judge (too expensive, too slow)
|
||||
or manually review every trace (doesn't scale). What you need are
|
||||
**behavioral signals** — fast, economical proxies that don't label quality
|
||||
outright but dramatically shrink the search space, pointing to sessions most
|
||||
likely to be broken or brilliant.
|
||||
This loop depends on the first step being nearly free. The framework is
|
||||
therefore designed around fixed-taxonomy, model-free detectors with
|
||||
:math:`O(\text{messages})` cost, no online behavior change, and no
|
||||
dependence on expensive evaluator models. By making production traces
|
||||
searchable and sampleable at scale, signals turn raw agent telemetry into a
|
||||
practical model-optimization flywheel.
|
||||
|
||||
What Are Behavioral Signals?
|
||||
============================
|
||||
|
|
@ -61,150 +86,159 @@ agent performance. Embedded directly into traces, they make it easy to spot
|
|||
friction as it happens: where users struggle, where agents loop, where tool
|
||||
failures cluster, and where escalations occur.
|
||||
|
||||
Signals vs Response Quality
|
||||
===========================
|
||||
|
||||
Behavioral signals and response quality are complementary.
|
||||
|
||||
**Response Quality**
|
||||
Domain-specific correctness: did the agent do the right thing given
|
||||
business rules, user intent, and operational context? This often
|
||||
requires subject-matter experts or outcome instrumentation and is
|
||||
time-intensive but irreplaceable.
|
||||
|
||||
**Behavioral Signals**
|
||||
Observable patterns that correlate with quality: misalignment,
|
||||
stagnation, disengagement, satisfaction, tool failures, loops, and
|
||||
environment exhaustion. Fast to compute and valuable for prioritizing
|
||||
which traces deserve inspection.
|
||||
|
||||
Used together, signals tell you *where to look*, and quality evaluation tells
|
||||
you *what went wrong (or right)*.
|
||||
|
||||
Signal Taxonomy
|
||||
===============
|
||||
|
||||
Signals are organized into three top-level **layers**, each with its own
|
||||
intent. Every detected signal belongs to exactly one leaf type under one of
|
||||
seven categories.
|
||||
seven categories. The per-category summaries and leaf-type descriptions
|
||||
below are borrowed verbatim from the reference implementation at
|
||||
`katanemo/signals <https://github.com/katanemo/signals>`_ to keep the
|
||||
documentation and the detector contract in sync.
|
||||
|
||||
Interaction (user ↔ agent conversational quality)
|
||||
Interaction — user ↔ agent conversational quality
|
||||
-------------------------------------------------
|
||||
|
||||
Covers how the discourse itself is going: is the user being understood, is
|
||||
the conversation progressing, is the user engaged, is the user satisfied?
|
||||
**Misalignment** — Misalignment signals capture semantic or intent mismatch
|
||||
between the user and the agent, such as rephrasing, corrections,
|
||||
clarifications, and restated constraints. These signals do not assert that
|
||||
either party is "wrong"; they only indicate that shared understanding has
|
||||
not yet been established.
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
:widths: 25 25 50
|
||||
:widths: 30 70
|
||||
|
||||
* - Category
|
||||
- Leaf signal type
|
||||
- Meaning
|
||||
* - **Misalignment**
|
||||
- ``misalignment.correction``
|
||||
- User explicitly corrects the agent ("No, I meant Paris, France").
|
||||
* -
|
||||
- ``misalignment.rephrase``
|
||||
- User reformulates a previous request; semantic overlap is high.
|
||||
* -
|
||||
- ``misalignment.clarification``
|
||||
- User signals confusion ("I don't understand", "what do you mean").
|
||||
* - **Stagnation**
|
||||
- ``stagnation.dragging``
|
||||
- Conversation length significantly exceeds the expected baseline.
|
||||
* -
|
||||
- ``stagnation.repetition``
|
||||
- Assistant near-duplicates prior turns (bigram Jaccard similarity).
|
||||
* - **Disengagement**
|
||||
- ``disengagement.escalation``
|
||||
- User asks to speak to a human / supervisor / support.
|
||||
* -
|
||||
- ``disengagement.quit``
|
||||
- User expresses intent to give up or abandon the session.
|
||||
* -
|
||||
- ``disengagement.negative_stance``
|
||||
- User expresses frustration: complaints, ALL CAPS, excessive
|
||||
punctuation, agent-directed profanity.
|
||||
* - **Satisfaction**
|
||||
- ``satisfaction.gratitude``
|
||||
- User expresses thanks or appreciation.
|
||||
* -
|
||||
- ``satisfaction.confirmation``
|
||||
- User confirms the outcome ("got it", "sounds good").
|
||||
* -
|
||||
- ``satisfaction.success``
|
||||
- User confirms task success ("that worked", "perfect").
|
||||
* - Leaf signal type
|
||||
- Description
|
||||
* - ``misalignment.correction``
|
||||
- Explicit corrections, negations, mistake acknowledgments.
|
||||
* - ``misalignment.rephrase``
|
||||
- Rephrasing indicators, alternative explanations.
|
||||
* - ``misalignment.clarification``
|
||||
- Confusion expressions, requests for clarification.
|
||||
|
||||
Execution (agent-caused action quality)
|
||||
**Stagnation** — Stagnation signals capture cases where the discourse
|
||||
continues but fails to make visible progress. This includes near-duplicate
|
||||
assistant responses, circular explanations, repeated scaffolding, and other
|
||||
forms of linguistic degeneration.
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
:widths: 30 70
|
||||
|
||||
* - Leaf signal type
|
||||
- Description
|
||||
* - ``stagnation.dragging``
|
||||
- Excessive turn count, conversation not progressing efficiently.
|
||||
* - ``stagnation.repetition``
|
||||
- Near-duplicate or repetitive assistant responses.
|
||||
|
||||
**Disengagement** — Disengagement signals mark the withdrawal of
|
||||
cooperative intent from the interaction. These include explicit requests to
|
||||
exit the agent flow (e.g., "talk to a human"), strong negative stances, and
|
||||
abandonment markers.
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
:widths: 30 70
|
||||
|
||||
* - Leaf signal type
|
||||
- Description
|
||||
* - ``disengagement.escalation``
|
||||
- Requests for human agent or support.
|
||||
* - ``disengagement.quit``
|
||||
- Notification to quit or leave.
|
||||
* - ``disengagement.negative_stance``
|
||||
- Complaints, frustration, negative sentiment.
|
||||
|
||||
**Satisfaction** — Satisfaction signals indicate explicit stabilization and
|
||||
completion of the interaction. These include expressions of gratitude,
|
||||
success confirmations, and closing utterances. We use these signals to
|
||||
sample exemplar traces rather than to assign quality scores.
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
:widths: 30 70
|
||||
|
||||
* - Leaf signal type
|
||||
- Description
|
||||
* - ``satisfaction.gratitude``
|
||||
- Expressions of thanks and appreciation.
|
||||
* - ``satisfaction.confirmation``
|
||||
- Explicit satisfaction expressions.
|
||||
* - ``satisfaction.success``
|
||||
- Confirmation of task completion or understanding.
|
||||
|
||||
Execution — agent-caused action quality
|
||||
---------------------------------------
|
||||
|
||||
Covers attempts to act in the world that don't yield usable outcomes.
|
||||
Requires tool-call traces (``function_call`` / ``observation``) to fire.
|
||||
**Failure** — Detects agent-caused failures in tool/function usage. These
|
||||
are issues the agent is responsible for (as opposed to environment failures
|
||||
which are external system issues). Requires tool-call traces
|
||||
(``function_call`` / ``observation``) to fire.
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
:widths: 25 25 50
|
||||
:widths: 30 70
|
||||
|
||||
* - Category
|
||||
- Leaf signal type
|
||||
- Meaning
|
||||
* - **Failure**
|
||||
- ``failure.invalid_args``
|
||||
- Tool call rejected due to schema / argument validation failure.
|
||||
* -
|
||||
- ``failure.bad_query``
|
||||
- Downstream query rejected as malformed by the tool.
|
||||
* -
|
||||
- ``failure.tool_not_found``
|
||||
- Agent called a tool that doesn't exist or isn't available.
|
||||
* -
|
||||
- ``failure.auth_misuse``
|
||||
- Authentication / authorization failure on a tool call.
|
||||
* -
|
||||
- ``failure.state_error``
|
||||
- Call-order / state-machine violation (e.g. commit without begin).
|
||||
* - **Loops**
|
||||
- ``loops.retry``
|
||||
- Same tool call repeated with near-identical arguments.
|
||||
* -
|
||||
- ``loops.parameter_drift``
|
||||
- Same tool called with slowly drifting parameters (walk pattern).
|
||||
* -
|
||||
- ``loops.oscillation``
|
||||
- Call A → Call B → Call A → Call B pattern across multiple turns.
|
||||
* - Leaf signal type
|
||||
- Description
|
||||
* - ``execution.failure.invalid_args``
|
||||
- Wrong type, missing required field.
|
||||
* - ``execution.failure.bad_query``
|
||||
- Empty results due to overly narrow/wrong query.
|
||||
* - ``execution.failure.tool_not_found``
|
||||
- Agent called non-existent tool.
|
||||
* - ``execution.failure.auth_misuse``
|
||||
- Agent didn't pass credentials correctly.
|
||||
* - ``execution.failure.state_error``
|
||||
- Tool called in wrong state/order.
|
||||
|
||||
Environment (external system / boundary conditions)
|
||||
**Loops** — Detects behavioral patterns where the agent gets stuck
|
||||
repeating tool calls. These are distinct from
|
||||
``interaction.stagnation`` (conversation text repetition) and
|
||||
``execution.failure`` (single tool errors) — these detect tool-level
|
||||
behavioral loops.
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
:widths: 30 70
|
||||
|
||||
* - Leaf signal type
|
||||
- Description
|
||||
* - ``execution.loops.retry``
|
||||
- Same tool with identical args ≥3 times.
|
||||
* - ``execution.loops.parameter_drift``
|
||||
- Same tool with varied args ≥3 times.
|
||||
* - ``execution.loops.oscillation``
|
||||
- Multi-tool A→B→A→B pattern ≥3 cycles.
|
||||
|
||||
Environment — external system / boundary conditions
|
||||
---------------------------------------------------
|
||||
|
||||
Covers failures **outside** the agent's control that still break the
|
||||
interaction. Useful for separating agent-caused issues from infrastructure.
|
||||
**Exhaustion** — Detects failures and constraints arising from the
|
||||
surrounding system rather than the agent's internal policy or reasoning.
|
||||
These are external issues the agent cannot control.
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
:widths: 25 25 50
|
||||
:widths: 30 70
|
||||
|
||||
* - Category
|
||||
- Leaf signal type
|
||||
- Meaning
|
||||
* - **Exhaustion**
|
||||
- ``exhaustion.api_error``
|
||||
- Downstream API returned a 5xx or unexpected error.
|
||||
* -
|
||||
- ``exhaustion.timeout``
|
||||
- Tool / API call timed out.
|
||||
* -
|
||||
- ``exhaustion.rate_limit``
|
||||
- Rate-limit response from a tool / API.
|
||||
* -
|
||||
- ``exhaustion.network``
|
||||
- Transient network failure mid-call.
|
||||
* -
|
||||
- ``exhaustion.malformed_response``
|
||||
- Response received but couldn't be parsed.
|
||||
* -
|
||||
- ``exhaustion.context_overflow``
|
||||
- Context window / token budget exceeded.
|
||||
* - Leaf signal type
|
||||
- Description
|
||||
* - ``environment.exhaustion.api_error``
|
||||
- 5xx errors, service unavailable.
|
||||
* - ``environment.exhaustion.timeout``
|
||||
- Connection/read timeouts.
|
||||
* - ``environment.exhaustion.rate_limit``
|
||||
- 429, quota exceeded.
|
||||
* - ``environment.exhaustion.network``
|
||||
- Connection refused, DNS errors.
|
||||
* - ``environment.exhaustion.malformed_response``
|
||||
- Invalid JSON, unexpected schema.
|
||||
* - ``environment.exhaustion.context_overflow``
|
||||
- Token/context limit exceeded.
|
||||
|
||||
How It Works
|
||||
============
|
||||
|
|
@ -368,7 +402,8 @@ Visual Flag Marker
|
|||
|
||||
When concerning signals are detected (disengagement present, stagnation
|
||||
count > 2, any execution failure / loop, or overall quality ``poor``/
|
||||
``severe``), the marker ``[!]`` is appended to the span's operation name.
|
||||
``severe``), the marker 🚩 (U+1F6A9) is appended to the span's operation
|
||||
name.
|
||||
This makes flagged sessions immediately visible in trace UIs without
|
||||
requiring attribute filtering.
|
||||
|
||||
|
|
@ -386,7 +421,7 @@ Example queries against the layered keys::
|
|||
signals.execution.failure.count > 0
|
||||
signals.environment.exhaustion.count > 0
|
||||
|
||||
For flagged sessions, search for ``[!]`` in span names.
|
||||
For flagged sessions, search for 🚩 in span names.
|
||||
|
||||
.. image:: /_static/img/signals_trace.png
|
||||
:width: 100%
|
||||
|
|
@ -473,7 +508,7 @@ Example Span
|
|||
A concerning session, showing both layered attributes and a per-instance
|
||||
event::
|
||||
|
||||
# Span name: "POST /v1/chat/completions gpt-5.2 [!]"
|
||||
# Span name: "POST /v1/chat/completions gpt-5.2 🚩"
|
||||
|
||||
# Top-level
|
||||
signals.quality = "severe"
|
||||
|
|
@ -585,7 +620,7 @@ Mitigation strategies:
|
|||
causes.
|
||||
|
||||
.. tip::
|
||||
The ``[!]`` marker in the span name provides instant visual feedback in
|
||||
The 🚩 marker in the span name provides instant visual feedback in
|
||||
trace UIs, while the structured attributes (``signals.quality``,
|
||||
``signals.interaction.disengagement.severity``, etc.) and per-instance
|
||||
span events enable powerful querying and drill-down in your observability
|
||||
|
|
|
|||
|
|
@ -33,6 +33,7 @@ extensions = [
|
|||
"sphinx.ext.autodoc",
|
||||
"sphinx.ext.intersphinx",
|
||||
"sphinx.ext.extlinks",
|
||||
"sphinx.ext.mathjax",
|
||||
"sphinx.ext.viewcode",
|
||||
"sphinx_sitemap",
|
||||
"sphinx_design",
|
||||
|
|
@ -41,6 +42,7 @@ extensions = [
|
|||
"provider_models",
|
||||
]
|
||||
|
||||
|
||||
# Paths that contain templates, relative to this directory.
|
||||
templates_path = ["_templates"]
|
||||
|
||||
|
|
|
|||
|
|
@ -114,11 +114,11 @@ Signals act as early warning indicators embedded in your traces:
|
|||
|
||||
**Visual Flag Markers**
|
||||
|
||||
When concerning signals are detected (disengagement, execution failures / loops, stagnation > 2, or ``poor`` / ``severe`` quality), Plano automatically appends a ``[!]`` marker to the span's operation name. This makes problematic traces immediately visible in your tracing UI without requiring additional queries.
|
||||
When concerning signals are detected (disengagement, execution failures / loops, stagnation > 2, or ``poor`` / ``severe`` quality), Plano automatically appends a 🚩 marker to the span's operation name. This makes problematic traces immediately visible in your tracing UI without requiring additional queries.
|
||||
|
||||
**Example Span with Signals**::
|
||||
|
||||
# Span name: "POST /v1/chat/completions gpt-4 [!]"
|
||||
# Span name: "POST /v1/chat/completions gpt-4 🚩"
|
||||
# Standard LLM attributes:
|
||||
llm.model = "gpt-4"
|
||||
llm.usage.total_tokens = 225
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue