<h1>Signals™<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#signals"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h1>
<p>Agentic Signals are behavioral and executions quality indicators that act as early warning signs of agent performance—highlighting both brilliant successes and <strong>severe failures</strong>. These signals are computed directly from conversation traces without requiring manual labeling or domain expertise, making them practical for production observability at scale.</p>
<sectionid="the-problem-knowing-what-s-good">
<h2>The Problem: Knowing What’s “Good”<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#the-problem-knowing-what-s-good"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#the-problem-knowing-what-s-good'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h2>
<p>One of the hardest parts of building agents is measuring how well they perform in the real world.</p>
<p><strong>Offline testing</strong> relies on hand-picked examples and happy-path scenarios, missing the messy diversity of real usage. Developers manually prompt models, evaluate responses, and tune prompts by guesswork—a slow, incomplete feedback loop.</p>
<p><strong>Production debugging</strong> floods developers with traces and logs but provides little guidance on which interactions actually matter. Finding failures means painstakingly reconstructing sessions and manually labeling quality issues.</p>
<p>You can’t score every response with an LLM-as-judge (too expensive, too slow) or manually review every trace (doesn’t scale). What you need are <strong>behavioral signals</strong>—fast, economical proxies that don’t label quality outright but dramatically shrink the search space, pointing to sessions most likely to be broken or brilliant.</p>
</section>
<sectionid="what-are-behavioral-signals">
<h2>What Are Behavioral Signals?<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#what-are-behavioral-signals"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#what-are-behavioral-signals'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h2>
<p>Behavioral signals are canaries in the coal mine—early, objective indicators that something may have gone wrong (or gone exceptionally well). They don’t explain <em>why</em> an agent failed, but they reliably signal <em>where</em> attention is needed.</p>
<p>These signals emerge naturally from the rhythm of interaction:</p>
<ulclass="simple">
<li><p>A user rephrasing the same request</p></li>
<li><p>Sharp increases in conversation length</p></li>
<li><p>Expressions of gratitude or satisfaction</p></li>
<li><p>Requests to speak to a human / contact support</p></li>
</ul>
<p>Individually, these clues are shallow; together, they form a fingerprint of agent performance. Embedded directly into traces, they make it easy to spot friction as it happens: where users struggle, where agents loop, and where escalations occur.</p>
</section>
<sectionid="signals-vs-response-quality">
<h2>Signals vs Response Quality<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#signals-vs-response-quality"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#signals-vs-response-quality'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h2>
<p>Behavioral signals and response quality are complementary.</p>
<dlclass="simple">
<dt><strong>Response Quality</strong></dt><dd><p>Domain-specific correctness: did the agent do the right thing given business rules, user intent, and operational context? This often requires subject-matter experts or outcome instrumentation and is time-intensive but irreplaceable.</p>
</dd>
<dt><strong>Behavioral Signals</strong></dt><dd><p>Observable patterns that correlate with quality: high repair frequency, excessive turns, frustration markers, repetition, escalation, and positive feedback. Fast to compute and valuable for prioritizing which traces deserve inspection.</p>
</dd>
</dl>
<p>Used together, signals tell you <em>where to look</em>, and quality evaluation tells you <em>what went wrong (or right)</em>.</p>
</section>
<sectionid="how-it-works">
<h2>How It Works<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#how-it-works"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#how-it-works'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h2>
<p>Signals are computed automatically by the gateway and emitted as <strong>OpenTelemetry trace attributes</strong> to your existing observability stack (Jaeger, Honeycomb, Grafana Tempo, etc.). No additional libraries or instrumentation required—just configure your OTEL collector endpoint.</p>
<p>Each conversation trace is enriched with signal attributes that you can query, filter, and visualize in your observability platform. The gateway analyzes message content (performing text normalization, Unicode handling, and pattern matching) to compute behavioral signals in real-time.</p>
<p><strong>OTEL Trace Attributes</strong></p>
<p>Signal data is exported as structured span attributes:</p>
<li><p><codeclass="docutils literal notranslate"><spanclass="pre">signals.turn_count</span></code> - Total number of turns in the conversation</p></li>
<li><p><codeclass="docutils literal notranslate"><spanclass="pre">signals.repair.count</span></code> - Number of repair attempts detected (when present)</p></li>
<li><p><codeclass="docutils literal notranslate"><spanclass="pre">signals.repair.ratio</span></code> - Ratio of repairs to user turns (when present)</p></li>
<li><p><codeclass="docutils literal notranslate"><spanclass="pre">signals.frustration.count</span></code> - Number of frustration indicators detected</p></li>
<li><p><codeclass="docutils literal notranslate"><spanclass="pre">signals.repetition.count</span></code> - Number of repetition instances detected</p></li>
<li><p><codeclass="docutils literal notranslate"><spanclass="pre">signals.escalation.requested</span></code> - Boolean escalation flag (“true” when present)</p></li>
<li><p><codeclass="docutils literal notranslate"><spanclass="pre">signals.positive_feedback.count</span></code> - Number of positive feedback indicators</p></li>
</ul>
<p><strong>Visual Flag Marker</strong></p>
<p>When concerning signals are detected (frustration, looping, escalation, or poor/severe quality), the flag marker <strong>🚩</strong> is automatically appended to the span’s operation name, making problematic traces easy to spot in your trace visualizations.</p>
<p><strong>Querying in Your Observability Platform</strong></p>
<p>Example queries:</p>
<ulclass="simple">
<li><p>Find all severe interactions: <codeclass="docutils literal notranslate"><spanclass="pre">signals.quality</span><spanclass="pre">=</span><spanclass="pre">"Severe"</span></code></p></li>
<li><p>Find flagged traces: search for <strong>🚩</strong> in span names</p></li>
<li><p>Find long conversations: <codeclass="docutils literal notranslate"><spanclass="pre">signals.turn_count</span><spanclass="pre">></span><spanclass="pre">10</span></code></p></li>
<h2>Core Signal Types<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#core-signal-types"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#core-signal-types'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h2>
<p>The signals system tracks six categories of behavioral indicators.</p>
<sectionid="turn-count-efficiency">
<h3>Turn Count & Efficiency<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#turn-count-efficiency"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#turn-count-efficiency'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h3>
<dlclass="simple">
<dt><strong>What it measures</strong></dt><dd><p>Number of user–assistant exchanges.</p>
</dd>
<dt><strong>Why it matters</strong></dt><dd><p>Long conversations often indicate unclear intent resolution, confusion, or inefficiency. Very short conversations can correlate with crisp resolution.</p>
<dt><strong>Efficiency scoring</strong></dt><dd><p>Baseline expectation is ~5 turns (tunable). Efficiency stays at 1.0 up to the baseline, then declines with an inverse penalty as turns exceed baseline:</p>
<h3>Follow-Up & Repair Frequency<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#follow-up-repair-frequency"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#follow-up-repair-frequency'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h3>
<dlclass="simple">
<dt><strong>What it measures</strong></dt><dd><p>How often users clarify, correct, or rephrase requests. This is a <strong>user signal</strong> tracking query reformulation behavior—when users must repair or rephrase their requests because the agent didn’t understand or respond appropriately.</p>
</dd>
<dt><strong>Why it matters</strong></dt><dd><p>High repair frequency is a proxy for misunderstanding or intent drift. When users repeatedly rephrase the same request, it indicates the agent is failing to grasp or act on the user’s intent.</p>
</dd>
</dl>
<p><strong>Key metrics</strong></p>
<ulclass="simple">
<li><p>Repair count and ratio (repairs / user turns)</p></li>
<h3>Repetition & Looping<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#repetition-looping"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#repetition-looping'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h3>
<dlclass="simple">
<dt><strong>What it measures</strong></dt><dd><p>Assistant repetition / degenerative loops. This is an <strong>assistant signal</strong> tracking when the agent repeats itself, fails to follow instructions, or gets stuck in loops—indicating the agent is not making progress or adapting its responses.</p>
</dd>
<dt><strong>Why it matters</strong></dt><dd><p>Often indicates missing state tracking, broken tool integration, prompt issues, or the agent ignoring user corrections. High repetition means the agent is not learning from the conversation context.</p>
</dd>
</dl>
<p><strong>Detection method</strong></p>
<ulclass="simple">
<li><p>Compare assistant messages using <strong>bigram Jaccard similarity</strong></p></li>
<dt><strong>Severe</strong></dt><dd><p>Critical issues—escalation requested, severe frustration, severe looping, or excessive turns (>12). Requires immediate attention.</p>
</dd>
</dl>
<p>This assessment uses a scoring model that weighs positive factors (efficiency, positive feedback) against negative ones (frustration, repairs, repetition, escalation).</p>
</section>
<sectionid="sampling-and-prioritization">
<h2>Sampling and Prioritization<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#sampling-and-prioritization"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#sampling-and-prioritization'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h2>
<p>In production, trace data is overwhelming. Signals provide a lightweight first layer of analysis to prioritize which sessions deserve review.</p>
<p>Workflow:</p>
<olclass="arabic simple">
<li><p>Gateway captures conversation messages and computes signals</p></li>
<li><p>Signal attributes are emitted to OTEL spans automatically</p></li>
<li><p>Your observability platform ingests and indexes the attributes</p></li>
<li><p>Query/filter by signal attributes to surface outliers (poor/severe and exemplars)</p></li>
<li><p>Review high-information traces to identify improvement opportunities</p></li>
<li><p>Update prompts, routing, or policies based on findings</p></li>
<li><p>Redeploy and monitor signal metrics to validate improvements</p></li>
</ol>
<p>This creates a reinforcement loop where traces become both diagnostic data and training signal.</p>
</section>
<sectionid="trace-filtering-and-telemetry">
<h2>Trace Filtering and Telemetry<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#trace-filtering-and-telemetry"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#trace-filtering-and-telemetry'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h2>
<p>Signal attributes are automatically added to OpenTelemetry spans, making them immediately queryable in your observability platform.</p>
<p><strong>Visual Filtering</strong></p>
<p>When concerning signals are detected, the flag marker <strong>🚩</strong> (U+1F6A9) is automatically appended to the span’s operation name. This makes flagged sessions immediately visible in trace visualizations without requiring attribute filtering.</p>
<p>Use signal attributes to build monitoring dashboards in Grafana, Honeycomb, Datadog, etc.:</p>
<ulclass="simple">
<li><p><strong>Quality distribution</strong>: Count of traces by <codeclass="docutils literal notranslate"><spanclass="pre">signals.quality</span></code></p></li>
<li><p><strong>P95 turn count</strong>: 95th percentile of <codeclass="docutils literal notranslate"><spanclass="pre">signals.turn_count</span></code></p></li>
<li><p><strong>Average efficiency</strong>: Mean of <codeclass="docutils literal notranslate"><spanclass="pre">signals.efficiency_score</span></code></p></li>
<li><p><strong>High repair rate</strong>: Percentage where <codeclass="docutils literal notranslate"><spanclass="pre">signals.repair.ratio</span><spanclass="pre">></span><spanclass="pre">0.3</span></code></p></li>
<li><p><strong>Frustration rate</strong>: Percentage where <codeclass="docutils literal notranslate"><spanclass="pre">signals.frustration.severity</span><spanclass="pre">>=</span><spanclass="pre">2</span></code></p></li>
<li><p><strong>Escalation rate</strong>: Percentage where <codeclass="docutils literal notranslate"><spanclass="pre">signals.escalation.requested</span><spanclass="pre">=</span><spanclass="pre">"true"</span></code></p></li>
<li><p><strong>Looping rate</strong>: Percentage where <codeclass="docutils literal notranslate"><spanclass="pre">signals.repetition.count</span><spanclass="pre">>=</span><spanclass="pre">3</span></code></p></li>
<li><p><strong>Positive feedback rate</strong>: Percentage where <codeclass="docutils literal notranslate"><spanclass="pre">signals.positive_feedback.count</span><spanclass="pre">>=</span><spanclass="pre">1</span></code></p></li>
</ul>
<p><strong>Creating Alerts</strong></p>
<p>Set up alerts based on signal thresholds:</p>
<ulclass="simple">
<li><p>Alert when severe interaction count exceeds threshold in 1-hour window</p></li>
<li><p>Alert on sudden spike in frustration rate (>2x baseline)</p></li>
<li><p>Alert when escalation rate exceeds 5% of total conversations</p></li>
<li><p>Alert on degraded efficiency (P95 turn count increases >50%)</p></li>
</ul>
</section>
<sectionid="best-practices">
<h2>Best Practices<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#best-practices"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#best-practices'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h2>
<p>Start simple:</p>
<ulclass="simple">
<li><p>Alert or page on <strong>Severe</strong> sessions (or on spikes in Severe rate)</p></li>
<li><p>Review <strong>Poor</strong> sessions within 24 hours</p></li>
<li><p>Sample <strong>Excellent</strong> sessions as exemplars</p></li>
</ul>
<p>Combine multiple signals to infer failure modes:</p>
<li><p>Misunderstood intent: repair ratio > 30% + excessive turns</p></li>
<li><p>Working well: positive feedback + high efficiency + no frustration</p></li>
</ul>
</section>
<sectionid="limitations-and-considerations">
<h2>Limitations and Considerations<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#limitations-and-considerations"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#limitations-and-considerations'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h2>
<p>Signals don’t capture:</p>
<ulclass="simple">
<li><p>Task completion / real outcomes</p></li>
<li><p>Factual or domain correctness</p></li>
<li><p>Silent abandonment (user leaves without expressing frustration)</p></li>
<li><p>Non-English nuance (pattern libraries are English-oriented)</p></li>
</ul>
<p>Mitigation strategies:</p>
<ulclass="simple">
<li><p>Periodically sample flagged sessions and measure false positives/negatives</p></li>
<li><p>Tune baselines per use case and user population</p></li>
<li><p>Add domain-specific phrase libraries where needed</p></li>
<li><p>Combine signals with non-text metrics (tool failures, disconnects, latency)</p></li>
</ul>
<divclass="admonition note">
<pclass="admonition-title">Note</p>
<p>Behavioral signals complement—but do not replace—domain-specific response quality evaluation. Use signals to prioritize which traces to inspect, then apply domain expertise and outcome checks to diagnose root causes.</p>
</div>
<divclass="admonition tip">
<pclass="admonition-title">Tip</p>
<p>The flag marker in the span name provides instant visual feedback in trace UIs, while the structured attributes (<codeclass="docutils literal notranslate"><spanclass="pre">signals.quality</span></code>, <codeclass="docutils literal notranslate"><spanclass="pre">signals.frustration.severity</span></code>, etc.) enable powerful querying and aggregation in your observability platform.</p>
</div>
</section>
<sectionid="see-also">
<h2>See Also<a@click.prevent="window.navigator.clipboard.writeText($el.href); $el.setAttribute('data-tooltip', 'Copied!'); setTimeout(() => $el.setAttribute('data-tooltip', 'Copy link to this element'), 2000)"aria-label="Copy link to this element"class="headerlink"data-tooltip="Copy link to this element"href="#see-also"x-intersect.margin.0%.0%.-70%.0%="activeSection ='#see-also'"><svgheight="1em"viewbox="0 0 24 24"width="1em"xmlns="http://www.w3.org/2000/svg"><pathd="M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76 0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71 0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71 0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76 0 5-2.24 5-5s-2.24-5-5-5z"></path></svg></a></h2>
<ulclass="simple">
<li><p><aclass="reference internal"href="../guides/observability/tracing.html"><spanclass="doc">Tracing</span></a> - Distributed tracing for agent systems</p></li>
<li><p><aclass="reference internal"href="../guides/observability/monitoring.html"><spanclass="doc">Monitoring</span></a> - Metrics and dashboards</p></li>