Prerelease cleanup (#46)

* feat: Add const_bound_vars tracking to prevent false positives in ownership checks

* feat: Introduce field interner and typed bounded vars for enhanced type tracking

* feat: Add typed_call_receivers and typed_bounded_dto_fields for enhanced type tracking

* feat: Centralize method name extraction with bare_method_name helper

* feat: Implement Phase-6 hierarchy fan-out for runtime virtual dispatch

* feat: Enhance C++ taint tracking with additional container operations and inline method resolution

* feat: Introduce field-sensitive points-to analysis for enhanced resource tracking

* feat: Implement Pointer-Phase 6 subscript handling for enhanced container analysis

* test: Add comprehensive tests for JavaScript control flow constructs and lattice operations

* docs: Update advanced analysis documentation with field-sensitive points-to and hierarchy fan-out details

* test: Add comprehensive tests for lattice algebra laws and SSA edge cases

* feat: Add destructured session user handling and safe user ID access patterns

* feat: Implement row-population reverse-walk for enhanced authorization checks

* feat: Enhance authorization checks with local alias chain for self-actor types

* feat: Introduce ActiveRecord query safety checks and enhance snippet extraction

* feat: Implement chained method call inner-gate rebinding for SSRF prevention

* feat: Add observability and error modules, enhance debug functionality, and implement theme context

* feat: Remove Auth Analysis page and update navigation to redirect to Explorer

* feat: Optimize SSA lowering by sharing results between taint engine and artifact extractor

* feat: Optimize SSA lowering by sharing results between taint engine and artifact extractor

* feat: Reset path-safe-suppressed spans before lowering to maintain analysis integrity

* fix(ssa): ungate debug_assert_bfs_ordering for release-tests build

The helper at src/ssa/lower.rs was gated `#[cfg(debug_assertions)]` while
the unit test at the bottom of the file was gated only `#[cfg(test)]`.
Since `cfg(test)` is set in release builds with `--tests` but
`cfg(debug_assertions)` is not, `cargo build --release --tests` failed
with E0425. Removing the gate fixes the build; the body is `debug_assert!`
only, so the helper is free in release. Also drop the gate at the call
site to avoid a `dead_code` warning when the lib is built without
`--tests`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(closure-capture): flip JS/TS fixtures to required-finding

The JS and TS closure-capture fixtures pinned the old broken behaviour
via `forbidden_findings: [{ "id_prefix": "taint-" }]`. The engine now
correctly traces taint through the closure boundary (env source captured
by an arrow function, sunk via `child_process.exec` inside the body), so
the formerly-forbidden finding is a true positive.

Match the Python sibling's shape — `required_findings` with
`id_prefix` + `min_count` plus a small `noise_budget` — and rewrite the
companion READMEs and the phase8_fragility_tests doc-comments from
"known gap" to "regression guard".

Verified:
- cargo test --release --test phase8_fragility_tests → 8/8 pass
- cargo test --release --lib bfs_assertion → pass
- corpus benchmark F1 = 0.9976 (TP=205, FP=1, FN=0) — unchanged

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: Add OWASP mapping and baseline mutation hooks for enhanced security analysis

* feat: Introduce health module and enhance health score computation with calibration tests

* feat: Add expectations configuration and cleanup .gitignore for log files

* feat: Implement theme selection and enhance settings panel for triage sync

* feat: Suppress false positives for strcpy calls with literal sources in AST

* feat: Update analyse_function_ssa to return body CFG for accurate analysis

* feat: Add bug report and feature request templates for improved issue tracking

* feat: removed dev scripts

* feat: update README.md for clarity and consistency in fixture descriptions

* feat: removed dev docs

* feat: clean up error handling and UI elements for improved user experience

* feat: adjust button sizes in HeaderBar for better UI consistency

* feat: enhance taint analysis with additional context for sanitizer and taint findings

* cargo fmt

* prettier

* refactor: simplify conditional checks and improve code readability in AST and screenshot capture scripts

* feat: add script to frame PNG screenshots with brand gradient

* feat: add fuzzing support with new targets and CI workflows

* refactor: streamline match expressions and improve formatting in CLI and output handling

* feat: enhance configuration display with detailed output options

* feat: stage demo configuration for improved CLI screenshot output

* feat: expose merge_configs function for user-configurable settings

* refactor: simplify code structure and improve readability in config handling

* refactor: improve descriptions for vulnerability patterns in various languages

* feat: update MIT License section with additional usage details and copyright information

* feat: update screenshots

* refactor: update build process and paths for frontend assets

* feat: add cross-file taint fuzzing target and supporting dictionary

* refactor: clean up formatting and comments in fuzz configuration and example files

* refactor: remove outdated comments and clean up CI configuration files

* chore: update changelog dates and improve formatting in documentation

* refactor: update Cargo.toml and CI configuration for improved packaging and build process

* refactor: enhance quote-stripping logic to prevent panics and add regression tests

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Eli Peter 2026-04-29 00:58:38 -04:00 committed by GitHub
parent 79c29b394d
commit 82f18184b1
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
348 changed files with 48731 additions and 2925 deletions

View file

@ -1,4 +1,6 @@
use crate::server::jobs::JobManager;
use crate::server::models::{FilterValues, FindingSummary, FindingView};
use crate::server::observability;
use crate::server::progress::TimingBreakdown;
use crate::server::routes;
use crate::server::security::LocalServerSecurity;
@ -41,6 +43,21 @@ pub enum ServerEvent {
ConfigChanged,
}
/// Pre-computed views over the latest scan's findings.
///
/// Built once per completed scan and reused across `/findings`,
/// `/findings/summary`, `/findings/filters`, and `/overview` requests so we
/// don't re-walk the diag list (or re-deserialize from SQLite) on every hit.
/// The `job_id` lets readers detect a stale entry without holding a write
/// lock on hot paths.
#[derive(Debug, Clone)]
pub struct CachedFindings {
pub job_id: String,
pub views: Arc<Vec<FindingView>>,
pub summary: Arc<FindingSummary>,
pub filters: Arc<FilterValues>,
}
/// Shared application state accessible to all route handlers.
#[derive(Clone)]
pub struct AppState {
@ -52,6 +69,7 @@ pub struct AppState {
pub job_manager: Arc<JobManager>,
pub event_tx: broadcast::Sender<ServerEvent>,
pub db_pool: Option<Arc<Pool<SqliteConnectionManager>>>,
pub findings_cache: Arc<RwLock<Option<CachedFindings>>>,
}
/// 50 MiB cap on request bodies — generous for config uploads, tight
@ -83,6 +101,7 @@ pub fn build_router(state: AppState) -> Router {
security,
crate::server::security::guard_requests,
))
.layer(middleware::from_fn(observability::observe))
.layer(CompressionLayer::new())
.layer(SetResponseHeaderLayer::overriding(
HeaderName::from_static("x-frame-options"),
@ -124,6 +143,7 @@ mod tests {
job_manager: Arc::new(JobManager::new(4, 8 * 1024 * 1024)),
event_tx,
db_pool: None,
findings_cache: Arc::new(RwLock::new(None)),
}
}

View file

@ -6,11 +6,17 @@
//! analysis pipeline on a single file/function for debug inspection.
use crate::ast::build_cfg_for_file;
use crate::auth_analysis::model::{
AnalysisUnit, AuthCheck, AuthorizationModel, CallSite, RouteRegistration, SensitiveOperation,
ValueRef,
};
use crate::callgraph::{CallGraph, CallGraphAnalysis};
use crate::cfg::{Cfg, EdgeKind, FileCfg, FuncSummaries, StmtKind};
use crate::constraint::{CompOp, ConditionExpr, ConstValue, Operand};
use crate::labels::{Cap, DataLabel};
use crate::pointer::{AbsLoc, PointsToFacts};
use crate::ssa::ir::*;
use crate::ssa::type_facts::{TypeFactResult, TypeKind};
use crate::ssa::{self, OptimizeResult};
use crate::state::symbol::SymbolInterner;
use crate::summary::GlobalSummaries;
@ -100,6 +106,13 @@ fn label_str(l: &DataLabel) -> String {
pub struct FunctionInfo {
pub name: String,
pub namespace: String,
/// Enclosing container path (class / impl / module / outer function).
/// Empty for free top-level functions. Surfaced so the UI can render
/// closures as `<anon#N> [in outer_fn]`.
pub container: String,
/// Structural [`crate::symbol::FuncKind`] slug (`"fn"`, `"method"`,
/// `"closure"`, ...). Lets the UI offer a closure-filter toggle.
pub func_kind: String,
pub param_count: usize,
pub line: usize,
pub source_caps: Vec<String>,
@ -298,6 +311,7 @@ fn op_view(op: &SsaOp) -> (String, Vec<String>) {
callee,
args,
receiver,
..
} => {
let mut ops = Vec::new();
if let Some(rv) = receiver {
@ -320,6 +334,18 @@ fn op_view(op: &SsaOp) -> (String, Vec<String>) {
SsaOp::CatchParam => ("CatchParam".into(), vec![]),
SsaOp::Nop => ("Nop".into(), vec![]),
SsaOp::Undef => ("Undef".into(), vec![]),
// FieldProj prints field-id (resolution to name requires the
// owning SsaBody, which the serializer does not have here).
// Debug consumers walk to the owning body when the name matters.
SsaOp::FieldProj {
receiver, field, ..
} => (
"FieldProj".into(),
vec![
format!("recv=v{}", receiver.0),
format!("field={}", field.0),
],
),
}
}
@ -753,6 +779,13 @@ pub struct FuncSummaryView {
pub file_path: String,
pub lang: String,
pub namespace: String,
/// Enclosing container path (class / impl / module / outer function).
/// Empty for free top-level functions.
pub container: String,
/// Structural [`crate::symbol::FuncKind`] slug — `"fn"`, `"method"`,
/// `"closure"`, etc. Lets the UI distinguish anonymous closures from
/// named functions for filtering.
pub func_kind: String,
pub arity: Option<usize>,
pub param_count: usize,
pub source_caps: Vec<String>,
@ -832,6 +865,8 @@ impl FuncSummaryView {
file_path: summary.file_path.clone(),
lang: format!("{:?}", key.lang),
namespace: key.namespace.clone(),
container: key.container.clone(),
func_kind: key.kind.as_str().to_string(),
arity: key.arity,
param_count: summary.param_count,
source_caps: cap_names(Cap::from_bits_truncate(summary.source_caps)),
@ -864,6 +899,480 @@ fn transform_str(t: &TaintTransform) -> String {
}
}
// ── Pointer / Points-to ──────────────────────────────────────────────────────
#[derive(Debug, Serialize)]
pub struct PointerLocationView {
pub id: u32,
pub kind: String,
pub display: String,
/// Parent location id for `Field { parent, field }` chains.
#[serde(skip_serializing_if = "Option::is_none")]
pub parent: Option<u32>,
#[serde(skip_serializing_if = "Option::is_none")]
pub field: Option<String>,
}
#[derive(Debug, Serialize)]
pub struct PointerValueView {
pub ssa_value: u32,
pub var_name: Option<String>,
/// `LocId`s referencing entries in [`PointerView::locations`].
pub points_to: Vec<u32>,
pub is_top: bool,
}
#[derive(Debug, Serialize)]
pub struct PointerFieldEntryView {
/// Parameter index, or `null` for the implicit receiver.
pub param_index: Option<u32>,
pub field: String,
}
#[derive(Debug, Serialize)]
pub struct PointerView {
pub locations: Vec<PointerLocationView>,
pub values: Vec<PointerValueView>,
/// Field reads attributed to params/receiver via the field-points-to
/// extractor (Phase 5).
pub field_reads: Vec<PointerFieldEntryView>,
/// Field writes attributed to params/receiver via the field-points-to
/// extractor (Phase 5).
pub field_writes: Vec<PointerFieldEntryView>,
/// Number of distinct interned locations beyond the reserved Top sentinel.
pub location_count: usize,
}
impl PointerView {
pub fn from_facts(facts: &PointsToFacts, ssa: &SsaBody) -> Self {
// Determine which LocIds are referenced by any pt set so we only
// emit those (plus Top when referenced).
let mut referenced: std::collections::BTreeSet<u32> = std::collections::BTreeSet::new();
for v in 0..ssa.num_values() as u32 {
let set = facts.pt(SsaValue(v));
for loc in set.iter() {
referenced.insert(loc.0);
}
}
// Build location views in interner order so parent ids land before
// child Field locations.
let mut locations: Vec<PointerLocationView> = Vec::new();
for raw_id in 0..facts.interner.len() as u32 {
if !referenced.contains(&raw_id) {
continue;
}
let loc_id = crate::pointer::LocId(raw_id);
let abs = facts.interner.resolve(loc_id);
let (kind, display, parent, field) = match abs {
AbsLoc::Top => ("Top".to_string(), "".to_string(), None, None),
AbsLoc::Alloc(_, ssa_v) => {
("Alloc".to_string(), format!("alloc#v{}", ssa_v), None, None)
}
AbsLoc::Param(_, idx) => {
("Param".to_string(), format!("param[{}]", idx), None, None)
}
AbsLoc::SelfParam(_) => ("SelfParam".to_string(), "self".to_string(), None, None),
AbsLoc::Field { parent, field } => {
let field_name = if *field == FieldId::ELEM {
"<elem>".to_string()
} else if (field.0 as usize) < ssa.field_interner.len() {
ssa.field_interner.resolve(*field).to_string()
} else {
format!("#{}", field.0)
};
(
"Field".to_string(),
format!(".{}", field_name),
Some(parent.0),
Some(field_name),
)
}
};
locations.push(PointerLocationView {
id: raw_id,
kind,
display,
parent,
field,
});
}
// Per-value pt sets — emit only values with non-empty sets to keep
// the payload focused on interesting facts.
let mut values: Vec<PointerValueView> = Vec::new();
for v in 0..ssa.num_values() as u32 {
let set = facts.pt(SsaValue(v));
if set.is_empty() {
continue;
}
values.push(PointerValueView {
ssa_value: v,
var_name: ssa
.value_defs
.get(v as usize)
.and_then(|d| d.var_name.clone()),
points_to: set.iter().map(|loc| loc.0).collect(),
is_top: set.is_top(),
});
}
// Field reads / writes summary derived from the body + facts.
let summary = crate::pointer::extract_field_points_to(ssa, facts);
let to_field_entries = |entries: &[(u32, smallvec::SmallVec<[String; 2]>)]| {
entries
.iter()
.flat_map(|(idx, fields)| {
let pi = if *idx == u32::MAX { None } else { Some(*idx) };
fields.iter().map(move |f| PointerFieldEntryView {
param_index: pi,
field: f.clone(),
})
})
.collect()
};
let field_reads = to_field_entries(&summary.param_field_reads);
let field_writes = to_field_entries(&summary.param_field_writes);
let location_count = facts.interner.len().saturating_sub(1);
PointerView {
locations,
values,
field_reads,
field_writes,
location_count,
}
}
}
// ── Type Facts (standalone view) ─────────────────────────────────────────────
#[derive(Debug, Serialize)]
pub struct DtoFieldView {
pub name: String,
pub kind: String,
}
#[derive(Debug, Serialize)]
pub struct DtoFactView {
pub class_name: String,
pub fields: Vec<DtoFieldView>,
}
#[derive(Debug, Serialize)]
pub struct TypeFactDetailView {
pub ssa_value: u32,
pub var_name: Option<String>,
pub line: usize,
/// Type kind tag — matches the [`TypeKind`] discriminant
/// (`String`, `Int`, `HttpClient`, `Dto`, …).
pub kind: String,
/// True when the value is allowed to be null/None.
pub nullable: bool,
/// Container/class name — set for `HttpClient`, `DatabaseConnection`,
/// `Dto`, etc. Mirrors [`TypeKind::container_name`].
#[serde(skip_serializing_if = "Option::is_none")]
pub container: Option<String>,
/// DTO field shape, populated only when `kind == "Dto"`.
#[serde(skip_serializing_if = "Option::is_none")]
pub dto: Option<DtoFactView>,
}
#[derive(Debug, Serialize)]
pub struct TypeFactsView {
pub facts: Vec<TypeFactDetailView>,
/// Total count of values reaching the analysis (for the "X of Y" header).
pub total_values: usize,
/// Count of values where the inferred type is `Unknown`. Surfaced so
/// the UI can show coverage at a glance.
pub unknown_count: usize,
}
impl TypeFactsView {
pub fn from_optimize(opt: &OptimizeResult, ssa: &SsaBody, bytes: &[u8]) -> Self {
Self::from_type_facts(&opt.type_facts, ssa, bytes)
}
pub fn from_type_facts(tf: &TypeFactResult, ssa: &SsaBody, bytes: &[u8]) -> Self {
let total_values = ssa.num_values();
let unknown_count = tf
.facts
.values()
.filter(|f| matches!(f.kind, TypeKind::Unknown))
.count();
let mut facts: Vec<TypeFactDetailView> = tf
.facts
.iter()
.filter(|(_, f)| !matches!(f.kind, TypeKind::Unknown))
.map(|(sv, fact)| {
// Find the defining instruction for this SSA value so we can
// resolve its source line. Falls back to 0 when no inst
// matches (the value lives only in `value_defs`).
let span: (usize, usize) = ssa
.blocks
.iter()
.find_map(|blk| {
blk.phis
.iter()
.chain(blk.body.iter())
.find(|i| i.value == *sv)
.map(|i| i.span)
})
.unwrap_or_default();
let line = byte_offset_to_line(bytes, span.0);
let dto = match &fact.kind {
TypeKind::Dto(d) => Some(DtoFactView {
class_name: d.class_name.clone(),
fields: d
.fields
.iter()
.map(|(name, k)| DtoFieldView {
name: name.clone(),
kind: type_kind_tag(k),
})
.collect(),
}),
_ => None,
};
TypeFactDetailView {
ssa_value: sv.0,
var_name: ssa
.value_defs
.get(sv.0 as usize)
.and_then(|d| d.var_name.clone()),
line,
kind: type_kind_tag(&fact.kind),
nullable: fact.nullable,
container: fact.kind.container_name(),
dto,
}
})
.collect();
facts.sort_by_key(|v| v.ssa_value);
TypeFactsView {
facts,
total_values,
unknown_count,
}
}
}
/// Stable string tag for a [`TypeKind`] (used by both the TypeFacts view
/// and DTO field rendering). Uses the variant name so the UI can map
/// each tag to a colour without parsing free-form `Debug` strings.
fn type_kind_tag(k: &TypeKind) -> String {
match k {
TypeKind::String => "String".into(),
TypeKind::Int => "Int".into(),
TypeKind::Bool => "Bool".into(),
TypeKind::Object => "Object".into(),
TypeKind::Array => "Array".into(),
TypeKind::Null => "Null".into(),
TypeKind::Unknown => "Unknown".into(),
TypeKind::HttpResponse => "HttpResponse".into(),
TypeKind::DatabaseConnection => "DatabaseConnection".into(),
TypeKind::FileHandle => "FileHandle".into(),
TypeKind::Url => "Url".into(),
TypeKind::HttpClient => "HttpClient".into(),
TypeKind::LocalCollection => "LocalCollection".into(),
TypeKind::Dto(_) => "Dto".into(),
}
}
// ── Auth Analysis ────────────────────────────────────────────────────────────
#[derive(Debug, Serialize)]
pub struct AuthValueRefView {
pub source_kind: String,
pub name: String,
#[serde(skip_serializing_if = "Option::is_none")]
pub base: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub field: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub index: Option<String>,
pub line: usize,
}
#[derive(Debug, Serialize)]
pub struct AuthCheckView {
pub kind: String,
pub callee: String,
pub line: usize,
pub subjects: Vec<AuthValueRefView>,
pub args: Vec<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub condition_text: Option<String>,
}
#[derive(Debug, Serialize)]
pub struct AuthOperationView {
pub kind: String,
#[serde(skip_serializing_if = "Option::is_none")]
pub sink_class: Option<String>,
pub callee: String,
pub line: usize,
pub text: String,
pub subjects: Vec<AuthValueRefView>,
}
#[derive(Debug, Serialize)]
pub struct AuthCallSiteView {
pub name: String,
pub line: usize,
pub args: Vec<String>,
}
#[derive(Debug, Serialize)]
pub struct AuthUnitView {
pub kind: String,
pub name: Option<String>,
pub line: usize,
pub params: Vec<String>,
pub auth_checks: Vec<AuthCheckView>,
pub operations: Vec<AuthOperationView>,
pub call_sites: Vec<AuthCallSiteView>,
pub self_actor_vars: Vec<String>,
pub typed_bounded_vars: Vec<String>,
pub authorized_sql_vars: Vec<String>,
pub const_bound_vars: Vec<String>,
}
#[derive(Debug, Serialize)]
pub struct AuthRouteView {
pub framework: String,
pub method: String,
pub path: String,
pub middleware: Vec<String>,
pub handler_params: Vec<String>,
pub line: usize,
pub unit_idx: usize,
}
#[derive(Debug, Serialize)]
pub struct AuthAnalysisView {
pub routes: Vec<AuthRouteView>,
pub units: Vec<AuthUnitView>,
/// Whether the auth-analysis rule set is enabled for the file's
/// language. When `false`, the model is intentionally empty and the
/// UI should surface that the analysis is skipped (not failing).
pub enabled: bool,
}
impl AuthAnalysisView {
pub fn from_model(model: &AuthorizationModel, bytes: &[u8], enabled: bool) -> Self {
let routes = model.routes.iter().map(|r| route_view(r, bytes)).collect();
let units = model.units.iter().map(|u| unit_view(u, bytes)).collect();
AuthAnalysisView {
routes,
units,
enabled,
}
}
}
fn value_ref_view(vr: &ValueRef, bytes: &[u8]) -> AuthValueRefView {
AuthValueRefView {
source_kind: format!("{:?}", vr.source_kind),
name: vr.name.clone(),
base: vr.base.clone(),
field: vr.field.clone(),
index: vr.index.clone(),
line: byte_offset_to_line(bytes, vr.span.0),
}
}
fn auth_check_view(c: &AuthCheck, bytes: &[u8]) -> AuthCheckView {
AuthCheckView {
kind: format!("{:?}", c.kind),
callee: c.callee.clone(),
line: c.line,
subjects: c
.subjects
.iter()
.map(|s| value_ref_view(s, bytes))
.collect(),
args: c.args.clone(),
condition_text: c.condition_text.clone(),
}
}
fn operation_view(op: &SensitiveOperation, bytes: &[u8]) -> AuthOperationView {
AuthOperationView {
kind: format!("{:?}", op.kind),
sink_class: op.sink_class.map(|c| format!("{:?}", c)),
callee: op.callee.clone(),
line: op.line,
text: op.text.clone(),
subjects: op
.subjects
.iter()
.map(|s| value_ref_view(s, bytes))
.collect(),
}
}
fn call_site_view(c: &CallSite, bytes: &[u8]) -> AuthCallSiteView {
AuthCallSiteView {
name: c.name.clone(),
line: byte_offset_to_line(bytes, c.span.0),
args: c.args.clone(),
}
}
fn unit_view(unit: &AnalysisUnit, bytes: &[u8]) -> AuthUnitView {
let mut self_actor_vars: Vec<String> = unit.self_actor_vars.iter().cloned().collect();
self_actor_vars.sort();
let mut typed_bounded_vars: Vec<String> = unit.typed_bounded_vars.iter().cloned().collect();
typed_bounded_vars.sort();
let mut authorized_sql_vars: Vec<String> = unit.authorized_sql_vars.iter().cloned().collect();
authorized_sql_vars.sort();
let mut const_bound_vars: Vec<String> = unit.const_bound_vars.iter().cloned().collect();
const_bound_vars.sort();
AuthUnitView {
kind: format!("{:?}", unit.kind),
name: unit.name.clone(),
line: unit.line,
params: unit.params.clone(),
auth_checks: unit
.auth_checks
.iter()
.map(|c| auth_check_view(c, bytes))
.collect(),
operations: unit
.operations
.iter()
.map(|op| operation_view(op, bytes))
.collect(),
call_sites: unit
.call_sites
.iter()
.map(|c| call_site_view(c, bytes))
.collect(),
self_actor_vars,
typed_bounded_vars,
authorized_sql_vars,
const_bound_vars,
}
}
fn route_view(r: &RouteRegistration, _bytes: &[u8]) -> AuthRouteView {
AuthRouteView {
framework: format!("{:?}", r.framework),
method: format!("{:?}", r.method),
path: r.path.clone(),
middleware: r.middleware.clone(),
handler_params: r.handler_params.clone(),
line: r.line,
unit_idx: r.unit_idx,
}
}
// ═════════════════════════════════════════════════════════════════════════════
// On-demand analysis pipeline
// ═════════════════════════════════════════════════════════════════════════════
@ -914,6 +1423,8 @@ pub fn function_list(analysis: &FileAnalysis) -> Vec<FunctionInfo> {
.map(|(key, summary)| FunctionInfo {
name: key.name.clone(),
namespace: key.namespace.clone(),
container: key.container.clone(),
func_kind: key.kind.as_str().to_string(),
param_count: summary.param_count,
line: byte_offset_to_line(&analysis.bytes, analysis.cfg()[summary.entry].ast.span.0),
source_caps: cap_names(summary.source_caps),
@ -924,10 +1435,16 @@ pub fn function_list(analysis: &FileAnalysis) -> Vec<FunctionInfo> {
}
/// Lower a single function to SSA and optimize it.
pub fn analyse_function_ssa(
analysis: &FileAnalysis,
///
/// Returns the per-function body graph alongside the SSA. SSA is lowered
/// against `body.graph`, whose `NodeIndex` space is body-local — the file's
/// top-level CFG (`analysis.cfg()`) has a different index space, so any
/// downstream analysis that indexes by `inst.cfg_node` must use the returned
/// `&Cfg`, not `analysis.cfg()`.
pub fn analyse_function_ssa<'a>(
analysis: &'a FileAnalysis,
func_name: &str,
) -> Result<(SsaBody, OptimizeResult), StatusCode> {
) -> Result<(SsaBody, OptimizeResult, &'a Cfg), StatusCode> {
// Find the function body by name from the per-body CFGs.
let body = analysis
.file_cfg
@ -945,9 +1462,48 @@ pub fn analyse_function_ssa(
);
let mut ssa = ssa_result.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
let opt = ssa::optimize_ssa(&mut ssa, &body.graph, Some(analysis.lang));
let opt = ssa::optimize_ssa_with_param_types(
&mut ssa,
&body.graph,
Some(analysis.lang),
&body.meta.param_types,
);
Ok((ssa, opt))
Ok((ssa, opt, &body.graph))
}
/// Lower a function and run the field-sensitive Steensgaard pointer
/// analysis on its body. Returns the SSA body alongside the resulting
/// [`PointsToFacts`] so the debug view can attribute names to SSA values.
pub fn analyse_function_pointer(
analysis: &FileAnalysis,
func_name: &str,
) -> Result<(SsaBody, PointsToFacts), StatusCode> {
let body = analysis
.file_cfg
.bodies
.iter()
.find(|b| b.meta.name.as_deref() == Some(func_name))
.ok_or(StatusCode::NOT_FOUND)?;
let ssa_result = crate::ssa::lower::lower_to_ssa_with_params(
&body.graph,
body.entry,
Some(func_name),
false,
&body.meta.params,
);
let mut ssa = ssa_result.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
let _opt = ssa::optimize_ssa_with_param_types(
&mut ssa,
&body.graph,
Some(analysis.lang),
&body.meta.param_types,
);
let facts = crate::pointer::analyse_body(&ssa, body.meta.id);
Ok((ssa, facts))
}
/// Run taint analysis on a function's SSA body.
@ -999,6 +1555,7 @@ pub fn analyse_function_taint(
static_map: None,
auto_seed_handler_params: matches!(lang, Lang::JavaScript | Lang::TypeScript),
cross_file_bodies: global_summaries.and_then(|gs| gs.bodies_by_key()),
pointer_facts: None,
};
crate::taint::ssa_transfer::run_ssa_taint_full_with_exits(ssa, cfg, &transfer)
@ -1078,6 +1635,31 @@ pub fn analyse_file_summaries(
Ok(global)
}
/// Run the file-level authorization extraction pipeline for the debug UI.
///
/// Returns the structured `AuthorizationModel` (routes, units, sensitive
/// operations, auth checks) plus the file bytes and an `enabled` flag —
/// the bytes drive line-number resolution in the view, and `enabled`
/// surfaces "auth analysis is off for this language" without conflating
/// it with an empty result.
pub fn analyse_file_auth(
file_path: &Path,
config: &Config,
) -> Result<(AuthorizationModel, Vec<u8>, bool), StatusCode> {
let bytes = std::fs::read(file_path).map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
let model = crate::ast::extract_auth_model_for_debug(file_path, config)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?
.ok_or(StatusCode::BAD_REQUEST)?;
// Determine whether the auth rules were actually enabled for this
// file's language — `extract_auth_model_for_debug` returns an empty
// model both when the rules are disabled and when the file just
// happens to have no routes. The view distinguishes the two so the
// UI can show "analysis disabled" instead of "no routes found".
let lang_slug = crate::ast::lang_slug_for_path(file_path).unwrap_or("");
let rules = crate::auth_analysis::config::build_auth_rules(config, lang_slug);
Ok((model, bytes, rules.enabled))
}
/// Format a `ConditionExpr` as a human-readable string.
fn format_condition_expr(cond: &ConditionExpr) -> String {
match cond {
@ -1150,7 +1732,7 @@ function demo() {
let config = Config::default();
let analysis = analyse_file(&path, &config).expect("file should analyse");
let (ssa, opt) =
let (ssa, opt, _cfg) =
analyse_function_ssa(&analysis, "demo").expect("function should lower to SSA");
let body = analysis
.file_cfg
@ -1205,7 +1787,7 @@ function sink() {
let config = Config::default();
let analysis = analyse_file(&path, &config).expect("file should analyse");
let (ssa, opt) =
let (ssa, opt, _cfg) =
analyse_function_ssa(&analysis, "sink").expect("function should lower to SSA");
let body = analysis
.file_cfg
@ -1249,7 +1831,7 @@ function consume() {
let config = Config::default();
let analysis = analyse_file(&path, &config).expect("file should analyse");
let (ssa, opt) =
let (ssa, opt, _cfg) =
analyse_function_ssa(&analysis, "consume").expect("function should lower to SSA");
let body = analysis
.file_cfg
@ -1287,7 +1869,9 @@ function consume() {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
},
);
@ -1373,4 +1957,249 @@ async function recentAuditLogs() {
"sibling function nodes should not appear in writeAuditLog view"
);
}
#[test]
fn pointer_view_serializes_synthetic_facts() {
// The Steensgaard analyser is exercised against synthetic SSA
// bodies in `src/pointer/analysis.rs` because real-world
// lowering can yield bodies whose Param ops have been folded
// away. Here we just pin the view-model wiring: feeding the
// serialiser an SsaBody with one SelfParam + one FieldProj
// produces non-empty locations / values / field_reads sections.
use crate::cfg::BodyId;
use crate::pointer::analyse_body;
use crate::ssa::ir::{
BlockId, FieldInterner, SsaBlock, SsaBody, SsaInst, SsaOp, SsaValue, Terminator,
ValueDef,
};
use petgraph::graph::NodeIndex;
use smallvec::SmallVec;
let mut field_interner = FieldInterner::new();
let mu = field_interner.intern("mu");
let v_self = SsaValue(0);
let v_field = SsaValue(1);
let value_defs = vec![
ValueDef {
var_name: Some("c".into()),
cfg_node: NodeIndex::new(0),
block: BlockId(0),
},
ValueDef {
var_name: Some("c.mu".into()),
cfg_node: NodeIndex::new(0),
block: BlockId(0),
},
];
let body = SsaBody {
blocks: vec![SsaBlock {
id: BlockId(0),
phis: vec![],
body: vec![
SsaInst {
value: v_self,
op: SsaOp::SelfParam,
cfg_node: NodeIndex::new(0),
var_name: Some("c".into()),
span: (0, 0),
},
SsaInst {
value: v_field,
op: SsaOp::FieldProj {
receiver: v_self,
field: mu,
projected_type: None,
},
cfg_node: NodeIndex::new(0),
var_name: Some("c.mu".into()),
span: (0, 0),
},
],
terminator: Terminator::Return(None),
preds: SmallVec::new(),
succs: SmallVec::new(),
}],
entry: BlockId(0),
value_defs,
cfg_node_map: std::collections::HashMap::new(),
exception_edges: vec![],
field_interner,
field_writes: std::collections::HashMap::new(),
};
let facts = analyse_body(&body, BodyId(0));
let view = PointerView::from_facts(&facts, &body);
assert!(
view.location_count > 0,
"synthetic body should produce at least one location"
);
assert!(
view.locations.iter().any(|l| l.kind == "SelfParam"),
"expected a SelfParam location in the serialised view"
);
assert!(
view.locations.iter().any(|l| l.kind == "Field"),
"expected a Field location in the serialised view"
);
assert!(
view.field_reads.iter().any(|e| e.field == "mu"),
"expected a `mu` field read; got {:?}",
view.field_reads,
);
}
/// Regression: `analyse_function_ssa` lowers SSA against `body.graph`
/// (per-function NodeIndex space). Routes used to pass `analysis.cfg()`
/// (the file's top-level CFG) to `analyse_function_taint`, which made
/// every `cfg[inst.cfg_node]` lookup index a foreign graph and panicked
/// with `index out of bounds` on any non-toplevel function whose body
/// had more nodes than the toplevel. Reproduce: a small Rust file with
/// a few top-level items and a `main` whose body branches enough to
/// allocate body-local NodeIndex values past the toplevel's count.
#[test]
fn taint_route_uses_per_function_cfg_for_index_lookups() {
let dir = tempfile::tempdir().unwrap();
let path = dir.path().join("docgen_like.rs");
std::fs::write(
&path,
r#"
use std::env;
use std::fs;
const BEGIN_MARKER: &str = "<!-- BEGIN -->";
const END_MARKER: &str = "<!-- END -->";
fn main() {
let args: Vec<String> = env::args().collect();
let target = args.get(1).cloned().unwrap_or_else(|| "x".to_string());
let original = match fs::read_to_string(&target) {
Ok(s) => s,
Err(_) => return,
};
let begin = match original.find(BEGIN_MARKER) {
Some(i) => i,
None => return,
};
let end = match original.find(END_MARKER) {
Some(i) => i,
None => return,
};
if end < begin {
return;
}
let _ = fs::write(&target, &original);
}
"#,
)
.unwrap();
let config = Config::default();
let analysis = analyse_file(&path, &config).expect("file should analyse");
let (ssa, opt, body_cfg) =
analyse_function_ssa(&analysis, "main").expect("function should lower to SSA");
// Sanity check that this fixture exercises the bug shape: main's body
// graph must have more nodes than the file's top-level CFG, so a
// mistaken `analysis.cfg()` would panic on `cfg[inst.cfg_node]`.
assert!(
body_cfg.node_count() > analysis.cfg().node_count(),
"fixture must have more body nodes than toplevel nodes to exercise the bug"
);
// Must not panic. Pre-fix this would `index out of bounds` inside
// `transfer_inst` because the SSA was lowered against `body_cfg` but
// the engine was given `analysis.cfg()`.
let _ = analyse_function_taint(
&ssa,
body_cfg,
analysis.lang,
analysis.summaries(),
None,
&opt,
);
// Belt-and-suspenders: assert that calling with the wrong (top-level)
// CFG would have panicked. We can't catch the panic across rayon
// worker threads here, but we can confirm at least one `inst.cfg_node`
// index lies outside `analysis.cfg()`'s range — that's what triggers
// the OOB indexing inside `transfer_inst`.
let toplevel_count = analysis.cfg().node_count();
let max_inst_idx = ssa
.blocks
.iter()
.flat_map(|b| b.phis.iter().chain(b.body.iter()))
.map(|inst| inst.cfg_node.index())
.max()
.unwrap_or(0);
assert!(
max_inst_idx >= toplevel_count,
"regression: at least one inst.cfg_node ({max_inst_idx}) must exceed the \
toplevel CFG node count ({toplevel_count}) for this test to exercise the bug"
);
}
#[test]
fn type_facts_view_groups_security_types() {
let dir = tempfile::tempdir().unwrap();
let path = dir.path().join("h.java");
std::fs::write(
&path,
r#"
import java.net.http.HttpClient;
public class Demo {
public void run() {
HttpClient c = HttpClient.newHttpClient();
c.send(null, null);
}
}
"#,
)
.unwrap();
let config = Config::default();
let analysis = analyse_file(&path, &config).expect("file should analyse");
let (ssa, opt, _cfg) = analyse_function_ssa(&analysis, "run").expect("ssa should lower");
let view = TypeFactsView::from_optimize(&opt, &ssa, &analysis.bytes);
assert!(
view.facts.iter().any(|f| f.kind == "HttpClient"),
"expected HttpClient inference for `c = HttpClient.newHttpClient()`; got {:?}",
view.facts.iter().map(|f| &f.kind).collect::<Vec<_>>(),
);
}
#[test]
fn auth_view_renders_routes_for_express_handlers() {
let dir = tempfile::tempdir().unwrap();
let path = dir.path().join("app.js");
std::fs::write(
&path,
r#"
const express = require('express');
const app = express();
app.get('/api/users/:id', (req, res) => {
db.query('SELECT * FROM users WHERE id=$1', [req.params.id]);
});
"#,
)
.unwrap();
let config = Config::default();
let (model, bytes, enabled) =
analyse_file_auth(&path, &config).expect("auth analysis should run");
assert!(enabled, "auth analysis should be enabled for JavaScript");
let view = AuthAnalysisView::from_model(&model, &bytes, enabled);
assert!(view.enabled);
assert!(
view.routes.iter().any(|r| r.path.contains("/api/users")),
"expected the express GET route to surface; got {:?}",
view.routes.iter().map(|r| &r.path).collect::<Vec<_>>(),
);
assert!(
!view.units.is_empty(),
"expected at least one analysis unit for the handler"
);
}
}

182
src/server/error.rs Normal file
View file

@ -0,0 +1,182 @@
//! Unified error type for HTTP route handlers.
//!
//! All routes should return [`ApiResult<T>`] (an alias for `Result<T, ApiError>`).
//! `ApiError` serializes as `{ "error": "<human msg>", "code": "<machine code>",
//! "detail"?: ... }` and carries the HTTP status code through `IntoResponse`.
use axum::Json;
use axum::http::StatusCode;
use axum::response::{IntoResponse, Response};
use serde::Serialize;
use serde_json::Value;
/// Machine-readable error codes. Stable strings the frontend can branch on.
#[derive(Debug, Clone, Copy)]
pub enum ApiCode {
BadRequest,
Forbidden,
NotFound,
Conflict,
PayloadTooLarge,
Unprocessable,
Internal,
ServiceUnavailable,
}
impl ApiCode {
fn as_str(self) -> &'static str {
match self {
ApiCode::BadRequest => "bad_request",
ApiCode::Forbidden => "forbidden",
ApiCode::NotFound => "not_found",
ApiCode::Conflict => "conflict",
ApiCode::PayloadTooLarge => "payload_too_large",
ApiCode::Unprocessable => "unprocessable",
ApiCode::Internal => "internal",
ApiCode::ServiceUnavailable => "service_unavailable",
}
}
fn status(self) -> StatusCode {
match self {
ApiCode::BadRequest => StatusCode::BAD_REQUEST,
ApiCode::Forbidden => StatusCode::FORBIDDEN,
ApiCode::NotFound => StatusCode::NOT_FOUND,
ApiCode::Conflict => StatusCode::CONFLICT,
ApiCode::PayloadTooLarge => StatusCode::PAYLOAD_TOO_LARGE,
ApiCode::Unprocessable => StatusCode::UNPROCESSABLE_ENTITY,
ApiCode::Internal => StatusCode::INTERNAL_SERVER_ERROR,
ApiCode::ServiceUnavailable => StatusCode::SERVICE_UNAVAILABLE,
}
}
}
#[derive(Debug)]
pub struct ApiError {
code: ApiCode,
message: String,
detail: Option<Value>,
}
impl ApiError {
pub fn new(code: ApiCode, message: impl Into<String>) -> Self {
Self {
code,
message: message.into(),
detail: None,
}
}
pub fn with_detail(mut self, detail: Value) -> Self {
self.detail = Some(detail);
self
}
pub fn bad_request(msg: impl Into<String>) -> Self {
Self::new(ApiCode::BadRequest, msg)
}
pub fn forbidden(msg: impl Into<String>) -> Self {
Self::new(ApiCode::Forbidden, msg)
}
pub fn not_found(msg: impl Into<String>) -> Self {
Self::new(ApiCode::NotFound, msg)
}
pub fn conflict(msg: impl Into<String>) -> Self {
Self::new(ApiCode::Conflict, msg)
}
pub fn unprocessable(msg: impl Into<String>) -> Self {
Self::new(ApiCode::Unprocessable, msg)
}
pub fn internal(msg: impl Into<String>) -> Self {
Self::new(ApiCode::Internal, msg)
}
pub fn service_unavailable(msg: impl Into<String>) -> Self {
Self::new(ApiCode::ServiceUnavailable, msg)
}
}
#[derive(Serialize)]
struct ApiErrorBody<'a> {
error: &'a str,
code: &'a str,
#[serde(skip_serializing_if = "Option::is_none")]
detail: Option<&'a Value>,
}
impl IntoResponse for ApiError {
fn into_response(self) -> Response {
let body = ApiErrorBody {
error: &self.message,
code: self.code.as_str(),
detail: self.detail.as_ref(),
};
(
self.code.status(),
Json(serde_json::to_value(&body).unwrap()),
)
.into_response()
}
}
impl From<StatusCode> for ApiError {
fn from(status: StatusCode) -> Self {
let code = match status {
StatusCode::BAD_REQUEST => ApiCode::BadRequest,
StatusCode::FORBIDDEN => ApiCode::Forbidden,
StatusCode::NOT_FOUND => ApiCode::NotFound,
StatusCode::CONFLICT => ApiCode::Conflict,
StatusCode::PAYLOAD_TOO_LARGE => ApiCode::PayloadTooLarge,
StatusCode::UNPROCESSABLE_ENTITY => ApiCode::Unprocessable,
StatusCode::SERVICE_UNAVAILABLE => ApiCode::ServiceUnavailable,
_ => ApiCode::Internal,
};
Self::new(
code,
status.canonical_reason().unwrap_or("error").to_string(),
)
}
}
impl From<std::io::Error> for ApiError {
fn from(err: std::io::Error) -> Self {
Self::internal(err.to_string())
}
}
impl From<serde_json::Error> for ApiError {
fn from(err: serde_json::Error) -> Self {
Self::bad_request(format!("invalid JSON: {err}"))
}
}
pub type ApiResult<T> = Result<T, ApiError>;
#[cfg(test)]
mod tests {
use super::*;
use axum::body::to_bytes;
#[tokio::test]
async fn serializes_with_error_code_detail() {
let err = ApiError::not_found("scan not found").with_detail(serde_json::json!({"id":"x"}));
let resp = err.into_response();
assert_eq!(resp.status(), StatusCode::NOT_FOUND);
let body = to_bytes(resp.into_body(), 8 * 1024).await.unwrap();
let v: serde_json::Value = serde_json::from_slice(&body).unwrap();
assert_eq!(v["error"], "scan not found");
assert_eq!(v["code"], "not_found");
assert_eq!(v["detail"]["id"], "x");
}
#[test]
fn omits_detail_when_absent() {
let err = ApiError::bad_request("bad input");
let body = ApiErrorBody {
error: &err.message,
code: err.code.as_str(),
detail: err.detail.as_ref(),
};
let s = serde_json::to_string(&body).unwrap();
assert!(!s.contains("detail"), "expected no detail key, got {s}");
}
}

927
src/server/health.rs Normal file
View file

@ -0,0 +1,927 @@
//! Health-score scoring engine — v3.5.
//!
//! Pure-function scoring over a `HealthInputs` struct. Documented in
//! `docs/health-score-audit.md` (calibration, rationale) and
//! `docs/health-score.md` (customer methodology).
//!
//! ## Conceptual model
//!
//! The score reflects two intersecting forces:
//!
//! 1. **Density of risk.** The *quantitative* axis: per-finding weight
//! that combines severity, confidence, symex verdict, and a test-
//! path discount, divided by a size proxy, mapped through a log
//! curve to a 0100 base.
//!
//! 2. **HIGH-count guardrails.** The *qualitative* axis: HIGH counts
//! cap the maximum grade and floor "no HIGH" to at least C. These
//! are non-negotiable promises — even a perfect-everywhere-else
//! repo with 6 confirmed HIGHs grades F.
//!
//! Modifiers (triage, trend, stale, regression, suppression hygiene)
//! are nudges totalling at most ±15 within whatever band the
//! guardrails carve out.
//!
//! ## What v3.5 changed vs v2/v3
//!
//! * Verdict-weighted credibility (`Confirmed > NotAttempted >
//! Inconclusive > Infeasible`). This is the structural protection
//! against false-positive-driven F grades while the scanner is
//! still maturing — it auto-tightens as symex coverage grows.
//! * Cross-file vs intra-file vs AST-only weighting via
//! `context_factor`.
//! * Test-path downweight (0.3×) — a HIGH in a test fixture is
//! genuinely less concerning than one in a request handler.
//! * Effective HIGH count for ceilings — the HIGH-count caps key on
//! credibility-adjusted HIGHs, not raw HIGHs. A repo with 5
//! low-confidence HIGHs that got `NotAttempted` from symex doesn't
//! pay the same ceiling cost as a repo with 5 `Confirmed` HIGHs.
//! * Tighter modifier ranges so they can't flip a band.
//! * No `parse_success_rate` (it's actually a cache-miss metric —
//! see `project_parse_success_rate_misnomer.md`).
use crate::commands::scan::Diag;
use crate::evidence::{Confidence, Verdict};
use crate::patterns::Severity;
use crate::server::models::{BacklogStats, FindingSummary, HealthComponent, HealthScore};
// ── Tunables ─────────────────────────────────────────────────────────────────
//
// Calibrated for v0.5.0 scanner FP rate. As Nyx symex coverage and
// rule precision improve, the HIGH ceilings should tighten — see
// `docs/health-score-audit.md` "Calibration trajectory" for the
// roadmap.
/// Below this file count, we floor the size divisor at 1.0 — tiny
/// repos can't claim infinite per-LOC dilution from one finding.
const FILES_FLOOR: f64 = 100.0;
/// Above this file count, no further dilution credit. A 50MLOC
/// monorepo doesn't get a pass on a HIGH because it's "drowned" in
/// other code.
const FILES_CEILING: f64 = 50_000.0;
/// Quality lints saturate fast. 300 quality lints = max drag.
const QUALITY_DRAG_PER_FINDING: f64 = 0.05;
const QUALITY_DRAG_CAP: f64 = 15.0;
/// Below this finding count, the Triage component contributes
/// weight 0 — we don't punish fresh users for not having triaged
/// what didn't need triaging.
const TRIAGE_FLOOR: usize = 20;
/// Stale-HIGH penalty parameters.
const STALE_PENALTY_PER_FINDING: f64 = 2.0;
const STALE_PENALTY_CAP: f64 = 10.0;
// ── Public API ───────────────────────────────────────────────────────────────
/// Pure inputs to the health-score calculation. No app state, no DB
/// handles — those upstream concerns are flattened into primitives the
/// scorer actually consumes.
#[derive(Debug, Clone, Copy)]
pub struct HealthInputs<'a> {
pub summary: &'a FindingSummary,
pub findings: &'a [Diag],
pub triage_coverage: f64,
pub new_since_last: usize,
pub fixed_since_last: usize,
pub reintroduced: usize,
/// Files scanned in the latest scan. Used as a proxy for repo
/// size. `None` disables size adjustment (matches v1 callers).
pub repo_files: Option<u64>,
/// Backlog stats from the overview pipeline. `None` is fine on
/// first scans (no aging data yet).
pub backlog: Option<&'a BacklogStats>,
/// Whether we have ≥2 completed scans. Without history Trend
/// is meaningless and contributes weight 0.
pub has_history: bool,
/// Fraction of suppressions that use blanket (rule/file/
/// rule_in_file) rules instead of fingerprint-level. `None` if
/// no suppressions. Drives a small ±2 modifier; high blanket
/// rates suggest gaming the score.
pub blanket_suppression_rate: Option<f64>,
}
/// Compute the health score from pure inputs.
pub fn compute(inp: &HealthInputs<'_>) -> HealthScore {
// Step 1: Per-finding credibility-weighted weight, plus the
// bookkeeping we need for the breakdown components.
let weighted = aggregate_findings(inp.findings);
// Step 2: Density adjustment.
let size_divisor = size_divisor(inp.repo_files);
let density_weight = weighted.raw_weight / size_divisor;
// Step 3: Map density to base score via log curve.
let base_score = density_to_base_score(density_weight);
// Step 4: Apply quality-lint drag.
let quality_drag = quality_drag(weighted.quality_count);
let base_after_drag = (base_score - quality_drag).clamp(0.0, 100.0);
// Step 5: HIGH-count guardrails — keyed on *effective* HIGH count
// (credibility-weighted), not raw count. This is what protects
// users from FP-driven F grades while the scanner is maturing.
let ceiling = high_total_ceiling(weighted.effective_high);
let floor = high_total_floor(weighted.effective_high);
let score_clamped = base_after_drag.clamp(floor, ceiling);
// Step 6: Build the breakdown components (also computes their
// sub-scores for transparency).
let components = build_components(inp, &weighted, base_after_drag, size_divisor);
// Step 7: Sum modifiers (already encoded in component weights;
// see `build_components`).
let modifier_sum = components
.iter()
.filter(|c| c.label != "Severity pressure")
.map(signed_modifier_contribution)
.sum::<f64>();
// Reapply ceiling AND floor after modifiers. Ceiling: modifiers
// can't lift past a HIGH cap. Floor: triage/regression
// modifiers can't break the no-HIGH ≥ C guarantee.
let final_uncapped = (score_clamped + modifier_sum).clamp(0.0, 100.0);
let score = final_uncapped.min(ceiling).max(floor).round() as u8;
let grade = grade_for(score).to_string();
HealthScore {
score,
grade,
components,
}
}
// ── Aggregation ──────────────────────────────────────────────────────────────
#[derive(Debug, Default)]
struct WeightedAggregate {
/// Sum of `severity_base × confidence_factor × verdict_factor ×
/// context_factor` across security findings. Quality lints are
/// handled separately via `quality_drag`.
raw_weight: f64,
/// Number of `*.quality.*` findings — drives `quality_drag`.
quality_count: usize,
/// Credibility-adjusted HIGH count (rounded) — drives the HIGH
/// ceiling and floor. A low-confidence + Inconclusive HIGH might
/// contribute 0.2; five of them would round to 1.
effective_high: usize,
/// Raw counts (for the breakdown text).
raw_high: usize,
raw_medium: usize,
raw_low_security: usize,
/// Confidence rate (high+medium*0.5)/total — drives the
/// confidence component. 100 if no findings.
confidence_rate: f64,
/// Symex coverage — % of taint findings with any non-NotAttempted
/// verdict. Surfaced in component detail; not currently in score.
symex_coverage: f64,
}
fn aggregate_findings(findings: &[Diag]) -> WeightedAggregate {
let mut agg = WeightedAggregate::default();
let mut effective_high_sum = 0.0f64;
let mut conf_score_sum = 0.0f64;
let mut taint_total = 0usize;
let mut taint_with_verdict = 0usize;
for f in findings {
let is_quality = f.id.contains(".quality.") || f.id.starts_with("quality.");
if is_quality {
agg.quality_count += 1;
continue;
}
let severity = f.severity;
let conf_factor = confidence_factor(f.confidence);
let verdict_factor = verdict_factor(f);
let context_factor = context_factor(f);
let credibility = (conf_factor * verdict_factor * context_factor).clamp(0.0, 1.2);
let weight = severity_base(severity) * credibility;
agg.raw_weight += weight;
match severity {
Severity::High => {
agg.raw_high += 1;
effective_high_sum += credibility;
}
Severity::Medium => agg.raw_medium += 1,
Severity::Low => agg.raw_low_security += 1,
}
// Confidence component contribution (independent of severity).
conf_score_sum += match f.confidence {
Some(Confidence::High) => 1.0,
Some(Confidence::Medium) => 0.5,
_ => 0.0,
};
// Symex coverage tracking — only meaningful for findings with
// taint-flow evidence (the ones symex even attempts).
if let Some(ev) = f.evidence.as_ref()
&& ev.symbolic.is_some()
{
taint_total += 1;
if !matches!(
ev.symbolic.as_ref().map(|s| s.verdict),
Some(Verdict::NotAttempted) | None
) {
taint_with_verdict += 1;
}
}
}
agg.effective_high = effective_high_sum.round() as usize;
agg.confidence_rate = if findings.is_empty() {
100.0
} else {
let security_total = (findings.len() - agg.quality_count).max(1);
(conf_score_sum / security_total as f64) * 100.0
};
agg.symex_coverage = if taint_total == 0 {
0.0
} else {
taint_with_verdict as f64 / taint_total as f64
};
agg
}
fn severity_base(s: Severity) -> f64 {
match s {
Severity::High => 10.0,
Severity::Medium => 3.0,
Severity::Low => 0.5,
}
}
fn confidence_factor(c: Option<Confidence>) -> f64 {
match c {
Some(Confidence::High) => 1.0,
Some(Confidence::Medium) => 0.6,
Some(Confidence::Low) => 0.3,
None => 0.5,
}
}
/// `verdict_factor` is the heart of the FP protection. An AST-only
/// finding (no taint flow → no symex even attempted) gets the
/// `NotAttempted` baseline of 1.0. A taint finding that symex
/// confirmed gets 1.2 (a credibility boost). A taint finding that
/// symex proved infeasible gets 0.1 (near-suppress).
fn verdict_factor(f: &Diag) -> f64 {
let Some(ev) = f.evidence.as_ref() else {
return 1.0;
};
let Some(sv) = ev.symbolic.as_ref() else {
return 1.0;
};
match sv.verdict {
Verdict::Confirmed => 1.2,
Verdict::NotAttempted => 1.0,
Verdict::Inconclusive => 0.7,
Verdict::Infeasible => 0.1,
}
}
/// Cross-file flow → 1.15. Intra-file taint flow → 1.0. AST-only
/// (no flow_steps) → 0.75. Test path → 0.3 regardless of the others
/// (returns the *minimum* factor so test paths always win over
/// cross-file boosts).
fn context_factor(f: &Diag) -> f64 {
if is_test_path(&f.path) {
return 0.3;
}
let Some(ev) = f.evidence.as_ref() else {
return 0.75; // No evidence at all — pattern match
};
if ev.flow_steps.is_empty() {
return 0.75;
}
if ev.flow_steps.iter().any(|s| s.is_cross_file) || ev.uses_summary {
return 1.15;
}
1.0
}
fn is_test_path(path: &str) -> bool {
let p = path.to_ascii_lowercase();
// Path-segment matches.
p.contains("/test/")
|| p.contains("/tests/")
|| p.contains("/spec/")
|| p.contains("/__tests__/")
|| p.contains("/testdata/")
// Filename suffix conventions.
|| p.ends_with("_test.go")
|| p.ends_with("_spec.rb")
|| p.ends_with(".test.ts")
|| p.ends_with(".test.js")
|| p.ends_with(".spec.ts")
|| p.ends_with(".spec.js")
|| file_basename(&p)
.map(|b| b.starts_with("test_") && b.ends_with(".py"))
.unwrap_or(false)
}
fn file_basename(path: &str) -> Option<&str> {
path.rsplit('/').next()
}
// ── Density math ─────────────────────────────────────────────────────────────
fn size_divisor(repo_files: Option<u64>) -> f64 {
let f = match repo_files {
Some(n) => (n as f64).clamp(FILES_FLOOR, FILES_CEILING),
None => FILES_FLOOR,
};
(f / FILES_FLOOR).sqrt()
}
fn density_to_base_score(density_weight: f64) -> f64 {
if density_weight <= 0.0 {
return 100.0;
}
let raw = 100.0 - 22.0 * (1.0 + density_weight / 4.0).log10();
raw.clamp(0.0, 100.0)
}
fn quality_drag(quality_count: usize) -> f64 {
(quality_count as f64 * QUALITY_DRAG_PER_FINDING).min(QUALITY_DRAG_CAP)
}
// ── HIGH guardrails — calibrated for v0.5.0 FP rate ──────────────────────────
/// Final-score ceiling keyed on *effective* HIGH count (credibility-
/// weighted, not raw). See module docstring for the rationale.
fn high_total_ceiling(effective_high: usize) -> f64 {
match effective_high {
0 => 100.0,
1 => 85.0, // 1 credible HIGH → max B
2 => 78.0, // 2 → max C+
3..=5 => 68.0, // 3-5 → max D+
6..=10 => 58.0,
_ => 45.0,
}
}
/// Final-score floor keyed on *effective* HIGH count. Zero HIGH never
/// grades below C. This is the structural promise that the score
/// isn't an automated F-machine.
fn high_total_floor(effective_high: usize) -> f64 {
if effective_high == 0 { 70.0 } else { 0.0 }
}
// ── Stale-HIGH penalty ──────────────────────────────────────────────────────
fn stale_high_penalty(effective_high: usize, backlog: Option<&BacklogStats>) -> f64 {
let Some(b) = backlog else { return 0.0 };
if effective_high == 0 || b.stale_count == 0 {
return 0.0;
}
(b.stale_count as f64 * STALE_PENALTY_PER_FINDING).min(STALE_PENALTY_CAP)
}
// ── Component breakdown ──────────────────────────────────────────────────────
fn build_components(
inp: &HealthInputs<'_>,
weighted: &WeightedAggregate,
base_after_drag: f64,
size_divisor: f64,
) -> Vec<HealthComponent> {
let total = inp.summary.total;
// Severity component is the primary score-bearing component;
// it absorbs the base+drag+ceiling+floor result.
let sev_score = base_after_drag.round().clamp(0.0, 100.0) as u8;
let sev_detail = severity_detail(weighted, size_divisor, inp.repo_files, inp.backlog);
// Confidence component — high-conf rate scaled into 0..=100.
let conf_score = weighted.confidence_rate.round().clamp(0.0, 100.0) as u8;
let conf_detail = format!(
"High-confidence rate {:.0}% across {} security finding{}",
weighted.confidence_rate,
total - weighted.quality_count,
plural_s(total - weighted.quality_count)
);
// Trend component — only contributes weight when has_history.
let net = inp.fixed_since_last as i64 - inp.new_since_last as i64;
let trend_score = (50 + net * 5).clamp(0, 100) as u8;
let trend_weight = if inp.has_history { 0.20 } else { 0.0 };
let trend_detail = if inp.has_history {
format!(
"Net {} since last scan ({} fixed, {} new)",
net, inp.fixed_since_last, inp.new_since_last
)
} else {
"Not applicable: no prior scan to compare against (re-scan to populate)".into()
};
// Triage — drops out when total < TRIAGE_FLOOR.
let triage_active = total >= TRIAGE_FLOOR;
let triage_score = (inp.triage_coverage * 100.0).round().clamp(0.0, 100.0) as u8;
let triage_weight = if triage_active { 0.20 } else { 0.0 };
let triage_detail = if triage_active {
format!(
"{:.0}% of findings have a triage state",
inp.triage_coverage * 100.0
)
} else {
format!(
"Not applicable: only {} finding{} (need ≥{} to evaluate)",
total,
plural_s(total),
TRIAGE_FLOOR
)
};
// Regression resistance.
let stale_penalty = stale_high_penalty(weighted.effective_high, inp.backlog);
let reintro_penalty = (inp.reintroduced as f64 * 5.0).min(10.0);
let regression_score = (100.0 - reintro_penalty - stale_penalty)
.clamp(0.0, 100.0)
.round() as u8;
let regression_detail = match (inp.reintroduced, stale_penalty) {
(0, 0.0) => "No reintroduced or stale-HIGH findings".into(),
(0, p) => format!(
"{} stale finding{} affecting HIGH severity ({:.0})",
inp.backlog.map(|b| b.stale_count).unwrap_or(0),
plural_s(inp.backlog.map(|b| b.stale_count).unwrap_or(0)),
p
),
(n, 0.0) => format!(
"{} previously-fixed finding{} reintroduced ({:.0})",
n,
plural_s(n),
(n as f64 * 5.0).min(10.0)
),
(n, p) => format!(
"{} reintroduced ({:.0}) + stale-HIGH penalty ({:.0})",
n,
(n as f64 * 5.0).min(10.0),
p
),
};
vec![
HealthComponent {
label: "Severity pressure".into(),
score: sev_score,
weight: 1.0, // Severity is the *base*, not a modifier — full weight in the blend.
detail: sev_detail,
},
HealthComponent {
label: "Confidence quality".into(),
score: conf_score,
weight: 0.0, // Confidence influence is already baked into raw_weight via verdict_factor.
detail: conf_detail,
},
HealthComponent {
label: "Trend".into(),
score: trend_score,
weight: trend_weight,
detail: trend_detail,
},
HealthComponent {
label: "Triage coverage".into(),
score: triage_score,
weight: triage_weight,
detail: triage_detail,
},
HealthComponent {
label: "Regression resistance".into(),
score: regression_score,
weight: 0.15,
detail: regression_detail,
},
]
}
/// How a non-severity component contributes to the modifier sum.
/// Each component's score (0100) is mapped to a signed point delta
/// in roughly the [5, +5] range, gated by the component's weight
/// (which becomes 0 when the component drops out).
fn signed_modifier_contribution(c: &HealthComponent) -> f64 {
if c.weight == 0.0 {
return 0.0;
}
match c.label.as_str() {
"Confidence quality" => {
// High-conf rate above 80% → +3, above 50% → +1, below → 0.
// (This component now also has weight 0 because its
// influence is baked into raw_weight via verdict_factor.
// Kept here for transparency in the breakdown only.)
0.0
}
"Trend" => {
// Net positive trend → +3 max; negative → 3 max.
// Linear in (score 50)/50 × 3, clamped.
let centred = (c.score as f64 - 50.0) / 50.0;
(centred * 3.0).clamp(-3.0, 3.0)
}
"Triage coverage" => {
// ≥50% triaged → +5; 0% triaged → 3; in between → linear.
if c.score >= 50 {
((c.score as f64 - 50.0) / 50.0 * 5.0).min(5.0)
} else {
-((50.0 - c.score as f64) / 50.0 * 3.0).min(3.0)
}
}
"Regression resistance" => {
// 100 → +0, lower scores subtract directly (already baked
// in the score; component weight pulls it into the blend).
// Map: at score 100 → 0; at score 70 → 5; at score 0 → 15.
((c.score as f64 - 100.0) * 0.15).clamp(-15.0, 0.0)
}
_ => 0.0,
}
}
fn severity_detail(
w: &WeightedAggregate,
size_divisor: f64,
repo_files: Option<u64>,
backlog: Option<&BacklogStats>,
) -> String {
let mut parts = Vec::new();
parts.push(format!("{:.0} weighted points", w.raw_weight));
parts.push(format!(
"{} High, {} Medium, {} Low",
w.raw_high, w.raw_medium, w.raw_low_security
));
if w.quality_count > 0 {
parts.push(format!("{} quality lints", w.quality_count));
}
if w.effective_high != w.raw_high {
parts.push(format!(
"effective HIGH={} (credibility-adjusted)",
w.effective_high
));
}
if let Some(f) = repo_files
&& (size_divisor - 1.0).abs() > 0.01
{
parts.push(format!("size factor 1/{:.2}× ({} files)", size_divisor, f));
}
let stale = stale_high_penalty(w.effective_high, backlog);
if stale > 0.0
&& let Some(b) = backlog
{
parts.push(format!("{:.0} stale-HIGH ({} >30d)", stale, b.stale_count));
}
parts.join(" · ")
}
// ── Misc ─────────────────────────────────────────────────────────────────────
fn grade_for(score: u8) -> &'static str {
match score {
90..=100 => "A",
80..=89 => "B",
70..=79 => "C",
60..=69 => "D",
_ => "F",
}
}
fn plural_s(n: usize) -> &'static str {
if n == 1 { "" } else { "s" }
}
// ── Tests ────────────────────────────────────────────────────────────────────
#[cfg(test)]
mod tests {
use super::*;
use crate::patterns::{FindingCategory, Severity};
fn diag(severity: Severity, id: &str, conf: Option<Confidence>) -> Diag {
Diag {
path: "src/lib.rs".into(),
line: 1,
col: 1,
severity,
id: id.into(),
category: FindingCategory::Security,
path_validated: false,
guard_kind: None,
message: None,
labels: Vec::new(),
confidence: conf,
evidence: None,
rank_score: None,
rank_reason: None,
suppressed: false,
suppression: None,
rollup: None,
finding_id: String::new(),
alternative_finding_ids: Vec::new(),
}
}
fn diag_in(path: &str, severity: Severity, conf: Option<Confidence>) -> Diag {
let mut d = diag(severity, "rs.taint.x", conf);
d.path = path.into();
d
}
fn summary_of(findings: &[Diag]) -> FindingSummary {
let mut s = FindingSummary {
total: findings.len(),
..Default::default()
};
for d in findings {
*s.by_severity
.entry(d.severity.as_db_str().to_string())
.or_insert(0) += 1;
}
s
}
fn first_scan<'a>(
summary: &'a FindingSummary,
findings: &'a [Diag],
triage: f64,
files: u64,
) -> HealthInputs<'a> {
HealthInputs {
summary,
findings,
triage_coverage: triage,
new_since_last: 0,
fixed_since_last: 0,
reintroduced: 0,
repo_files: Some(files),
backlog: None,
has_history: false,
blanket_suppression_rate: None,
}
}
#[allow(dead_code)]
fn with_history<'a>(
summary: &'a FindingSummary,
findings: &'a [Diag],
triage: f64,
files: u64,
) -> HealthInputs<'a> {
HealthInputs {
has_history: true,
..first_scan(summary, findings, triage, files)
}
}
#[allow(dead_code)]
fn sev_score(h: &HealthScore) -> u8 {
h.components
.iter()
.find(|c| c.label == "Severity pressure")
.unwrap()
.score
}
// ── Foundational behaviour ───────────────────────────────────────
#[test]
fn clean_repo_first_scan_grades_a() {
let findings: Vec<Diag> = vec![];
let s = summary_of(&findings);
let h = compute(&first_scan(&s, &findings, 0.0, 100));
assert_eq!(h.grade, "A");
assert!(h.score >= 95, "clean first-scan ≥95, got {}", h.score);
}
#[test]
fn no_high_repo_never_grades_below_c() {
// 0 HIGH, lots of mediums + quality.
let mut findings: Vec<Diag> = (0..200)
.map(|_| diag(Severity::Medium, "rs.taint.foo", Some(Confidence::High)))
.collect();
findings.extend(
(0..2000).map(|_| diag(Severity::Low, "rs.quality.unwrap", Some(Confidence::High))),
);
let s = summary_of(&findings);
let h = compute(&first_scan(&s, &findings, 0.0, 200));
assert!(h.score >= 70, "0 HIGH must grade ≥C (70), got {}", h.score);
}
#[test]
fn quality_lints_alone_grade_at_least_b() {
// 1000 quality lints, no security findings. Drag caps at 15
// so base ~10015=85. Should grade at worst B-.
let findings: Vec<Diag> = (0..1000)
.map(|_| diag(Severity::Low, "rs.quality.unwrap", Some(Confidence::High)))
.collect();
let s = summary_of(&findings);
let h = compute(&first_scan(&s, &findings, 0.0, 100));
assert!(h.score >= 80, "1000 quality lints → ≥B, got {}", h.score);
}
#[test]
fn one_high_caps_at_b() {
let findings = vec![diag(Severity::High, "rs.taint.x", Some(Confidence::High))];
let s = summary_of(&findings);
let h = compute(&first_scan(&s, &findings, 0.0, 100));
assert!(h.score <= 89, "1 HIGH must not grade A, got {}", h.score);
assert_ne!(h.grade, "A");
}
#[test]
fn many_confirmed_high_grades_f() {
// 8 HIGHs all symex-Confirmed → effective_high ≈ 9.6 → F band.
let findings: Vec<Diag> = (0..8)
.map(|_| {
let mut d = diag(Severity::High, "rs.taint.x", Some(Confidence::High));
let ev = crate::evidence::Evidence {
symbolic: Some(crate::evidence::SymbolicVerdict {
verdict: crate::evidence::Verdict::Confirmed,
constraints_checked: 0,
paths_explored: 0,
witness: None,
interproc_call_chains: Vec::new(),
cutoff_notes: Vec::new(),
}),
..Default::default()
};
d.evidence = Some(ev);
d
})
.collect();
let s = summary_of(&findings);
let h = compute(&first_scan(&s, &findings, 0.0, 1000));
assert_eq!(h.grade, "F");
}
#[test]
fn low_credibility_high_does_not_count_as_full() {
// 5 raw HIGHs, all Low confidence, all NotAttempted (no
// evidence). Each has credibility ≈ 0.3 × 1.0 × 0.75 = 0.225.
// Sum = 1.125 → effective_high = 1. Ceiling 85.
let findings: Vec<Diag> = (0..5)
.map(|_| {
let mut d = diag(Severity::High, "rs.taint.x", Some(Confidence::Low));
// Force AST-only: no evidence at all.
d.evidence = None;
d
})
.collect();
let s = summary_of(&findings);
let h = compute(&first_scan(&s, &findings, 0.0, 100));
// The score reflects credibility — should NOT crater to F.
assert!(
h.score >= 60,
"low-credibility HIGHs shouldn't crater to F, got {}",
h.score
);
}
#[test]
fn test_path_findings_are_discounted() {
let in_test = vec![diag_in(
"src/feature/__tests__/handler.test.ts",
Severity::High,
Some(Confidence::High),
)];
let in_prod = vec![diag_in(
"src/feature/handler.ts",
Severity::High,
Some(Confidence::High),
)];
let st = summary_of(&in_test);
let sp = summary_of(&in_prod);
let h_test = compute(&first_scan(&st, &in_test, 0.0, 50));
let h_prod = compute(&first_scan(&sp, &in_prod, 0.0, 50));
assert!(
h_test.score > h_prod.score,
"test-path HIGH ({}) should grade better than prod HIGH ({})",
h_test.score,
h_prod.score
);
}
#[test]
fn density_dampens_for_large_repos_but_caps() {
let findings: Vec<Diag> = (0..3)
.map(|_| diag(Severity::Medium, "rs.taint.x", Some(Confidence::High)))
.collect();
let s = summary_of(&findings);
let small = compute(&first_scan(&s, &findings, 0.0, 100));
let mid = compute(&first_scan(&s, &findings, 0.0, 5000));
let big = compute(&first_scan(&s, &findings, 0.0, 50_000));
let huge = compute(&first_scan(&s, &findings, 0.0, 500_000));
assert!(
small.score <= mid.score,
"small {} mid {}",
small.score,
mid.score
);
assert!(
mid.score <= big.score,
"mid {} big {}",
mid.score,
big.score
);
assert!(
(big.score as i32 - huge.score as i32).abs() <= 1,
"size cap broken: big {} huge {}",
big.score,
huge.score
);
}
#[test]
fn triage_drops_when_total_under_floor() {
let findings: Vec<Diag> = (0..5)
.map(|_| diag(Severity::Low, "rs.x", Some(Confidence::High)))
.collect();
let s = summary_of(&findings);
let h = compute(&first_scan(&s, &findings, 0.0, 100));
let triage = h
.components
.iter()
.find(|c| c.label == "Triage coverage")
.unwrap();
assert_eq!(triage.weight, 0.0);
assert!(triage.detail.contains("Not applicable"));
}
#[test]
fn trend_drops_on_first_scan() {
let findings: Vec<Diag> = (0..30)
.map(|_| diag(Severity::Medium, "rs.x", Some(Confidence::High)))
.collect();
let s = summary_of(&findings);
let h = compute(&first_scan(&s, &findings, 0.5, 100));
let trend = h.components.iter().find(|c| c.label == "Trend").unwrap();
assert_eq!(trend.weight, 0.0);
assert!(trend.detail.contains("Not applicable"));
}
#[test]
fn stale_high_penalty_lowers_regression_component() {
let findings = vec![diag(Severity::High, "rs.taint.x", Some(Confidence::High))];
let s = summary_of(&findings);
let backlog_clean = BacklogStats {
oldest_open_days: Some(2),
median_age_days: Some(1),
stale_count: 0,
age_buckets: vec![],
};
let backlog_stale = BacklogStats {
oldest_open_days: Some(120),
median_age_days: Some(60),
stale_count: 3,
age_buckets: vec![],
};
let fresh_inputs = HealthInputs {
backlog: Some(&backlog_clean),
has_history: true,
..first_scan(&s, &findings, 0.0, 100)
};
let rotting_inputs = HealthInputs {
backlog: Some(&backlog_stale),
has_history: true,
..first_scan(&s, &findings, 0.0, 100)
};
let fresh = compute(&fresh_inputs);
let rotting = compute(&rotting_inputs);
let fresh_reg = fresh
.components
.iter()
.find(|c| c.label == "Regression resistance")
.unwrap()
.score;
let rot_reg = rotting
.components
.iter()
.find(|c| c.label == "Regression resistance")
.unwrap()
.score;
assert!(
rot_reg < fresh_reg,
"stale should lower regression score: fresh {} vs rotting {}",
fresh_reg,
rot_reg
);
}
#[test]
fn grade_thresholds() {
assert_eq!(grade_for(100), "A");
assert_eq!(grade_for(90), "A");
assert_eq!(grade_for(89), "B");
assert_eq!(grade_for(80), "B");
assert_eq!(grade_for(79), "C");
assert_eq!(grade_for(70), "C");
assert_eq!(grade_for(69), "D");
assert_eq!(grade_for(60), "D");
assert_eq!(grade_for(59), "F");
assert_eq!(grade_for(0), "F");
}
}

View file

@ -1,8 +1,12 @@
pub mod app;
pub mod assets;
pub mod debug;
pub mod error;
pub mod health;
pub mod jobs;
pub mod models;
pub mod observability;
pub mod owasp;
pub mod progress;
pub mod routes;
pub mod scan_log;

View file

@ -582,6 +582,187 @@ pub struct OverviewResponse {
pub noisy_rules: Vec<NoisyRule>,
pub recent_scans: Vec<ScanSummary>,
pub insights: Vec<Insight>,
// ── Tier 1 ──
#[serde(skip_serializing_if = "Option::is_none")]
pub health: Option<HealthScore>,
#[serde(skip_serializing_if = "Option::is_none")]
pub posture: Option<PostureSummary>,
#[serde(skip_serializing_if = "Option::is_none")]
pub backlog: Option<BacklogStats>,
#[serde(default, skip_serializing_if = "Vec::is_empty")]
pub weighted_top_files: Vec<WeightedFile>,
#[serde(skip_serializing_if = "Option::is_none")]
pub confidence_distribution: Option<ConfidenceDistribution>,
// ── Tier 2 ──
#[serde(skip_serializing_if = "Option::is_none")]
pub scanner_quality: Option<ScannerQuality>,
#[serde(default, skip_serializing_if = "Vec::is_empty")]
pub issue_categories: Vec<IssueCategoryBucket>,
#[serde(default, skip_serializing_if = "Vec::is_empty")]
pub hot_sinks: Vec<HotSink>,
#[serde(default, skip_serializing_if = "Vec::is_empty")]
pub owasp_buckets: Vec<OwaspBucket>,
#[serde(skip_serializing_if = "Option::is_none")]
pub cross_file_ratio: Option<f64>,
// ── Tier 3 ──
#[serde(skip_serializing_if = "Option::is_none")]
pub baseline: Option<BaselineInfo>,
#[serde(default, skip_serializing_if = "Vec::is_empty")]
pub language_health: Vec<LanguageHealth>,
#[serde(skip_serializing_if = "Option::is_none")]
pub suppression_hygiene: Option<SuppressionHygiene>,
}
/// Composite repo-health rollup.
#[derive(Debug, Clone, Serialize)]
pub struct HealthScore {
/// 0100 score; higher is better.
pub score: u8,
/// Letter grade AF derived from score.
pub grade: String,
/// Sub-component contributions (0100 each) for transparency.
pub components: Vec<HealthComponent>,
}
/// Single line item in the health-score breakdown.
#[derive(Debug, Clone, Serialize)]
pub struct HealthComponent {
/// Human label (e.g. "Severity pressure", "Trend", "Triage").
pub label: String,
/// 0100 — already inverted so higher = healthier.
pub score: u8,
/// Weight applied when blending into the final score (0.01.0).
pub weight: f64,
/// Short rationale shown in tooltip.
pub detail: String,
}
/// One-line trend posture for the page header.
#[derive(Debug, Clone, Serialize)]
pub struct PostureSummary {
/// "improving" | "regressing" | "stable" | "unknown"
pub trend: String,
/// "success" | "warning" | "danger" | "info"
pub severity: String,
/// Short message shown verbatim in the banner.
pub message: String,
/// Findings that were previously fixed and have re-appeared.
pub reintroduced_count: usize,
}
/// Backlog age statistics computed from finding_first_seen.
#[derive(Debug, Clone, Serialize)]
pub struct BacklogStats {
/// Days since the oldest still-open finding was first seen.
pub oldest_open_days: Option<u32>,
/// Median age of currently-open findings, in days.
pub median_age_days: Option<u32>,
/// Findings older than 30 days that remain open.
pub stale_count: usize,
/// Histogram buckets (label, count) — fixed 5 buckets.
pub age_buckets: Vec<OverviewCount>,
}
/// Top-file row including severity stack for the weighted ranking.
#[derive(Debug, Clone, Serialize)]
pub struct WeightedFile {
pub name: String,
pub score: u32,
pub high: usize,
pub medium: usize,
pub low: usize,
pub total: usize,
}
/// Confidence-level distribution.
#[derive(Debug, Clone, Serialize, Default)]
pub struct ConfidenceDistribution {
pub high: usize,
pub medium: usize,
pub low: usize,
pub none: usize,
}
/// Engine-quality metrics that describe analysis depth/coverage.
#[derive(Debug, Clone, Serialize)]
pub struct ScannerQuality {
pub files_scanned: u64,
pub files_skipped: u64,
/// 0.01.0 — files_scanned / (files_scanned + files_skipped).
pub parse_success_rate: f64,
pub functions_analyzed: u64,
pub call_edges: u64,
pub unresolved_calls: u64,
/// 0.01.0 — call_edges / (call_edges + unresolved_calls).
pub call_resolution_rate: f64,
/// % of taint findings that received a symbolic verdict (Confirmed|Infeasible|Inconclusive).
pub symex_verified_rate: f64,
/// Count broken down by symbolic verdict label.
pub symex_breakdown: HashMap<String, usize>,
}
/// One issue-category bucket (rule-family derived). Broader than OWASP, with
/// engine-friendly labels like "Tainted Flow" or "Code Quality".
#[derive(Debug, Clone, Serialize)]
pub struct IssueCategoryBucket {
pub label: String,
pub count: usize,
}
/// "Hot sink" — a single callee that absorbs many findings.
#[derive(Debug, Clone, Serialize)]
pub struct HotSink {
/// Callee name (best-effort; from flow_steps last Sink).
pub callee: String,
pub count: usize,
}
/// One OWASP Top-10 (2021) bucket.
#[derive(Debug, Clone, Serialize)]
pub struct OwaspBucket {
/// "A01:2021 — Broken Access Control" etc.
pub code: String,
pub label: String,
pub count: usize,
}
/// Per-language posture.
#[derive(Debug, Clone, Serialize)]
pub struct LanguageHealth {
pub language: String,
pub findings: usize,
pub high: usize,
pub medium: usize,
pub low: usize,
}
/// Suppression-quality breakdown.
#[derive(Debug, Clone, Serialize)]
pub struct SuppressionHygiene {
/// Findings explicitly triaged by fingerprint.
pub fingerprint_level: usize,
/// Findings suppressed by rule-level suppression.
pub rule_level: usize,
/// Findings suppressed by file-level suppression.
pub file_level: usize,
/// Findings suppressed by rule-in-file suppression.
pub rule_in_file_level: usize,
/// % of suppressed findings using low-specificity (rule/file/rule_in_file) rules.
pub blanket_rate: f64,
}
/// Pinned baseline scan and current drift relative to it.
#[derive(Debug, Clone, Serialize)]
pub struct BaselineInfo {
pub scan_id: String,
#[serde(skip_serializing_if = "Option::is_none")]
pub started_at: Option<String>,
pub baseline_total: usize,
pub drift_new: usize,
pub drift_fixed: usize,
}
/// A name + count pair for overview top-N lists.

140
src/server/observability.rs Normal file
View file

@ -0,0 +1,140 @@
//! Per-request observability: request IDs + structured access logs.
//!
//! Layered above the security guard. Generates a short request id, attaches it
//! as the `X-Request-Id` response header, and emits one INFO record per request
//! with method, path, status, and duration.
use axum::extract::Request;
use axum::http::{HeaderName, HeaderValue};
use axum::middleware::Next;
use axum::response::Response;
use std::time::Instant;
use uuid::Uuid;
const REQUEST_ID_HEADER: HeaderName = HeaderName::from_static("x-request-id");
pub async fn observe(mut request: Request, next: Next) -> Response {
let request_id = request
.headers()
.get(&REQUEST_ID_HEADER)
.and_then(|v| v.to_str().ok())
.map(|s| s.to_string())
.unwrap_or_else(|| Uuid::new_v4().as_simple().to_string()[..12].to_string());
if let Ok(value) = HeaderValue::from_str(&request_id) {
request.headers_mut().insert(REQUEST_ID_HEADER, value);
}
let method = request.method().clone();
let path = request
.uri()
.path_and_query()
.map(|p| p.as_str().to_string())
.unwrap_or_else(|| request.uri().path().to_string());
let started = Instant::now();
let mut response = next.run(request).await;
let elapsed_ms = started.elapsed().as_secs_f64() * 1000.0;
let status = response.status();
if let Ok(value) = HeaderValue::from_str(&request_id) {
response.headers_mut().insert(REQUEST_ID_HEADER, value);
}
// Skip noisy SSE channel — long-lived stream pollutes logs.
if path != "/api/events" {
if status.is_server_error() {
tracing::error!(
request_id = %request_id,
method = %method,
path = %path,
status = status.as_u16(),
elapsed_ms = format!("{elapsed_ms:.1}"),
"request"
);
} else if status.is_client_error() {
tracing::warn!(
request_id = %request_id,
method = %method,
path = %path,
status = status.as_u16(),
elapsed_ms = format!("{elapsed_ms:.1}"),
"request"
);
} else {
tracing::info!(
request_id = %request_id,
method = %method,
path = %path,
status = status.as_u16(),
elapsed_ms = format!("{elapsed_ms:.1}"),
"request"
);
}
}
response
}
#[cfg(test)]
mod tests {
use super::*;
use axum::Router;
use axum::body::Body;
use axum::http::{Request as HttpRequest, StatusCode};
use axum::middleware;
use axum::routing::get;
use tower::util::ServiceExt;
#[tokio::test]
async fn adds_request_id_header_when_absent() {
let app: Router = Router::new()
.route("/ping", get(|| async { "pong" }))
.layer(middleware::from_fn(observe));
let resp = app
.oneshot(
HttpRequest::builder()
.uri("/ping")
.body(Body::empty())
.unwrap(),
)
.await
.unwrap();
assert_eq!(resp.status(), StatusCode::OK);
let id = resp
.headers()
.get("x-request-id")
.unwrap()
.to_str()
.unwrap();
assert!(!id.is_empty());
assert_eq!(id.len(), 12);
}
#[tokio::test]
async fn preserves_caller_supplied_request_id() {
let app: Router = Router::new()
.route("/ping", get(|| async { "pong" }))
.layer(middleware::from_fn(observe));
let resp = app
.oneshot(
HttpRequest::builder()
.uri("/ping")
.header("x-request-id", "abc-123")
.body(Body::empty())
.unwrap(),
)
.await
.unwrap();
assert_eq!(
resp.headers()
.get("x-request-id")
.unwrap()
.to_str()
.unwrap(),
"abc-123"
);
}
}

236
src/server/owasp.rs Normal file
View file

@ -0,0 +1,236 @@
//! Static rule-id → OWASP Top-10 (2021) mapping for the dashboard.
//!
//! Rule IDs follow the convention `{lang}.{family}.{name}` (e.g. `js.xss.outer_html`).
//! The family segment is what determines the bucket. Conservative — when in doubt,
//! map to the closest fit; rules with no obvious bucket are left unbucketed.
use crate::server::models::OwaspBucket;
use std::collections::HashMap;
/// Extract the family token from a rule ID. Handles two ID shapes:
/// 1. `lang.family.name` — typical (e.g. `js.xss.outer_html`)
/// 2. `family-subname` or single-segment — engine-emitted (e.g.
/// `state-resource-leak`, `taint-unsanitised-flow`, `cfg-error-fallthrough`)
fn extract_family(rule_id: &str) -> &str {
if let Some(idx) = rule_id.find('.') {
let after = &rule_id[idx + 1..];
return match after.find('.') {
Some(n) => &after[..n],
None => after,
};
}
if let Some(idx) = rule_id.find('-') {
return &rule_id[..idx];
}
rule_id
}
/// Return the OWASP 2021 (code, label) pair for a given rule id, or `None` if unmapped.
pub fn owasp_bucket_for(rule_id: &str) -> Option<(&'static str, &'static str)> {
let family = extract_family(rule_id);
if family.is_empty() {
return None;
}
Some(match family {
// A01 — Broken Access Control
"auth" | "csrf" | "mass_assign" | "path" | "redirect" => ("A01", "Broken Access Control"),
// A02 — Cryptographic Failures
"crypto" | "secrets" => ("A02", "Cryptographic Failures"),
// A03 — Injection (covers SQLi, XSS, command, code-eval, template, NoSQL, LDAP, reflection,
// and engine-level taint findings without a more specific family tag).
"sqli" | "xss" | "cmdi" | "code_exec" | "template" | "nosql" | "ldap" | "reflection"
| "taint" => ("A03", "Injection"),
// A05 — Security Misconfiguration (TLS verify off, cookie flags, prototype pollution)
"config" | "transport" | "prototype" => ("A05", "Security Misconfiguration"),
// A08 — Software and Data Integrity Failures
"deser" => ("A08", "Software and Data Integrity Failures"),
// A09 — Logging & Monitoring Failures
"log" => ("A09", "Logging and Monitoring Failures"),
// A10 — SSRF
"ssrf" => ("A10", "Server-Side Request Forgery"),
// Memory-safety + state-machine resource lifecycle bugs — closest OWASP fit is
// A04 Insecure Design (defensive depth).
"memory" | "state" => ("A04", "Insecure Design"),
// Quality findings (e.g. rs.quality.unwrap) and CFG structural issues
// (cfg-error-fallthrough) are reliability / code-health, not direct OWASP
// categories. We return None so they don't pollute the security buckets.
_ => return None,
})
}
/// Bucket all rule-id counts into OWASP categories, returning sorted-desc.
pub fn bucket_findings(by_rule: &HashMap<String, usize>) -> Vec<OwaspBucket> {
let mut totals: HashMap<&'static str, (&'static str, usize)> = HashMap::new();
for (rule_id, &count) in by_rule {
if let Some((code, label)) = owasp_bucket_for(rule_id) {
let entry = totals.entry(code).or_insert((label, 0));
entry.1 += count;
}
}
let mut out: Vec<OwaspBucket> = totals
.into_iter()
.map(|(code, (label, count))| OwaspBucket {
code: code.to_string(),
label: label.to_string(),
count,
})
.collect();
out.sort_by(|a, b| b.count.cmp(&a.count).then_with(|| a.code.cmp(&b.code)));
out
}
/// Bucket rule-id counts into issue categories using the family segment.
/// Broader than OWASP, with friendlier labels (e.g. "Tainted Flow", "Code Quality").
pub fn issue_categories(
by_rule: &HashMap<String, usize>,
) -> Vec<crate::server::models::IssueCategoryBucket> {
let mut totals: HashMap<&'static str, usize> = HashMap::new();
for (rule_id, &count) in by_rule {
let label = issue_category_label(rule_id);
*totals.entry(label).or_insert(0) += count;
}
let mut out: Vec<_> = totals
.into_iter()
.map(
|(label, count)| crate::server::models::IssueCategoryBucket {
label: label.to_string(),
count,
},
)
.collect();
out.sort_by(|a, b| b.count.cmp(&a.count).then_with(|| a.label.cmp(&b.label)));
out
}
fn issue_category_label(rule_id: &str) -> &'static str {
match extract_family(rule_id) {
"sqli" => "SQL Injection",
"xss" => "Cross-Site Scripting",
"cmdi" => "Command Injection",
"code_exec" => "Code Execution",
"deser" => "Deserialization",
"ssrf" => "SSRF",
"path" => "Path Traversal",
"auth" => "Access Control",
"csrf" => "CSRF",
"mass_assign" => "Mass Assignment",
"crypto" => "Weak Crypto",
"secrets" => "Hardcoded Secrets",
"config" => "Misconfiguration",
"transport" => "Insecure Transport",
"prototype" => "Prototype Pollution",
"memory" => "Memory Safety",
"reflection" => "Reflection",
"redirect" => "Open Redirect",
"log" => "Logging",
"template" => "Template Injection",
"taint" => "Tainted Flow",
"state" => "Resource Lifecycle",
"cfg" => "Control-Flow",
"quality" => "Code Quality",
_ => "Other",
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn maps_xss_to_a03() {
assert_eq!(
owasp_bucket_for("js.xss.outer_html"),
Some(("A03", "Injection"))
);
}
#[test]
fn maps_auth_to_a01() {
assert_eq!(
owasp_bucket_for("rs.auth.missing_ownership_check"),
Some(("A01", "Broken Access Control"))
);
}
#[test]
fn unknown_family_returns_none() {
assert_eq!(owasp_bucket_for("js.weirdthing.foo"), None);
}
#[test]
fn malformed_rule_returns_none() {
// single-segment "not" → family "not" → unmapped → None
assert_eq!(owasp_bucket_for("not-a-rule"), None);
// "js.onlytwo" — family is "onlytwo" which is unmapped
assert_eq!(owasp_bucket_for("js.onlytwo"), None);
}
#[test]
fn extract_family_handles_dashed_ids() {
assert_eq!(extract_family("state-resource-leak"), "state");
assert_eq!(extract_family("taint-unsanitised-flow"), "taint");
assert_eq!(extract_family("cfg-error-fallthrough"), "cfg");
assert_eq!(extract_family("rs.quality.unwrap"), "quality");
assert_eq!(extract_family(""), "");
}
#[test]
fn taint_findings_bucket_to_a03() {
assert_eq!(
owasp_bucket_for("taint-unsanitised-flow"),
Some(("A03", "Injection"))
);
}
#[test]
fn quality_and_cfg_are_not_owasp() {
assert_eq!(owasp_bucket_for("rs.quality.unwrap"), None);
assert_eq!(owasp_bucket_for("cfg-error-fallthrough"), None);
}
#[test]
fn issue_category_handles_engine_ids() {
assert_eq!(issue_category_label("rs.quality.unwrap"), "Code Quality");
assert_eq!(
issue_category_label("state-resource-leak"),
"Resource Lifecycle"
);
assert_eq!(
issue_category_label("cfg-error-fallthrough"),
"Control-Flow"
);
assert_eq!(
issue_category_label("taint-unsanitised-flow"),
"Tainted Flow"
);
}
#[test]
fn bucket_findings_sorts_desc() {
let mut m = HashMap::new();
m.insert("js.xss.outer_html".to_string(), 3);
m.insert("rs.auth.missing_ownership_check".to_string(), 5);
m.insert("js.crypto.math_random".to_string(), 2);
let out = bucket_findings(&m);
assert_eq!(out[0].code, "A01");
assert_eq!(out[0].count, 5);
assert_eq!(out[1].code, "A03");
assert_eq!(out[1].count, 3);
assert_eq!(out[2].code, "A02");
assert_eq!(out[2].count, 2);
}
#[test]
fn issue_category_label_recognises_simple_families() {
assert_eq!(
issue_category_label("js.xss.outer_html"),
"Cross-Site Scripting"
);
assert_eq!(
issue_category_label("py.cmdi.os_system"),
"Command Injection"
);
assert_eq!(issue_category_label("garbage"), "Other");
}
}

View file

@ -1,16 +1,17 @@
use crate::commands::config as config_cmd;
use crate::labels;
use crate::server::app::{AppState, ServerEvent};
use crate::server::models::{LabelEntryView, ProfileView, RuleView, TerminatorView};
use crate::utils::config::{CapName, RuleKind, ScanProfile};
use crate::utils::config::{CapName, Config, RuleKind, ScanProfile};
use axum::extract::{Path, State};
use axum::http::StatusCode;
use axum::routing::get;
use axum::{Json, Router};
use std::fs;
pub fn routes() -> Router<AppState> {
Router::new()
.route("/config", get(get_config))
.route("/config/raw", get(get_config_raw).put(put_config_raw))
.route(
"/config/rules",
get(list_rules).post(add_rule).delete(remove_rule),
@ -55,6 +56,67 @@ async fn get_config(State(state): State<AppState>) -> Json<serde_json::Value> {
Json(serde_json::to_value(&*config).unwrap_or_default())
}
// ── Raw nyx.local read/write ─────────────────────────────────────────────────
async fn get_config_raw(State(state): State<AppState>) -> Json<serde_json::Value> {
let local_path = state.config_dir.join("nyx.local");
let exists = local_path.exists();
let content = if exists {
fs::read_to_string(&local_path).unwrap_or_default()
} else {
String::new()
};
Json(serde_json::json!({
"path": local_path.display().to_string(),
"exists": exists,
"content": content,
}))
}
async fn put_config_raw(
State(state): State<AppState>,
Json(body): Json<serde_json::Value>,
) -> Result<Json<serde_json::Value>, (StatusCode, Json<serde_json::Value>)> {
let content = body
.get("content")
.and_then(|v| v.as_str())
.ok_or_else(|| bad_request("missing content field"))?
.to_string();
// Validate by parsing into Config (round-trip check).
let parsed: Config =
toml::from_str(&content).map_err(|e| bad_request(&format!("invalid TOML: {e}")))?;
if let Err(errs) = parsed.validate() {
let joined = errs
.iter()
.map(|e| e.to_string())
.collect::<Vec<_>>()
.join("; ");
return Err(bad_request(&format!("config validation failed: {joined}")));
}
let local_path = state.config_dir.join("nyx.local");
fs::write(&local_path, &content)
.map_err(|e| bad_request(&format!("failed to write {}: {e}", local_path.display())))?;
// Reload the merged config so live state matches the file.
match Config::load(&state.config_dir) {
Ok((reloaded, _note)) => {
*state.config.write() = reloaded;
}
Err(e) => return Err(bad_request(&format!("config reload failed: {e}"))),
}
let _ = state.event_tx.send(ServerEvent::ConfigChanged);
Ok(Json(serde_json::json!({
"status": "ok",
"path": local_path.display().to_string(),
"bytes": content.len(),
})))
}
// ── Custom rules (existing endpoints) ────────────────────────────────────────
async fn list_rules(State(state): State<AppState>) -> Json<Vec<RuleView>> {
@ -220,29 +282,17 @@ async fn remove_terminator(
// ── Sources / Sinks / Sanitizers (by kind) ───────────────────────────────────
fn list_by_kind(state: &AppState, target_kind: &str) -> Vec<LabelEntryView> {
let builtins = labels::enumerate_builtin_rules();
let config = state.config.read();
let mut out: Vec<LabelEntryView> = builtins
.iter()
.filter(|r| r.kind == target_kind && !r.is_gated)
.map(|r| LabelEntryView {
lang: r.language.clone(),
matchers: r.matchers.clone(),
cap: r.cap.clone(),
case_sensitive: r.case_sensitive,
is_builtin: true,
})
.collect();
// Add custom rules of the target kind
// Built-in rules live on /api/rules — keep this endpoint focused on the
// user's own additions in nyx.local.
let target_rule_kind = match target_kind {
"source" => RuleKind::Source,
"sanitizer" => RuleKind::Sanitizer,
"sink" => RuleKind::Sink,
_ => return out,
_ => return Vec::new(),
};
let config = state.config.read();
let mut out: Vec<LabelEntryView> = Vec::new();
for (lang, lang_cfg) in &config.analysis.languages {
for cr in &lang_cfg.rules {
if cr.kind == target_rule_kind {
@ -256,7 +306,6 @@ fn list_by_kind(state: &AppState, target_kind: &str) -> Vec<LabelEntryView> {
}
}
}
out
}

View file

@ -26,6 +26,9 @@ pub fn routes() -> Router<AppState> {
.route("/debug/call-graph", get(get_call_graph))
.route("/debug/abstract-interp", get(get_abstract_interp))
.route("/debug/symex", get(get_symex))
.route("/debug/pointer", get(get_pointer))
.route("/debug/type-facts", get(get_type_facts))
.route("/debug/auth", get(get_auth))
}
// ── Query params ─────────────────────────────────────────────────────────────
@ -117,7 +120,7 @@ async fn get_ssa(
let path = validate_and_resolve(&state.scan_root, &q.file)?;
let config = state.config.read();
let analysis = debug::analyse_file(&path, &config)?;
let (ssa, _opt) = debug::analyse_function_ssa(&analysis, &q.function)?;
let (ssa, _opt, _cfg) = debug::analyse_function_ssa(&analysis, &q.function)?;
Ok(Json(SsaBodyView::from_ssa(&ssa, &analysis.bytes)))
}
@ -130,7 +133,7 @@ async fn get_taint(
let path = validate_and_resolve(&state.scan_root, &q.file)?;
let config = state.config.read();
let analysis = debug::analyse_file(&path, &config)?;
let (ssa, opt) = debug::analyse_function_ssa(&analysis, &q.function)?;
let (ssa, opt, body_cfg) = debug::analyse_function_ssa(&analysis, &q.function)?;
// Try to load global summaries from DB for cross-file context
let global = load_global_summaries(&state);
@ -141,7 +144,7 @@ async fn get_taint(
let (events, _entry_states, exit_states) = debug::analyse_function_taint(
&ssa,
analysis.cfg(),
body_cfg,
analysis.lang,
analysis.summaries(),
global.as_ref(),
@ -168,13 +171,13 @@ async fn get_abstract_interp(
let path = validate_and_resolve(&state.scan_root, &q.file)?;
let config = state.config.read();
let analysis = debug::analyse_file(&path, &config)?;
let (ssa, opt) = debug::analyse_function_ssa(&analysis, &q.function)?;
let (ssa, opt, body_cfg) = debug::analyse_function_ssa(&analysis, &q.function)?;
let global = load_global_summaries(&state);
let (_events, block_states, _exit_states) = debug::analyse_function_taint(
&ssa,
analysis.cfg(),
body_cfg,
analysis.lang,
analysis.summaries(),
global.as_ref(),
@ -262,16 +265,59 @@ async fn get_symex(
let path = validate_and_resolve(&state.scan_root, &q.file)?;
let config = state.config.read();
let analysis = debug::analyse_file(&path, &config)?;
let (ssa, opt) = debug::analyse_function_ssa(&analysis, &q.function)?;
let (ssa, opt, body_cfg) = debug::analyse_function_ssa(&analysis, &q.function)?;
let global = load_global_summaries(&state);
let sym_state =
debug::analyse_function_symex(&ssa, analysis.cfg(), analysis.lang, &opt, global.as_ref());
debug::analyse_function_symex(&ssa, body_cfg, analysis.lang, &opt, global.as_ref());
Ok(Json(SymexView::from_symbolic_state(&sym_state, &ssa)))
}
/// GET /api/debug/pointer?file=<path>&function=<name>
/// Return the field-sensitive Steensgaard points-to facts for a function.
async fn get_pointer(
State(state): State<AppState>,
Query(q): Query<FileFunctionQuery>,
) -> Result<Json<PointerView>, StatusCode> {
let path = validate_and_resolve(&state.scan_root, &q.file)?;
let config = state.config.read();
let analysis = debug::analyse_file(&path, &config)?;
let (ssa, facts) = debug::analyse_function_pointer(&analysis, &q.function)?;
Ok(Json(PointerView::from_facts(&facts, &ssa)))
}
/// GET /api/debug/type-facts?file=<path>&function=<name>
/// Return per-function type-fact details derived from the SSA optimiser.
async fn get_type_facts(
State(state): State<AppState>,
Query(q): Query<FileFunctionQuery>,
) -> Result<Json<TypeFactsView>, StatusCode> {
let path = validate_and_resolve(&state.scan_root, &q.file)?;
let config = state.config.read();
let analysis = debug::analyse_file(&path, &config)?;
let (ssa, opt, _cfg) = debug::analyse_function_ssa(&analysis, &q.function)?;
Ok(Json(TypeFactsView::from_optimize(
&opt,
&ssa,
&analysis.bytes,
)))
}
/// GET /api/debug/auth?file=<path>
/// Return the file-scoped authorization model — routes, units,
/// sensitive operations, and auth checks — for the debug UI.
async fn get_auth(
State(state): State<AppState>,
Query(q): Query<FileQuery>,
) -> Result<Json<AuthAnalysisView>, StatusCode> {
let path = validate_and_resolve(&state.scan_root, &q.file)?;
let config = state.config.read();
let (model, bytes, enabled) = debug::analyse_file_auth(&path, &config)?;
Ok(Json(AuthAnalysisView::from_model(&model, &bytes, enabled)))
}
// ── Helpers ──────────────────────────────────────────────────────────────────
/// Load global summaries from DB if available.
@ -396,7 +442,9 @@ mod tests {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
},
)],
)
@ -466,6 +514,8 @@ mod tests {
value_defs: vec![],
cfg_node_map: std::collections::HashMap::new(),
exception_edges: vec![],
field_interner: crate::ssa::ir::FieldInterner::default(),
field_writes: std::collections::HashMap::new(),
},
false,
false,
@ -486,6 +536,8 @@ mod tests {
value_defs: vec![],
cfg_node_map: std::collections::HashMap::new(),
exception_edges: vec![],
field_interner: crate::ssa::ir::FieldInterner::default(),
field_writes: std::collections::HashMap::new(),
},
true,
true,
@ -506,6 +558,8 @@ mod tests {
value_defs: vec![],
cfg_node_map: std::collections::HashMap::new(),
exception_edges: vec![],
field_interner: crate::ssa::ir::FieldInterner::default(),
field_writes: std::collections::HashMap::new(),
},
true,
false,
@ -599,7 +653,9 @@ mod tests {
abstract_transfer: vec![],
param_return_paths: vec![],
points_to: Default::default(),
field_points_to: Default::default(),
return_path_facts: smallvec::SmallVec::new(),
typed_call_receivers: vec![],
},
)],
)

View file

@ -54,7 +54,19 @@ struct TreeEntry {
#[derive(Debug, Serialize)]
struct SymbolEntry {
name: String,
/// Legacy display kind (`"function"` / `"method"`) used by existing CSS
/// classes in the frontend. Kept for backward-compat — new consumers
/// should prefer `func_kind`.
kind: String,
/// Structural [`crate::symbol::FuncKind`] slug (`"fn"`, `"method"`,
/// `"closure"`, `"ctor"`, `"getter"`, `"setter"`, `"toplevel"`). Lets
/// the UI distinguish anonymous closures (`<anon#N>`) from named
/// functions and offer a default-hide toggle.
func_kind: String,
/// Enclosing container path (class / impl / module / outer function).
/// Empty for free top-level functions. Surfaced so the UI can render
/// closures as `<anon#N> [in outer_fn]`.
container: String,
line: Option<usize>,
finding_count: usize,
namespace: Option<String>,
@ -278,16 +290,21 @@ async fn get_symbols(
let entries: Vec<SymbolEntry> = symbols
.into_iter()
.map(|(name, arity, _lang, namespace)| {
let kind = if !namespace.is_empty() && namespace != name {
"method".to_string()
} else {
"function".to_string()
.map(|(name, arity, _lang, namespace, container, func_kind)| {
// Legacy `kind` field — still used by existing CSS classes
// (`symbol-kind-method`, `symbol-kind-function`). Map any
// method-like FuncKind onto `"method"` and everything else
// onto `"function"` so the rendered icon stays sensible.
let kind = match func_kind.as_str() {
"method" | "ctor" | "getter" | "setter" => "method".to_string(),
_ => "function".to_string(),
};
let finding_count = func_finding_counts.get(&name).copied().unwrap_or(0);
SymbolEntry {
name,
kind,
func_kind,
container,
line: None,
finding_count,
namespace: if namespace.is_empty() {

View file

@ -1,7 +1,7 @@
use crate::server::app::AppState;
use crate::server::error::{ApiError, ApiResult};
use crate::utils::path::{DEFAULT_UI_MAX_FILE_BYTES, RepoPathError, open_repo_text_file};
use axum::extract::{Query, State};
use axum::http::StatusCode;
use axum::routing::get;
use axum::{Json, Router};
use serde::{Deserialize, Serialize};
@ -33,9 +33,9 @@ struct FileResponse {
async fn get_file(
State(state): State<AppState>,
Query(query): Query<FileQuery>,
) -> Result<Json<FileResponse>, StatusCode> {
) -> ApiResult<Json<FileResponse>> {
let opened = open_repo_text_file(&state.scan_root, &query.path, DEFAULT_UI_MAX_FILE_BYTES)
.map_err(map_path_error)?;
.map_err(|e| map_path_error(e, &query.path))?;
let content = opened.content;
let all_lines: Vec<&str> = content.lines().collect();
let total_lines = all_lines.len();
@ -64,14 +64,25 @@ async fn get_file(
}))
}
fn map_path_error(err: RepoPathError) -> StatusCode {
fn map_path_error(err: RepoPathError, path: &str) -> ApiError {
match err {
RepoPathError::InvalidPath | RepoPathError::OutsideRoot => StatusCode::FORBIDDEN,
RepoPathError::NotFound => StatusCode::NOT_FOUND,
RepoPathError::TooLarge
| RepoPathError::InvalidText
| RepoPathError::NotFile
| RepoPathError::NotDirectory => StatusCode::BAD_REQUEST,
RepoPathError::Io => StatusCode::INTERNAL_SERVER_ERROR,
RepoPathError::InvalidPath => ApiError::forbidden(format!("invalid path: {path}")),
RepoPathError::OutsideRoot => {
ApiError::forbidden(format!("path outside scan root: {path}"))
}
RepoPathError::NotFound => ApiError::not_found(format!("file not found: {path}")),
RepoPathError::TooLarge => {
ApiError::bad_request(format!("file too large to display: {path}"))
}
RepoPathError::InvalidText => {
ApiError::bad_request(format!("file is not valid UTF-8 text: {path}"))
}
RepoPathError::NotFile => {
ApiError::bad_request(format!("path is not a regular file: {path}"))
}
RepoPathError::NotDirectory => {
ApiError::bad_request(format!("path is not a directory: {path}"))
}
RepoPathError::Io => ApiError::internal(format!("I/O error reading: {path}")),
}
}

View file

@ -2,13 +2,13 @@
use crate::commands::scan::Diag;
use crate::database::index::Indexer;
use crate::server::app::AppState;
use crate::server::app::{AppState, CachedFindings};
use crate::server::error::{ApiError, ApiResult};
use crate::server::models::{
FilterValues, FindingSummary, FindingView, collect_filter_values, finding_from_diag,
finding_from_diag_with_detail, overlay_triage_states, summarize_findings,
};
use axum::extract::{Path, Query, State};
use axum::http::StatusCode;
use axum::routing::get;
use axum::{Json, Router};
use serde::Deserialize;
@ -22,16 +22,30 @@ pub fn routes() -> Router<AppState> {
.route("/findings/{index}", get(get_finding))
}
/// Sentinel job id for "we read this from SQLite, not from JobManager."
/// Used as the cache key when no in-memory job exists (e.g. fresh server boot).
const DB_FALLBACK_KEY: &str = "__db_fallback__";
/// Bundle returned by [`load_latest_findings`]: the raw diags plus the cache
/// key under which their derived views should be stored. The cache key is the
/// in-memory job id when available, or [`DB_FALLBACK_KEY`] when we fell back
/// to SQLite.
struct LoadedFindings {
cache_key: String,
findings: Arc<Vec<Diag>>,
}
/// Load findings for the latest completed scan, falling back to DB if no
/// in-memory completed scan exists (e.g. after a server restart).
pub fn load_latest_findings(state: &AppState) -> Arc<Vec<Diag>> {
// In-memory first
fn load_latest_findings_internal(state: &AppState) -> LoadedFindings {
if let Some(job) = state.job_manager.get_latest_completed() {
if let Some(ref findings) = job.findings {
return Arc::clone(findings);
return LoadedFindings {
cache_key: job.id.clone(),
findings: Arc::clone(findings),
};
}
}
// DB fallback — find the most recent completed scan with findings
if let Some(ref pool) = state.db_pool {
if let Ok(idx) = Indexer::from_pool("_scans", pool) {
if let Ok(scans) = idx.list_scans(20) {
@ -39,7 +53,10 @@ pub fn load_latest_findings(state: &AppState) -> Arc<Vec<Diag>> {
if scan.status == "completed" {
if let Some(json) = scan.findings_json.as_deref() {
if let Ok(diags) = serde_json::from_str::<Vec<Diag>>(json) {
return Arc::new(diags);
return LoadedFindings {
cache_key: format!("{DB_FALLBACK_KEY}:{}", scan.id),
findings: Arc::new(diags),
};
}
}
}
@ -47,10 +64,61 @@ pub fn load_latest_findings(state: &AppState) -> Arc<Vec<Diag>> {
}
}
}
Arc::new(Vec::new())
LoadedFindings {
cache_key: DB_FALLBACK_KEY.to_string(),
findings: Arc::new(Vec::new()),
}
}
/// Build (or fetch from cache) the per-scan derived views.
///
/// Returns clones of `Arc`s so callers can drop the lock immediately and work
/// without contention. Triage state is *not* baked into the cached views — it
/// changes on a different cadence and is overlaid per request.
fn cached_for_latest(state: &AppState) -> CachedFindings {
let loaded = load_latest_findings_internal(state);
// Fast path: cache hit for the same job id.
if let Some(cached) = state.findings_cache.read().as_ref() {
if cached.job_id == loaded.cache_key {
return cached.clone();
}
}
// Slow path: rebuild. Guard against concurrent rebuilds of the same key —
// a second writer that finds the cache already populated for our key
// simply returns it.
let mut guard = state.findings_cache.write();
if let Some(existing) = guard.as_ref() {
if existing.job_id == loaded.cache_key {
return existing.clone();
}
}
let views: Vec<FindingView> = loaded
.findings
.iter()
.enumerate()
.map(|(i, d)| finding_from_diag(i, d))
.collect();
let summary = summarize_findings(&loaded.findings);
let filters = collect_filter_values(&loaded.findings);
let entry = CachedFindings {
job_id: loaded.cache_key,
views: Arc::new(views),
summary: Arc::new(summary),
filters: Arc::new(filters),
};
*guard = Some(entry.clone());
entry
}
/// Load triage states and suppression rules from DB, apply to views.
///
/// Triage state is overlaid onto a freshly-cloned `Vec` rather than mutating
/// the cached views so concurrent readers see consistent data and the cache
/// stays valid across triage edits.
fn apply_triage_overlay(state: &AppState, views: &mut [FindingView]) {
if let Some(ref pool) = state.db_pool {
if let Ok(idx) = Indexer::from_pool("_triage", pool) {
@ -80,19 +148,11 @@ struct FindingsQuery {
async fn list_findings(
State(state): State<AppState>,
Query(query): Query<FindingsQuery>,
) -> Result<Json<serde_json::Value>, StatusCode> {
let findings = load_latest_findings(&state);
let mut views: Vec<FindingView> = findings
.iter()
.enumerate()
.map(|(i, d)| finding_from_diag(i, d))
.collect();
// Overlay triage states from DB before filtering
) -> ApiResult<Json<serde_json::Value>> {
let cached = cached_for_latest(&state);
let mut views: Vec<FindingView> = (*cached.views).clone();
apply_triage_overlay(&state, &mut views);
// Apply filters.
if let Some(ref sev) = query.severity {
let sev_upper = sev.to_ascii_uppercase();
views.retain(|f| f.severity.as_db_str() == sev_upper);
@ -138,7 +198,6 @@ async fn list_findings(
});
}
// Sort.
match query.sort_by.as_deref() {
Some("severity") => views.sort_by_key(|a| a.severity),
Some("path") | Some("file") => views.sort_by(|a, b| a.path.cmp(&b.path)),
@ -163,13 +222,12 @@ async fn list_findings(
}),
Some("status") => views.sort_by(|a, b| a.status.cmp(&b.status)),
Some("category") => views.sort_by_key(|a| a.category.to_string()),
_ => {} // default order (by index)
_ => {}
}
if query.sort_dir.as_deref() == Some("desc") {
views.reverse();
}
// Paginate.
let total = views.len();
let page = query.page.unwrap_or(1).max(1);
let per_page = query.per_page.unwrap_or(50).clamp(1, 10000);
@ -185,22 +243,28 @@ async fn list_findings(
}
async fn findings_summary(State(state): State<AppState>) -> Json<FindingSummary> {
let findings = load_latest_findings(&state);
Json(summarize_findings(&findings))
Json((*cached_for_latest(&state).summary).clone())
}
async fn findings_filters(State(state): State<AppState>) -> Json<FilterValues> {
let findings = load_latest_findings(&state);
Json(collect_filter_values(&findings))
Json((*cached_for_latest(&state).filters).clone())
}
async fn get_finding(
State(state): State<AppState>,
Path(index): Path<usize>,
) -> Result<Json<FindingView>, StatusCode> {
let findings = load_latest_findings(&state);
let diag = findings.get(index).ok_or(StatusCode::NOT_FOUND)?;
) -> ApiResult<Json<FindingView>> {
let findings = load_latest_findings_internal(&state).findings;
let diag = findings
.get(index)
.ok_or_else(|| ApiError::not_found(format!("finding {index} not found")))?;
let mut view = finding_from_diag_with_detail(index, diag, &state.scan_root, &findings);
apply_triage_overlay(&state, std::slice::from_mut(&mut view));
Ok(Json(view))
}
/// Public alias for callers (overview, explorer, triage) that just want
/// the raw diag list. Kept as `load_latest_findings` for source-compat.
pub fn load_latest_findings(state: &AppState) -> Arc<Vec<Diag>> {
load_latest_findings_internal(state).findings
}

View file

@ -2,21 +2,31 @@
use crate::commands::scan::Diag;
use crate::database::index::{Indexer, ScanRecord};
use crate::evidence::Confidence;
use crate::evidence::{Confidence, Verdict};
use crate::server::app::AppState;
use crate::server::models::{
Insight, NoisyRule, OverviewResponse, ScanSummary, TrendPoint, by_language_from_findings,
compute_fingerprint, summarize_findings, top_directories_from_findings, top_n_from_map,
BacklogStats, BaselineInfo, ConfidenceDistribution, HotSink, Insight, LanguageHealth,
NoisyRule, OverviewCount, OverviewResponse, PostureSummary, ScanSummary, ScannerQuality,
SuppressionHygiene, TrendPoint, WeightedFile, by_language_from_findings, compute_fingerprint,
lang_for_finding_path, summarize_findings, top_directories_from_findings, top_n_from_map,
};
use axum::extract::State;
use axum::routing::get;
use crate::server::owasp;
use axum::extract::{Path as AxPath, State};
use axum::http::StatusCode;
use axum::routing::{delete, get, post};
use axum::{Json, Router};
use serde::Deserialize;
use std::collections::{HashMap, HashSet};
const BASELINE_KEY: &str = "baseline_scan_id";
pub fn routes() -> Router<AppState> {
Router::new()
.route("/overview", get(overview))
.route("/overview/trends", get(overview_trends))
.route("/overview/baseline", post(set_baseline))
.route("/overview/baseline", delete(clear_baseline))
.route("/overview/baseline/{scan_id}", post(set_baseline_path))
}
/// GET /api/overview — aggregated dashboard data.
@ -25,7 +35,7 @@ async fn overview(State(state): State<AppState>) -> Json<OverviewResponse> {
let findings = crate::server::routes::findings::load_latest_findings(&state);
// 2. Collect recent scans (in-memory + DB, deduped)
let recent_scans = collect_recent_scans(&state, 10);
let recent_scans = collect_recent_scans(&state, 20);
// 3. Basic summary
let summary = summarize_findings(&findings);
@ -37,8 +47,10 @@ async fn overview(State(state): State<AppState>) -> Json<OverviewResponse> {
let latest_scan_at = latest_completed.and_then(|s| s.started_at.clone());
let latest_scan_duration = latest_completed.and_then(|s| s.duration_secs);
// 5. New/fixed since last scan
let (new_since_last, fixed_since_last) = compute_delta(&state, &findings);
// 5. Walk historical scans once for delta + posture + backlog + drift.
let history = ScanHistory::load(&state, 20);
let (new_since_last, fixed_since_last, reintroduced_count) =
history.compare_to_current(&findings);
// 6. High confidence rate
let high_confidence_rate = if findings.is_empty() {
@ -67,6 +79,7 @@ async fn overview(State(state): State<AppState>) -> Json<OverviewResponse> {
&summary,
new_since_last,
fixed_since_last,
reintroduced_count,
triage_coverage,
&noisy_rules,
);
@ -80,6 +93,51 @@ async fn overview(State(state): State<AppState>) -> Json<OverviewResponse> {
"normal".to_string()
};
// ── New (Tier 1/2/3) ──
let confidence_distribution = Some(compute_confidence_distribution(&findings));
let weighted_top_files = compute_weighted_top_files(&findings, 10);
let cross_file_ratio = Some(compute_cross_file_ratio(&findings));
let hot_sinks = compute_hot_sinks(&findings, 5);
let owasp_buckets = owasp::bucket_findings(&summary.by_rule);
let issue_categories = owasp::issue_categories(&summary.by_rule);
let scanner_quality =
compute_scanner_quality(&state, &findings, latest_completed.map(|s| s.id.as_str()));
let language_health = compute_language_health(&findings);
let suppression_hygiene = Some(compute_suppression_hygiene(&state, &findings));
let backlog = Some(compute_backlog(&state, &findings, &history));
let baseline = compute_baseline_info(&state, &findings);
let posture = Some(build_posture(
new_since_last,
fixed_since_last,
reintroduced_count,
&history,
summary.total,
));
let health = Some(crate::server::health::compute(
&crate::server::health::HealthInputs {
summary: &summary,
findings: &findings,
triage_coverage,
new_since_last,
fixed_since_last,
reintroduced: reintroduced_count,
// Files-scanned proxy for repo size — used for size-aware
// severity dampening in `health::compute`. See
// `docs/health-score-audit.md` for calibration data.
repo_files: scanner_quality
.as_ref()
.map(|q| q.files_scanned)
.filter(|&f| f > 0),
backlog: backlog.as_ref(),
// Trend is meaningless without ≥2 completed scans —
// matches the first-scan check `compare_to_current` uses.
has_history: history.scans.len() >= 2,
// Suppression-hygiene modifier — populated when the
// suppression panel was computable for this scan.
blanket_suppression_rate: suppression_hygiene.as_ref().map(|s| s.blanket_rate),
},
));
Json(OverviewResponse {
state: state_str,
total_findings: summary.total,
@ -90,8 +148,8 @@ async fn overview(State(state): State<AppState>) -> Json<OverviewResponse> {
latest_scan_duration_secs: latest_scan_duration,
latest_scan_id,
latest_scan_at,
by_severity: summary.by_severity,
by_category: summary.by_category,
by_severity: summary.by_severity.clone(),
by_category: summary.by_category.clone(),
by_language,
top_files,
top_directories,
@ -99,6 +157,19 @@ async fn overview(State(state): State<AppState>) -> Json<OverviewResponse> {
noisy_rules,
recent_scans: recent_scans.into_iter().take(10).collect(),
insights,
health,
posture,
backlog,
weighted_top_files,
confidence_distribution,
scanner_quality,
issue_categories,
hot_sinks,
owasp_buckets,
cross_file_ratio,
baseline,
language_health,
suppression_hygiene,
})
}
@ -142,8 +213,198 @@ async fn overview_trends(State(state): State<AppState>) -> Json<Vec<TrendPoint>>
Json(points)
}
#[derive(Debug, Deserialize)]
struct BaselineBody {
scan_id: String,
}
/// POST /api/overview/baseline { scan_id } — pin a scan as the baseline for drift comparison.
async fn set_baseline(
State(state): State<AppState>,
Json(body): Json<BaselineBody>,
) -> Result<StatusCode, StatusCode> {
set_baseline_inner(&state, &body.scan_id)
}
/// POST /api/overview/baseline/:scan_id — convenience path-form for clients without a JSON body.
async fn set_baseline_path(
State(state): State<AppState>,
AxPath(scan_id): AxPath<String>,
) -> Result<StatusCode, StatusCode> {
set_baseline_inner(&state, &scan_id)
}
fn set_baseline_inner(state: &AppState, scan_id: &str) -> Result<StatusCode, StatusCode> {
if scan_id.is_empty() {
return Err(StatusCode::BAD_REQUEST);
}
let pool = state
.db_pool
.as_ref()
.ok_or(StatusCode::SERVICE_UNAVAILABLE)?;
let idx = Indexer::from_pool("_scans", pool).map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
idx.set_metadata(BASELINE_KEY, scan_id)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(StatusCode::NO_CONTENT)
}
/// DELETE /api/overview/baseline — clear the pinned baseline.
async fn clear_baseline(State(state): State<AppState>) -> Result<StatusCode, StatusCode> {
let pool = state
.db_pool
.as_ref()
.ok_or(StatusCode::SERVICE_UNAVAILABLE)?;
let idx = Indexer::from_pool("_scans", pool).map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
idx.delete_metadata(BASELINE_KEY)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(StatusCode::NO_CONTENT)
}
// ── Helpers ──────────────────────────────────────────────────────────────────
/// Cached view of recent completed scans' fingerprints + timestamps. Built once
/// per overview request and reused by delta / posture / backlog / drift.
struct ScanHistory {
/// Completed scans, oldest → newest.
scans: Vec<HistoricScan>,
/// fingerprint → earliest started_at (RFC-3339) seen across history.
first_seen: HashMap<String, String>,
}
struct HistoricScan {
#[allow(dead_code)]
id: String,
#[allow(dead_code)]
started_at: Option<String>,
fingerprints: HashSet<String>,
total: usize,
}
impl ScanHistory {
fn load(state: &AppState, limit: usize) -> Self {
let mut scans = Vec::new();
let mut first_seen: HashMap<String, String> = HashMap::new();
let Some(ref pool) = state.db_pool else {
return Self { scans, first_seen };
};
let Ok(idx) = Indexer::from_pool("_scans", pool) else {
return Self { scans, first_seen };
};
let mut records = idx.list_scans(limit as i64).unwrap_or_default();
// Filter to completed and reverse to oldest-first.
records.retain(|r| r.status == "completed");
records.reverse();
let mut bulk_inserts: Vec<(String, String)> = Vec::new();
for r in records {
let fps: HashSet<String> = r
.findings_json
.as_deref()
.and_then(|j| serde_json::from_str::<Vec<Diag>>(j).ok())
.map(|diags| diags.iter().map(compute_fingerprint).collect())
.unwrap_or_default();
let total = fps.len();
let started_at = r.started_at.clone();
// Seed first_seen for new fingerprints.
if let Some(ref ts) = started_at {
for fp in &fps {
first_seen.entry(fp.clone()).or_insert_with(|| {
bulk_inserts.push((fp.clone(), ts.clone()));
ts.clone()
});
}
}
scans.push(HistoricScan {
id: r.id,
started_at,
fingerprints: fps,
total,
});
}
// Persist newly observed first-seen entries (best-effort; ignore errors).
if !bulk_inserts.is_empty() {
let _ = idx.record_finding_first_seen_bulk(&bulk_inserts);
}
Self { scans, first_seen }
}
/// Compare current findings against the most-recent historical scan and
/// against all earlier scans for regression detection.
/// Returns (new_count, fixed_count, reintroduced_count).
fn compare_to_current(&self, current: &[Diag]) -> (usize, usize, usize) {
if self.scans.is_empty() {
return (0, 0, 0);
}
let current_fps: HashSet<String> = current.iter().map(compute_fingerprint).collect();
// For new/fixed delta, compare against the *previous* completed scan
// (i.e. the one before the latest, since the latest is "current" in DB
// most of the time). If only one scan exists, treat all as new.
let (new_count, fixed_count) = if self.scans.len() >= 2 {
let prev = &self.scans[self.scans.len() - 2];
let new_count = current_fps.difference(&prev.fingerprints).count();
let fixed_count = prev.fingerprints.difference(&current_fps).count();
(new_count, fixed_count)
} else {
(0, 0)
};
// Regression: fingerprints that were present in some past scan, were
// absent in the immediately-preceding scan, and are present now.
let reintroduced = if self.scans.len() >= 2 {
let prev_fps = &self.scans[self.scans.len() - 2].fingerprints;
let mut count = 0usize;
for fp in current_fps.iter() {
if prev_fps.contains(fp) {
continue;
}
// Was present in any earlier scan?
let earlier = self
.scans
.iter()
.take(self.scans.len() - 2)
.any(|s| s.fingerprints.contains(fp));
if earlier {
count += 1;
}
}
count
} else {
0
};
(new_count, fixed_count, reintroduced)
}
/// Trend slope across the last N totals — 1.0 means strictly improving,
/// -1.0 strictly regressing, 0.0 unchanged. Returns None with <3 points.
fn trend_slope(&self) -> Option<f64> {
if self.scans.len() < 3 {
return None;
}
let tail: Vec<f64> = self
.scans
.iter()
.rev()
.take(5)
.map(|s| s.total as f64)
.collect();
let first = *tail.last()?;
let last = *tail.first()?;
if first <= 0.0 && last <= 0.0 {
return Some(0.0);
}
// Improving = total decreased → positive score. Normalize by max.
let max = first.max(last).max(1.0);
Some(((first - last) / max).clamp(-1.0, 1.0))
}
}
/// Collect recent scans from in-memory jobs + DB, deduped by ID.
fn collect_recent_scans(state: &AppState, limit: usize) -> Vec<ScanSummary> {
let mut seen = HashSet::new();
@ -181,55 +442,11 @@ fn collect_recent_scans(state: &AppState, limit: usize) -> Vec<ScanSummary> {
}
}
// Sort by started_at descending
scans.sort_by(|a, b| b.started_at.cmp(&a.started_at));
scans.truncate(limit);
scans
}
/// Compute new/fixed finding counts by comparing the two most recent completed scans.
fn compute_delta(state: &AppState, current_findings: &[Diag]) -> (usize, usize) {
if current_findings.is_empty() {
return (0, 0);
}
let current_fps: HashSet<String> = current_findings.iter().map(compute_fingerprint).collect();
// Find previous completed scan's findings
let previous_fps = load_previous_scan_fingerprints(state);
if previous_fps.is_empty() {
return (0, 0);
}
let new_count = current_fps.difference(&previous_fps).count();
let fixed_count = previous_fps.difference(&current_fps).count();
(new_count, fixed_count)
}
/// Load fingerprints from the second-most-recent completed scan.
fn load_previous_scan_fingerprints(state: &AppState) -> HashSet<String> {
if let Some(ref pool) = state.db_pool {
if let Ok(idx) = Indexer::from_pool("_scans", pool) {
if let Ok(scans) = idx.list_scans(10) {
let completed: Vec<&ScanRecord> = scans
.iter()
.filter(|s| s.status == "completed" && s.findings_json.is_some())
.collect();
// Skip the first (latest) completed scan — we want the previous one
if let Some(prev) = completed.get(1) {
if let Some(json) = prev.findings_json.as_deref() {
if let Ok(diags) = serde_json::from_str::<Vec<Diag>>(json) {
return diags.iter().map(compute_fingerprint).collect();
}
}
}
}
}
}
HashSet::new()
}
/// Compute triage coverage: fraction of findings with non-"open" triage state.
fn compute_triage_coverage(state: &AppState, findings: &[Diag]) -> f64 {
if findings.is_empty() {
@ -249,24 +466,19 @@ fn compute_triage_coverage(state: &AppState, findings: &[Diag]) -> f64 {
let mut non_open = 0usize;
for d in findings {
let fp = compute_fingerprint(d);
// Check explicit triage state
if let Some((triage_state, _, _)) = triage_map.get(&fp) {
if triage_state != "open" {
non_open += 1;
continue;
}
}
// Check suppression rules
let path = &d.path;
let rule_id = &d.id;
for rule in &suppression_rules {
let matches = match rule.suppress_by.as_str() {
"fingerprint" => fp == rule.match_value,
"rule" => *rule_id == rule.match_value,
"rule_in_file" => {
let key = format!("{rule_id}:{path}");
key == rule.match_value
}
"rule_in_file" => format!("{rule_id}:{path}") == rule.match_value,
"file" => *path == rule.match_value,
_ => false,
};
@ -296,7 +508,6 @@ fn compute_noisy_rules(
let triage_map = idx.get_all_triage_states().unwrap_or_default();
let suppression_rules = idx.get_suppression_rules().unwrap_or_default();
// Count suppressed findings per rule
let mut suppressed_per_rule: HashMap<String, usize> = HashMap::new();
for d in findings {
let fp = compute_fingerprint(d);
@ -347,12 +558,12 @@ fn generate_insights(
summary: &crate::server::models::FindingSummary,
new_since_last: usize,
fixed_since_last: usize,
reintroduced: usize,
triage_coverage: f64,
noisy_rules: &[NoisyRule],
) -> Vec<Insight> {
let mut insights = Vec::new();
// Untriaged high findings
let high_count = summary.by_severity.get("HIGH").copied().unwrap_or(0);
if high_count > 0 {
insights.push(Insight {
@ -366,7 +577,18 @@ fn generate_insights(
});
}
// New findings since last scan
if reintroduced > 0 {
insights.push(Insight {
kind: "regression".into(),
message: format!(
"{reintroduced} previously-fixed finding{} reintroduced",
if reintroduced == 1 { "" } else { "s" }
),
severity: "danger".into(),
action_url: Some("/findings".into()),
});
}
if new_since_last > 0 {
insights.push(Insight {
kind: "new_findings".into(),
@ -379,7 +601,6 @@ fn generate_insights(
});
}
// Fixed findings since last scan
if fixed_since_last > 0 {
insights.push(Insight {
kind: "fixed_findings".into(),
@ -392,7 +613,6 @@ fn generate_insights(
});
}
// Noisy rules
for rule in noisy_rules.iter().take(3) {
insights.push(Insight {
kind: "noisy_rule".into(),
@ -407,7 +627,6 @@ fn generate_insights(
});
}
// Low triage coverage
if triage_coverage < 0.1 && summary.total > 20 {
insights.push(Insight {
kind: "low_triage".into(),
@ -435,3 +654,481 @@ fn is_fresh_scan(scan: Option<&ScanSummary>) -> bool {
}
false
}
// ── Tier 1/2/3 computations ──────────────────────────────────────────────────
fn compute_confidence_distribution(findings: &[Diag]) -> ConfidenceDistribution {
let mut d = ConfidenceDistribution::default();
for f in findings {
match f.confidence {
Some(Confidence::High) => d.high += 1,
Some(Confidence::Medium) => d.medium += 1,
Some(Confidence::Low) => d.low += 1,
None => d.none += 1,
}
}
d
}
fn compute_weighted_top_files(findings: &[Diag], limit: usize) -> Vec<WeightedFile> {
use crate::patterns::Severity;
let mut per_file: HashMap<String, [usize; 3]> = HashMap::new(); // [high, medium, low]
for f in findings {
let entry = per_file.entry(f.path.clone()).or_insert([0, 0, 0]);
match f.severity {
Severity::High => entry[0] += 1,
Severity::Medium => entry[1] += 1,
Severity::Low => entry[2] += 1,
}
}
let mut rows: Vec<WeightedFile> = per_file
.into_iter()
.map(|(name, [h, m, l])| WeightedFile {
name,
score: (h * 10 + m * 3 + l) as u32,
high: h,
medium: m,
low: l,
total: h + m + l,
})
.collect();
rows.sort_by(|a, b| b.score.cmp(&a.score).then_with(|| b.total.cmp(&a.total)));
rows.truncate(limit);
rows
}
fn compute_cross_file_ratio(findings: &[Diag]) -> f64 {
if findings.is_empty() {
return 0.0;
}
let mut cross = 0usize;
for f in findings {
if let Some(ev) = f.evidence.as_ref() {
if ev.uses_summary || ev.flow_steps.iter().any(|s| s.is_cross_file) {
cross += 1;
}
}
}
cross as f64 / findings.len() as f64
}
/// Hot sinks are *only* meaningful for taint findings — counting AST rule IDs
/// (e.g. `rs.quality.unwrap`) here just duplicates the Top Rules table. So we
/// deliberately require a real Sink-step callee (or a parsable sink snippet)
/// and skip everything else. Empty result → frontend hides the card.
fn compute_hot_sinks(findings: &[Diag], limit: usize) -> Vec<HotSink> {
let mut counts: HashMap<String, usize> = HashMap::new();
for f in findings {
let Some(ev) = f.evidence.as_ref() else {
continue;
};
let from_flow = ev
.flow_steps
.iter()
.rev()
.find(|s| matches!(s.kind, crate::evidence::FlowStepKind::Sink))
.and_then(|s| s.callee.clone())
.filter(|c| !c.trim().is_empty());
let from_sink_snippet = ev
.sink
.as_ref()
.and_then(|s| s.snippet.as_ref())
.and_then(|s| {
let c = extract_callee_from_snippet(s);
if c.is_empty() { None } else { Some(c) }
});
let Some(callee) = from_flow.or(from_sink_snippet) else {
continue;
};
*counts.entry(callee).or_insert(0) += 1;
}
let mut rows: Vec<HotSink> = counts
.into_iter()
.map(|(callee, count)| HotSink { callee, count })
.collect();
rows.sort_by(|a, b| b.count.cmp(&a.count).then_with(|| a.callee.cmp(&b.callee)));
rows.truncate(limit);
rows
}
/// Pull the leading identifier from a sink snippet — a best-effort heuristic
/// for the dashboard's "hot sinks" list.
fn extract_callee_from_snippet(s: &str) -> String {
let trimmed = s.trim();
let end = trimmed
.find('(')
.or_else(|| trimmed.find(char::is_whitespace))
.unwrap_or(trimmed.len());
trimmed[..end].trim().to_string()
}
fn compute_scanner_quality(
state: &AppState,
findings: &[Diag],
latest_scan_id: Option<&str>,
) -> Option<ScannerQuality> {
let pool = state.db_pool.as_ref()?;
let idx = Indexer::from_pool("_scans", pool).ok()?;
let mut files_scanned = 0u64;
let mut files_skipped = 0u64;
if let Some(scan_id) = latest_scan_id {
let scans = idx.list_scans(20).unwrap_or_default();
if let Some(rec) = scans.into_iter().find(|s| s.id == scan_id) {
files_scanned = rec.files_scanned.unwrap_or(0).max(0) as u64;
files_skipped = rec.files_skipped.unwrap_or(0).max(0) as u64;
}
}
let parse_success_rate = if files_scanned + files_skipped > 0 {
files_scanned as f64 / (files_scanned + files_skipped) as f64
} else {
0.0
};
// Engine metrics from scan_metrics table (if available via Indexer).
let (functions_analyzed, call_edges, unresolved_calls) = latest_scan_id
.and_then(|id| idx.get_scan_metrics(id).ok().flatten())
.map(|m| (m.functions_analyzed, m.call_edges, m.unresolved_calls))
.unwrap_or((0, 0, 0));
let call_resolution_rate = if call_edges + unresolved_calls > 0 {
call_edges as f64 / (call_edges + unresolved_calls) as f64
} else {
0.0
};
// Symex coverage from current findings.
let mut breakdown: HashMap<String, usize> = HashMap::new();
let mut taint_total = 0usize;
for f in findings {
let Some(ev) = f.evidence.as_ref() else {
continue;
};
let Some(sv) = ev.symbolic.as_ref() else {
continue;
};
taint_total += 1;
let label = match sv.verdict {
Verdict::Confirmed => "confirmed",
Verdict::Infeasible => "infeasible",
Verdict::Inconclusive => "inconclusive",
Verdict::NotAttempted => "not_attempted",
};
*breakdown.entry(label.to_string()).or_insert(0) += 1;
}
let symex_verified_rate = if taint_total > 0 {
let attempted = breakdown
.iter()
.filter(|(k, _)| k.as_str() != "not_attempted")
.map(|(_, v)| *v)
.sum::<usize>();
attempted as f64 / taint_total as f64
} else {
0.0
};
Some(ScannerQuality {
files_scanned,
files_skipped,
parse_success_rate,
functions_analyzed,
call_edges,
unresolved_calls,
call_resolution_rate,
symex_verified_rate,
symex_breakdown: breakdown,
})
}
fn compute_language_health(findings: &[Diag]) -> Vec<LanguageHealth> {
use crate::patterns::Severity;
let mut per_lang: HashMap<String, [usize; 4]> = HashMap::new(); // [total, h, m, l]
for f in findings {
let Some(lang) = lang_for_finding_path(&f.path) else {
continue;
};
let entry = per_lang.entry(lang).or_insert([0; 4]);
entry[0] += 1;
match f.severity {
Severity::High => entry[1] += 1,
Severity::Medium => entry[2] += 1,
Severity::Low => entry[3] += 1,
}
}
let mut rows: Vec<LanguageHealth> = per_lang
.into_iter()
.map(|(language, [t, h, m, l])| LanguageHealth {
language,
findings: t,
high: h,
medium: m,
low: l,
})
.collect();
rows.sort_by(|a, b| {
b.high
.cmp(&a.high)
.then_with(|| b.findings.cmp(&a.findings))
});
rows
}
fn compute_suppression_hygiene(state: &AppState, findings: &[Diag]) -> SuppressionHygiene {
let mut hygiene = SuppressionHygiene {
fingerprint_level: 0,
rule_level: 0,
file_level: 0,
rule_in_file_level: 0,
blanket_rate: 0.0,
};
if findings.is_empty() {
return hygiene;
}
let Some(ref pool) = state.db_pool else {
return hygiene;
};
let Ok(idx) = Indexer::from_pool("_scans", pool) else {
return hygiene;
};
let triage_map = idx.get_all_triage_states().unwrap_or_default();
let suppression_rules = idx.get_suppression_rules().unwrap_or_default();
let mut total_suppressed = 0usize;
for d in findings {
let fp = compute_fingerprint(d);
if let Some((s, _, _)) = triage_map.get(&fp) {
if s == "suppressed" || s == "false_positive" {
hygiene.fingerprint_level += 1;
total_suppressed += 1;
continue;
}
}
for rule in &suppression_rules {
let matched = match rule.suppress_by.as_str() {
"fingerprint" => fp == rule.match_value,
"rule" => d.id == rule.match_value,
"rule_in_file" => format!("{}:{}", d.id, d.path) == rule.match_value,
"file" => d.path == rule.match_value,
_ => false,
};
if matched {
match rule.suppress_by.as_str() {
"fingerprint" => hygiene.fingerprint_level += 1,
"rule" => hygiene.rule_level += 1,
"file" => hygiene.file_level += 1,
"rule_in_file" => hygiene.rule_in_file_level += 1,
_ => {}
}
total_suppressed += 1;
break;
}
}
}
if total_suppressed > 0 {
let blanket = hygiene.rule_level + hygiene.file_level + hygiene.rule_in_file_level;
hygiene.blanket_rate = blanket as f64 / total_suppressed as f64;
}
hygiene
}
fn compute_backlog(state: &AppState, findings: &[Diag], history: &ScanHistory) -> BacklogStats {
// No useful aging data on the first scan — every fingerprint was first-seen
// today by definition. Avoid the misleading "0d / 0d / 0" display.
if history.scans.len() <= 1 {
return BacklogStats {
oldest_open_days: None,
median_age_days: None,
stale_count: 0,
age_buckets: Vec::new(),
};
}
let now = chrono::Utc::now();
// Pull DB-cached first_seen first; fall back to in-memory history map.
let fingerprints: Vec<String> = findings.iter().map(compute_fingerprint).collect();
let mut cached: HashMap<String, String> = HashMap::new();
if let Some(ref pool) = state.db_pool {
if let Ok(idx) = Indexer::from_pool("_scans", pool) {
cached = idx.get_first_seen_map(&fingerprints).unwrap_or_default();
}
}
// Merge history's view (already persisted as we walked).
for (fp, ts) in &history.first_seen {
cached.entry(fp.clone()).or_insert_with(|| ts.clone());
}
let mut ages_days: Vec<u32> = Vec::with_capacity(fingerprints.len());
for fp in &fingerprints {
let Some(ts) = cached.get(fp) else {
continue;
};
if let Ok(dt) = chrono::DateTime::parse_from_rfc3339(ts) {
let elapsed = now - dt.with_timezone(&chrono::Utc);
let days = elapsed.num_days().max(0) as u32;
ages_days.push(days);
}
}
let oldest_open_days = ages_days.iter().copied().max();
let median_age_days = if ages_days.is_empty() {
None
} else {
let mut sorted = ages_days.clone();
sorted.sort_unstable();
Some(sorted[sorted.len() / 2])
};
let stale_count = ages_days.iter().filter(|d| **d > 30).count();
// Buckets: ≤1d, ≤7d, ≤30d, ≤90d, >90d
let mut b = [0usize; 5];
for d in &ages_days {
let i = match *d {
0..=1 => 0,
2..=7 => 1,
8..=30 => 2,
31..=90 => 3,
_ => 4,
};
b[i] += 1;
}
let labels = ["≤1d", "≤7d", "≤30d", "≤90d", ">90d"];
let age_buckets = labels
.iter()
.zip(b.iter())
.map(|(l, c)| OverviewCount {
name: (*l).to_string(),
count: *c,
})
.collect();
BacklogStats {
oldest_open_days,
median_age_days,
stale_count,
age_buckets,
}
}
fn compute_baseline_info(state: &AppState, findings: &[Diag]) -> Option<BaselineInfo> {
let pool = state.db_pool.as_ref()?;
let idx = Indexer::from_pool("_scans", pool).ok()?;
let scan_id = idx.get_metadata(BASELINE_KEY).ok().flatten()?;
if scan_id.is_empty() {
return None;
}
// Look up baseline scan record (separate from history, since history is capped at 20).
let scans = idx.list_scans(200).ok()?;
let baseline = scans.into_iter().find(|s| s.id == scan_id)?;
let baseline_fps: HashSet<String> = baseline
.findings_json
.as_deref()
.and_then(|j| serde_json::from_str::<Vec<Diag>>(j).ok())
.map(|diags| diags.iter().map(compute_fingerprint).collect())
.unwrap_or_default();
let current_fps: HashSet<String> = findings.iter().map(compute_fingerprint).collect();
let drift_new = current_fps.difference(&baseline_fps).count();
let drift_fixed = baseline_fps.difference(&current_fps).count();
Some(BaselineInfo {
scan_id: baseline.id,
started_at: baseline.started_at,
baseline_total: baseline_fps.len(),
drift_new,
drift_fixed,
})
}
fn build_posture(
new_since_last: usize,
fixed_since_last: usize,
reintroduced: usize,
history: &ScanHistory,
current_total: usize,
) -> PostureSummary {
// First-scan case: no prior data to diff against. Saying "stable / no change"
// is misleading — we genuinely don't know yet.
if history.scans.len() <= 1 {
return PostureSummary {
trend: "unknown".into(),
severity: "info".into(),
message: format!(
"First scan: {current_total} finding{} detected. Re-scan to compare.",
plural(current_total)
),
reintroduced_count: 0,
};
}
let net = fixed_since_last as i64 - new_since_last as i64;
let trend_slope = history.trend_slope();
// Severity selection priorities: regressions are loudest.
let (trend, severity, message) = if reintroduced > 0 {
(
"regressing",
"danger",
format!(
"Regressed: {reintroduced} previously-fixed finding{} returned",
plural(reintroduced)
),
)
} else if net > 0 {
(
"improving",
"success",
format!(
"Improving: net {net:+} since last scan ({fixed_since_last} fixed, {new_since_last} new)"
),
)
} else if net < 0 {
(
"regressing",
"warning",
format!(
"Regressing: net {net:+} since last scan ({new_since_last} new, {fixed_since_last} fixed)"
),
)
} else if let Some(slope) = trend_slope {
if slope > 0.1 {
(
"improving",
"success",
"Improving: gradual decline in finding count over the last 5 scans".to_string(),
)
} else if slope < -0.1 {
(
"regressing",
"warning",
"Regressing: gradual rise in finding count over the last 5 scans".to_string(),
)
} else {
(
"stable",
"info",
"Stable: no net change since last scan".to_string(),
)
}
} else {
(
"stable",
"info",
"Stable: no net change since last scan".to_string(),
)
};
PostureSummary {
trend: trend.to_string(),
severity: severity.to_string(),
message,
reintroduced_count: reintroduced,
}
}
fn plural(n: usize) -> &'static str {
if n == 1 { "" } else { "s" }
}
// `compute_health_score` moved to `crate::server::health::compute`
// after the v2 audit (2026-04-28). See `docs/health-score-audit.md`
// for calibration data and the rationale, and `docs/health-score.md`
// for the customer-facing methodology.