fix(security): scope tabular-review document_ids by access (CWE-639)

The tabular-review routes accept user-supplied document_ids in
request bodies (POST /tabular-review, PATCH /:reviewId) and stale
cell rows on byte-fetching paths (POST /:reviewId/regenerate-cell,
POST /:reviewId/generate). None of those paths checked whether the
caller can read those documents — a free-account attacker could plant
foreign UUIDs into their own review and have the server fetch the
bytes from R2 + run an LLM extraction over them, returning verbatim
text via the standard review GET.

Adds filterAccessibleDocumentIds(documentIds, userId, userEmail, db)
next to the existing access helpers (owner-of-doc OR project member),
and applies it at the four entry points:

- POST /tabular-review               drop unauthorised on insert
- PATCH /:reviewId                   drop newly-added unauthorised; keep
                                     already-attached cells so non-owner
                                     collaborators don't accidentally
                                     orphan rows they can't directly
                                     access
- POST /:reviewId/regenerate-cell    refuse byte fetch when caller has
                                     no access to the underlying doc
- POST /:reviewId/generate           filter docIds before parallel LLM
                                     fetch (defense-in-depth for legacy
                                     cells planted before this fix)

Fails closed silently rather than 403'ing so legacy clients that pass
stale ids don't error out the whole review.

Detected by Aeon + manual review.
Severity: high
CWE-639 (Authorization Bypass Through User-Controlled Key)
This commit is contained in:
Aeon (aaronjmars) 2026-05-10 04:50:21 +00:00
parent f40c25d07f
commit e261d2e4bd
2 changed files with 97 additions and 6 deletions

View file

@ -119,6 +119,47 @@ export async function ensureReviewAccess(
return { ok: false };
}
/**
* Filter a list of document IDs down to those the caller is actually
* authorised to read owners pass, plus any document whose `project_id`
* the caller has access to (own project or `shared_with` member).
*
* The tabular-review routes accept user-supplied `document_ids` from
* request bodies; without this filter an attacker who has any review of
* their own can plant arbitrary doc UUIDs and have the server fetch + run
* an LLM extraction over their bytes (CWE-639).
*/
export async function filterAccessibleDocumentIds(
documentIds: string[],
userId: string,
userEmail: string | null | undefined,
db: Db,
): Promise<string[]> {
if (documentIds.length === 0) return [];
const { data: docs } = await db
.from("documents")
.select("id, user_id, project_id")
.in("id", documentIds);
const rows = (docs ?? []) as {
id: string;
user_id: string;
project_id: string | null;
}[];
if (rows.length === 0) return [];
const accessibleProjectIds = new Set(
await listAccessibleProjectIds(userId, userEmail, db),
);
const out: string[] = [];
for (const d of rows) {
if (d.user_id === userId) {
out.push(d.id);
} else if (d.project_id && accessibleProjectIds.has(d.project_id)) {
out.push(d.id);
}
}
return out;
}
/**
* Returns the set of project IDs the user can access own projects plus
* any project where their email is in `shared_with`. Used to scope chat