feat: render html files in knowledge view via sandboxed iframe

This commit is contained in:
Gagancreates 2026-05-08 00:26:57 +05:30
parent eb6a7ac466
commit 9014c79f2c
5 changed files with 481 additions and 0 deletions

View file

@ -0,0 +1,246 @@
# Knowledge File Viewer — Research & Implementation Plan
## Current State
The gap is a single `<pre>` fallback in `App.tsx:45234527`. The decision tree today:
```
selectedPath ends in .md → MarkdownEditor (full ProseMirror, works great)
selectedPath is anything else → <pre> raw text dump ← THIS IS THE ENTIRE GAP
```
Everything else needed already exists:
| What's needed | What exists | Where |
|---|---|---|
| Read binary files | `shell:readFileBase64` IPC handler | `apps/main/src/ipc.ts:648667` |
| Read text files | `workspace:readFile` with `encoding` param | `packages/shared/src/ipc.ts:5567` |
| File type detection | `attachment-presentation.ts` utilities | `renderer/src/lib/attachment-presentation.ts` |
| Audio player component | `AudioFileCard` (base64 → `<audio>`) | `renderer/src/components/ai-elements/file-path-card.tsx` |
| Image thumbnail | `SystemFileCard` (base64 → `<img>`) | Same file as above |
| Navigate to knowledge path | `onOpenKnowledgeFile` context | `renderer/src/contexts/file-card-context.tsx` |
The 10MB cap on `shell:readFileBase64` is the main constraint to watch.
---
## Recommended Architecture
### The Core Idea: `app://` Custom Protocol
**Never use `file://` for serving local content.** In Electron, `file://` has elevated same-origin privileges — an HTML file loaded that way can read other files from the filesystem.
Register a custom scheme **before `app.whenReady()`** in `apps/main/src/main.ts`:
```typescript
protocol.registerSchemesAsPrivileged([{
scheme: 'app',
privileges: {
standard: true,
secure: true,
supportFetchAPI: true,
stream: true // CRITICAL for video seeking (byte-range requests)
}
}]);
```
Then in the handler, resolve paths inside the workspace root and block traversal:
```typescript
protocol.handle('app', (req) => {
const filePath = resolveAndGuard(req.url, WORKSPACE_ROOT);
if (!filePath) return new Response('Forbidden', { status: 403 });
return net.fetch(pathToFileURL(filePath).toString());
});
```
This single protocol handles images, video, DOCX, and HTML all from one place.
---
## File Type Strategy
### Images (PNG, JPG, WEBP, GIF, SVG, AVIF)
**Approach:** Native `<img>` via `app://` protocol.
```tsx
<img src={`app://local/${encodeURIComponent(relativePath)}`} className="max-w-full" />
```
- Chromium renders all of these natively. Zero dependencies.
- HEIC/HEIF is not natively supported on Windows — use `sharp` in main process to convert to JPEG first.
- Strip EXIF before sending to LLM (GPS data). `sharp` does this automatically on JPEG output.
---
### Video (MP4, WebM, MOV)
**Approach:** Native `<video>` via `app://` protocol with `stream: true`.
```tsx
<video controls src={`app://local/${encodeURIComponent(relativePath)}`} className="w-full" />
```
`stream: true` is the only non-obvious requirement — it enables HTTP byte-range requests so scrubbing/seeking works. Without it, the entire file downloads before playback starts.
**Supported formats:** H.264/AAC in MP4, WebM (VP8/VP9/AV1). MKV partially. For WMV/AVI on Windows, fall back to "Open in system."
**Do NOT route through `shell:readFileBase64`** — 10MB cap will silently fail on real video files. The custom protocol streams directly from disk.
---
### PDF
**Approach:** Chromium's built-in PDFium renderer via `<webview>` with `plugins: true`.
```tsx
<webview
src={`app://local/${encodeURIComponent(relativePath)}`}
webpreferences="plugins=on,javascript=off,contextIsolation=on"
sandbox
style={{ width: '100%', height: '100%' }}
/>
```
Requires `webviewTag: true` in the parent BrowserWindow's `webPreferences`. Zero bundle size cost — Chromium already ships PDFium. Native zoom, scroll, print.
**Alternative if you need text extraction / annotations:** `pdfjs-dist` in a sandboxed iframe. ~35MB bundle cost, but gives you page events, text selection, and highlight APIs. Overkill unless annotation features are planned.
---
### HTML Files
**Approach:** Sandboxed `<webview>` in an isolated session partition, with **all network blocked**.
```tsx
<webview
src={`app://local/${encodeURIComponent(relativePath)}`}
partition="sandbox-html"
webpreferences="contextIsolation=on,nodeIntegration=off"
sandbox
/>
```
In `main.ts`, create the partition and block all outbound network:
```typescript
const sandboxSession = session.fromPartition('sandbox-html', { cache: false });
sandboxSession.setPermissionRequestHandler((_, __, cb) => cb(false));
sandboxSession.webRequest.onBeforeRequest({ urls: ['*://*/*'] }, (_, cb) =>
cb({ cancel: true })
);
```
Relative assets (`./style.css`, `./images/photo.jpg`) served via the `app://` handler still work. External requests are silently blocked.
---
### DOCX / DOC
**Approach:** `docx-preview` for display, `mammoth.js` for LLM text extraction. They solve different problems — do not use them as alternatives.
- **`docx-preview`** — reproduces Word's visual layout in the DOM (tables, fonts, headings, images as base64). High fidelity for reading.
- **`mammoth.js`** — converts to clean semantic HTML, strips all visual formatting. For feeding document content to the model.
```typescript
// display
import { renderAsync } from 'docx-preview';
const buffer = await window.api.readFileBytes(filePath); // needs new IPC handler
await renderAsync(buffer, containerElement);
// LLM extraction
import mammoth from 'mammoth';
const { value: html } = await mammoth.convertToHtml({ arrayBuffer: buffer });
```
A new `read-file-bytes` IPC handler is needed in `main/src/ipc.ts` that returns a raw `Uint8Array` — the existing `shell:readFileBase64` returns a base64 string which would need decoding.
---
## Split-Pane Layout
**Recommended library: `react-resizable-panels`** (Brian Vaughn, React core team alum). Powers `shadcn/ui`'s `<Resizable>` component. Used in production by OpenAI and Adobe.
```tsx
import { Panel, PanelGroup, PanelResizeHandle } from 'react-resizable-panels';
<PanelGroup direction="horizontal" autoSaveId="knowledge-chat-layout">
<Panel defaultSize={55} minSize={30}>
<FileViewer path={selectedPath} />
</Panel>
<PanelResizeHandle className="w-1.5 bg-border hover:bg-primary/50 transition-colors" />
<Panel defaultSize={45} minSize={25}>
<ChatView />
</Panel>
</PanelGroup>
```
`autoSaveId` persists the split ratio to `localStorage` automatically across sessions.
**Alternative: `allotment`** — extracted directly from VS Code's C++ split-view code. Pixel-identical to VS Code. Slightly less React-idiomatic API.
---
## Security Model
| Concern | Pattern |
|---|---|
| Local file access | Main process only via `ipcMain.handle`. Renderer never reads filesystem directly. |
| Protocol | Custom `app://` scheme, not `file://`. All local resources routed through validated handler. |
| Path traversal | Every path resolved to absolute, checked with `startsWith(WORKSPACE_ROOT)`. |
| Renderer isolation | `contextIsolation: true`, `nodeIntegration: false`, `sandbox: true`. |
| Untrusted HTML | Separate `session.fromPartition('sandbox-html')` with network blocked. |
---
## Implementation Steps
### Step 1 — Register `app://` protocol in `main.ts`
Before `app.whenReady()`. One change, covers images, video, PDF, and HTML.
### Step 2 — Add `read-file-bytes` IPC handler in `ipc.ts`
Returns raw `Uint8Array` for DOCX rendering. Avoids base64 encode/decode overhead for large files.
### Step 3 — Create `KnowledgeFileViewer` component
`apps/x/apps/renderer/src/components/knowledge-file-viewer.tsx`
Extension routing:
| Extensions | Renderer |
|---|---|
| `.png .jpg .jpeg .webp .gif .svg .avif` | `<img>` via `app://` |
| `.mp4 .mov .webm` | `<video>` via `app://` |
| `.pdf` | `<webview plugins sandbox>` |
| `.html .htm` | `<webview partition="sandbox-html">` |
| `.docx .doc` | `docx-preview` in sandboxed iframe |
| `.mp3 .wav .m4a` | Reuse existing `AudioFileCard` |
| everything else | "Open in system" button (`shell.openPath`) |
### Step 4 — Replace `<pre>` fallback in `App.tsx:45224527`
One-line swap. All routing logic lives in `KnowledgeFileViewer`.
### Step 5 — Add split-pane layout
Install `react-resizable-panels`, wrap knowledge view (file viewer + chat) in `PanelGroup`.
---
## Dependencies to Add
| Package | Purpose | Bundle cost |
|---|---|---|
| `react-resizable-panels` | Split pane layout | ~15KB |
| `docx-preview` | DOCX visual rendering | ~500KB |
| `mammoth` | DOCX → semantic HTML for LLM | ~300KB |
| `pdfjs-dist` | PDF with text extraction (optional) | ~35MB — only if PDFium isn't enough |
Images, video, PDF (via PDFium), and HTML have zero additional dependencies.
---
## What to Avoid
- **`<iframe src="file:///...">` for anything** — always use `app://`.
- **Routing large files through `shell:readFileBase64`** — 10MB cap silently fails.
- **Using `mammoth` for display** — it strips all formatting. LLM extraction only.
- **Assuming `webviewTag` is enabled** — check `main.ts` BrowserWindow creation before shipping PDF/HTML webviews.

83
apps/x/PLAN.md Normal file
View file

@ -0,0 +1,83 @@
# HTML File Rendering — Implementation Plan
## Goal
Replace the `<pre>` raw text fallback in the knowledge view with a proper HTML file renderer using `<iframe srcdoc sandbox="allow-scripts">`.
## Scope
- Only HTML file rendering for now
- No layout changes, no split pane
- No other file types in this PR
---
## Phase 1 — IPC: Read HTML file content and pass to renderer
### What
Add an IPC handler in the main process that reads a local HTML file and returns its content as a string to the renderer.
### Work
1. Add `knowledge:readHtmlFile` handler in `apps/main/src/ipc.ts`
- Accepts a workspace-relative path
- Resolves to absolute path, validates it stays inside workspace root (path traversal guard)
- Reads file as UTF-8 string
- Returns the HTML string to renderer
2. Add the channel type to `packages/shared/src/ipc.ts`
### Test ✅
- Open a `.html` file from the knowledge tree
- Console log the returned string in the renderer
- Verify: correct HTML content is returned, no errors
- Verify: attempting a path like `../../secret.txt` is rejected with an error
---
## Phase 2 — Renderer: Detect `.html` files and render in iframe
### What
In `App.tsx`, detect when `selectedPath` is an `.html` file and render it in a sandboxed `<iframe srcdoc>` instead of the `<pre>` fallback.
### Work
1. In the file loading logic (`App.tsx:12841357`), when extension is `.html`:
- Call `knowledge:readHtmlFile` via IPC
- Store the HTML string in state
2. In the knowledge view render switch (`App.tsx:45224527`):
- Add a condition: if extension is `.html` → render `<HtmlFileViewer html={htmlContent} />`
- Otherwise fall through to existing `<pre>` fallback
3. Create `apps/renderer/src/components/html-file-viewer.tsx`:
- Accepts `html: string` prop
- Renders `<iframe srcdoc={html} sandbox="allow-scripts" />` with full width/height, no border
### Test ✅
- Open a real `.html` file from the knowledge tree
- Verify: file renders visually in the iframe (not raw text)
- Verify: a non-html file still shows the `<pre>` fallback (no regression)
- Verify: an HTML file with a `<script>` tag runs its JS (allow-scripts works)
- Verify: an HTML file with `<script src="https://evil.com">` — open network tab, confirm no request is made (allow-same-origin is absent, so external scripts are blocked by default CSP)
---
## Phase 3 — Polish: Loading state, error state, empty file handling
### What
Handle edge cases so the viewer never shows a broken or confusing UI.
### Work
1. Loading state — show a spinner while the IPC call is in flight
2. Error state — if `knowledge:readHtmlFile` throws (file deleted, permission error), show a clean error message with the file path
3. Empty file — if HTML string is empty, show "This file is empty" instead of a blank iframe
4. Large files — if file is over a reasonable size limit (e.g. 5MB), show "File too large to preview — Open in system" button that calls `shell.openPath`
### Test ✅
- Open a valid HTML file → renders correctly
- Delete the file while it's open, trigger a reload → error state shown cleanly
- Open an empty `.html` file → "This file is empty" message shown
- Simulate a file over 5MB → "File too large" message with open button shown
- Verify: no console errors in any of the above scenarios
---
## Out of scope for this PR
- PDF, DOCX, image, video rendering
- Split pane / resizable layout
- Relative asset loading (`./style.css`) — Phase 2 uses `srcdoc` which has no base URL; assets will not load. Acceptable for now, documented as known limitation.
- `app://` custom protocol — not needed until we handle relative assets

View file

@ -13,6 +13,7 @@ import { ChatInputWithMentions, type StagedAttachment } from './components/chat-
import { ChatMessageAttachments } from '@/components/chat-message-attachments'
import { GraphView, type GraphEdge, type GraphNode } from '@/components/graph-view';
import { BasesView, type BaseConfig, DEFAULT_BASE_CONFIG } from '@/components/bases-view';
import { HtmlFileViewer } from '@/components/html-file-viewer';
import { useDebounce } from './hooks/use-debounce';
import { SidebarContentPanel } from '@/components/sidebar-content';
import { SuggestedTopicsView } from '@/components/suggested-topics-view';
@ -1424,6 +1425,11 @@ function App() {
}
const requestId = (fileLoadRequestIdRef.current += 1)
const pathToLoad = selectedPath
// For HTML files, clear stale content immediately so the viewer shows
// its loading state instead of rendering the previous file's bytes.
if (pathToLoad.toLowerCase().endsWith('.html') || pathToLoad.toLowerCase().endsWith('.htm')) {
setFileContent('')
}
let cancelled = false
;(async () => {
try {
@ -4819,6 +4825,10 @@ function App() {
/>
)}
</div>
) : selectedPath?.toLowerCase().endsWith('.html') || selectedPath?.toLowerCase().endsWith('.htm') ? (
<div className="flex-1 min-h-0 overflow-hidden">
<HtmlFileViewer html={fileContent} path={selectedPath} />
</div>
) : (
<div className="flex-1 overflow-auto p-4">
<pre className="text-sm font-mono text-foreground whitespace-pre-wrap">

View file

@ -0,0 +1,38 @@
import { useEffect, useState } from 'react'
import { Loader2Icon } from 'lucide-react'
interface HtmlFileViewerProps {
html: string
path: string
}
export function HtmlFileViewer({ html, path }: HtmlFileViewerProps) {
const [iframeLoaded, setIframeLoaded] = useState(false)
useEffect(() => {
setIframeLoaded(false)
}, [path, html])
const showSpinner = !html || !iframeLoaded
return (
<div className="relative h-full w-full">
{html && (
<iframe
key={path}
srcDoc={html}
sandbox="allow-scripts"
className="h-full w-full border-0 bg-white"
title="HTML preview"
onLoad={() => setIframeLoaded(true)}
/>
)}
{showSpinner && (
<div className="absolute inset-0 flex flex-col items-center justify-center gap-3 bg-background text-muted-foreground">
<Loader2Icon className="size-6 animate-spin" />
<p className="text-sm">Rendering preview</p>
</div>
)}
</div>
)
}

View file

@ -0,0 +1,104 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>HTML Viewer Test</title>
<style>
body {
font-family: -apple-system, BlinkMacSystemFont, sans-serif;
max-width: 720px;
margin: 40px auto;
padding: 24px;
color: #1a1a1a;
background: #fafafa;
}
h1 { color: #2563eb; border-bottom: 2px solid #2563eb; padding-bottom: 8px; }
.card {
background: white;
padding: 16px;
border-radius: 8px;
box-shadow: 0 1px 3px rgba(0,0,0,0.1);
margin: 16px 0;
}
button {
background: #2563eb;
color: white;
border: none;
padding: 10px 20px;
border-radius: 6px;
cursor: pointer;
font-size: 14px;
}
button:hover { background: #1d4ed8; }
.counter { font-size: 24px; font-weight: bold; color: #2563eb; }
.danger { color: #dc2626; font-weight: 600; }
.ok { color: #16a34a; font-weight: 600; }
table { width: 100%; border-collapse: collapse; margin-top: 12px; }
th, td { padding: 8px; text-align: left; border-bottom: 1px solid #e5e7eb; }
th { background: #f3f4f6; }
</style>
</head>
<body>
<h1>HTML Viewer Test Page</h1>
<div class="card">
<h2>1. Inline Styles</h2>
<p>If this page looks <span class="ok">styled and clean</span>, inline CSS works.</p>
</div>
<div class="card">
<h2>2. Interactive JavaScript</h2>
<p>Counter: <span class="counter" id="count">0</span></p>
<button onclick="document.getElementById('count').textContent = parseInt(document.getElementById('count').textContent) + 1">
Click to increment
</button>
<p style="margin-top:8px;font-size:13px;color:#666;">If clicking increments the counter, allow-scripts works.</p>
</div>
<div class="card">
<h2>3. Sandbox Verification</h2>
<p>The next button tries to access <code>window.parent</code>:</p>
<button onclick="
try {
const x = window.parent.location.href;
document.getElementById('parentResult').innerHTML = '<span class=&quot;danger&quot;>FAIL: parent accessible — ' + x + '</span>';
} catch (e) {
document.getElementById('parentResult').innerHTML = '<span class=&quot;ok&quot;>OK: parent blocked — ' + e.message + '</span>';
}
">Test parent access</button>
<p id="parentResult" style="margin-top:8px;">Click to test.</p>
</div>
<div class="card">
<h2>4. External Network Request</h2>
<p>Tries to fetch from a remote URL (should fail due to no allow-same-origin):</p>
<button onclick="
fetch('https://api.github.com/zen')
.then(r => r.text())
.then(t => document.getElementById('netResult').innerHTML = '<span class=&quot;danger&quot;>Network worked: ' + t + '</span>')
.catch(e => document.getElementById('netResult').innerHTML = '<span class=&quot;ok&quot;>Blocked or failed: ' + e.message + '</span>')
">Test fetch</button>
<p id="netResult" style="margin-top:8px;">Click to test.</p>
</div>
<div class="card">
<h2>5. Table Rendering</h2>
<table>
<thead><tr><th>Feature</th><th>Status</th></tr></thead>
<tbody>
<tr><td>Inline CSS</td><td class="ok">Working</td></tr>
<tr><td>Inline JS</td><td class="ok">Working</td></tr>
<tr><td>Sandbox isolation</td><td class="ok">Active</td></tr>
</tbody>
</table>
</div>
<div class="card">
<h2>6. Auto-run script on load</h2>
<p id="loaded" style="font-size:13px;color:#666;">Script did not run.</p>
<script>
document.getElementById('loaded').innerHTML = '<span class="ok">Script ran on load at ' + new Date().toLocaleTimeString() + '</span>';
</script>
</div>
</body>
</html>