Merge pull request #898 from MODSetter/dev

feat: SearXNG search, Electron desktop app, video agent & UI overhaul
This commit is contained in:
Rohan Verma 2026-03-22 01:28:58 -07:00 committed by GitHub
commit 1013586506
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
220 changed files with 12155 additions and 3102 deletions

View file

@ -0,0 +1,136 @@
---
name: system-architecture
description: Design systems with appropriate complexity - no more, no less. Use when the user asks to architect applications, design system boundaries, plan service decomposition, evaluate monolith vs microservices, make scaling decisions, or review structural trade-offs. Applies to new system design, refactoring, and migration planning.
---
# System Architecture
Design real structures with clear boundaries, explicit trade-offs, and appropriate complexity. Match architecture to actual requirements, not imagined future needs.
## Workflow
When the user requests an architecture, follow these steps:
```
Task Progress:
- [ ] Step 1: Clarify constraints
- [ ] Step 2: Identify domains
- [ ] Step 3: Map data flow
- [ ] Step 4: Draw boundaries with rationale
- [ ] Step 5: Run complexity checklist
- [ ] Step 6: Present architecture with trade-offs
```
**Step 1 - Clarify constraints.** Ask about:
| Constraint | Question | Why it matters |
|------------|----------|----------------|
| Scale | What's the real load? (users, requests/sec, data size) | Design for 10x current, not 1000x |
| Team | How many developers? How many teams? | Deployable units ≤ number of teams |
| Lifespan | Prototype? MVP? Long-term product? | Temporary systems need temporary solutions |
| Change vectors | What actually varies? | Abstract only where you have evidence of variation |
**Step 2 - Identify domains.** Group by business capability, not technical layer. Look for things that change for different reasons and at different rates.
**Step 3 - Map data flow.** Trace: where does data enter → how does it transform → where does it exit? Make the flow obvious.
**Step 4 - Draw boundaries.** Every boundary needs a reason: different team, different change rate, different compliance requirement, or different scaling need.
**Step 5 - Run complexity checklist.** Before adding any non-trivial pattern:
```
[ ] Have I tried the simple solution?
[ ] Do I have evidence it's insufficient?
[ ] Can my team operate this?
[ ] Will this still make sense in 6 months?
[ ] Can I explain why this complexity is necessary?
```
If any answer is "no", keep it simple.
**Step 6 - Present the architecture** using the output template below.
## Output Template
```markdown
### System: [Name]
**Constraints**:
- Scale: [current and expected load]
- Team: [size and structure]
- Lifespan: [prototype / MVP / long-term]
**Architecture**:
[Component diagram or description of components and their relationships]
**Data Flow**:
[How data enters → transforms → exits]
**Key Boundaries**:
| Boundary | Reason | Change Rate |
|----------|--------|-------------|
| ... | ... | ... |
**Trade-offs**:
- Chose X over Y because [reason]
- Accepted [limitation] to gain [benefit]
**Complexity Justification**:
- [Each non-trivial pattern] → [why it's needed, with evidence]
```
## Core Principles
1. **Boundaries at real differences.** Separate concerns that change for different reasons and at different rates.
2. **Dependencies flow inward.** Core logic depends on nothing. Infrastructure depends on core.
3. **Follow the data.** Architecture should make data flow obvious.
4. **Design for failure.** Network fails. Databases timeout. Build compensation into the structure.
5. **Design for operations.** You will debug this at 3am. Every request needs a trace. Every error needs context for replay.
For concrete good/bad examples of each principle, see [examples.md](examples.md).
## Anti-Patterns
| Don't | Do Instead |
|-------|------------|
| Microservices for a 3-person team | Well-structured monolith |
| Event sourcing for CRUD | Simple state storage |
| Message queues within the same process | Just call the function |
| Distributed transactions | Redesign to avoid, or accept eventual consistency |
| Repository wrapping an ORM | Use the ORM directly |
| Interfaces with one implementation | Mock at boundaries only |
| AbstractFactoryFactoryBean | Just instantiate the thing |
| DI containers for simple graphs | Constructor injection is enough |
| Clean Architecture for a TODO app | Match layers to actual complexity |
| DDD tactics without strategic design | Aggregates need bounded contexts |
| Hexagonal ports with one adapter | Just call the database |
| CQRS when reads = writes | Add when they diverge |
| "We might swap databases" | You won't; rewrite if you do |
| "Multi-tenant someday" | Build it when you have tenant #2 |
| "Microservices for team scale" | Helps at 50+ engineers, not 4 |
## Success Criteria
Your architecture is right-sized when:
1. **You can draw it** - dependency graph fits on a whiteboard
2. **You can explain it** - new team member understands data flow in 30 minutes
3. **You can change it** - adding a feature touches 1-3 modules, not 10
4. **You can delete it** - removing a component needs no archaeology
5. **You can debug it** - tracing a request takes minutes, not hours
6. **It matches your team** - deployable units ≤ number of teams
## When the Simple Solution Isn't Enough
If the complexity checklist says "yes, scale is real", see [scaling-checklist.md](scaling-checklist.md) for concrete techniques covering caching, async processing, partitioning, horizontal scaling, and multi-region.
## Iterative Architecture
Architecture is discovered, not designed upfront:
1. **Start obvious** - group by domain, not by technical layer
2. **Let hotspots emerge** - monitor which modules change together
3. **Extract when painful** - split only when the current form causes measurable problems
4. **Document decisions** - record why boundaries exist so future you knows what's load-bearing
Every senior engineer has a graveyard of over-engineered systems they regret. Learn from their pain. Build boring systems that work.

View file

@ -0,0 +1,120 @@
# Architecture Examples
Concrete good/bad examples for each core principle in SKILL.md.
---
## Boundaries at Real Differences
**Good** - Meaningful boundary:
```
# Users and Billing are separate bounded contexts
# - Different teams own them
# - Different change cadences (users: weekly, billing: quarterly)
# - Different compliance requirements
src/
users/ # User management domain
models.py
services.py
api.py
billing/ # Billing domain
models.py
services.py
api.py
shared/ # Truly shared utilities
auth.py
```
**Bad** - Ceremony without purpose:
```
# UserService → UserRepository → UserRepositoryImpl
# ...when you'll never swap the database
src/
interfaces/
IUserRepository.py # One implementation exists
repositories/
UserRepositoryImpl.py # Wraps SQLAlchemy, which is already a repository
services/
UserService.py # Just calls the repository
```
---
## Dependencies Flow Inward
**Good** - Clear dependency direction:
```
# Dependency flows inward: infrastructure → application → domain
domain/ # Pure business logic, no imports from outer layers
order.py # Order entity with business rules
application/ # Use cases, orchestrates domain
place_order.py # Imports from domain/, not infrastructure/
infrastructure/ # External concerns
postgres.py # Implements persistence, imports from application/
stripe.py # Implements payments
```
---
## Follow the Data
**Good** - Obvious data flow:
```
Request → Validate → Transform → Store → Respond
# Each step is a clear function/module:
api/routes.py # Request enters
validators.py # Validation
transformers.py # Business logic transformation
repositories.py # Storage
serializers.py # Response shaping
```
---
## Design for Failure
**Good** - Failure-aware design with compensation:
```python
class OrderService:
def place_order(self, order: Order) -> Result:
inventory = self.inventory.reserve(order.items)
if inventory.failed:
return Result.failure("Items unavailable", retry=False)
payment = self.payments.charge(order.total)
if payment.failed:
self.inventory.release(inventory.reservation_id) # Compensate
return Result.failure("Payment failed", retry=True)
return Result.success(order)
```
---
## Design for Operations
**Good** - Observable architecture:
```python
@trace
def handle_request(request):
log.info("Processing", request_id=request.id, user=request.user_id)
try:
result = process(request)
log.info("Completed", request_id=request.id, result=result.status)
return result
except Exception as e:
log.error("Failed", request_id=request.id, error=str(e),
context=request.to_dict()) # Full context for replay
raise
```
Key elements:
- Every request gets a correlation ID
- Every service logs with that ID
- Every error includes full context for reproduction

View file

@ -0,0 +1,76 @@
# Scaling Checklist
Concrete techniques for when the complexity checklist in SKILL.md confirms scale is a real problem. Apply in order - each level solves the previous level's bottleneck.
---
## Level 0: Optimize First
Before adding infrastructure, exhaust these:
- [ ] Database queries have proper indexes
- [ ] N+1 queries eliminated
- [ ] Connection pooling configured
- [ ] Slow endpoints profiled and optimized
- [ ] Static assets served via CDN
## Level 1: Read-Heavy
**Symptom**: Database reads are the bottleneck.
| Technique | When | Trade-off |
|-----------|------|-----------|
| Application cache (in-memory) | Small, frequently accessed data | Stale data, memory pressure |
| Redis/Memcached | Shared cache across instances | Network hop, cache invalidation complexity |
| Read replicas | High read volume, slight staleness OK | Replication lag, eventual consistency |
| CDN | Static or semi-static content | Cache invalidation delay |
## Level 2: Write-Heavy
**Symptom**: Database writes or processing are the bottleneck.
| Technique | When | Trade-off |
|-----------|------|-----------|
| Async task queue (Celery, SQS) | Work can be deferred | Eventual consistency, failure handling |
| Write-behind cache | Batch frequent writes | Data loss risk on crash |
| Event streaming (Kafka) | Multiple consumers of same data | Operational complexity, ordering guarantees |
| CQRS | Reads and writes have diverged significantly | Two models to maintain |
## Level 3: Traffic Spikes
**Symptom**: Individual instances can't handle peak load.
| Technique | When | Trade-off |
|-----------|------|-----------|
| Horizontal scaling + load balancer | Stateless services | Session management, deploy complexity |
| Auto-scaling | Unpredictable traffic patterns | Cold start latency, cost spikes |
| Rate limiting | Protect against abuse/spikes | Legitimate users may be throttled |
| Circuit breakers | Downstream services degrade | Partial functionality during failures |
## Level 4: Data Growth
**Symptom**: Single database can't hold or query all the data efficiently.
| Technique | When | Trade-off |
|-----------|------|-----------|
| Table partitioning | Time-series or naturally partitioned data | Query complexity, partition management |
| Archival / cold storage | Old data rarely accessed | Access latency for archived data |
| Database sharding | Partitioning insufficient, clear shard key exists | Cross-shard queries, operational burden |
| Search index (Elasticsearch) | Full-text or complex queries on large datasets | Index lag, another system to operate |
## Level 5: Multi-Region
**Symptom**: Users are geographically distributed, latency matters.
| Technique | When | Trade-off |
|-----------|------|-----------|
| CDN + edge caching | Static/semi-static content | Cache invalidation |
| Read replicas per region | Read-heavy, slight staleness OK | Replication lag |
| Active-passive failover | Disaster recovery | Failover time, cost of standby |
| Active-active multi-region | True global low-latency required | Conflict resolution, extreme complexity |
---
## Decision Rule
Always start at Level 0. Move to the next level only when you have **measured evidence** that the current level is insufficient. Skipping levels is how you end up with Kafka for a TODO app.

78
.github/workflows/desktop-release.yml vendored Normal file
View file

@ -0,0 +1,78 @@
name: Desktop Release
on:
push:
tags:
- 'v*'
- 'beta-v*'
permissions:
contents: write
jobs:
build:
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
include:
- os: macos-latest
platform: --mac
- os: ubuntu-latest
platform: --linux
- os: windows-latest
platform: --win
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Extract version from tag
id: version
shell: bash
run: |
TAG=${GITHUB_REF#refs/tags/}
VERSION=${TAG#beta-}
VERSION=${VERSION#v}
echo "VERSION=$VERSION" >> "$GITHUB_OUTPUT"
- name: Setup pnpm
uses: pnpm/action-setup@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: 'pnpm'
cache-dependency-path: |
surfsense_web/pnpm-lock.yaml
surfsense_desktop/pnpm-lock.yaml
- name: Install web dependencies
run: pnpm install
working-directory: surfsense_web
- name: Build Next.js standalone
run: pnpm build
working-directory: surfsense_web
env:
NEXT_PUBLIC_FASTAPI_BACKEND_URL: ${{ vars.NEXT_PUBLIC_FASTAPI_BACKEND_URL }}
NEXT_PUBLIC_ELECTRIC_URL: ${{ vars.NEXT_PUBLIC_ELECTRIC_URL }}
NEXT_PUBLIC_DEPLOYMENT_MODE: ${{ vars.NEXT_PUBLIC_DEPLOYMENT_MODE }}
NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE: ${{ vars.NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE }}
- name: Install desktop dependencies
run: pnpm install
working-directory: surfsense_desktop
- name: Build Electron
run: pnpm build
working-directory: surfsense_desktop
env:
HOSTED_FRONTEND_URL: ${{ vars.HOSTED_FRONTEND_URL }}
- name: Package & Publish
run: pnpm exec electron-builder ${{ matrix.platform }} --config electron-builder.yml --publish always -c.extraMetadata.version=${{ steps.version.outputs.VERSION }}
working-directory: surfsense_desktop
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

2
.gitignore vendored
View file

@ -5,4 +5,4 @@ node_modules/
.ruff_cache/
.venv
.pnpm-store
.DS_Store
.DS_Store

35
.vscode/launch.json vendored
View file

@ -22,7 +22,11 @@
"console": "integratedTerminal",
"justMyCode": false,
"cwd": "${workspaceFolder}/surfsense_backend",
"python": "${command:python.interpreterPath}"
"python": "uv",
"pythonArgs": [
"run",
"python"
]
},
{
"name": "Backend: FastAPI (No Reload)",
@ -32,7 +36,11 @@
"console": "integratedTerminal",
"justMyCode": false,
"cwd": "${workspaceFolder}/surfsense_backend",
"python": "${command:python.interpreterPath}"
"python": "uv",
"pythonArgs": [
"run",
"python"
]
},
{
"name": "Backend: FastAPI (main.py)",
@ -41,14 +49,19 @@
"program": "${workspaceFolder}/surfsense_backend/main.py",
"console": "integratedTerminal",
"justMyCode": false,
"cwd": "${workspaceFolder}/surfsense_backend"
"cwd": "${workspaceFolder}/surfsense_backend",
"python": "uv",
"pythonArgs": [
"run",
"python"
]
},
{
"name": "Frontend: Next.js",
"type": "node",
"request": "launch",
"cwd": "${workspaceFolder}/surfsense_web",
"runtimeExecutable": "npm",
"runtimeExecutable": "pnpm",
"runtimeArgs": ["run", "dev"],
"console": "integratedTerminal",
"serverReadyAction": {
@ -62,7 +75,7 @@
"type": "node",
"request": "launch",
"cwd": "${workspaceFolder}/surfsense_web",
"runtimeExecutable": "npm",
"runtimeExecutable": "pnpm",
"runtimeArgs": ["run", "debug:server"],
"console": "integratedTerminal",
"serverReadyAction": {
@ -87,7 +100,11 @@
"console": "integratedTerminal",
"justMyCode": false,
"cwd": "${workspaceFolder}/surfsense_backend",
"python": "${command:python.interpreterPath}"
"python": "uv",
"pythonArgs": [
"run",
"python"
]
},
{
"name": "Celery: Beat Scheduler",
@ -103,7 +120,11 @@
"console": "integratedTerminal",
"justMyCode": false,
"cwd": "${workspaceFolder}/surfsense_backend",
"python": "${command:python.interpreterPath}"
"python": "uv",
"pythonArgs": [
"run",
"python"
]
}
],
"compounds": [

View file

@ -1,3 +1,4 @@
{
"biome.configurationPath": "./surfsense_web/biome.json"
"biome.configurationPath": "./surfsense_web/biome.json",
"deepscan.ignoreConfirmWarning": true
}

View file

@ -27,11 +27,18 @@ SurfSense es un agente de investigación de IA altamente personalizable, conecta
# Video
# Demo
https://github.com/user-attachments/assets/cc0c84d3-1f2f-4f7a-b519-2ecce22310b1
## Ejemplo de Podcast
## Ejemplo de Agente de Video
https://github.com/user-attachments/assets/cc977e6d-8292-4ffe-abb8-3b0560ef5562
## Ejemplo de Agente de Podcast
https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
@ -46,20 +53,25 @@ https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
2. Conecta tus conectores y sincroniza. Activa la sincronización periódica para mantenerlos actualizados.
<p align="center"><img src="https://github.com/user-attachments/assets/59da61d7-da05-4576-b7c0-dbc09f5985e8" alt="Conectores" /></p>
<p align="center"><img src="https://github.com/user-attachments/assets/0740f351-23fa-4909-9880-70aa1dcc1df7" alt="Conectores" /></p>
3. Mientras se indexan los datos de los conectores, sube documentos.
<p align="center"><img src="https://github.com/user-attachments/assets/d1e8b2e2-9eac-41d8-bdc0-f0cdc405d128" alt="Subir Documentos" /></p>
<p align="center"><img src="https://github.com/user-attachments/assets/daf3dbae-ef86-4e86-82ea-fcbcad988761" alt="Subir Documentos" /></p>
4. Una vez que todo esté indexado, pregunta lo que quieras (Casos de uso):
- Generación de videos
<p align="center"><img src="https://github.com/user-attachments/assets/af85c0f3-6cfd-4757-9706-07fd5e32c857" alt="Generación de Videos" /></p>
- Búsqueda básica y citaciones
<p align="center"><img src="https://github.com/user-attachments/assets/81e797a1-e01a-4003-8e60-0a0b3a9789df" alt="Búsqueda y Citación" /></p>
- QNA con mención de documentos
<p align="center"><img src="https://github.com/user-attachments/assets/65c3bf06-1d46-4dd5-b169-4d934c9b6798" alt="QNA con Mención de Documentos" /></p>
<p align="center"><img src="https://github.com/user-attachments/assets/be958295-0a8c-4707-998c-9fe1f1c007be" alt="QNA con Mención de Documentos" /></p>
- Generación de informes y exportaciones (PDF, DOCX, HTML, LaTeX, EPUB, ODT, texto plano)
@ -133,6 +145,8 @@ Para Docker Compose, instalación manual y otras opciones de despliegue, consult
| Soporte Universal de LLM | 100+ LLMs, 6000+ modelos de embeddings, todos los principales rerankers vía OpenAI spec y LiteLLM |
| Privacidad Primero | Soporte completo de LLM local (vLLM, Ollama) tus datos son tuyos |
| Colaboración en Equipo | RBAC con roles de Propietario / Admin / Editor / Visor, chat en tiempo real e hilos de comentarios |
| Generación de Videos | Genera videos con narración y visuales |
| Generación de Presentaciones | Crea presentaciones editables basadas en diapositivas |
| Generación de Podcasts | Podcast de 3 min en menos de 20 segundos; múltiples proveedores TTS (OpenAI, Azure, Kokoro) |
| Extensión de Navegador | Extensión multi-navegador para guardar cualquier página web, incluyendo páginas protegidas por autenticación |
| 25+ Conectores | Motores de búsqueda, Google Drive, Slack, Teams, Jira, Notion, GitHub, Discord y [más](#fuentes-externas) |

View file

@ -27,11 +27,18 @@ SurfSense एक अत्यधिक अनुकूलन योग्य AI
# वीडियो
# डेमो
https://github.com/user-attachments/assets/cc0c84d3-1f2f-4f7a-b519-2ecce22310b1
## पॉडकास्ट नमूना
## वीडियो एजेंट नमूना
https://github.com/user-attachments/assets/cc977e6d-8292-4ffe-abb8-3b0560ef5562
## पॉडकास्ट एजेंट नमूना
https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
@ -46,20 +53,25 @@ https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
2. अपने कनेक्टर जोड़ें और सिंक करें। कनेक्टर्स को अपडेट रखने के लिए आवधिक सिंकिंग सक्षम करें।
<p align="center"><img src="https://github.com/user-attachments/assets/59da61d7-da05-4576-b7c0-dbc09f5985e8" alt="कनेक्टर्स" /></p>
<p align="center"><img src="https://github.com/user-attachments/assets/0740f351-23fa-4909-9880-70aa1dcc1df7" alt="कनेक्टर्स" /></p>
3. जब तक कनेक्टर्स का डेटा इंडेक्स हो रहा है, दस्तावेज़ अपलोड करें।
<p align="center"><img src="https://github.com/user-attachments/assets/d1e8b2e2-9eac-41d8-bdc0-f0cdc405d128" alt="दस्तावेज़ अपलोड करें" /></p>
<p align="center"><img src="https://github.com/user-attachments/assets/daf3dbae-ef86-4e86-82ea-fcbcad988761" alt="दस्तावेज़ अपलोड करें" /></p>
4. सब कुछ इंडेक्स हो जाने के बाद, कुछ भी पूछें (उपयोग के मामले):
- वीडियो जनरेशन
<p align="center"><img src="https://github.com/user-attachments/assets/af85c0f3-6cfd-4757-9706-07fd5e32c857" alt="वीडियो जनरेशन" /></p>
- बेसिक सर्च और उद्धरण
<p align="center"><img src="https://github.com/user-attachments/assets/81e797a1-e01a-4003-8e60-0a0b3a9789df" alt="सर्च और उद्धरण" /></p>
- दस्तावेज़ मेंशन QNA
<p align="center"><img src="https://github.com/user-attachments/assets/65c3bf06-1d46-4dd5-b169-4d934c9b6798" alt="दस्तावेज़ मेंशन QNA" /></p>
<p align="center"><img src="https://github.com/user-attachments/assets/be958295-0a8c-4707-998c-9fe1f1c007be" alt="दस्तावेज़ मेंशन QNA" /></p>
- रिपोर्ट जनरेशन और एक्सपोर्ट (PDF, DOCX, HTML, LaTeX, EPUB, ODT, सादा टेक्स्ट)
@ -133,6 +145,8 @@ Docker Compose, मैनुअल इंस्टॉलेशन और अन
| यूनिवर्सल LLM सपोर्ट | 100+ LLMs, 6000+ एम्बेडिंग मॉडल, सभी प्रमुख रीरैंकर्स OpenAI spec और LiteLLM के माध्यम से |
| प्राइवेसी फर्स्ट | पूर्ण लोकल LLM सपोर्ट (vLLM, Ollama) आपका डेटा आपका रहता है |
| टीम सहयोग | मालिक / एडमिन / संपादक / दर्शक भूमिकाओं के साथ RBAC, रीयल-टाइम चैट और कमेंट थ्रेड |
| वीडियो जनरेशन | नैरेशन और विज़ुअल के साथ वीडियो बनाएं |
| प्रेजेंटेशन जनरेशन | संपादन योग्य, स्लाइड आधारित प्रेजेंटेशन बनाएं |
| पॉडकास्ट जनरेशन | 20 सेकंड से कम में 3 मिनट का पॉडकास्ट; कई TTS प्रदाता (OpenAI, Azure, Kokoro) |
| ब्राउज़र एक्सटेंशन | किसी भी वेबपेज को सहेजने के लिए क्रॉस-ब्राउज़र एक्सटेंशन, प्रमाणीकरण सुरक्षित पेज सहित |
| 25+ कनेक्टर्स | सर्च इंजन, Google Drive, Slack, Teams, Jira, Notion, GitHub, Discord और [अधिक](#बाहरी-स्रोत) |

View file

@ -27,11 +27,18 @@ SurfSense is a highly customizable AI research agent, connected to external sour
# Video
# Demo
https://github.com/user-attachments/assets/cc0c84d3-1f2f-4f7a-b519-2ecce22310b1
## Podcast Sample
## Video Agent Sample
https://github.com/user-attachments/assets/cc977e6d-8292-4ffe-abb8-3b0560ef5562
## Podcast Agent Sample
https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
@ -46,20 +53,25 @@ https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
2. Connect your connectors and sync. Enable periodic syncing to keep connectors synced.
<p align="center"><img src="https://github.com/user-attachments/assets/59da61d7-da05-4576-b7c0-dbc09f5985e8" alt="Connectors" /></p>
<p align="center"><img src="https://github.com/user-attachments/assets/0740f351-23fa-4909-9880-70aa1dcc1df7" alt="Connectors" /></p>
3. Till connectors data index, upload Documents.
<p align="center"><img src="https://github.com/user-attachments/assets/d1e8b2e2-9eac-41d8-bdc0-f0cdc405d128" alt="Upload Documents" /></p>
<p align="center"><img src="https://github.com/user-attachments/assets/daf3dbae-ef86-4e86-82ea-fcbcad988761" alt="Upload Documents" /></p>
4. Once everything is indexed, Ask Away (Use Cases):
- Video Generation
<p align="center"><img src="https://github.com/user-attachments/assets/af85c0f3-6cfd-4757-9706-07fd5e32c857" alt="Search and Citation" /></p>
- Basic search and citation
<p align="center"><img src="https://github.com/user-attachments/assets/81e797a1-e01a-4003-8e60-0a0b3a9789df" alt="Search and Citation" /></p>
- Document Mention QNA
<p align="center"><img src="https://github.com/user-attachments/assets/65c3bf06-1d46-4dd5-b169-4d934c9b6798" alt="Document Mention QNA" /></p>
<p align="center"><img src="https://github.com/user-attachments/assets/be958295-0a8c-4707-998c-9fe1f1c007be" alt="Document Mention QNA" /></p>
- Report Generations and Exports (PDF, DOCX, HTML, LaTeX, EPUB, ODT, Plain Text)
@ -133,6 +145,8 @@ For Docker Compose, manual installation, and other deployment options, see the [
| Universal LLM Support | 100+ LLMs, 6000+ embedding models, all major rerankers via OpenAI spec & LiteLLM |
| Privacy First | Full local LLM support (vLLM, Ollama) your data stays yours |
| Team Collaboration | RBAC with Owner / Admin / Editor / Viewer roles, real time chat & comment threads |
| Video Generation | Generate videos with narration and visuals |
| Presentation Generation | Create editable, slide based presentations |
| Podcast Generation | 3 min podcast in under 20 seconds; multiple TTS providers (OpenAI, Azure, Kokoro) |
| Browser Extension | Cross browser extension to save any webpage, including auth protected pages |
| 25+ Connectors | Search Engines, Google Drive, Slack, Teams, Jira, Notion, GitHub, Discord & [more](#external-sources) |

View file

@ -27,11 +27,18 @@ SurfSense é um agente de pesquisa de IA altamente personalizável, conectado a
# Vídeo
# Demo
https://github.com/user-attachments/assets/cc0c84d3-1f2f-4f7a-b519-2ecce22310b1
## Exemplo de Podcast
## Exemplo de Agente de Vídeo
https://github.com/user-attachments/assets/cc977e6d-8292-4ffe-abb8-3b0560ef5562
## Exemplo de Agente de Podcast
https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
@ -46,20 +53,25 @@ https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
2. Conecte seus conectores e sincronize. Ative a sincronização periódica para manter os conectores atualizados.
<p align="center"><img src="https://github.com/user-attachments/assets/59da61d7-da05-4576-b7c0-dbc09f5985e8" alt="Conectores" /></p>
<p align="center"><img src="https://github.com/user-attachments/assets/0740f351-23fa-4909-9880-70aa1dcc1df7" alt="Conectores" /></p>
3. Enquanto os dados dos conectores são indexados, faça upload de documentos.
<p align="center"><img src="https://github.com/user-attachments/assets/d1e8b2e2-9eac-41d8-bdc0-f0cdc405d128" alt="Upload de Documentos" /></p>
<p align="center"><img src="https://github.com/user-attachments/assets/daf3dbae-ef86-4e86-82ea-fcbcad988761" alt="Upload de Documentos" /></p>
4. Quando tudo estiver indexado, pergunte o que quiser (Casos de uso):
- Geração de vídeos
<p align="center"><img src="https://github.com/user-attachments/assets/af85c0f3-6cfd-4757-9706-07fd5e32c857" alt="Geração de Vídeos" /></p>
- Busca básica e citações
<p align="center"><img src="https://github.com/user-attachments/assets/81e797a1-e01a-4003-8e60-0a0b3a9789df" alt="Busca e Citação" /></p>
- QNA com menção de documentos
<p align="center"><img src="https://github.com/user-attachments/assets/65c3bf06-1d46-4dd5-b169-4d934c9b6798" alt="QNA com Menção de Documentos" /></p>
<p align="center"><img src="https://github.com/user-attachments/assets/be958295-0a8c-4707-998c-9fe1f1c007be" alt="QNA com Menção de Documentos" /></p>
- Geração de relatórios e exportações (PDF, DOCX, HTML, LaTeX, EPUB, ODT, texto simples)
@ -133,6 +145,8 @@ Para Docker Compose, instalação manual e outras opções de implantação, con
| Suporte Universal de LLM | 100+ LLMs, 6000+ modelos de embeddings, todos os principais rerankers via OpenAI spec e LiteLLM |
| Privacidade em Primeiro Lugar | Suporte completo a LLM local (vLLM, Ollama) seus dados ficam com você |
| Colaboração em Equipe | RBAC com papéis de Proprietário / Admin / Editor / Visualizador, chat em tempo real e threads de comentários |
| Geração de Vídeos | Gera vídeos com narração e visuais |
| Geração de Apresentações | Cria apresentações editáveis baseadas em slides |
| Geração de Podcasts | Podcast de 3 min em menos de 20 segundos; múltiplos provedores TTS (OpenAI, Azure, Kokoro) |
| Extensão de Navegador | Extensão multi-navegador para salvar qualquer página web, incluindo páginas protegidas por autenticação |
| 25+ Conectores | Mecanismos de busca, Google Drive, Slack, Teams, Jira, Notion, GitHub, Discord e [mais](#fontes-externas) |

View file

@ -27,11 +27,18 @@ SurfSense 是一个高度可定制的 AI 研究助手,可以连接外部数据
# 视频
# 演示
https://github.com/user-attachments/assets/cc0c84d3-1f2f-4f7a-b519-2ecce22310b1
## 播客示例
## 视频代理示例
https://github.com/user-attachments/assets/cc977e6d-8292-4ffe-abb8-3b0560ef5562
## 播客代理示例
https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
@ -46,20 +53,25 @@ https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
2. 连接您的连接器并同步。启用定期同步以保持连接器数据更新。
<p align="center"><img src="https://github.com/user-attachments/assets/59da61d7-da05-4576-b7c0-dbc09f5985e8" alt="连接器" /></p>
<p align="center"><img src="https://github.com/user-attachments/assets/0740f351-23fa-4909-9880-70aa1dcc1df7" alt="连接器" /></p>
3. 在连接器数据索引期间,上传文档。
<p align="center"><img src="https://github.com/user-attachments/assets/d1e8b2e2-9eac-41d8-bdc0-f0cdc405d128" alt="上传文档" /></p>
<p align="center"><img src="https://github.com/user-attachments/assets/daf3dbae-ef86-4e86-82ea-fcbcad988761" alt="上传文档" /></p>
4. 一切索引完成后,尽管提问(使用场景):
- 视频生成
<p align="center"><img src="https://github.com/user-attachments/assets/af85c0f3-6cfd-4757-9706-07fd5e32c857" alt="视频生成" /></p>
- 基本搜索和引用
<p align="center"><img src="https://github.com/user-attachments/assets/81e797a1-e01a-4003-8e60-0a0b3a9789df" alt="搜索和引用" /></p>
- 文档提及问答
<p align="center"><img src="https://github.com/user-attachments/assets/65c3bf06-1d46-4dd5-b169-4d934c9b6798" alt="文档提及问答" /></p>
<p align="center"><img src="https://github.com/user-attachments/assets/be958295-0a8c-4707-998c-9fe1f1c007be" alt="文档提及问答" /></p>
- 报告生成和导出PDF、DOCX、HTML、LaTeX、EPUB、ODT、纯文本
@ -133,6 +145,8 @@ irm https://raw.githubusercontent.com/MODSetter/SurfSense/main/docker/scripts/in
| 通用 LLM 支持 | 100+ LLM、6000+ 嵌入模型、所有主流重排序器,通过 OpenAI spec 和 LiteLLM |
| 隐私优先 | 完整本地 LLM 支持vLLM、Ollama您的数据由您掌控 |
| 团队协作 | RBAC 角色控制(所有者/管理员/编辑者/查看者),实时聊天和评论线程 |
| 视频生成 | 生成带有旁白和视觉效果的视频 |
| 演示文稿生成 | 创建可编辑的幻灯片式演示文稿 |
| 播客生成 | 20 秒内生成 3 分钟播客;多种 TTS 提供商OpenAI、Azure、Kokoro |
| 浏览器扩展 | 跨浏览器扩展,保存任何网页,包括需要身份验证的页面 |
| 25+ 连接器 | 搜索引擎、Google Drive、Slack、Teams、Jira、Notion、GitHub、Discord 等[更多](#外部数据源) |

View file

@ -36,6 +36,7 @@ EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
# BACKEND_PORT=8929
# FRONTEND_PORT=3929
# ELECTRIC_PORT=5929
# SEARXNG_PORT=8888
# FLOWER_PORT=5555
# ==============================================================================
@ -199,6 +200,16 @@ STT_SERVICE=local/base
# COMPOSIO_ENABLED=TRUE
# COMPOSIO_REDIRECT_URI=http://localhost:8000/api/v1/auth/composio/connector/callback
# ------------------------------------------------------------------------------
# SearXNG (bundled web search — works out of the box, no config needed)
# ------------------------------------------------------------------------------
# SearXNG provides web search to all search spaces automatically.
# To access the SearXNG UI directly: http://localhost:8888
# To disable the service entirely: docker compose up --scale searxng=0
# To point at your own SearXNG instance instead of the bundled one:
# SEARXNG_DEFAULT_HOST=http://your-searxng:8080
# SEARXNG_SECRET=surfsense-searxng-secret
# ------------------------------------------------------------------------------
# Daytona Sandbox (optional — cloud code execution for the deep agent)
# ------------------------------------------------------------------------------

View file

@ -57,6 +57,20 @@ services:
timeout: 5s
retries: 5
searxng:
image: searxng/searxng:2026.3.13-3c1f68c59
ports:
- "${SEARXNG_PORT:-8888}:8080"
volumes:
- ./searxng:/etc/searxng
environment:
- SEARXNG_SECRET=${SEARXNG_SECRET:-surfsense-searxng-secret}
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost:8080/healthz"]
interval: 10s
timeout: 5s
retries: 5
backend:
build: ../surfsense_backend
ports:
@ -81,6 +95,7 @@ services:
- ELECTRIC_DB_PASSWORD=${ELECTRIC_DB_PASSWORD:-electric_password}
- AUTH_TYPE=${AUTH_TYPE:-LOCAL}
- NEXT_FRONTEND_URL=${NEXT_FRONTEND_URL:-http://localhost:3000}
- SEARXNG_DEFAULT_HOST=${SEARXNG_DEFAULT_HOST:-http://searxng:8080}
# Daytona Sandbox uncomment and set credentials to enable cloud code execution
# - DAYTONA_SANDBOX_ENABLED=TRUE
# - DAYTONA_API_KEY=${DAYTONA_API_KEY:-}
@ -92,6 +107,8 @@ services:
condition: service_healthy
redis:
condition: service_healthy
searxng:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 15s
@ -115,6 +132,7 @@ services:
- PYTHONPATH=/app
- ELECTRIC_DB_USER=${ELECTRIC_DB_USER:-electric}
- ELECTRIC_DB_PASSWORD=${ELECTRIC_DB_PASSWORD:-electric_password}
- SEARXNG_DEFAULT_HOST=${SEARXNG_DEFAULT_HOST:-http://searxng:8080}
- SERVICE_ROLE=worker
depends_on:
db:

View file

@ -42,6 +42,19 @@ services:
timeout: 5s
retries: 5
searxng:
image: searxng/searxng:2026.3.13-3c1f68c59
volumes:
- ./searxng:/etc/searxng
environment:
SEARXNG_SECRET: ${SEARXNG_SECRET:-surfsense-searxng-secret}
restart: unless-stopped
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost:8080/healthz"]
interval: 10s
timeout: 5s
retries: 5
backend:
image: ghcr.io/modsetter/surfsense-backend:${SURFSENSE_VERSION:-latest}
ports:
@ -62,6 +75,7 @@ services:
ELECTRIC_DB_USER: ${ELECTRIC_DB_USER:-electric}
ELECTRIC_DB_PASSWORD: ${ELECTRIC_DB_PASSWORD:-electric_password}
NEXT_FRONTEND_URL: ${NEXT_FRONTEND_URL:-http://localhost:${FRONTEND_PORT:-3929}}
SEARXNG_DEFAULT_HOST: ${SEARXNG_DEFAULT_HOST:-http://searxng:8080}
# Daytona Sandbox uncomment and set credentials to enable cloud code execution
# DAYTONA_SANDBOX_ENABLED: "TRUE"
# DAYTONA_API_KEY: ${DAYTONA_API_KEY:-}
@ -75,6 +89,8 @@ services:
condition: service_healthy
redis:
condition: service_healthy
searxng:
condition: service_healthy
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
@ -98,6 +114,7 @@ services:
PYTHONPATH: /app
ELECTRIC_DB_USER: ${ELECTRIC_DB_USER:-electric}
ELECTRIC_DB_PASSWORD: ${ELECTRIC_DB_PASSWORD:-electric_password}
SEARXNG_DEFAULT_HOST: ${SEARXNG_DEFAULT_HOST:-http://searxng:8080}
SERVICE_ROLE: worker
depends_on:
db:

View file

@ -103,6 +103,7 @@ Write-Step "Downloading SurfSense files"
Write-Info "Installation directory: $InstallDir"
New-Item -ItemType Directory -Path "$InstallDir\scripts" -Force | Out-Null
New-Item -ItemType Directory -Path "$InstallDir\searxng" -Force | Out-Null
$Files = @(
@{ Src = "docker/docker-compose.yml"; Dest = "docker-compose.yml" }
@ -110,6 +111,8 @@ $Files = @(
@{ Src = "docker/postgresql.conf"; Dest = "postgresql.conf" }
@{ Src = "docker/scripts/init-electric-user.sh"; Dest = "scripts/init-electric-user.sh" }
@{ Src = "docker/scripts/migrate-database.ps1"; Dest = "scripts/migrate-database.ps1" }
@{ Src = "docker/searxng/settings.yml"; Dest = "searxng/settings.yml" }
@{ Src = "docker/searxng/limiter.toml"; Dest = "searxng/limiter.toml" }
)
foreach ($f in $Files) {

View file

@ -102,6 +102,7 @@ wait_for_pg() {
step "Downloading SurfSense files"
info "Installation directory: ${INSTALL_DIR}"
mkdir -p "${INSTALL_DIR}/scripts"
mkdir -p "${INSTALL_DIR}/searxng"
FILES=(
"docker/docker-compose.yml:docker-compose.yml"
@ -109,6 +110,8 @@ FILES=(
"docker/postgresql.conf:postgresql.conf"
"docker/scripts/init-electric-user.sh:scripts/init-electric-user.sh"
"docker/scripts/migrate-database.sh:scripts/migrate-database.sh"
"docker/searxng/settings.yml:searxng/settings.yml"
"docker/searxng/limiter.toml:searxng/limiter.toml"
)
for entry in "${FILES[@]}"; do

View file

@ -0,0 +1,5 @@
[botdetection.ip_limit]
link_token = false
[botdetection.ip_lists]
pass_ip = ["0.0.0.0/0"]

View file

@ -0,0 +1,90 @@
use_default_settings:
engines:
remove:
- ahmia
- torch
- qwant
- qwant news
- qwant images
- qwant videos
- mojeek
- mojeek images
- mojeek news
server:
secret_key: "override-me-via-env"
limiter: false
image_proxy: false
method: "GET"
default_http_headers:
X-Robots-Tag: "noindex, nofollow"
search:
formats:
- html
- json
default_lang: "auto"
autocomplete: ""
safe_search: 0
ban_time_on_fail: 5
max_ban_time_on_fail: 120
suspended_times:
SearxEngineAccessDenied: 3600
SearxEngineCaptcha: 3600
SearxEngineTooManyRequests: 600
cf_SearxEngineCaptcha: 7200
cf_SearxEngineAccessDenied: 3600
recaptcha_SearxEngineCaptcha: 7200
ui:
static_use_hash: true
outgoing:
request_timeout: 12.0
max_request_timeout: 20.0
pool_connections: 100
pool_maxsize: 20
enable_http2: true
extra_proxy_timeout: 10
retries: 1
# Uncomment and set your residential proxy URL to route search engine requests through it.
# Format: http://<username>:<base64_password>@<hostname>:<port>/
#
# proxies:
# all://:
# - http://user:pass@proxy-host:port/
engines:
- name: google
disabled: false
weight: 1.2
retry_on_http_error: [429, 503]
- name: duckduckgo
disabled: false
weight: 1.1
retry_on_http_error: [429, 503]
- name: brave
disabled: false
weight: 1.0
retry_on_http_error: [429, 503]
- name: bing
disabled: false
weight: 0.9
retry_on_http_error: [429, 503]
- name: wikipedia
disabled: false
weight: 0.8
- name: stackoverflow
disabled: false
weight: 0.7
- name: yahoo
disabled: false
weight: 0.7
retry_on_http_error: [429, 503]
- name: wikidata
disabled: false
weight: 0.6
- name: currency
disabled: false
- name: ddg definitions
disabled: false

View file

@ -14,6 +14,7 @@ SurfSense 现已支持以下国产 LLM
- ✅ **阿里通义千问 (Alibaba Qwen)** - 阿里云通义千问大模型
- ✅ **月之暗面 Kimi (Moonshot)** - 月之暗面 Kimi 大模型
- ✅ **智谱 AI GLM (Zhipu)** - 智谱 AI GLM 系列模型
- ✅ **MiniMax** - MiniMax 大模型 (M2.5 系列204K 上下文)
---
@ -197,6 +198,52 @@ API Base URL: https://open.bigmodel.cn/api/paas/v4
---
## 5⃣ MiniMax 配置 | MiniMax Configuration
### 获取 API Key
1. 访问 [MiniMax 开放平台](https://platform.minimaxi.com/)
2. 注册并登录账号
3. 进入 **API Keys** 页面
4. 创建新的 API Key
5. 复制 API Key
### 在 SurfSense 中配置
| 字段 | 值 | 说明 |
|------|-----|------|
| **Configuration Name** | `MiniMax M2.5` | 配置名称(自定义) |
| **Provider** | `MINIMAX` | 选择 MiniMax |
| **Model Name** | `MiniMax-M2.5` | 推荐模型<br>其他选项: `MiniMax-M2.5-highspeed` |
| **API Key** | `eyJ...` | 你的 MiniMax API Key |
| **API Base URL** | `https://api.minimax.io/v1` | MiniMax API 地址 |
| **Parameters** | `{"temperature": 1.0}` | 注意temperature 必须在 (0.0, 1.0] 范围内,不能为 0 |
### 示例配置
```
Configuration Name: MiniMax M2.5
Provider: MINIMAX
Model Name: MiniMax-M2.5
API Key: eyJxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
API Base URL: https://api.minimax.io/v1
```
### 可用模型
- **MiniMax-M2.5**: 高性能通用模型204K 上下文窗口(推荐)
- **MiniMax-M2.5-highspeed**: 高速推理版本204K 上下文窗口
### 注意事项
- **temperature 参数**: MiniMax 要求 temperature 必须在 (0.0, 1.0] 范围内,不能设置为 0。建议使用 1.0。
- 两个模型都支持 204K 超长上下文窗口,适合处理长文本任务。
### 定价
- 请访问 [MiniMax 定价页面](https://platform.minimaxi.com/document/Price) 查看最新价格
---
## ⚙️ 高级配置 | Advanced Configuration
### 自定义参数 | Custom Parameters
@ -268,8 +315,8 @@ docker compose logs backend | grep -i "error"
|---------|---------|------|
| **文档摘要** | Qwen-Plus, GLM-4 | 平衡性能和成本 |
| **代码分析** | DeepSeek-Coder | 代码专用 |
| **长文本处理** | Kimi 128K | 超长上下文 |
| **快速响应** | Qwen-Turbo, GLM-4-Flash | 速度优先 |
| **长文本处理** | Kimi 128K, MiniMax-M2.5 (204K) | 超长上下文 |
| **快速响应** | Qwen-Turbo, GLM-4-Flash, MiniMax-M2.5-highspeed | 速度优先 |
### 2. 成本优化
@ -294,6 +341,7 @@ docker compose logs backend | grep -i "error"
- [阿里云百炼文档](https://help.aliyun.com/zh/model-studio/)
- [Moonshot AI 文档](https://platform.moonshot.cn/docs)
- [智谱 AI 文档](https://open.bigmodel.cn/dev/api)
- [MiniMax 文档](https://platform.minimaxi.com/document/Guides)
### SurfSense 文档

View file

@ -12,6 +12,11 @@ REDIS_APP_URL=redis://localhost:6379/0
# Optional: TTL in seconds for connector indexing lock key
# CONNECTOR_INDEXING_LOCK_TTL_SECONDS=28800
# Platform Web Search (SearXNG)
# Set this to enable built-in web search. Docker Compose sets it automatically.
# Only uncomment if running the backend outside Docker (e.g. uvicorn on host).
# SEARXNG_DEFAULT_HOST=http://localhost:8888
#Electric(for migrations only)
ELECTRIC_DB_USER=electric
ELECTRIC_DB_PASSWORD=electric_password

View file

@ -6,6 +6,7 @@ __pycache__/
.flashrank_cache
surf_new_backend.egg-info/
podcasts/
video_presentation_audio/
sandbox_files/
temp_audio/
celerybeat-schedule*

View file

@ -0,0 +1,23 @@
"""Add MINIMAX to LiteLLMProvider enum
Revision ID: 106
Revises: 105
"""
from collections.abc import Sequence
from alembic import op
revision: str = "106"
down_revision: str | None = "105"
branch_labels: str | Sequence[str] | None = None
depends_on: str | Sequence[str] | None = None
def upgrade() -> None:
op.execute("COMMIT")
op.execute("ALTER TYPE litellmprovider ADD VALUE IF NOT EXISTS 'MINIMAX'")
def downgrade() -> None:
pass

View file

@ -0,0 +1,85 @@
"""Add video_presentations table and video_presentation_status enum
Revision ID: 107
Revises: 106
"""
from collections.abc import Sequence
import sqlalchemy as sa
from sqlalchemy.dialects.postgresql import JSONB
from alembic import op
revision: str = "107"
down_revision: str | None = "106"
branch_labels: str | Sequence[str] | None = None
depends_on: str | Sequence[str] | None = None
video_presentation_status_enum = sa.Enum(
"pending",
"generating",
"ready",
"failed",
name="video_presentation_status",
)
def upgrade() -> None:
video_presentation_status_enum.create(op.get_bind(), checkfirst=True)
op.create_table(
"video_presentations",
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
sa.Column("title", sa.String(length=500), nullable=False),
sa.Column("slides", JSONB(), nullable=True),
sa.Column("scene_codes", JSONB(), nullable=True),
sa.Column(
"status",
video_presentation_status_enum,
server_default="ready",
nullable=False,
),
sa.Column("search_space_id", sa.Integer(), nullable=False),
sa.Column("thread_id", sa.Integer(), nullable=True),
sa.Column(
"created_at",
sa.TIMESTAMP(timezone=True),
server_default=sa.text("now()"),
nullable=False,
),
sa.ForeignKeyConstraint(
["search_space_id"],
["searchspaces.id"],
ondelete="CASCADE",
),
sa.ForeignKeyConstraint(
["thread_id"],
["new_chat_threads.id"],
ondelete="SET NULL",
),
sa.PrimaryKeyConstraint("id"),
)
op.create_index(
"ix_video_presentations_status",
"video_presentations",
["status"],
)
op.create_index(
"ix_video_presentations_thread_id",
"video_presentations",
["thread_id"],
)
op.create_index(
"ix_video_presentations_created_at",
"video_presentations",
["created_at"],
)
def downgrade() -> None:
op.drop_index("ix_video_presentations_created_at", table_name="video_presentations")
op.drop_index("ix_video_presentations_thread_id", table_name="video_presentations")
op.drop_index("ix_video_presentations_status", table_name="video_presentations")
op.drop_table("video_presentations")
video_presentation_status_enum.drop(op.get_bind(), checkfirst=True)

View file

@ -37,13 +37,15 @@ _perf_log = get_perf_logger()
# =============================================================================
# Maps SearchSourceConnectorType enum values to the searchable document/connector types
# used by the knowledge_base tool. Some connectors map to different document types.
# used by the knowledge_base and web_search tools.
# Live search connectors (TAVILY_API, LINKUP_API, BAIDU_SEARCH_API) are routed to
# the web_search tool; all others go to search_knowledge_base.
_CONNECTOR_TYPE_TO_SEARCHABLE: dict[str, str] = {
# Direct mappings (connector type == searchable type)
# Live search connectors (handled by web_search tool)
"TAVILY_API": "TAVILY_API",
"SEARXNG_API": "SEARXNG_API",
"LINKUP_API": "LINKUP_API",
"BAIDU_SEARCH_API": "BAIDU_SEARCH_API",
# Local/indexed connectors (handled by search_knowledge_base tool)
"SLACK_CONNECTOR": "SLACK_CONNECTOR",
"TEAMS_CONNECTOR": "TEAMS_CONNECTOR",
"NOTION_CONNECTOR": "NOTION_CONNECTOR",
@ -233,6 +235,7 @@ async def create_surfsense_deep_agent(
available_document_types = await connector_service.get_available_document_types(
search_space_id
)
except Exception as e:
logging.warning(f"Failed to discover available connectors/document types: {e}")
_perf_log.info(

View file

@ -59,6 +59,7 @@ PROVIDER_MAP = {
"DATABRICKS": "databricks",
"COMETAPI": "cometapi",
"HUGGINGFACE": "huggingface",
"MINIMAX": "openai",
"CUSTOM": "custom",
}

View file

@ -99,14 +99,8 @@ _TOOL_INSTRUCTIONS["search_knowledge_base"] = """
- IMPORTANT: When searching for information (meetings, schedules, notes, tasks, etc.), ALWAYS search broadly
across ALL sources first by omitting connectors_to_search. The user may store information in various places
including calendar apps, note-taking apps (Obsidian, Notion), chat apps (Slack, Discord), and more.
- IMPORTANT (REAL-TIME / PUBLIC WEB QUERIES): For questions that require current public web data
(e.g., live exchange rates, stock prices, breaking news, weather, current events), you MUST call
`search_knowledge_base` using live web connectors via `connectors_to_search`:
["LINKUP_API", "TAVILY_API", "SEARXNG_API", "BAIDU_SEARCH_API"].
- For these real-time/public web queries, DO NOT answer from memory and DO NOT say you lack internet
access before attempting a live connector search.
- If the live connectors return no relevant results, explain that live web sources did not return enough
data and ask the user if they want you to retry with a refined query.
- This tool searches ONLY local/indexed data (uploaded files, Notion, Slack, browser extension captures, etc.).
For real-time web search (current events, news, live data), use the `web_search` tool instead.
- FALLBACK BEHAVIOR: If the search returns no relevant results, you MAY then answer using your own
general knowledge, but clearly indicate that no matching information was found in the knowledge base.
- Only narrow to specific connectors if the user explicitly asks (e.g., "check my Slack" or "in my calendar").
@ -138,6 +132,17 @@ _TOOL_INSTRUCTIONS["generate_podcast"] = """
- After calling this tool, inform the user that podcast generation has started and they will see the player when it's ready (takes 3-5 minutes).
"""
_TOOL_INSTRUCTIONS["generate_video_presentation"] = """
- generate_video_presentation: Generate a video presentation from provided content.
- Use this when the user asks to create a video, presentation, slides, or slide deck.
- Trigger phrases: "give me a presentation", "create slides", "generate a video", "make a slide deck", "turn this into a presentation"
- Args:
- source_content: The text content to turn into a presentation. The more detailed, the better.
- video_title: Optional title (default: "SurfSense Presentation")
- user_prompt: Optional style instructions (e.g., "Make it technical and detailed")
- After calling this tool, inform the user that generation has started and they will see the presentation when it's ready.
"""
_TOOL_INSTRUCTIONS["generate_report"] = """
- generate_report: Generate or revise a structured Markdown report artifact.
- WHEN TO CALL THIS TOOL the message must contain a creation or modification VERB directed at producing a deliverable:
@ -271,6 +276,24 @@ _TOOL_INSTRUCTIONS["scrape_webpage"] = """
* Don't show every image - just the most relevant 1-3 images that enhance understanding.
"""
_TOOL_INSTRUCTIONS["web_search"] = """
- web_search: Search the web for real-time information using all configured search engines.
- Use this for current events, news, prices, weather, public facts, or any question requiring
up-to-date information from the internet.
- This tool dispatches to all configured search engines (SearXNG, Tavily, Linkup, Baidu) in
parallel and merges the results.
- IMPORTANT (REAL-TIME / PUBLIC WEB QUERIES): For questions that require current public web data
(e.g., live exchange rates, stock prices, breaking news, weather, current events), you MUST call
`web_search` instead of answering from memory.
- For these real-time/public web queries, DO NOT answer from memory and DO NOT say you lack internet
access before attempting a web search.
- If the search returns no relevant results, explain that web sources did not return enough
data and ask the user if they want you to retry with a refined query.
- Args:
- query: The search query - use specific, descriptive terms
- top_k: Number of results to retrieve (default: 10, max: 50)
"""
# Memory tool instructions have private and shared variants.
# We store them keyed as "save_memory" / "recall_memory" with sub-keys.
_MEMORY_TOOL_INSTRUCTIONS: dict[str, dict[str, str]] = {
@ -401,7 +424,7 @@ _TOOL_EXAMPLES["search_knowledge_base"] = """
- User: "Check my Obsidian notes for meeting notes"
- Call: `search_knowledge_base(query="meeting notes", connectors_to_search=["OBSIDIAN_CONNECTOR"])`
- User: "search me current usd to inr rate"
- Call: `search_knowledge_base(query="current USD to INR exchange rate", connectors_to_search=["LINKUP_API", "TAVILY_API", "SEARXNG_API", "BAIDU_SEARCH_API"])`
- Call: `web_search(query="current USD to INR exchange rate")`
- Then answer using the returned live web results with citations.
"""
@ -426,6 +449,16 @@ _TOOL_EXAMPLES["generate_podcast"] = """
- Then: `generate_podcast(source_content="Key insights about quantum computing from the knowledge base:\\n\\n[Comprehensive summary of all relevant search results with key facts, concepts, and findings]", podcast_title="Quantum Computing Explained")`
"""
_TOOL_EXAMPLES["generate_video_presentation"] = """
- User: "Give me a presentation about AI trends based on what we discussed"
- First search for relevant content, then call: `generate_video_presentation(source_content="Based on our conversation and search results: [detailed summary of chat + search findings]", video_title="AI Trends Presentation")`
- User: "Create slides summarizing this conversation"
- Call: `generate_video_presentation(source_content="Complete conversation summary:\\n\\nUser asked about [topic 1]:\\n[Your detailed response]\\n\\nUser then asked about [topic 2]:\\n[Your detailed response]\\n\\n[Continue for all exchanges in the conversation]", video_title="Conversation Summary")`
- User: "Make a video presentation about quantum computing"
- First search: `search_knowledge_base(query="quantum computing")`
- Then: `generate_video_presentation(source_content="Key insights about quantum computing from the knowledge base:\\n\\n[Comprehensive summary of all relevant search results with key facts, concepts, and findings]", video_title="Quantum Computing Explained")`
"""
_TOOL_EXAMPLES["generate_report"] = """
- User: "Generate a report about AI trends"
- Call: `generate_report(topic="AI Trends Report", source_strategy="kb_search", search_queries=["AI trends recent developments", "artificial intelligence industry trends", "AI market growth and predictions"], report_style="detailed")`
@ -471,11 +504,23 @@ _TOOL_EXAMPLES["generate_image"] = """
- Step 2: `display_image(src="<returned_url>", alt="Bean Dream coffee shop logo", title="Generated Image")`
"""
_TOOL_EXAMPLES["web_search"] = """
- User: "What's the current USD to INR exchange rate?"
- Call: `web_search(query="current USD to INR exchange rate")`
- Then answer using the returned web results with citations.
- User: "What's the latest news about AI?"
- Call: `web_search(query="latest AI news today")`
- User: "What's the weather in New York?"
- Call: `web_search(query="weather New York today")`
"""
# All tool names that have prompt instructions (order matters for prompt readability)
_ALL_TOOL_NAMES_ORDERED = [
"search_surfsense_docs",
"search_knowledge_base",
"web_search",
"generate_podcast",
"generate_video_presentation",
"generate_report",
"link_preview",
"display_image",
@ -543,7 +588,7 @@ DISABLED TOOLS (by user):
The following tools are available in SurfSense but have been disabled by the user for this session: {disabled_list}.
You do NOT have access to these tools and MUST NOT claim you can use them.
If the user asks about a capability provided by a disabled tool, let them know the relevant tool
is currently disabled and they can re-enable it from the tools menu (wrench icon) in the composer toolbar.
is currently disabled and they can re-enable it.
""")
parts.append("\n</tools>\n")
@ -595,11 +640,10 @@ The documents you receive are structured like this:
</document_content>
</document>
**Live web search results (URL chunk IDs):**
**Web search results (URL chunk IDs):**
<document>
<document_metadata>
<document_id>TAVILY_API::Some Title::https://example.com/article</document_id>
<document_type>TAVILY_API</document_type>
<document_type>WEB_SEARCH</document_type>
<title><![CDATA[Some web search result]]></title>
<url><![CDATA[https://example.com/article]]></url>
</document_metadata>

View file

@ -8,6 +8,7 @@ Available tools:
- search_knowledge_base: Search the user's personal knowledge base
- search_surfsense_docs: Search Surfsense documentation for usage help
- generate_podcast: Generate audio podcasts from content
- generate_video_presentation: Generate video presentations with slides and narration
- generate_image: Generate images from text descriptions using AI models
- link_preview: Fetch rich previews for URLs
- display_image: Display images in chat
@ -39,6 +40,7 @@ from .registry import (
from .scrape_webpage import create_scrape_webpage_tool
from .search_surfsense_docs import create_search_surfsense_docs_tool
from .user_memory import create_recall_memory_tool, create_save_memory_tool
from .video_presentation import create_generate_video_presentation_tool
__all__ = [
# Registry
@ -51,6 +53,7 @@ __all__ = [
"create_display_image_tool",
"create_generate_image_tool",
"create_generate_podcast_tool",
"create_generate_video_presentation_tool",
"create_link_preview_tool",
"create_recall_memory_tool",
"create_save_memory_tool",

View file

@ -23,11 +23,10 @@ from app.db import shielded_async_session
from app.services.connector_service import ConnectorService
from app.utils.perf import get_perf_logger
# Connectors that call external live-search APIs (no local DB / embedding needed).
# These are never filtered by available_document_types.
# Connectors that call external live-search APIs. These are handled by the
# ``web_search`` tool and must be excluded from knowledge-base searches.
_LIVE_SEARCH_CONNECTORS: set[str] = {
"TAVILY_API",
"SEARXNG_API",
"LINKUP_API",
"BAIDU_SEARCH_API",
}
@ -190,10 +189,6 @@ _ALL_CONNECTORS: list[str] = [
"GOOGLE_DRIVE_FILE",
"DISCORD_CONNECTOR",
"AIRTABLE_CONNECTOR",
"TAVILY_API",
"SEARXNG_API",
"LINKUP_API",
"BAIDU_SEARCH_API",
"LUMA_CONNECTOR",
"NOTE",
"BOOKSTACK_CONNECTOR",
@ -227,10 +222,6 @@ CONNECTOR_DESCRIPTIONS: dict[str, str] = {
"GOOGLE_DRIVE_FILE": "Google Drive files and documents (personal cloud storage)",
"DISCORD_CONNECTOR": "Discord server conversations and shared content (personal community)",
"AIRTABLE_CONNECTOR": "Airtable records, tables, and database content (personal data)",
"TAVILY_API": "Tavily web search API results (real-time web search)",
"SEARXNG_API": "SearxNG search API results (privacy-focused web search)",
"LINKUP_API": "Linkup search API results (web search)",
"BAIDU_SEARCH_API": "Baidu search API results (Chinese web search)",
"LUMA_CONNECTOR": "Luma events and meetings",
"WEBCRAWLER_CONNECTOR": "Webpages indexed by SurfSense (personally selected websites)",
"CRAWLED_URL": "Webpages indexed by SurfSense (personally selected websites)",
@ -268,14 +259,15 @@ def _normalize_connectors(
valid_set = (
set(available_connectors) if available_connectors else set(_ALL_CONNECTORS)
)
valid_set -= _LIVE_SEARCH_CONNECTORS
if not connectors_to_search:
# Search all available connectors if none specified
return (
base = (
list(available_connectors)
if available_connectors
else list(_ALL_CONNECTORS)
)
return [c for c in base if c not in _LIVE_SEARCH_CONNECTORS]
normalized: list[str] = []
for raw in connectors_to_search:
@ -302,15 +294,14 @@ def _normalize_connectors(
out.append(c)
# Fallback to all available if nothing matched
return (
out
if out
else (
if not out:
base = (
list(available_connectors)
if available_connectors
else list(_ALL_CONNECTORS)
)
)
return [c for c in base if c not in _LIVE_SEARCH_CONNECTORS]
return out
# =============================================================================
@ -479,7 +470,6 @@ def format_documents_for_context(
# a numeric chunk_id (the numeric IDs are meaningless auto-incremented counters).
live_search_connectors = {
"TAVILY_API",
"SEARXNG_API",
"LINKUP_API",
"BAIDU_SEARCH_API",
}
@ -623,13 +613,11 @@ async def search_knowledge_base_async(
connectors = _normalize_connectors(connectors_to_search, available_connectors)
# --- Optimization 1: skip local connectors that have zero indexed documents ---
# --- Optimization 1: skip connectors that have zero indexed documents ---
if available_document_types:
doc_types_set = set(available_document_types)
before_count = len(connectors)
connectors = [
c for c in connectors if c in _LIVE_SEARCH_CONNECTORS or c in doc_types_set
]
connectors = [c for c in connectors if c in doc_types_set]
skipped = before_count - len(connectors)
if skipped:
perf.info(
@ -664,9 +652,7 @@ async def search_knowledge_base_async(
"[kb_search] degenerate query %r detected - falling back to recency browse",
query,
)
local_connectors = [c for c in connectors if c not in _LIVE_SEARCH_CONNECTORS]
if not local_connectors:
local_connectors = [None] # type: ignore[list-item]
browse_connectors = connectors if connectors else [None] # type: ignore[list-item]
browse_results = await asyncio.gather(
*[
@ -677,7 +663,7 @@ async def search_knowledge_base_async(
start_date=resolved_start_date,
end_date=resolved_end_date,
)
for c in local_connectors
for c in browse_connectors
]
)
for docs in browse_results:
@ -702,66 +688,20 @@ async def search_knowledge_base_async(
)
return result
# Specs for live-search connectors (external APIs, no local DB/embedding).
live_connector_specs: dict[str, tuple[str, bool, bool, dict[str, Any]]] = {
"TAVILY_API": ("search_tavily", False, True, {}),
"SEARXNG_API": ("search_searxng", False, True, {}),
"LINKUP_API": ("search_linkup", False, False, {"mode": "standard"}),
"BAIDU_SEARCH_API": ("search_baidu", False, True, {}),
}
# --- Optimization 2: compute the query embedding once, share across all local searches ---
precomputed_embedding: list[float] | None = None
has_local_connectors = any(c not in _LIVE_SEARCH_CONNECTORS for c in connectors)
if has_local_connectors:
from app.config import config as app_config
from app.config import config as app_config
t_embed = time.perf_counter()
precomputed_embedding = app_config.embedding_model_instance.embed(query)
perf.info(
"[kb_search] shared embedding computed in %.3fs",
time.perf_counter() - t_embed,
)
t_embed = time.perf_counter()
precomputed_embedding = app_config.embedding_model_instance.embed(query)
perf.info(
"[kb_search] shared embedding computed in %.3fs",
time.perf_counter() - t_embed,
)
max_parallel_searches = 4
semaphore = asyncio.Semaphore(max_parallel_searches)
async def _search_one_connector(connector: str) -> list[dict[str, Any]]:
is_live = connector in _LIVE_SEARCH_CONNECTORS
if is_live:
spec = live_connector_specs.get(connector)
if spec is None:
return []
method_name, includes_date_range, includes_top_k, extra_kwargs = spec
kwargs: dict[str, Any] = {
"user_query": query,
"search_space_id": search_space_id,
**extra_kwargs,
}
if includes_top_k:
kwargs["top_k"] = top_k
if includes_date_range:
kwargs["start_date"] = resolved_start_date
kwargs["end_date"] = resolved_end_date
try:
t_conn = time.perf_counter()
async with semaphore, shielded_async_session() as isolated_session:
svc = ConnectorService(isolated_session, search_space_id)
_, chunks = await getattr(svc, method_name)(**kwargs)
perf.info(
"[kb_search] connector=%s results=%d in %.3fs",
connector,
len(chunks),
time.perf_counter() - t_conn,
)
return chunks
except Exception as e:
perf.warning("[kb_search] connector=%s FAILED: %s", connector, e)
return []
# --- Optimization 3: call _combined_rrf_search directly with shared embedding ---
try:
t_conn = time.perf_counter()
async with semaphore, shielded_async_session() as isolated_session:
@ -967,7 +907,9 @@ Focus searches on these types for best results."""
# This is what the LLM sees when deciding whether/how to use the tool
dynamic_description = f"""Search the user's personal knowledge base for relevant information.
Use this tool to find documents, notes, files, web pages, and other content that may help answer the user's question.
Use this tool to find documents, notes, files, web pages, and other content the user has indexed.
This searches ONLY local/indexed data (uploaded files, Notion, Slack, browser extension captures, etc.).
For real-time web search (current events, news, live data), use the `web_search` tool instead.
IMPORTANT:
- Always craft specific, descriptive search queries using natural language keywords.
@ -977,9 +919,6 @@ IMPORTANT:
- If the user requests a specific source type (e.g. "my notes", "Slack messages"), pass `connectors_to_search=[...]` using the enums below.
- If `connectors_to_search` is omitted/empty, the system will search broadly.
- Only connectors that are enabled/configured for this search space are available.{doc_types_info}
- For real-time/public web queries (e.g., current exchange rates, stock prices, breaking news, weather),
explicitly include live web connectors in `connectors_to_search`, prioritizing:
["LINKUP_API", "TAVILY_API", "SEARXNG_API", "BAIDU_SEARCH_API"].
## Available connector enums for `connectors_to_search`

View file

@ -4,60 +4,15 @@ Podcast generation tool for the SurfSense agent.
This module provides a factory function for creating the generate_podcast tool
that submits a Celery task for background podcast generation. The frontend
polls for completion and auto-updates when the podcast is ready.
Duplicate request prevention:
- Only one podcast can be generated at a time per search space
- Uses Redis to track active podcast tasks
- Returns a friendly message if a podcast is already being generated
"""
from typing import Any
import redis
from langchain_core.tools import tool
from sqlalchemy.ext.asyncio import AsyncSession
from app.config import config
from app.db import Podcast, PodcastStatus
# Redis connection for tracking active podcast tasks
# Defaults to the Celery broker when REDIS_APP_URL is not set
REDIS_URL = config.REDIS_APP_URL
_redis_client: redis.Redis | None = None
def get_redis_client() -> redis.Redis:
"""Get or create Redis client for podcast task tracking."""
global _redis_client
if _redis_client is None:
_redis_client = redis.from_url(REDIS_URL, decode_responses=True)
return _redis_client
def _redis_key(search_space_id: int) -> str:
return f"podcast:generating:{search_space_id}"
def get_generating_podcast_id(search_space_id: int) -> int | None:
"""Get the podcast ID currently being generated for this search space."""
try:
client = get_redis_client()
value = client.get(_redis_key(search_space_id))
return int(value) if value else None
except Exception:
return None
def set_generating_podcast(search_space_id: int, podcast_id: int) -> None:
"""Mark a podcast as currently generating for this search space."""
try:
client = get_redis_client()
client.setex(_redis_key(search_space_id), 1800, str(podcast_id))
except Exception as e:
print(
f"[generate_podcast] Warning: Could not set generating podcast in Redis: {e}"
)
def create_generate_podcast_tool(
search_space_id: int,
@ -109,18 +64,6 @@ def create_generate_podcast_tool(
- message: Status message (or "error" field if status is failed)
"""
try:
generating_podcast_id = get_generating_podcast_id(search_space_id)
if generating_podcast_id:
print(
f"[generate_podcast] Blocked duplicate request. Generating podcast: {generating_podcast_id}"
)
return {
"status": PodcastStatus.GENERATING.value,
"podcast_id": generating_podcast_id,
"title": podcast_title,
"message": "A podcast is already being generated. Please wait for it to complete.",
}
podcast = Podcast(
title=podcast_title,
status=PodcastStatus.PENDING,
@ -142,8 +85,6 @@ def create_generate_podcast_tool(
user_prompt=user_prompt,
)
set_generating_podcast(search_space_id, podcast.id)
print(f"[generate_podcast] Created podcast {podcast.id}, task: {task.id}")
return {

View file

@ -73,6 +73,8 @@ from .shared_memory import (
create_save_shared_memory_tool,
)
from .user_memory import create_recall_memory_tool, create_save_memory_tool
from .video_presentation import create_generate_video_presentation_tool
from .web_search import create_web_search_tool
# =============================================================================
# Tool Definition
@ -135,6 +137,17 @@ BUILTIN_TOOLS: list[ToolDefinition] = [
),
requires=["search_space_id", "db_session", "thread_id"],
),
# Video presentation generation tool
ToolDefinition(
name="generate_video_presentation",
description="Generate a video presentation with slides and narration from provided content",
factory=lambda deps: create_generate_video_presentation_tool(
search_space_id=deps["search_space_id"],
db_session=deps["db_session"],
thread_id=deps["thread_id"],
),
requires=["search_space_id", "db_session", "thread_id"],
),
# Report generation tool (inline, short-lived sessions for DB ops)
# Supports internal KB search via source_strategy so the agent doesn't
# need to call search_knowledge_base separately before generating.
@ -186,7 +199,16 @@ BUILTIN_TOOLS: list[ToolDefinition] = [
),
requires=[], # firecrawl_api_key is optional
),
# Note: write_todos is now provided by TodoListMiddleware from deepagents
# Web search tool — real-time web search via SearXNG + user-configured engines
ToolDefinition(
name="web_search",
description="Search the web for real-time information using configured search engines",
factory=lambda deps: create_web_search_tool(
search_space_id=deps.get("search_space_id"),
available_connectors=deps.get("available_connectors"),
),
requires=[],
),
# Surfsense documentation search tool
ToolDefinition(
name="search_surfsense_docs",

View file

@ -0,0 +1,87 @@
"""
Video presentation generation tool for the SurfSense agent.
This module provides a factory function for creating the generate_video_presentation
tool that submits a Celery task for background video presentation generation.
The frontend polls for completion and auto-updates when the presentation is ready.
"""
from typing import Any
from langchain_core.tools import tool
from sqlalchemy.ext.asyncio import AsyncSession
from app.db import VideoPresentation, VideoPresentationStatus
def create_generate_video_presentation_tool(
search_space_id: int,
db_session: AsyncSession,
thread_id: int | None = None,
):
"""
Factory function to create the generate_video_presentation tool with injected dependencies.
Pre-creates video presentation record with pending status so the ID is available
immediately for frontend polling.
"""
@tool
async def generate_video_presentation(
source_content: str,
video_title: str = "SurfSense Presentation",
user_prompt: str | None = None,
) -> dict[str, Any]:
"""Generate a video presentation from the provided content.
Use this tool when the user asks to create a video, presentation, slides, or slide deck.
Args:
source_content: The text content to turn into a presentation.
video_title: Title for the presentation (default: "SurfSense Presentation")
user_prompt: Optional style/tone instructions.
"""
try:
video_pres = VideoPresentation(
title=video_title,
status=VideoPresentationStatus.PENDING,
search_space_id=search_space_id,
thread_id=thread_id,
)
db_session.add(video_pres)
await db_session.commit()
await db_session.refresh(video_pres)
from app.tasks.celery_tasks.video_presentation_tasks import (
generate_video_presentation_task,
)
task = generate_video_presentation_task.delay(
video_presentation_id=video_pres.id,
source_content=source_content,
search_space_id=search_space_id,
user_prompt=user_prompt,
)
print(
f"[generate_video_presentation] Created video presentation {video_pres.id}, task: {task.id}"
)
return {
"status": VideoPresentationStatus.PENDING.value,
"video_presentation_id": video_pres.id,
"title": video_title,
"message": "Video presentation generation started. This may take a few minutes.",
}
except Exception as e:
error_message = str(e)
print(f"[generate_video_presentation] Error: {error_message}")
return {
"status": VideoPresentationStatus.FAILED.value,
"error": error_message,
"title": video_title,
"video_presentation_id": None,
}
return generate_video_presentation

View file

@ -0,0 +1,247 @@
"""
Web search tool for the SurfSense agent.
Provides a unified tool for real-time web searches that dispatches to all
configured search engines: the platform SearXNG instance (always available)
plus any user-configured live-search connectors (Tavily, Linkup, Baidu).
"""
import asyncio
import json
import time
from typing import Any
from langchain_core.tools import StructuredTool
from pydantic import BaseModel, Field
from app.db import shielded_async_session
from app.services.connector_service import ConnectorService
from app.utils.perf import get_perf_logger
_LIVE_SEARCH_CONNECTORS: set[str] = {
"TAVILY_API",
"LINKUP_API",
"BAIDU_SEARCH_API",
}
_LIVE_CONNECTOR_SPECS: dict[str, tuple[str, bool, bool, dict[str, Any]]] = {
"TAVILY_API": ("search_tavily", False, True, {}),
"LINKUP_API": ("search_linkup", False, False, {"mode": "standard"}),
"BAIDU_SEARCH_API": ("search_baidu", False, True, {}),
}
_CONNECTOR_LABELS: dict[str, str] = {
"TAVILY_API": "Tavily",
"LINKUP_API": "Linkup",
"BAIDU_SEARCH_API": "Baidu",
}
class WebSearchInput(BaseModel):
"""Input schema for the web_search tool."""
query: str = Field(
description="The search query to look up on the web. Use specific, descriptive terms.",
)
top_k: int = Field(
default=10,
description="Number of results to retrieve (default: 10, max: 50).",
)
def _format_web_results(
documents: list[dict[str, Any]],
*,
max_chars: int = 50_000,
) -> str:
"""Format web search results into XML suitable for the LLM context."""
if not documents:
return "No web search results found."
parts: list[str] = []
total_chars = 0
for doc in documents:
doc_info = doc.get("document") or {}
metadata = doc_info.get("metadata") or {}
title = doc_info.get("title") or "Web Result"
url = metadata.get("url") or ""
content = (doc.get("content") or "").strip()
source = metadata.get("document_type") or doc.get("source") or "WEB_SEARCH"
if not content:
continue
metadata_json = json.dumps(metadata, ensure_ascii=False)
doc_xml = "\n".join(
[
"<document>",
"<document_metadata>",
f" <document_type>{source}</document_type>",
f" <title><![CDATA[{title}]]></title>",
f" <url><![CDATA[{url}]]></url>",
f" <metadata_json><![CDATA[{metadata_json}]]></metadata_json>",
"</document_metadata>",
"<document_content>",
f" <chunk id='{url}'><![CDATA[{content}]]></chunk>",
"</document_content>",
"</document>",
"",
]
)
if total_chars + len(doc_xml) > max_chars:
parts.append("<!-- Output truncated to fit context window -->")
break
parts.append(doc_xml)
total_chars += len(doc_xml)
return "\n".join(parts).strip() or "No web search results found."
async def _search_live_connector(
connector: str,
query: str,
search_space_id: int,
top_k: int,
semaphore: asyncio.Semaphore,
) -> list[dict[str, Any]]:
"""Dispatch a single live-search connector (Tavily / Linkup / Baidu)."""
perf = get_perf_logger()
spec = _LIVE_CONNECTOR_SPECS.get(connector)
if spec is None:
return []
method_name, _includes_date_range, includes_top_k, extra_kwargs = spec
kwargs: dict[str, Any] = {
"user_query": query,
"search_space_id": search_space_id,
**extra_kwargs,
}
if includes_top_k:
kwargs["top_k"] = top_k
try:
t0 = time.perf_counter()
async with semaphore, shielded_async_session() as session:
svc = ConnectorService(session, search_space_id)
_, chunks = await getattr(svc, method_name)(**kwargs)
perf.info(
"[web_search] connector=%s results=%d in %.3fs",
connector,
len(chunks),
time.perf_counter() - t0,
)
return chunks
except Exception as e:
perf.warning("[web_search] connector=%s FAILED: %s", connector, e)
return []
def create_web_search_tool(
search_space_id: int | None = None,
available_connectors: list[str] | None = None,
) -> StructuredTool:
"""Factory for the ``web_search`` tool.
Dispatches in parallel to the platform SearXNG instance and any
user-configured live-search connectors (Tavily, Linkup, Baidu).
"""
active_live_connectors: list[str] = []
if available_connectors:
active_live_connectors = [
c for c in available_connectors if c in _LIVE_SEARCH_CONNECTORS
]
engine_names = ["SearXNG (platform default)"]
engine_names.extend(_CONNECTOR_LABELS.get(c, c) for c in active_live_connectors)
engines_summary = ", ".join(engine_names)
description = (
"Search the web for real-time information. "
"Use this for current events, news, prices, weather, public facts, or any "
"question that requires up-to-date information from the internet.\n\n"
f"Active search engines: {engines_summary}.\n"
"All configured engines are queried in parallel and results are merged."
)
_search_space_id = search_space_id
_active_live = active_live_connectors
async def _web_search_impl(query: str, top_k: int = 10) -> str:
from app.services import web_search_service
perf = get_perf_logger()
t0 = time.perf_counter()
clamped_top_k = min(max(1, top_k), 50)
semaphore = asyncio.Semaphore(4)
tasks: list[asyncio.Task[list[dict[str, Any]]]] = []
if web_search_service.is_available():
async def _searxng() -> list[dict[str, Any]]:
async with semaphore:
_result_obj, docs = await web_search_service.search(
query=query,
top_k=clamped_top_k,
)
return docs
tasks.append(asyncio.ensure_future(_searxng()))
if _search_space_id is not None:
for connector in _active_live:
tasks.append(
asyncio.ensure_future(
_search_live_connector(
connector=connector,
query=query,
search_space_id=_search_space_id,
top_k=clamped_top_k,
semaphore=semaphore,
)
)
)
if not tasks:
return "Web search is not available — no search engines are configured."
results_lists = await asyncio.gather(*tasks, return_exceptions=True)
all_documents: list[dict[str, Any]] = []
for result in results_lists:
if isinstance(result, BaseException):
perf.warning("[web_search] a search engine failed: %s", result)
continue
all_documents.extend(result)
seen_urls: set[str] = set()
deduplicated: list[dict[str, Any]] = []
for doc in all_documents:
url = ((doc.get("document") or {}).get("metadata") or {}).get("url", "")
if url and url in seen_urls:
continue
if url:
seen_urls.add(url)
deduplicated.append(doc)
formatted = _format_web_results(deduplicated)
perf.info(
"[web_search] query=%r engines=%d results=%d deduped=%d chars=%d in %.3fs",
query[:60],
len(tasks),
len(all_documents),
len(deduplicated),
len(formatted),
time.perf_counter() - t0,
)
return formatted
return StructuredTool(
name="web_search",
description=description,
coroutine=_web_search_impl,
args_schema=WebSearchInput,
)

View file

@ -0,0 +1,10 @@
"""Video Presentation LangGraph Agent.
This module defines a graph for generating video presentations
from source content, similar to the podcaster agent but producing
slide-based video presentations with TTS narration.
"""
from .graph import graph
__all__ = ["graph"]

View file

@ -0,0 +1,25 @@
"""Define the configurable parameters for the video presentation agent."""
from __future__ import annotations
from dataclasses import dataclass, fields
from langchain_core.runnables import RunnableConfig
@dataclass(kw_only=True)
class Configuration:
"""The configuration for the video presentation agent."""
video_title: str
search_space_id: int
user_prompt: str | None = None
@classmethod
def from_runnable_config(
cls, config: RunnableConfig | None = None
) -> Configuration:
"""Create a Configuration instance from a RunnableConfig object."""
configurable = (config.get("configurable") or {}) if config else {}
_fields = {f.name for f in fields(cls) if f.init}
return cls(**{k: v for k, v in configurable.items() if k in _fields})

View file

@ -0,0 +1,39 @@
from langgraph.graph import StateGraph
from .configuration import Configuration
from .nodes import (
assign_slide_themes,
create_presentation_slides,
create_slide_audio,
generate_slide_scene_codes,
)
from .state import State
def build_graph():
workflow = StateGraph(State, config_schema=Configuration)
workflow.add_node("create_presentation_slides", create_presentation_slides)
workflow.add_node("create_slide_audio", create_slide_audio)
workflow.add_node("assign_slide_themes", assign_slide_themes)
workflow.add_node("generate_slide_scene_codes", generate_slide_scene_codes)
# Fan-out: after slides are parsed, run audio generation and theme
# assignment in parallel (themes only need slide metadata, not audio).
workflow.add_edge("__start__", "create_presentation_slides")
workflow.add_edge("create_presentation_slides", "create_slide_audio")
workflow.add_edge("create_presentation_slides", "assign_slide_themes")
# Fan-in: scene code generation waits for both audio and themes.
workflow.add_edge("create_slide_audio", "generate_slide_scene_codes")
workflow.add_edge("assign_slide_themes", "generate_slide_scene_codes")
workflow.add_edge("generate_slide_scene_codes", "__end__")
graph = workflow.compile()
graph.name = "Surfsense Video Presentation"
return graph
graph = build_graph()

View file

@ -0,0 +1,580 @@
import asyncio
import contextlib
import json
import math
import os
import shutil
import uuid
from pathlib import Path
from typing import Any
from ffmpeg.asyncio import FFmpeg
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.runnables import RunnableConfig
from litellm import aspeech
from app.config import config as app_config
from app.services.kokoro_tts_service import get_kokoro_tts_service
from app.services.llm_service import get_agent_llm
from .configuration import Configuration
from .prompts import (
DEFAULT_DURATION_IN_FRAMES,
FPS,
REFINE_SCENE_SYSTEM_PROMPT,
REMOTION_SCENE_SYSTEM_PROMPT,
THEME_PRESETS,
build_scene_generation_user_prompt,
build_theme_assignment_user_prompt,
get_slide_generation_prompt,
get_theme_assignment_system_prompt,
pick_theme_and_mode_fallback,
)
from .state import (
PresentationSlides,
SlideAudioResult,
SlideContent,
SlideSceneCode,
State,
)
from .utils import get_voice_for_provider
MAX_REFINE_ATTEMPTS = 3
async def create_presentation_slides(
state: State, config: RunnableConfig
) -> dict[str, Any]:
"""Parse source content into structured presentation slides using LLM."""
configuration = Configuration.from_runnable_config(config)
search_space_id = configuration.search_space_id
user_prompt = configuration.user_prompt
llm = await get_agent_llm(state.db_session, search_space_id)
if not llm:
error_message = f"No LLM configured for search space {search_space_id}"
print(error_message)
raise RuntimeError(error_message)
prompt = get_slide_generation_prompt(user_prompt)
messages = [
SystemMessage(content=prompt),
HumanMessage(
content=f"<source_content>{state.source_content}</source_content>"
),
]
llm_response = await llm.ainvoke(messages)
try:
presentation = PresentationSlides.model_validate(
json.loads(llm_response.content)
)
except (json.JSONDecodeError, ValueError) as e:
print(f"Direct JSON parsing failed, trying fallback approach: {e!s}")
try:
content = llm_response.content
json_start = content.find("{")
json_end = content.rfind("}") + 1
if json_start >= 0 and json_end > json_start:
json_str = content[json_start:json_end]
parsed_data = json.loads(json_str)
presentation = PresentationSlides.model_validate(parsed_data)
print("Successfully parsed presentation slides using fallback approach")
else:
error_message = f"Could not find valid JSON in LLM response. Raw response: {content}"
print(error_message)
raise ValueError(error_message)
except (json.JSONDecodeError, ValueError) as e2:
error_message = f"Error parsing LLM response (fallback also failed): {e2!s}"
print(f"Error parsing LLM response: {e2!s}")
print(f"Raw response: {llm_response.content}")
raise
return {"slides": presentation.slides}
async def create_slide_audio(state: State, config: RunnableConfig) -> dict[str, Any]:
"""Generate TTS audio for each slide.
Each slide's speaker_transcripts are generated as individual TTS chunks,
then concatenated with ffmpeg (matching the POC in RemotionTets/api/tts).
"""
session_id = str(uuid.uuid4())
temp_dir = Path("temp_audio")
temp_dir.mkdir(exist_ok=True)
output_dir = Path("video_presentation_audio")
output_dir.mkdir(exist_ok=True)
slides = state.slides or []
voice = get_voice_for_provider(app_config.TTS_SERVICE, speaker_id=0)
ext = "wav" if app_config.TTS_SERVICE == "local/kokoro" else "mp3"
async def _generate_tts_chunk(text: str, chunk_path: str) -> str:
"""Generate a single TTS chunk and write it to *chunk_path*."""
if app_config.TTS_SERVICE == "local/kokoro":
kokoro_service = await get_kokoro_tts_service(lang_code="a")
await kokoro_service.generate_speech(
text=text,
voice=voice,
speed=1.0,
output_path=chunk_path,
)
else:
kwargs: dict[str, Any] = {
"model": app_config.TTS_SERVICE,
"api_key": app_config.TTS_SERVICE_API_KEY,
"voice": voice,
"input": text,
"max_retries": 2,
"timeout": 600,
}
if app_config.TTS_SERVICE_API_BASE:
kwargs["api_base"] = app_config.TTS_SERVICE_API_BASE
response = await aspeech(**kwargs)
with open(chunk_path, "wb") as f:
f.write(response.content)
return chunk_path
async def _concat_with_ffmpeg(chunk_paths: list[str], output_file: str) -> None:
"""Concatenate multiple audio chunks into one file using async ffmpeg."""
ffmpeg = FFmpeg().option("y")
for chunk in chunk_paths:
ffmpeg = ffmpeg.input(chunk)
filter_parts = [f"[{i}:0]" for i in range(len(chunk_paths))]
filter_str = (
"".join(filter_parts) + f"concat=n={len(chunk_paths)}:v=0:a=1[outa]"
)
ffmpeg = ffmpeg.option("filter_complex", filter_str)
ffmpeg = ffmpeg.output(output_file, map="[outa]")
await ffmpeg.execute()
async def generate_audio_for_slide(slide: SlideContent) -> SlideAudioResult:
has_transcripts = (
slide.speaker_transcripts and len(slide.speaker_transcripts) > 0
)
if not has_transcripts:
print(
f"Slide {slide.slide_number}: no speaker_transcripts, "
f"using default duration ({DEFAULT_DURATION_IN_FRAMES} frames)"
)
return SlideAudioResult(
slide_number=slide.slide_number,
audio_file="",
duration_seconds=DEFAULT_DURATION_IN_FRAMES / FPS,
duration_in_frames=DEFAULT_DURATION_IN_FRAMES,
)
output_file = str(output_dir / f"{session_id}_slide_{slide.slide_number}.{ext}")
chunk_paths: list[str] = []
try:
chunk_paths = [
str(
temp_dir
/ f"{session_id}_slide_{slide.slide_number}_chunk_{i}.{ext}"
)
for i in range(len(slide.speaker_transcripts))
]
for i, text in enumerate(slide.speaker_transcripts):
print(
f" Slide {slide.slide_number} chunk {i + 1}/"
f"{len(slide.speaker_transcripts)}: "
f'"{text[:60]}..."'
)
await asyncio.gather(
*[
_generate_tts_chunk(text, path)
for text, path in zip(
slide.speaker_transcripts, chunk_paths, strict=False
)
]
)
if len(chunk_paths) == 1:
shutil.move(chunk_paths[0], output_file)
else:
print(
f" Concatenating {len(chunk_paths)} chunks for slide "
f"{slide.slide_number} with ffmpeg"
)
await _concat_with_ffmpeg(chunk_paths, output_file)
duration_seconds = await _get_audio_duration(output_file)
duration_in_frames = math.ceil(duration_seconds * FPS)
return SlideAudioResult(
slide_number=slide.slide_number,
audio_file=output_file,
duration_seconds=duration_seconds,
duration_in_frames=max(duration_in_frames, DEFAULT_DURATION_IN_FRAMES),
)
except Exception as e:
print(f"Error generating audio for slide {slide.slide_number}: {e!s}")
raise
finally:
for p in chunk_paths:
with contextlib.suppress(OSError):
os.remove(p)
tasks = [generate_audio_for_slide(slide) for slide in slides]
audio_results = await asyncio.gather(*tasks)
audio_results_sorted = sorted(audio_results, key=lambda r: r.slide_number)
print(
f"Generated audio for {len(audio_results_sorted)} slides "
f"(total duration: {sum(r.duration_seconds for r in audio_results_sorted):.1f}s)"
)
return {"slide_audio_results": audio_results_sorted}
async def _get_audio_duration(file_path: str) -> float:
"""Get audio duration in seconds using ffprobe (via python-ffmpeg).
Falls back to file-size estimation if ffprobe fails.
"""
try:
import subprocess
proc = await asyncio.create_subprocess_exec(
"ffprobe",
"-v",
"error",
"-show_entries",
"format=duration",
"-of",
"default=noprint_wrappers=1:nokey=1",
file_path,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=10)
if proc.returncode == 0 and stdout.strip():
return float(stdout.strip())
except Exception as e:
print(f"ffprobe failed for {file_path}: {e!s}, using file-size estimation")
try:
file_size = os.path.getsize(file_path)
if file_path.endswith(".wav"):
return file_size / (16000 * 2)
else:
return file_size / 16000
except Exception:
return DEFAULT_DURATION_IN_FRAMES / FPS
async def _assign_themes_with_llm(
llm, slides: list[SlideContent]
) -> dict[int, tuple[str, str]]:
"""Ask the LLM to assign a theme+mode to each slide in one call.
Returns a dict mapping slide_number (theme, mode).
Falls back to round-robin if the LLM response can't be parsed.
"""
total = len(slides)
slide_summaries = [
{
"slide_number": s.slide_number,
"title": s.title,
"subtitle": s.subtitle or "",
"background_explanation": s.background_explanation or "",
}
for s in slides
]
system = get_theme_assignment_system_prompt()
user = build_theme_assignment_user_prompt(slide_summaries)
try:
response = await llm.ainvoke(
[
SystemMessage(content=system),
HumanMessage(content=user),
]
)
text = response.content.strip()
if text.startswith("```"):
lines = text.split("\n")
text = "\n".join(
line for line in lines if not line.strip().startswith("```")
).strip()
assignments = json.loads(text)
valid_themes = set(THEME_PRESETS)
result: dict[int, tuple[str, str]] = {}
for entry in assignments:
sn = entry.get("slide_number")
theme = entry.get("theme", "").upper()
mode = entry.get("mode", "dark").lower()
if sn and theme in valid_themes and mode in ("dark", "light"):
result[sn] = (theme, mode)
if len(result) == total:
print(
"LLM theme assignment: "
+ ", ".join(f"S{sn}={t}/{m}" for sn, (t, m) in sorted(result.items()))
)
return result
print(
f"LLM returned {len(result)}/{total} valid assignments, "
"filling gaps with fallback"
)
for s in slides:
if s.slide_number not in result:
result[s.slide_number] = pick_theme_and_mode_fallback(
s.slide_number - 1, total
)
return result
except Exception as e:
print(f"LLM theme assignment failed ({e!s}), using fallback")
return {
s.slide_number: pick_theme_and_mode_fallback(s.slide_number - 1, total)
for s in slides
}
async def assign_slide_themes(state: State, config: RunnableConfig) -> dict[str, Any]:
"""Assign a theme preset + dark/light mode to every slide via a single LLM call.
Runs in parallel with audio generation since it only needs slide metadata.
"""
configuration = Configuration.from_runnable_config(config)
search_space_id = configuration.search_space_id
llm = await get_agent_llm(state.db_session, search_space_id)
if not llm:
raise RuntimeError(f"No LLM configured for search space {search_space_id}")
slides = state.slides or []
assignments = await _assign_themes_with_llm(llm, slides)
return {"slide_theme_assignments": assignments}
async def generate_slide_scene_codes(
state: State, config: RunnableConfig
) -> dict[str, Any]:
"""Generate Remotion component code for each slide using LLM.
Reads pre-assigned themes from state (produced by the parallel
assign_slide_themes node) and generates scene code concurrently.
"""
configuration = Configuration.from_runnable_config(config)
search_space_id = configuration.search_space_id
llm = await get_agent_llm(state.db_session, search_space_id)
if not llm:
raise RuntimeError(f"No LLM configured for search space {search_space_id}")
slides = state.slides or []
audio_results = state.slide_audio_results or []
audio_map: dict[int, SlideAudioResult] = {r.slide_number: r for r in audio_results}
total_slides = len(slides)
theme_assignments = state.slide_theme_assignments or {}
async def _generate_scene_for_slide(slide: SlideContent) -> SlideSceneCode:
audio = audio_map.get(slide.slide_number)
duration = audio.duration_in_frames if audio else DEFAULT_DURATION_IN_FRAMES
theme, mode = theme_assignments.get(
slide.slide_number,
pick_theme_and_mode_fallback(slide.slide_number - 1, total_slides),
)
user_prompt = build_scene_generation_user_prompt(
slide_number=slide.slide_number,
total_slides=total_slides,
title=slide.title,
subtitle=slide.subtitle,
content_in_markdown=slide.content_in_markdown,
background_explanation=slide.background_explanation,
duration_in_frames=duration,
theme=theme,
mode=mode,
)
messages = [
SystemMessage(content=REMOTION_SCENE_SYSTEM_PROMPT),
HumanMessage(content=user_prompt),
]
print(
f"Generating scene code for slide {slide.slide_number}/{total_slides}: "
f'"{slide.title}" ({duration} frames)'
)
llm_response = await llm.ainvoke(messages)
code, scene_title = _extract_code_and_title(llm_response.content)
code = await _refine_if_needed(llm, code, slide.slide_number)
print(f"Scene code ready for slide {slide.slide_number} ({len(code)} chars)")
return SlideSceneCode(
slide_number=slide.slide_number,
code=code,
title=scene_title or slide.title,
)
scene_codes = list(
await asyncio.gather(*[_generate_scene_for_slide(s) for s in slides])
)
return {"slide_scene_codes": scene_codes}
def _extract_code_and_title(content: str) -> tuple[str, str | None]:
"""Extract code and optional title from LLM response.
The LLM may return a JSON object like the POC's structured output:
{ "code": "...", "title": "..." }
Or it may return raw code (with optional markdown fences).
Returns (code, title) where title may be None.
"""
text = content.strip()
if text.startswith("{"):
try:
parsed = json.loads(text)
if isinstance(parsed, dict) and "code" in parsed:
return parsed["code"], parsed.get("title")
except (json.JSONDecodeError, ValueError):
pass
json_start = text.find("{")
json_end = text.rfind("}") + 1
if json_start >= 0 and json_end > json_start:
try:
parsed = json.loads(text[json_start:json_end])
if isinstance(parsed, dict) and "code" in parsed:
return parsed["code"], parsed.get("title")
except (json.JSONDecodeError, ValueError):
pass
code = text
if code.startswith("```"):
lines = code.split("\n")
start = 1
end = len(lines)
for i in range(len(lines) - 1, 0, -1):
if lines[i].strip().startswith("```"):
end = i
break
code = "\n".join(lines[start:end]).strip()
return code, None
async def _refine_if_needed(llm, code: str, slide_number: int) -> str:
"""Attempt basic syntax validation and auto-repair via LLM if needed.
Raises RuntimeError if the code is still invalid after MAX_REFINE_ATTEMPTS,
matching the POC's behavior where a failed slide aborts the pipeline.
"""
error = _basic_syntax_check(code)
if error is None:
return code
for attempt in range(1, MAX_REFINE_ATTEMPTS + 1):
print(
f"Slide {slide_number}: syntax issue (attempt {attempt}/{MAX_REFINE_ATTEMPTS}): {error}"
)
messages = [
SystemMessage(content=REFINE_SCENE_SYSTEM_PROMPT),
HumanMessage(
content=(
f"Here is the broken Remotion component code:\n\n{code}\n\n"
f"Compilation error:\n{error}\n\nFix the code."
)
),
]
response = await llm.ainvoke(messages)
code, _ = _extract_code_and_title(response.content)
error = _basic_syntax_check(code)
if error is None:
print(f"Slide {slide_number}: fixed on attempt {attempt}")
return code
raise RuntimeError(
f"Slide {slide_number} failed to compile after {MAX_REFINE_ATTEMPTS} "
f"refine attempts. Last error: {error}"
)
def _basic_syntax_check(code: str) -> str | None:
"""Run a lightweight syntax check on the generated code.
Full Babel-based compilation happens on the frontend. This backend check
catches the most common LLM code-generation mistakes so the refine loop
can fix them before persisting.
Returns an error description or None if the code looks valid.
"""
if not code or not code.strip():
return "Empty code"
if "export" not in code and "MyComposition" not in code:
return "Missing exported component (expected 'export const MyComposition')"
brace_count = 0
paren_count = 0
bracket_count = 0
for ch in code:
if ch == "{":
brace_count += 1
elif ch == "}":
brace_count -= 1
elif ch == "(":
paren_count += 1
elif ch == ")":
paren_count -= 1
elif ch == "[":
bracket_count += 1
elif ch == "]":
bracket_count -= 1
if brace_count < 0:
return "Unmatched closing brace '}'"
if paren_count < 0:
return "Unmatched closing parenthesis ')'"
if bracket_count < 0:
return "Unmatched closing bracket ']'"
if brace_count != 0:
return f"Unbalanced braces: {brace_count} unclosed"
if paren_count != 0:
return f"Unbalanced parentheses: {paren_count} unclosed"
if bracket_count != 0:
return f"Unbalanced brackets: {bracket_count} unclosed"
if "useCurrentFrame" not in code:
return "Missing useCurrentFrame() — required for Remotion animations"
if "AbsoluteFill" not in code:
return "Missing AbsoluteFill — required as the root layout component"
return None

View file

@ -0,0 +1,509 @@
import datetime
# TODO: move these to config file
MAX_SLIDES = 5
FPS = 30
DEFAULT_DURATION_IN_FRAMES = 300
THEME_PRESETS = [
"TERRA",
"OCEAN",
"SUNSET",
"EMERALD",
"ECLIPSE",
"ROSE",
"FROST",
"NEBULA",
"AURORA",
"CORAL",
"MIDNIGHT",
"AMBER",
"LAVENDER",
"STEEL",
"CITRUS",
"CHERRY",
]
THEME_DESCRIPTIONS: dict[str, str] = {
"TERRA": "Warm earthy tones — terracotta, olive. Heritage, tradition, organic warmth.",
"OCEAN": "Cool oceanic depth — teal, coral accents. Calm, marine, fluid elegance.",
"SUNSET": "Vibrant warm energy — orange, purple. Passion, creativity, bold expression.",
"EMERALD": "Fresh natural life — green, mint. Growth, health, sustainability.",
"ECLIPSE": "Dramatic luxury — black, gold. Premium, power, prestige.",
"ROSE": "Soft elegance — dusty pink, mauve. Beauty, care, refined femininity.",
"FROST": "Crisp clarity — ice blue, silver. Tech, data, precision analytics.",
"NEBULA": "Cosmic mystery — magenta, deep purple. AI, innovation, cutting-edge future.",
"AURORA": "Ethereal northern lights — green-teal, violet. Mystical, transformative, wonder.",
"CORAL": "Tropical warmth — coral, turquoise. Inviting, lively, community.",
"MIDNIGHT": "Deep sophistication — navy, silver. Contemplative, trust, authority.",
"AMBER": "Rich honey warmth — amber, brown. Comfort, wisdom, organic richness.",
"LAVENDER": "Gentle dreaminess — purple, lilac. Calm, imaginative, serene.",
"STEEL": "Industrial strength — gray, steel blue. Modern professional, reliability.",
"CITRUS": "Bright optimism — yellow, lime. Energy, joy, fresh starts.",
"CHERRY": "Bold impact — deep red, dark. Power, urgency, passionate conviction.",
}
# ---------------------------------------------------------------------------
# LLM-based theme assignment (replaces keyword-based pick_theme_and_mode)
# ---------------------------------------------------------------------------
THEME_ASSIGNMENT_SYSTEM_PROMPT = """You are a visual design director assigning color themes to presentation slides.
Given a list of slides, assign each slide a theme preset and color mode (dark or light).
Available themes (name description):
{theme_list}
Rules:
1. Pick the theme that best matches each slide's mood, content, and visual direction.
2. Maximize visual variety avoid repeating the same theme on consecutive slides.
3. Mix dark and light modes across the presentation for contrast and rhythm.
4. Opening slides often benefit from a bold dark theme; closing/summary slides can go either way.
5. The "background_explanation" field is the primary signal it describes the intended mood and color direction.
Return ONLY a JSON array (no markdown fences, no explanation):
[
{{"slide_number": 1, "theme": "THEME_NAME", "mode": "dark"}},
{{"slide_number": 2, "theme": "THEME_NAME", "mode": "light"}}
]
""".strip()
def build_theme_assignment_user_prompt(
slides: list[dict[str, str]],
) -> str:
"""Build the user prompt for LLM theme assignment.
*slides* is a list of dicts with keys: slide_number, title, subtitle,
background_explanation (mood).
"""
lines = ["Assign a theme and mode to each of these slides:", ""]
for s in slides:
lines.append(
f'Slide {s["slide_number"]}: "{s["title"]}" '
f'(subtitle: "{s.get("subtitle", "")}") — '
f'Mood: "{s.get("background_explanation", "neutral")}"'
)
return "\n".join(lines)
def get_theme_assignment_system_prompt() -> str:
"""Return the theme assignment system prompt with the full theme list injected."""
theme_list = "\n".join(
f"- {name}: {desc}" for name, desc in THEME_DESCRIPTIONS.items()
)
return THEME_ASSIGNMENT_SYSTEM_PROMPT.format(theme_list=theme_list)
def pick_theme_and_mode_fallback(
slide_index: int, total_slides: int
) -> tuple[str, str]:
"""Simple round-robin fallback when LLM theme assignment fails."""
theme = THEME_PRESETS[slide_index % len(THEME_PRESETS)]
mode = "dark" if slide_index % 2 == 0 else "light"
if total_slides == 1:
mode = "dark"
return theme, mode
def get_slide_generation_prompt(user_prompt: str | None = None) -> str:
return f"""
Today's date: {datetime.datetime.now().strftime("%Y-%m-%d")}
<video_presentation_system>
You are a content-to-slides converter. You receive raw source content (articles, notes, transcripts,
product descriptions, chat conversations, etc.) and break it into a sequence of presentation slides
for a video presentation with voiceover narration.
{
f'''
You **MUST** strictly adhere to the following user instruction while generating the slides:
<user_instruction>
{user_prompt}
</user_instruction>
'''
if user_prompt
else ""
}
<input>
- '<source_content>': A block of text containing the information to be presented. This could be
research findings, an article summary, a detailed outline, user chat history, or any relevant
raw information. The content serves as the factual basis for the video presentation.
</input>
<output_format>
A JSON object containing the presentation slides:
{{
"slides": [
{{
"slide_number": 1,
"title": "Concise slide title",
"subtitle": "One-line subtitle or tagline",
"content_in_markdown": "## Heading\\n- Bullet point 1\\n- **Bold text**\\n- Bullet point 3",
"speaker_transcripts": [
"First narration sentence for this slide.",
"Second narration sentence expanding on the point.",
"Third sentence wrapping up this slide."
],
"background_explanation": "Emotional mood and color direction for this slide"
}}
]
}}
</output_format>
<guidelines>
=== SLIDE COUNT ===
Dynamically decide the number of slides between 1 and {MAX_SLIDES} (inclusive).
Base your decision entirely on the content's depth, richness, and how many distinct ideas it contains.
Thin or simple content should produce fewer slides; dense or multi-faceted content may use more.
Do NOT inflate or pad slides to reach {
MAX_SLIDES
} only use what the content genuinely warrants.
Do NOT treat {MAX_SLIDES} as a target; it is a hard ceiling, not a goal.
=== SLIDE STRUCTURE ===
- Each slide should cover ONE distinct key idea or section.
- Keep slides focused: 2-5 bullet points of content per slide max.
- The first slide should be a title/intro slide.
- The last slide should be a summary or closing slide ONLY if there are 3+ slides.
For 1-2 slides, skip the closing slide just cover the content.
- Do NOT create a separate closing slide if its content would just repeat earlier slides.
=== CONTENT FIELDS ===
- Write speaker_transcripts as if a human presenter is narrating natural, conversational, 2-4 sentences per slide.
These will be converted to TTS audio, so write in a way that sounds great when spoken aloud.
- background_explanation should describe a visual style matching the slide's mood:
- Describe the emotional feel: "warm and organic", "dramatic and urgent", "clean and optimistic",
"technical and precise", "celebratory", "earthy and grounded", "cosmic and futuristic"
- Mention color direction: warm tones, cool tones, earth tones, neon accents, gold/black, etc.
- Vary the mood across slides do NOT always say "dark blue gradient".
- content_in_markdown should use proper markdown: ## headings, **bold**, - bullets, etc.
=== NARRATION QUALITY ===
- Speaker transcripts should explain the slide content in an engaging, presenter-like voice.
- Keep narration concise: 2-4 sentences per slide (targeting ~10-15 seconds of audio per slide).
- The narration should add context beyond what's on the slide — don't just read the bullets.
- Use natural language: contractions, conversational tone, occasional enthusiasm.
</guidelines>
<examples>
Input: "Quantum computing uses quantum bits or qubits which can exist in multiple states simultaneously due to superposition."
Output:
{{
"slides": [
{{
"slide_number": 1,
"title": "Quantum Computing",
"subtitle": "Beyond Classical Bits",
"content_in_markdown": "## The Quantum Leap\\n- Classical computers use **bits** (0 or 1)\\n- Quantum computers use **qubits**\\n- Qubits leverage **superposition**",
"speaker_transcripts": [
"Let's explore quantum computing, a technology that's fundamentally different from the computers we use every day.",
"While traditional computers work with bits that are either zero or one, quantum computers use something called qubits.",
"The magic of qubits is superposition — they can exist in multiple states at the same time."
],
"background_explanation": "Cosmic and futuristic with deep purple and magenta tones, evoking the mystery of quantum mechanics"
}}
]
}}
</examples>
Transform the source material into well-structured presentation slides with engaging narration.
Ensure each slide has a clear visual mood and natural-sounding speaker transcripts.
</video_presentation_system>
"""
# ---------------------------------------------------------------------------
# Remotion scene code generation prompt
# Ported from RemotionTets POC /api/generate system prompt
# ---------------------------------------------------------------------------
REMOTION_SCENE_SYSTEM_PROMPT = """
You are a Remotion component generator that creates cinematic, modern motion graphics.
Generate a single self-contained React component that uses Remotion.
=== THEME PRESETS (pick ONE per slide see user prompt for which to use) ===
Each slide MUST use a DIFFERENT preset. The user prompt will tell you which preset to use.
Use ALL colors from that preset background, surface, text, accent, glow. Do NOT mix presets.
TERRA (warm earth terracotta + olive):
dark: bg #1C1510 surface #261E16 border #3D3024 text #E8DDD0 muted #9A8A78 accent #C2623D secondary #7D8C52 glow rgba(194,98,61,0.12)
light: bg #F7F0E8 surface #FFF8F0 border #DDD0BF text #2C1D0E muted #8A7A68 accent #B85430 secondary #6B7A42 glow rgba(184,84,48,0.08)
gradient-dark: radial-gradient(ellipse at 30% 80%, rgba(194,98,61,0.18), transparent 60%), linear-gradient(180deg, #1C1510, #261E16)
gradient-light: radial-gradient(ellipse at 70% 20%, rgba(107,122,66,0.12), transparent 55%), linear-gradient(180deg, #F7F0E8, #FFF8F0)
OCEAN (cool depth teal + coral):
dark: bg #0B1A1E surface #122428 border #1E3740 text #D5EAF0 muted #6A9AA8 accent #1DB6A8 secondary #E87461 glow rgba(29,182,168,0.12)
light: bg #F0F8FA surface #FFFFFF border #C8E0E8 text #0E2830 muted #5A8A98 accent #0EA69A secondary #D05F4E glow rgba(14,166,154,0.08)
gradient-dark: radial-gradient(ellipse at 80% 30%, rgba(29,182,168,0.20), transparent 55%), radial-gradient(circle at 20% 80%, rgba(232,116,97,0.10), transparent 50%), #0B1A1E
gradient-light: radial-gradient(ellipse at 20% 40%, rgba(14,166,154,0.10), transparent 55%), linear-gradient(180deg, #F0F8FA, #FFFFFF)
SUNSET (warm energy orange + purple):
dark: bg #1E130F surface #2A1B14 border #42291C text #F0DDD0 muted #A08878 accent #E86A20 secondary #A855C0 glow rgba(232,106,32,0.12)
light: bg #FFF5ED surface #FFFFFF border #EADAC8 text #2E1508 muted #907860 accent #D05A18 secondary #9045A8 glow rgba(208,90,24,0.08)
gradient-dark: linear-gradient(135deg, rgba(232,106,32,0.15) 0%, transparent 40%), radial-gradient(circle at 80% 70%, rgba(168,85,192,0.15), transparent 50%), #1E130F
gradient-light: linear-gradient(135deg, rgba(208,90,24,0.08) 0%, rgba(144,69,168,0.06) 100%), #FFF5ED
EMERALD (fresh life green + mint):
dark: bg #0B1E14 surface #12281A border #1E3C28 text #D0F0E0 muted #5EA880 accent #10B981 secondary #84CC16 glow rgba(16,185,129,0.12)
light: bg #F0FAF5 surface #FFFFFF border #C0E8D0 text #0E2C18 muted #489068 accent #059669 secondary #65A30D glow rgba(5,150,105,0.08)
gradient-dark: radial-gradient(ellipse at 50% 50%, rgba(16,185,129,0.18), transparent 60%), linear-gradient(180deg, #0B1E14, #12281A)
gradient-light: radial-gradient(ellipse at 60% 30%, rgba(101,163,13,0.10), transparent 55%), linear-gradient(180deg, #F0FAF5, #FFFFFF)
ECLIPSE (dramatic black + gold):
dark: bg #100C05 surface #1A1508 border #2E2510 text #D4B96A muted #8A7840 accent #E8B830 secondary #C09020 glow rgba(232,184,48,0.14)
light: bg #FAF6ED surface #FFFFFF border #E0D8C0 text #1A1408 muted #7A6818 accent #C09820 secondary #A08018 glow rgba(192,152,32,0.08)
gradient-dark: radial-gradient(circle at 50% 40%, rgba(232,184,48,0.20), transparent 50%), radial-gradient(ellipse at 50% 90%, rgba(192,144,32,0.08), transparent 50%), #100C05
gradient-light: radial-gradient(circle at 50% 40%, rgba(192,152,32,0.10), transparent 55%), linear-gradient(180deg, #FAF6ED, #FFFFFF)
ROSE (soft elegance dusty pink + mauve):
dark: bg #1E1018 surface #281820 border #3D2830 text #F0D8E0 muted #A08090 accent #E4508C secondary #B06498 glow rgba(228,80,140,0.12)
light: bg #FDF2F5 surface #FFFFFF border #F0D0D8 text #2C1018 muted #906878 accent #D43D78 secondary #9A5080 glow rgba(212,61,120,0.08)
gradient-dark: radial-gradient(ellipse at 70% 30%, rgba(228,80,140,0.18), transparent 55%), radial-gradient(circle at 20% 80%, rgba(176,100,152,0.10), transparent 50%), #1E1018
gradient-light: radial-gradient(ellipse at 30% 60%, rgba(212,61,120,0.08), transparent 55%), linear-gradient(180deg, #FDF2F5, #FFFFFF)
FROST (crisp clarity ice blue + silver):
dark: bg #0A1520 surface #101D2A border #1A3040 text #D0E5F5 muted #6090B0 accent #5AB4E8 secondary #8BA8C0 glow rgba(90,180,232,0.12)
light: bg #F0F6FC surface #FFFFFF border #C8D8E8 text #0C1820 muted #5080A0 accent #3A96D0 secondary #7090A8 glow rgba(58,150,208,0.08)
gradient-dark: radial-gradient(ellipse at 40% 20%, rgba(90,180,232,0.16), transparent 55%), linear-gradient(180deg, #0A1520, #101D2A)
gradient-light: radial-gradient(ellipse at 50% 50%, rgba(58,150,208,0.08), transparent 55%), linear-gradient(180deg, #F0F6FC, #FFFFFF)
NEBULA (cosmic magenta + deep purple):
dark: bg #150A1E surface #1E1028 border #351A48 text #E0D0F0 muted #8060A0 accent #C850E0 secondary #8030C0 glow rgba(200,80,224,0.14)
light: bg #F8F0FF surface #FFFFFF border #E0C8F0 text #1A0A24 muted #7050A0 accent #A840C0 secondary #6820A0 glow rgba(168,64,192,0.08)
gradient-dark: radial-gradient(circle at 60% 40%, rgba(200,80,224,0.18), transparent 50%), radial-gradient(ellipse at 30% 80%, rgba(128,48,192,0.12), transparent 50%), #150A1E
gradient-light: radial-gradient(circle at 40% 30%, rgba(168,64,192,0.10), transparent 55%), linear-gradient(180deg, #F8F0FF, #FFFFFF)
AURORA (ethereal lights green-teal + violet):
dark: bg #0A1A1A surface #102020 border #1A3838 text #D0F0F0 muted #60A0A0 accent #30D0B0 secondary #8040D0 glow rgba(48,208,176,0.12)
light: bg #F0FAF8 surface #FFFFFF border #C0E8E0 text #0A2020 muted #508080 accent #20B090 secondary #6830B0 glow rgba(32,176,144,0.08)
gradient-dark: radial-gradient(ellipse at 30% 70%, rgba(48,208,176,0.18), transparent 55%), radial-gradient(circle at 70% 30%, rgba(128,64,208,0.12), transparent 50%), #0A1A1A
gradient-light: radial-gradient(ellipse at 50% 40%, rgba(32,176,144,0.10), transparent 55%), linear-gradient(180deg, #F0FAF8, #FFFFFF)
CORAL (tropical warmth coral + turquoise):
dark: bg #1E0F0F surface #281818 border #402828 text #F0D8D8 muted #A07070 accent #F06050 secondary #30B8B0 glow rgba(240,96,80,0.12)
light: bg #FFF5F3 surface #FFFFFF border #F0D0C8 text #2E1010 muted #906060 accent #E04838 secondary #20A098 glow rgba(224,72,56,0.08)
gradient-dark: radial-gradient(ellipse at 60% 60%, rgba(240,96,80,0.18), transparent 55%), radial-gradient(circle at 30% 30%, rgba(48,184,176,0.10), transparent 50%), #1E0F0F
gradient-light: radial-gradient(ellipse at 40% 50%, rgba(224,72,56,0.08), transparent 55%), linear-gradient(180deg, #FFF5F3, #FFFFFF)
MIDNIGHT (deep sophistication navy + silver):
dark: bg #080C18 surface #0E1420 border #1A2438 text #C8D8F0 muted #5070A0 accent #4080E0 secondary #A0B0D0 glow rgba(64,128,224,0.12)
light: bg #F0F2F8 surface #FFFFFF border #C8D0E0 text #101828 muted #506080 accent #3060C0 secondary #8090B0 glow rgba(48,96,192,0.08)
gradient-dark: radial-gradient(ellipse at 50% 30%, rgba(64,128,224,0.16), transparent 55%), linear-gradient(180deg, #080C18, #0E1420)
gradient-light: radial-gradient(ellipse at 50% 50%, rgba(48,96,192,0.08), transparent 55%), linear-gradient(180deg, #F0F2F8, #FFFFFF)
AMBER (rich honey warmth amber + brown):
dark: bg #1A1208 surface #221A0E border #3A2C18 text #F0E0C0 muted #A09060 accent #E0A020 secondary #C08030 glow rgba(224,160,32,0.12)
light: bg #FFF8E8 surface #FFFFFF border #E8D8B8 text #2A1C08 muted #907840 accent #C88810 secondary #A86820 glow rgba(200,136,16,0.08)
gradient-dark: radial-gradient(ellipse at 40% 60%, rgba(224,160,32,0.18), transparent 55%), linear-gradient(180deg, #1A1208, #221A0E)
gradient-light: radial-gradient(ellipse at 60% 40%, rgba(200,136,16,0.10), transparent 55%), linear-gradient(180deg, #FFF8E8, #FFFFFF)
LAVENDER (gentle dreaminess purple + lilac):
dark: bg #14101E surface #1C1628 border #302840 text #E0D8F0 muted #8070A0 accent #A060E0 secondary #C090D0 glow rgba(160,96,224,0.12)
light: bg #F8F0FF surface #FFFFFF border #E0D0F0 text #1C1028 muted #706090 accent #8848C0 secondary #A878B8 glow rgba(136,72,192,0.08)
gradient-dark: radial-gradient(ellipse at 60% 40%, rgba(160,96,224,0.18), transparent 55%), radial-gradient(circle at 30% 70%, rgba(192,144,208,0.10), transparent 50%), #14101E
gradient-light: radial-gradient(ellipse at 40% 30%, rgba(136,72,192,0.10), transparent 55%), linear-gradient(180deg, #F8F0FF, #FFFFFF)
STEEL (industrial strength gray + steel blue):
dark: bg #101214 surface #181C20 border #282E38 text #D0D8E0 muted #708090 accent #5088B0 secondary #90A0B0 glow rgba(80,136,176,0.12)
light: bg #F2F4F6 surface #FFFFFF border #D0D8E0 text #181C24 muted #607080 accent #3870A0 secondary #708898 glow rgba(56,112,160,0.08)
gradient-dark: radial-gradient(ellipse at 50% 50%, rgba(80,136,176,0.14), transparent 55%), linear-gradient(180deg, #101214, #181C20)
gradient-light: radial-gradient(ellipse at 50% 40%, rgba(56,112,160,0.08), transparent 55%), linear-gradient(180deg, #F2F4F6, #FFFFFF)
CITRUS (bright optimism yellow + lime):
dark: bg #181808 surface #202010 border #383818 text #F0F0C0 muted #A0A060 accent #E8D020 secondary #90D030 glow rgba(232,208,32,0.12)
light: bg #FFFFF0 surface #FFFFFF border #E8E8C0 text #282808 muted #808040 accent #C8B010 secondary #70B020 glow rgba(200,176,16,0.08)
gradient-dark: radial-gradient(ellipse at 40% 40%, rgba(232,208,32,0.18), transparent 55%), radial-gradient(circle at 70% 70%, rgba(144,208,48,0.10), transparent 50%), #181808
gradient-light: radial-gradient(ellipse at 50% 30%, rgba(200,176,16,0.10), transparent 55%), linear-gradient(180deg, #FFFFF0, #FFFFFF)
CHERRY (bold impact deep red + dark):
dark: bg #1A0808 surface #241010 border #401818 text #F0D0D0 muted #A06060 accent #D02030 secondary #E05060 glow rgba(208,32,48,0.14)
light: bg #FFF0F0 surface #FFFFFF border #F0C8C8 text #280808 muted #904848 accent #B01828 secondary #C83848 glow rgba(176,24,40,0.08)
gradient-dark: radial-gradient(ellipse at 50% 40%, rgba(208,32,48,0.20), transparent 50%), linear-gradient(180deg, #1A0808, #241010)
gradient-light: radial-gradient(ellipse at 50% 50%, rgba(176,24,40,0.10), transparent 55%), linear-gradient(180deg, #FFF0F0, #FFFFFF)
=== SHARED TOKENS (use with any theme above) ===
SPACING: xs 8px, sm 16px, md 24px, lg 32px, xl 48px, 2xl 64px, 3xl 96px, 4xl 128px
TYPOGRAPHY: fontFamily "Inter, system-ui, -apple-system, sans-serif"
caption 14px/1.4, body 18px/1.6, subhead 24px/1.4, title 40px/1.2 w600, headline 64px/1.1 w700, display 96px/1.0 w800
letterSpacing: tight "-0.02em", normal "0", wide "0.05em"
BORDER RADIUS: 12px (cards), 8px (buttons), 9999px (pills)
=== VISUAL VARIETY (CRITICAL) ===
The user prompt assigns each slide a specific theme preset AND mode (dark/light).
You MUST use EXACTLY the assigned preset and mode. Additionally:
1. Use the preset's gradient as the AbsoluteFill background.
2. Use the preset's accent/secondary colors for highlights, pill badges, and card accents.
3. Use the preset's glow value for all boxShadow effects.
4. LAYOUT VARIATION: Vary layout between slides:
- One slide: bold centered headline + subtle stat
- Another: two-column card layout
- Another: single large number or quote as hero
Do NOT use the same layout pattern for every slide.
=== LAYOUT RULES (CRITICAL elements must NEVER overlap) ===
The canvas is 1920x1080. You MUST use a SINGLE-LAYER layout. NO stacking, NO multiple AbsoluteFill layers.
STRUCTURE every component must follow this exact pattern:
<AbsoluteFill style={{ backgroundColor: "...", display: "flex", flexDirection: "column", justifyContent: "center", alignItems: "center", padding: 80 }}>
{/* ALL content goes here as direct children in normal flow */}
</AbsoluteFill>
ABSOLUTE RULES:
- Use exactly ONE AbsoluteFill as the root. Set its background color/gradient via its style prop.
- NEVER nest AbsoluteFill inside AbsoluteFill.
- NEVER use position "absolute" or position "fixed" on ANY element.
- NEVER use multiple layers or z-index.
- ALL elements must be in normal document flow inside the single root AbsoluteFill.
SPACING:
- Root padding: 80px on all sides (safe area).
- Use flexDirection "column" with gap for vertical stacking, flexDirection "row" with gap for horizontal.
- Minimum gap between elements: 24px vertical, 32px horizontal.
- Text hierarchy gaps: headlinesubheading 16px, subheadingbody 12px, bodybutton 32px.
- Cards/panels: padding 32px-48px, borderRadius 12px.
- NEVER use margin to space siblings always use the parent's gap property.
=== DESIGN STYLE ===
- Premium aesthetic use the exact colors from the assigned theme preset (do NOT invent your own)
- Background: use the preset's gradient-dark or gradient-light value directly as the AbsoluteFill's background
- Card/surface backgrounds: use the preset's surface color
- Text colors: use the preset's text, muted values
- Borders: use the preset's border color
- Glows: use the preset's glow value for all boxShadow — do NOT substitute other colors
- Generous whitespace less is more, let elements breathe
- NO decorative background shapes, blurs, or overlapping ornaments
=== REMOTION RULES ===
- Export the component as: export const MyComposition = () => { ... }
- Use useCurrentFrame() and useVideoConfig() from "remotion"
- Do NOT use Sequence
- Do NOT manually calculate animation timings or frame offsets
=== ANIMATION (use the stagger() helper for ALL element animations) ===
A pre-built helper function called stagger() is available globally.
It handles enter, hold, and exit phases automatically you MUST use it.
Signature:
stagger(frame, fps, index, total) { opacity: number, transform: string }
Parameters:
frame from useCurrentFrame()
fps from useVideoConfig()
index 0-based index of this element in the entrance order
total total number of animated elements in the scene
It returns a style object with opacity and transform that you spread onto the element.
Timing is handled for you: staggered spring entrances, ambient hold motion, and a graceful exit.
Usage pattern:
const frame = useCurrentFrame();
const { fps } = useVideoConfig();
<div style={stagger(frame, fps, 0, 4)}>Headline</div>
<div style={stagger(frame, fps, 1, 4)}>Subtitle</div>
<div style={stagger(frame, fps, 2, 4)}>Card</div>
<div style={stagger(frame, fps, 3, 4)}>Footer</div>
Rules:
- Count ALL animated elements in your scene and pass that count as the "total" parameter.
- Assign each element a sequential index starting from 0.
- You can merge stagger's return with additional styles:
<div style={{ ...stagger(frame, fps, 0, 3), fontSize: 64, color: "#fafafa" }}>
- For non-animated static elements (backgrounds, borders), just use normal styles without stagger.
- You may still use spring() and interpolate() for EXTRA custom effects (e.g., a number counter,
color shift, or typewriter effect), but stagger() must drive all entrance/exit animations.
=== AVAILABLE GLOBALS (injected at runtime, do NOT import anything else) ===
- React (available globally)
- AbsoluteFill, useCurrentFrame, useVideoConfig, spring, interpolate, Easing from "remotion"
- stagger(frame, fps, index, total) animation helper described above
=== CODE RULES ===
- Output ONLY the raw code, no markdown fences, no explanations
- Keep it fully self-contained, no external dependencies or images
- Use inline styles only (no CSS imports, no className)
- Target 1920x1080 resolution
- Every container must use display "flex" with explicit gap values
- NEVER use marginTop/marginBottom to space siblings use the parent's gap instead
""".strip()
def build_scene_generation_user_prompt(
slide_number: int,
total_slides: int,
title: str,
subtitle: str,
content_in_markdown: str,
background_explanation: str,
duration_in_frames: int,
theme: str,
mode: str,
) -> str:
"""Build the user prompt for generating a single slide's Remotion scene code.
*theme* and *mode* are pre-assigned (by LLM or fallback) before this is called.
"""
return "\n".join(
[
"Create a cinematic, visually striking Remotion scene.",
f"The video is {duration_in_frames} frames at {FPS}fps ({duration_in_frames / FPS:.1f}s total).",
"",
f"This is slide {slide_number} of {total_slides} in the video.",
"",
f"=== ASSIGNED THEME: {theme} / {mode.upper()} mode ===",
f"You MUST use the {theme} preset in {mode} mode from the theme presets above.",
f"Use its exact background gradient (gradient-{mode}), surface, text, accent, secondary, border, and glow colors.",
"Do NOT substitute, invent, or default to blue/violet colors.",
"",
f'The scene should communicate this message: "{title}{subtitle}"',
"",
"Key ideas to convey (use as creative inspiration, NOT literal text to dump on screen):",
content_in_markdown,
"",
"Pick only the 1-2 most impactful phrases or numbers to display as text.",
"",
f"Mood & tone: {background_explanation}",
]
)
REFINE_SCENE_SYSTEM_PROMPT = """
You are a code repair assistant. You will receive a Remotion React component that failed to compile,
along with the exact error message from the Babel transpiler.
Your job is to fix the code so it compiles and runs correctly.
RULES:
- Output ONLY the fixed raw code as a string no markdown fences, no explanations.
- Preserve the original intent, design, and animations as closely as possible.
- The component must be exported as: export const MyComposition = () => { ... }
- Only these globals are available at runtime (they are injected, not actually imported):
React, AbsoluteFill, useCurrentFrame, useVideoConfig, spring, interpolate, Easing,
stagger (a helper: stagger(frame, fps, index, total) { opacity, transform })
- Keep import statements at the top (they get stripped by the compiler) but do NOT import anything
other than "react" and "remotion".
- Use inline styles only (no CSS, no className).
- Common fixes:
- Mismatched braces/brackets in JSX style objects (e.g. }}, instead of }}>)
- Missing closing tags
- Trailing commas before > in JSX
- Undefined variables or typos
- Invalid JSX expressions
- After fixing, mentally walk through every brace pair { } and JSX tag to verify they match.
""".strip()

View file

@ -0,0 +1,73 @@
"""Define the state structures for the video presentation agent."""
from __future__ import annotations
from dataclasses import dataclass
from pydantic import BaseModel, Field
from sqlalchemy.ext.asyncio import AsyncSession
class SlideContent(BaseModel):
"""Represents a single parsed slide from content analysis."""
slide_number: int = Field(..., description="1-based slide number")
title: str = Field(..., description="Concise slide title")
subtitle: str = Field(..., description="One-line subtitle or tagline")
content_in_markdown: str = Field(
..., description="Slide body content formatted as markdown"
)
speaker_transcripts: list[str] = Field(
...,
description="2-4 short sentences a presenter would say while this slide is shown",
)
background_explanation: str = Field(
...,
description="Emotional mood and color direction for this slide",
)
class PresentationSlides(BaseModel):
"""Represents the full set of parsed slides from the LLM."""
slides: list[SlideContent] = Field(
..., description="Ordered array of presentation slides"
)
class SlideAudioResult(BaseModel):
"""Audio generation result for a single slide."""
slide_number: int
audio_file: str = Field(..., description="Path to the per-slide audio file")
duration_seconds: float = Field(..., description="Audio duration in seconds")
duration_in_frames: int = Field(
..., description="Audio duration in frames (at 30fps)"
)
class SlideSceneCode(BaseModel):
"""Generated Remotion component code for a single slide."""
slide_number: int
code: str = Field(
..., description="Raw Remotion React component source code for this slide"
)
title: str = Field(..., description="Short title for the composition")
@dataclass
class State:
"""State for the video presentation agent graph.
Pipeline: parse slides (TTS audio theme assignment) generate Remotion code
The frontend receives the slides + code + audio and handles compilation/rendering.
"""
db_session: AsyncSession
source_content: str
slides: list[SlideContent] | None = None
slide_audio_results: list[SlideAudioResult] | None = None
slide_theme_assignments: dict[int, tuple[str, str]] | None = None
slide_scene_codes: list[SlideSceneCode] | None = None

View file

@ -0,0 +1,30 @@
def get_voice_for_provider(provider: str, speaker_id: int = 0) -> dict | str:
"""
Get the appropriate voice configuration based on the TTS provider.
Currently single-speaker only (speaker_id=0). Multi-speaker support
will be added in a future iteration.
Args:
provider: The TTS provider (e.g., "openai/tts-1", "vertex_ai/test")
speaker_id: The ID of the speaker (default 0, single speaker for now)
Returns:
Voice configuration - string for OpenAI, dict for Vertex AI
"""
if provider == "local/kokoro":
return "af_heart"
provider_type = (
provider.split("/")[0].lower() if "/" in provider else provider.lower()
)
voices = {
"openai": "alloy",
"vertex_ai": {
"languageCode": "en-US",
"name": "en-US-Studio-O",
},
"azure": "alloy",
}
return voices.get(provider_type, {})

View file

@ -340,20 +340,17 @@ if config.NEXT_FRONTEND_URL:
if www_url not in allowed_origins:
allowed_origins.append(www_url)
# For local development, also allow common localhost origins
if not config.BACKEND_URL or (
config.NEXT_FRONTEND_URL and "localhost" in config.NEXT_FRONTEND_URL
):
allowed_origins.extend(
[
"http://localhost:3000",
"http://127.0.0.1:3000",
]
)
allowed_origins.extend(
[ # For local development and desktop app
"http://localhost:3000",
"http://127.0.0.1:3000",
]
)
app.add_middleware(
CORSMiddleware,
allow_origins=allowed_origins,
allow_origin_regex=r"^https?://(localhost|127\.0\.0\.1)(:\d+)?$",
allow_credentials=True,
allow_methods=["*"], # Allows all methods
allow_headers=["*"], # Allows all headers

View file

@ -77,6 +77,7 @@ celery_app = Celery(
include=[
"app.tasks.celery_tasks.document_tasks",
"app.tasks.celery_tasks.podcast_tasks",
"app.tasks.celery_tasks.video_presentation_tasks",
"app.tasks.celery_tasks.connector_tasks",
"app.tasks.celery_tasks.schedule_checker_task",
"app.tasks.celery_tasks.document_reindex_tasks",

View file

@ -224,6 +224,9 @@ class Config:
os.getenv("CONNECTOR_INDEXING_LOCK_TTL_SECONDS", str(8 * 60 * 60))
)
# Platform web search (SearXNG)
SEARXNG_DEFAULT_HOST = os.getenv("SEARXNG_DEFAULT_HOST")
NEXT_FRONTEND_URL = os.getenv("NEXT_FRONTEND_URL")
# Backend URL to override the http to https in the OAuth redirect URI
BACKEND_URL = os.getenv("BACKEND_URL")

View file

@ -183,6 +183,23 @@ global_llm_configs:
use_default_system_instructions: true
citations_enabled: true
# Example: MiniMax M2.5 - High-performance with 204K context window
- id: -8
name: "Global MiniMax M2.5"
description: "MiniMax M2.5 with 204K context window and competitive pricing"
provider: "MINIMAX"
model_name: "MiniMax-M2.5"
api_key: "your-minimax-api-key-here"
api_base: "https://api.minimax.io/v1"
rpm: 60
tpm: 100000
litellm_params:
temperature: 1.0 # MiniMax requires temperature in (0.0, 1.0], cannot be 0
max_tokens: 4000
system_instructions: ""
use_default_system_instructions: true
citations_enabled: true
# =============================================================================
# Image Generation Configuration
# =============================================================================

View file

@ -463,7 +463,7 @@ async def _process_gmail_messages_phase2(
"connector_id": connector_id,
"source": "composio",
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -477,7 +477,7 @@ async def index_composio_google_calendar(
"connector_id": connector_id,
"source": "composio",
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -1112,7 +1112,7 @@ async def _index_composio_drive_delta_sync(
"connector_id": connector_id,
"source": "composio",
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()
@ -1520,7 +1520,7 @@ async def _index_composio_drive_full_scan(
"connector_id": connector_id,
"source": "composio",
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -103,6 +103,13 @@ class PodcastStatus(StrEnum):
FAILED = "failed"
class VideoPresentationStatus(StrEnum):
PENDING = "pending"
GENERATING = "generating"
READY = "ready"
FAILED = "failed"
class DocumentStatus:
"""
Helper class for document processing status (stored as JSONB).
@ -215,6 +222,7 @@ class LiteLLMProvider(StrEnum):
COMETAPI = "COMETAPI"
HUGGINGFACE = "HUGGINGFACE"
GITHUB_MODELS = "GITHUB_MODELS"
MINIMAX = "MINIMAX"
CUSTOM = "CUSTOM"
@ -336,6 +344,12 @@ class Permission(StrEnum):
PODCASTS_UPDATE = "podcasts:update"
PODCASTS_DELETE = "podcasts:delete"
# Video Presentations
VIDEO_PRESENTATIONS_CREATE = "video_presentations:create"
VIDEO_PRESENTATIONS_READ = "video_presentations:read"
VIDEO_PRESENTATIONS_UPDATE = "video_presentations:update"
VIDEO_PRESENTATIONS_DELETE = "video_presentations:delete"
# Image Generations
IMAGE_GENERATIONS_CREATE = "image_generations:create"
IMAGE_GENERATIONS_READ = "image_generations:read"
@ -402,6 +416,10 @@ DEFAULT_ROLE_PERMISSIONS = {
Permission.PODCASTS_CREATE.value,
Permission.PODCASTS_READ.value,
Permission.PODCASTS_UPDATE.value,
# Video Presentations (no delete)
Permission.VIDEO_PRESENTATIONS_CREATE.value,
Permission.VIDEO_PRESENTATIONS_READ.value,
Permission.VIDEO_PRESENTATIONS_UPDATE.value,
# Image Generations (create and read, no delete)
Permission.IMAGE_GENERATIONS_CREATE.value,
Permission.IMAGE_GENERATIONS_READ.value,
@ -434,6 +452,8 @@ DEFAULT_ROLE_PERMISSIONS = {
Permission.LLM_CONFIGS_READ.value,
# Podcasts (read only)
Permission.PODCASTS_READ.value,
# Video Presentations (read only)
Permission.VIDEO_PRESENTATIONS_READ.value,
# Image Generations (read only)
Permission.IMAGE_GENERATIONS_READ.value,
# Connectors (read only)
@ -1043,6 +1063,46 @@ class Podcast(BaseModel, TimestampMixin):
thread = relationship("NewChatThread")
class VideoPresentation(BaseModel, TimestampMixin):
"""Video presentation model for storing AI-generated video presentations.
The slides JSONB stores per-slide data including Remotion component code,
audio file paths, and durations. The frontend compiles the code and renders
the video using Remotion Player.
"""
__tablename__ = "video_presentations"
title = Column(String(500), nullable=False)
slides = Column(JSONB, nullable=True)
scene_codes = Column(JSONB, nullable=True)
status = Column(
SQLAlchemyEnum(
VideoPresentationStatus,
name="video_presentation_status",
create_type=False,
values_callable=lambda x: [e.value for e in x],
),
nullable=False,
default=VideoPresentationStatus.READY,
server_default="ready",
index=True,
)
search_space_id = Column(
Integer, ForeignKey("searchspaces.id", ondelete="CASCADE"), nullable=False
)
search_space = relationship("SearchSpace", back_populates="video_presentations")
thread_id = Column(
Integer,
ForeignKey("new_chat_threads.id", ondelete="SET NULL"),
nullable=True,
index=True,
)
thread = relationship("NewChatThread")
class Report(BaseModel, TimestampMixin):
"""Report model for storing generated Markdown reports."""
@ -1227,6 +1287,12 @@ class SearchSpace(BaseModel, TimestampMixin):
order_by="Podcast.id.desc()",
cascade="all, delete-orphan",
)
video_presentations = relationship(
"VideoPresentation",
back_populates="search_space",
order_by="VideoPresentation.id.desc()",
cascade="all, delete-orphan",
)
reports = relationship(
"Report",
back_populates="search_space",

View file

@ -1,3 +1,4 @@
import asyncio
import time
from datetime import datetime
@ -49,7 +50,7 @@ class ChucksHybridSearchRetriever:
# Get embedding for the query
embedding_model = config.embedding_model_instance
t_embed = time.perf_counter()
query_embedding = embedding_model.embed(query_text)
query_embedding = await asyncio.to_thread(embedding_model.embed, query_text)
perf.debug(
"[chunk_search] vector_search embedding in %.3fs",
time.perf_counter() - t_embed,
@ -195,7 +196,7 @@ class ChucksHybridSearchRetriever:
if query_embedding is None:
embedding_model = config.embedding_model_instance
t_embed = time.perf_counter()
query_embedding = embedding_model.embed(query_text)
query_embedding = await asyncio.to_thread(embedding_model.embed, query_text)
perf.debug(
"[chunk_search] hybrid_search embedding in %.3fs",
time.perf_counter() - t_embed,

View file

@ -42,6 +42,7 @@ from .search_spaces_routes import router as search_spaces_router
from .slack_add_connector_route import router as slack_add_connector_router
from .surfsense_docs_routes import router as surfsense_docs_router
from .teams_add_connector_route import router as teams_add_connector_router
from .video_presentations_routes import router as video_presentations_router
from .youtube_routes import router as youtube_router
router = APIRouter()
@ -55,6 +56,9 @@ router.include_router(new_chat_router) # Chat with assistant-ui persistence
router.include_router(sandbox_router) # Sandbox file downloads (Daytona)
router.include_router(chat_comments_router)
router.include_router(podcasts_router) # Podcast task status and audio
router.include_router(
video_presentations_router
) # Video presentation status and streaming
router.include_router(reports_router) # Report CRUD and multi-format export
router.include_router(image_generation_router) # Image generation via litellm
router.include_router(search_source_connectors_router)

View file

@ -21,6 +21,7 @@ from app.services.public_chat_service import (
get_public_chat,
get_snapshot_podcast,
get_snapshot_report,
get_snapshot_video_presentation,
)
from app.users import current_active_user
@ -117,6 +118,119 @@ async def stream_public_podcast(
)
@router.get("/{share_token}/video-presentations/{video_presentation_id}")
async def get_public_video_presentation(
share_token: str,
video_presentation_id: int,
session: AsyncSession = Depends(get_async_session),
):
"""
Get video presentation details from a public chat snapshot.
No authentication required - the share_token provides access.
Returns slide data (with public audio URLs) and scene codes.
"""
vp_info = await get_snapshot_video_presentation(
session, share_token, video_presentation_id
)
if not vp_info:
raise HTTPException(status_code=404, detail="Video presentation not found")
slides = vp_info.get("slides") or []
public_slides = _replace_audio_paths_with_public_urls(
share_token, video_presentation_id, slides
)
return {
"id": vp_info.get("original_id"),
"title": vp_info.get("title"),
"status": "ready",
"slides": public_slides,
"scene_codes": vp_info.get("scene_codes"),
"slide_count": len(slides) if slides else None,
}
@router.get(
"/{share_token}/video-presentations/{video_presentation_id}/slides/{slide_number}/audio"
)
async def stream_public_slide_audio(
share_token: str,
video_presentation_id: int,
slide_number: int,
session: AsyncSession = Depends(get_async_session),
):
"""
Stream a slide's audio from a public chat snapshot.
No authentication required - the share_token provides access.
"""
from pathlib import Path
vp_info = await get_snapshot_video_presentation(
session, share_token, video_presentation_id
)
if not vp_info:
raise HTTPException(status_code=404, detail="Video presentation not found")
slides = vp_info.get("slides") or []
slide_data = None
for s in slides:
if s.get("slide_number") == slide_number:
slide_data = s
break
if not slide_data:
raise HTTPException(status_code=404, detail=f"Slide {slide_number} not found")
file_path = slide_data.get("audio_file")
if not file_path or not os.path.isfile(file_path):
raise HTTPException(status_code=404, detail="Slide audio file not found")
ext = Path(file_path).suffix.lower()
media_type = "audio/wav" if ext == ".wav" else "audio/mpeg"
def iterfile():
with open(file_path, mode="rb") as file_like:
yield from file_like
return StreamingResponse(
iterfile(),
media_type=media_type,
headers={
"Accept-Ranges": "bytes",
"Content-Disposition": f"inline; filename={Path(file_path).name}",
},
)
def _replace_audio_paths_with_public_urls(
share_token: str,
video_presentation_id: int,
slides: list[dict],
) -> list[dict]:
"""Replace server-local audio_file paths with public streaming API URLs."""
result = []
for slide in slides:
slide_copy = dict(slide)
slide_number = slide_copy.get("slide_number")
audio_file = slide_copy.pop("audio_file", None)
if audio_file and slide_number is not None:
slide_copy["audio_url"] = (
f"/api/v1/public/{share_token}"
f"/video-presentations/{video_presentation_id}"
f"/slides/{slide_number}/audio"
)
else:
slide_copy["audio_url"] = None
result.append(slide_copy)
return result
@router.get("/{share_token}/reports/{report_id}/content")
async def get_public_report_content(
share_token: str,

View file

@ -0,0 +1,242 @@
"""
Video presentation routes for CRUD operations and per-slide audio streaming.
These routes support the video presentation generation feature in new-chat.
Frontend polls GET /video-presentations/{id} to check status field.
When ready, the slides JSONB contains per-slide Remotion code and audio file paths.
The frontend compiles the Remotion code via Babel and renders with Remotion Player.
"""
import os
from pathlib import Path
from fastapi import APIRouter, Depends, HTTPException
from fastapi.responses import StreamingResponse
from sqlalchemy import select
from sqlalchemy.exc import SQLAlchemyError
from sqlalchemy.ext.asyncio import AsyncSession
from app.db import (
Permission,
SearchSpace,
SearchSpaceMembership,
User,
VideoPresentation,
get_async_session,
)
from app.schemas import VideoPresentationRead
from app.users import current_active_user
from app.utils.rbac import check_permission
router = APIRouter()
@router.get("/video-presentations", response_model=list[VideoPresentationRead])
async def read_video_presentations(
skip: int = 0,
limit: int = 100,
search_space_id: int | None = None,
session: AsyncSession = Depends(get_async_session),
user: User = Depends(current_active_user),
):
"""
List video presentations the user has access to.
Requires VIDEO_PRESENTATIONS_READ permission for the search space(s).
"""
if skip < 0 or limit < 1:
raise HTTPException(status_code=400, detail="Invalid pagination parameters")
try:
if search_space_id is not None:
await check_permission(
session,
user,
search_space_id,
Permission.VIDEO_PRESENTATIONS_READ.value,
"You don't have permission to read video presentations in this search space",
)
result = await session.execute(
select(VideoPresentation)
.filter(VideoPresentation.search_space_id == search_space_id)
.offset(skip)
.limit(limit)
)
else:
result = await session.execute(
select(VideoPresentation)
.join(SearchSpace)
.join(SearchSpaceMembership)
.filter(SearchSpaceMembership.user_id == user.id)
.offset(skip)
.limit(limit)
)
return [
VideoPresentationRead.from_orm_with_slides(vp)
for vp in result.scalars().all()
]
except HTTPException:
raise
except SQLAlchemyError:
raise HTTPException(
status_code=500,
detail="Database error occurred while fetching video presentations",
) from None
@router.get(
"/video-presentations/{video_presentation_id}",
response_model=VideoPresentationRead,
)
async def read_video_presentation(
video_presentation_id: int,
session: AsyncSession = Depends(get_async_session),
user: User = Depends(current_active_user),
):
"""
Get a specific video presentation by ID.
Requires authentication with VIDEO_PRESENTATIONS_READ permission.
When status is "ready", the response includes:
- slides: parsed slide data with per-slide audio_url and durations
- scene_codes: Remotion component source code per slide
"""
try:
result = await session.execute(
select(VideoPresentation).filter(
VideoPresentation.id == video_presentation_id
)
)
video_pres = result.scalars().first()
if not video_pres:
raise HTTPException(status_code=404, detail="Video presentation not found")
await check_permission(
session,
user,
video_pres.search_space_id,
Permission.VIDEO_PRESENTATIONS_READ.value,
"You don't have permission to read video presentations in this search space",
)
return VideoPresentationRead.from_orm_with_slides(video_pres)
except HTTPException as he:
raise he
except SQLAlchemyError:
raise HTTPException(
status_code=500,
detail="Database error occurred while fetching video presentation",
) from None
@router.delete("/video-presentations/{video_presentation_id}", response_model=dict)
async def delete_video_presentation(
video_presentation_id: int,
session: AsyncSession = Depends(get_async_session),
user: User = Depends(current_active_user),
):
"""
Delete a video presentation.
Requires VIDEO_PRESENTATIONS_DELETE permission for the search space.
"""
try:
result = await session.execute(
select(VideoPresentation).filter(
VideoPresentation.id == video_presentation_id
)
)
db_video_pres = result.scalars().first()
if not db_video_pres:
raise HTTPException(status_code=404, detail="Video presentation not found")
await check_permission(
session,
user,
db_video_pres.search_space_id,
Permission.VIDEO_PRESENTATIONS_DELETE.value,
"You don't have permission to delete video presentations in this search space",
)
await session.delete(db_video_pres)
await session.commit()
return {"message": "Video presentation deleted successfully"}
except HTTPException as he:
raise he
except SQLAlchemyError:
await session.rollback()
raise HTTPException(
status_code=500,
detail="Database error occurred while deleting video presentation",
) from None
@router.get("/video-presentations/{video_presentation_id}/slides/{slide_number}/audio")
async def stream_slide_audio(
video_presentation_id: int,
slide_number: int,
session: AsyncSession = Depends(get_async_session),
user: User = Depends(current_active_user),
):
"""
Stream the audio file for a specific slide in a video presentation.
The slide_number is 1-based. Audio path is read from the slides JSONB.
"""
try:
result = await session.execute(
select(VideoPresentation).filter(
VideoPresentation.id == video_presentation_id
)
)
video_pres = result.scalars().first()
if not video_pres:
raise HTTPException(status_code=404, detail="Video presentation not found")
await check_permission(
session,
user,
video_pres.search_space_id,
Permission.VIDEO_PRESENTATIONS_READ.value,
"You don't have permission to access video presentations in this search space",
)
slides = video_pres.slides or []
slide_data = None
for s in slides:
if s.get("slide_number") == slide_number:
slide_data = s
break
if not slide_data:
raise HTTPException(
status_code=404,
detail=f"Slide {slide_number} not found",
)
file_path = slide_data.get("audio_file")
if not file_path or not os.path.isfile(file_path):
raise HTTPException(status_code=404, detail="Slide audio file not found")
ext = Path(file_path).suffix.lower()
media_type = "audio/wav" if ext == ".wav" else "audio/mpeg"
def iterfile():
with open(file_path, mode="rb") as file_like:
yield from file_like
return StreamingResponse(
iterfile(),
media_type=media_type,
headers={
"Accept-Ranges": "bytes",
"Content-Disposition": f"inline; filename={Path(file_path).name}",
},
)
except HTTPException as he:
raise he
except Exception as e:
raise HTTPException(
status_code=500,
detail=f"Error streaming slide audio: {e!s}",
) from e

View file

@ -101,6 +101,12 @@ from .search_space import (
SearchSpaceWithStats,
)
from .users import UserCreate, UserRead, UserUpdate
from .video_presentations import (
VideoPresentationBase,
VideoPresentationCreate,
VideoPresentationRead,
VideoPresentationUpdate,
)
__all__ = [
# Chat schemas (assistant-ui integration)
@ -220,4 +226,9 @@ __all__ = [
"UserRead",
"UserSearchSpaceAccess",
"UserUpdate",
# Video Presentation schemas
"VideoPresentationBase",
"VideoPresentationCreate",
"VideoPresentationRead",
"VideoPresentationUpdate",
]

View file

@ -12,13 +12,11 @@ class SearchSpaceBase(BaseModel):
class SearchSpaceCreate(SearchSpaceBase):
# Optional on create, will use defaults if not provided
citations_enabled: bool = True
qna_custom_instructions: str | None = None
class SearchSpaceUpdate(BaseModel):
# All fields optional on update - only send what you want to change
name: str | None = None
description: str | None = None
citations_enabled: bool | None = None
@ -29,7 +27,6 @@ class SearchSpaceRead(SearchSpaceBase, IDModel, TimestampModel):
id: int
created_at: datetime
user_id: uuid.UUID
# QnA configuration
citations_enabled: bool
qna_custom_instructions: str | None = None

View file

@ -0,0 +1,103 @@
"""Video presentation schemas for API responses."""
from datetime import datetime
from enum import StrEnum
from typing import Any
from pydantic import BaseModel
class VideoPresentationStatusEnum(StrEnum):
PENDING = "pending"
GENERATING = "generating"
READY = "ready"
FAILED = "failed"
class VideoPresentationBase(BaseModel):
"""Base video presentation schema."""
title: str
slides: list[dict[str, Any]] | None = None
scene_codes: list[dict[str, Any]] | None = None
search_space_id: int
class VideoPresentationCreate(VideoPresentationBase):
"""Schema for creating a video presentation."""
pass
class VideoPresentationUpdate(BaseModel):
"""Schema for updating a video presentation."""
title: str | None = None
slides: list[dict[str, Any]] | None = None
scene_codes: list[dict[str, Any]] | None = None
class VideoPresentationRead(VideoPresentationBase):
"""Schema for reading a video presentation."""
id: int
status: VideoPresentationStatusEnum = VideoPresentationStatusEnum.READY
created_at: datetime
slide_count: int | None = None
class Config:
from_attributes = True
@classmethod
def from_orm_with_slides(cls, obj):
"""Create VideoPresentationRead with slide_count computed.
Replaces raw server file paths in `audio_file` with API streaming
URLs so the frontend can use them directly in Remotion <Audio />.
"""
slides = obj.slides
if slides:
slides = _replace_audio_paths_with_urls(obj.id, slides)
data = {
"id": obj.id,
"title": obj.title,
"slides": slides,
"scene_codes": obj.scene_codes,
"search_space_id": obj.search_space_id,
"status": obj.status,
"created_at": obj.created_at,
"slide_count": len(obj.slides) if obj.slides else None,
}
return cls(**data)
def _replace_audio_paths_with_urls(
video_presentation_id: int,
slides: list[dict[str, Any]],
) -> list[dict[str, Any]]:
"""Replace server-local audio_file paths with streaming API URLs.
Transforms:
"audio_file": "video_presentation_audio/abc_slide_1.mp3"
Into:
"audio_url": "/api/v1/video-presentations/42/slides/1/audio"
The frontend passes this URL to Remotion's <Audio src={slide.audio_url} />.
"""
result = []
for slide in slides:
slide_copy = dict(slide)
slide_number = slide_copy.get("slide_number")
audio_file = slide_copy.pop("audio_file", None)
if audio_file and slide_number is not None:
slide_copy["audio_url"] = (
f"/api/v1/video-presentations/{video_presentation_id}"
f"/slides/{slide_number}/audio"
)
else:
slide_copy["audio_url"] = None
result.append(slide_copy)
return result

View file

@ -2,7 +2,6 @@ import asyncio
import time
from datetime import datetime
from typing import Any
from urllib.parse import urljoin
import httpx
from linkup import LinkupClient
@ -577,185 +576,27 @@ class ConnectorService:
search_space_id: int,
top_k: int = 20,
) -> tuple:
"""Search using the platform SearXNG instance.
Delegates to ``WebSearchService`` which handles caching, circuit
breaking, and retries. SearXNG configuration comes from the
docker/searxng/settings.yml file.
"""
Search using a configured SearxNG instance and return both sources and documents.
"""
searx_connector = await self.get_connector_by_type(
SearchSourceConnectorType.SEARXNG_API, search_space_id
from app.services import web_search_service
if not web_search_service.is_available():
return {
"id": 11,
"name": "Web Search",
"type": "SEARXNG_API",
"sources": [],
}, []
return await web_search_service.search(
query=user_query,
top_k=top_k,
)
if not searx_connector:
return {
"id": 11,
"name": "SearxNG Search",
"type": "SEARXNG_API",
"sources": [],
}, []
config = searx_connector.config or {}
host = config.get("SEARXNG_HOST")
if not host:
print("SearxNG connector is missing SEARXNG_HOST configuration")
return {
"id": 11,
"name": "SearxNG Search",
"type": "SEARXNG_API",
"sources": [],
}, []
api_key = config.get("SEARXNG_API_KEY")
engines = config.get("SEARXNG_ENGINES")
categories = config.get("SEARXNG_CATEGORIES")
language = config.get("SEARXNG_LANGUAGE")
safesearch = config.get("SEARXNG_SAFESEARCH")
def _parse_bool(value: Any, default: bool = True) -> bool:
if isinstance(value, bool):
return value
if isinstance(value, str):
lowered = value.strip().lower()
if lowered in {"true", "1", "yes", "on"}:
return True
if lowered in {"false", "0", "no", "off"}:
return False
return default
verify_ssl = _parse_bool(config.get("SEARXNG_VERIFY_SSL", True))
safesearch_value: int | None = None
if isinstance(safesearch, str):
safesearch_clean = safesearch.strip()
if safesearch_clean.isdigit():
safesearch_value = int(safesearch_clean)
elif isinstance(safesearch, int | float):
safesearch_value = int(safesearch)
if safesearch_value is not None and not (0 <= safesearch_value <= 2):
safesearch_value = None
def _format_list(value: Any) -> str | None:
if value is None:
return None
if isinstance(value, str):
value = value.strip()
return value or None
if isinstance(value, list | tuple | set):
cleaned = [str(item).strip() for item in value if str(item).strip()]
return ",".join(cleaned) if cleaned else None
return str(value)
params: dict[str, Any] = {
"q": user_query,
"format": "json",
"language": language or "",
"limit": max(1, min(top_k, 50)),
}
engines_param = _format_list(engines)
if engines_param:
params["engines"] = engines_param
categories_param = _format_list(categories)
if categories_param:
params["categories"] = categories_param
if safesearch_value is not None:
params["safesearch"] = safesearch_value
if not params.get("language"):
params.pop("language")
headers = {"Accept": "application/json"}
if api_key:
headers["X-API-KEY"] = api_key
searx_endpoint = urljoin(host if host.endswith("/") else f"{host}/", "search")
try:
async with httpx.AsyncClient(timeout=20.0, verify=verify_ssl) as client:
response = await client.get(
searx_endpoint,
params=params,
headers=headers,
)
response.raise_for_status()
except httpx.HTTPError as exc:
print(f"Error searching with SearxNG: {exc!s}")
return {
"id": 11,
"name": "SearxNG Search",
"type": "SEARXNG_API",
"sources": [],
}, []
try:
data = response.json()
except ValueError:
print("Failed to decode JSON response from SearxNG")
return {
"id": 11,
"name": "SearxNG Search",
"type": "SEARXNG_API",
"sources": [],
}, []
searx_results = data.get("results", [])
if not searx_results:
return {
"id": 11,
"name": "SearxNG Search",
"type": "SEARXNG_API",
"sources": [],
}, []
sources_list: list[dict[str, Any]] = []
documents: list[dict[str, Any]] = []
async with self.counter_lock:
for result in searx_results:
description = result.get("content") or result.get("snippet") or ""
if len(description) > 160:
description = f"{description}"
source = {
"id": self.source_id_counter,
"title": result.get("title", "SearxNG Result"),
"description": description,
"url": result.get("url", ""),
}
sources_list.append(source)
metadata = {
"url": result.get("url", ""),
"engines": result.get("engines", []),
"category": result.get("category"),
"source": "SEARXNG_API",
}
document = {
"chunk_id": self.source_id_counter,
"content": description or result.get("content", ""),
"score": result.get("score", 0.0),
"document": {
"id": self.source_id_counter,
"title": result.get("title", "SearxNG Result"),
"document_type": "SEARXNG_API",
"metadata": metadata,
},
}
documents.append(document)
self.source_id_counter += 1
result_object = {
"id": 11,
"name": "SearxNG Search",
"type": "SEARXNG_API",
"sources": sources_list,
}
return result_object, documents
async def search_baidu(
self,
user_query: str,

View file

@ -1,11 +1,10 @@
import logging
from datetime import datetime
from sqlalchemy import delete
from sqlalchemy.ext.asyncio import AsyncSession
from app.connectors.linear_connector import LinearConnector
from app.db import Chunk, Document
from app.db import Document
from app.services.llm_service import get_user_long_context_llm
from app.utils.document_converters import (
create_document_chunks,
@ -105,10 +104,6 @@ class LinearKBSyncService:
)
summary_embedding = embed_text(summary_content)
await self.db_session.execute(
delete(Chunk).where(Chunk.document_id == document.id)
)
chunks = await create_document_chunks(issue_content)
document.title = f"{issue_identifier}: {issue_title}"
@ -131,7 +126,7 @@ class LinearKBSyncService:
"connector_id": connector_id,
}
flag_modified(document, "document_metadata")
safe_set_chunks(document, chunks)
await safe_set_chunks(self.db_session, document, chunks)
document.updated_at = get_current_timestamp()
await self.db_session.commit()

View file

@ -85,6 +85,7 @@ PROVIDER_MAP = {
"ZHIPU": "openai",
"GITHUB_MODELS": "github",
"HUGGINGFACE": "huggingface",
"MINIMAX": "openai",
"CUSTOM": "custom",
}

View file

@ -127,6 +127,7 @@ async def validate_llm_config(
"ALIBABA_QWEN": "openai",
"MOONSHOT": "openai",
"ZHIPU": "openai", # GLM needs special handling
"MINIMAX": "openai",
"GITHUB_MODELS": "github",
}
provider_prefix = provider_map.get(provider, provider.lower())
@ -277,6 +278,7 @@ async def get_search_space_llm_instance(
"ALIBABA_QWEN": "openai",
"MOONSHOT": "openai",
"ZHIPU": "openai",
"MINIMAX": "openai",
}
provider_prefix = provider_map.get(
global_config["provider"], global_config["provider"].lower()
@ -350,6 +352,7 @@ async def get_search_space_llm_instance(
"ALIBABA_QWEN": "openai",
"MOONSHOT": "openai",
"ZHIPU": "openai",
"MINIMAX": "openai",
"GITHUB_MODELS": "github",
}
provider_prefix = provider_map.get(

View file

@ -1,10 +1,9 @@
import logging
from datetime import datetime
from sqlalchemy import delete
from sqlalchemy.ext.asyncio import AsyncSession
from app.db import Chunk, Document
from app.db import Document
from app.services.llm_service import get_user_long_context_llm
from app.utils.document_converters import (
create_document_chunks,
@ -130,11 +129,6 @@ class NotionKBSyncService:
summary_content = f"Notion Page: {document.document_metadata.get('page_title')}\n\n{full_content}"
summary_embedding = embed_text(summary_content)
logger.debug(f"Deleting old chunks for document {document_id}")
await self.db_session.execute(
delete(Chunk).where(Chunk.document_id == document.id)
)
logger.debug("Creating new chunks")
chunks = await create_document_chunks(full_content)
logger.debug(f"Created {len(chunks)} chunks")
@ -147,7 +141,7 @@ class NotionKBSyncService:
**document.document_metadata,
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
}
safe_set_chunks(document, chunks)
await safe_set_chunks(self.db_session, document, chunks)
document.updated_at = get_current_timestamp()
logger.debug("Committing changes to database")

View file

@ -32,6 +32,8 @@ from app.db import (
Report,
SearchSpaceMembership,
User,
VideoPresentation,
VideoPresentationStatus,
)
from app.utils.rbac import check_permission
@ -40,6 +42,7 @@ UI_TOOLS = {
"link_preview",
"generate_podcast",
"generate_report",
"generate_video_presentation",
"scrape_webpage",
"multi_link_preview",
}
@ -199,6 +202,8 @@ async def create_snapshot(
podcast_ids_seen: set[int] = set()
reports_data = []
report_ids_seen: set[int] = set()
video_presentations_data = []
video_presentation_ids_seen: set[int] = set()
for msg in sorted(thread.messages, key=lambda m: m.created_at):
author = await get_author_display(session, msg.author_id, user_cache)
@ -225,6 +230,18 @@ async def create_snapshot(
# Update status to "ready" so frontend renders PodcastPlayer
part["result"] = {**result_data, "status": "ready"}
elif tool_name == "generate_video_presentation":
result_data = part.get("result", {})
vp_id = result_data.get("video_presentation_id")
if vp_id and vp_id not in video_presentation_ids_seen:
vp_info = await _get_video_presentation_for_snapshot(
session, vp_id
)
if vp_info:
video_presentations_data.append(vp_info)
video_presentation_ids_seen.add(vp_id)
part["result"] = {**result_data, "status": "ready"}
elif tool_name == "generate_report":
result_data = part.get("result", {})
report_id = result_data.get("report_id")
@ -283,6 +300,7 @@ async def create_snapshot(
"messages": messages_data,
"podcasts": podcasts_data,
"reports": reports_data,
"video_presentations": video_presentations_data,
}
# Create new snapshot
@ -326,6 +344,27 @@ async def _get_podcast_for_snapshot(
}
async def _get_video_presentation_for_snapshot(
session: AsyncSession,
video_presentation_id: int,
) -> dict | None:
"""Get video presentation info for embedding in snapshot_data."""
result = await session.execute(
select(VideoPresentation).filter(VideoPresentation.id == video_presentation_id)
)
vp = result.scalars().first()
if not vp or vp.status != VideoPresentationStatus.READY:
return None
return {
"original_id": vp.id,
"title": vp.title,
"slides": vp.slides,
"scene_codes": vp.scene_codes,
}
async def _get_report_for_snapshot(
session: AsyncSession,
report_id: int,
@ -769,6 +808,31 @@ async def get_snapshot_podcast(
return None
async def get_snapshot_video_presentation(
session: AsyncSession,
share_token: str,
video_presentation_id: int,
) -> dict | None:
"""
Get video presentation info from a snapshot by original video presentation ID.
Used for rendering video presentation in public view.
Looks up the presentation by its original_id in the snapshot's video_presentations array.
"""
snapshot = await get_snapshot_by_token(session, share_token)
if not snapshot:
return None
video_presentations = snapshot.snapshot_data.get("video_presentations", [])
for vp in video_presentations:
if vp.get("original_id") == video_presentation_id:
return vp
return None
async def get_snapshot_report(
session: AsyncSession,
share_token: str,

View file

@ -0,0 +1,290 @@
"""
Platform-level web search service backed by SearXNG.
Redis is used only for result caching (graceful degradation if unavailable).
The circuit breaker is fully in-process no external dependency, zero
latency overhead.
"""
from __future__ import annotations
import contextlib
import hashlib
import json
import logging
import threading
import time
from typing import Any
from urllib.parse import urljoin
import httpx
import redis
from app.config import config
logger = logging.getLogger(__name__)
_EMPTY_RESULT: dict[str, Any] = {
"id": 11,
"name": "Web Search",
"type": "SEARXNG_API",
"sources": [],
}
# ---------------------------------------------------------------------------
# Redis — used only for result caching
# ---------------------------------------------------------------------------
_redis_client: redis.Redis | None = None
def _get_redis() -> redis.Redis:
global _redis_client
if _redis_client is None:
_redis_client = redis.from_url(config.REDIS_APP_URL, decode_responses=True)
return _redis_client
# ---------------------------------------------------------------------------
# In-process Circuit Breaker (no Redis dependency)
# ---------------------------------------------------------------------------
_CB_FAILURE_THRESHOLD = 5
_CB_FAILURE_WINDOW_SECONDS = 60
_CB_COOLDOWN_SECONDS = 30
_cb_lock = threading.Lock()
_cb_failure_count: int = 0
_cb_last_failure_time: float = 0.0
_cb_open_until: float = 0.0
def _circuit_is_open() -> bool:
return time.monotonic() < _cb_open_until
def _record_failure() -> None:
global _cb_failure_count, _cb_last_failure_time, _cb_open_until
now = time.monotonic()
with _cb_lock:
if now - _cb_last_failure_time > _CB_FAILURE_WINDOW_SECONDS:
_cb_failure_count = 0
_cb_failure_count += 1
_cb_last_failure_time = now
if _cb_failure_count >= _CB_FAILURE_THRESHOLD:
_cb_open_until = now + _CB_COOLDOWN_SECONDS
logger.warning(
"Circuit breaker OPENED after %d failures — "
"SearXNG calls paused for %ds",
_cb_failure_count,
_CB_COOLDOWN_SECONDS,
)
def _record_success() -> None:
global _cb_failure_count, _cb_open_until
with _cb_lock:
_cb_failure_count = 0
_cb_open_until = 0.0
# ---------------------------------------------------------------------------
# Result Caching (Redis, graceful degradation)
# ---------------------------------------------------------------------------
_CACHE_TTL_SECONDS = 300 # 5 minutes
_CACHE_PREFIX = "websearch:cache:"
def _cache_key(query: str, engines: str | None, language: str | None) -> str:
raw = f"{query}|{engines or ''}|{language or ''}"
digest = hashlib.sha256(raw.encode()).hexdigest()[:24]
return f"{_CACHE_PREFIX}{digest}"
def _cache_get(key: str) -> dict | None:
try:
data = _get_redis().get(key)
if data:
return json.loads(data)
except (redis.RedisError, json.JSONDecodeError):
pass
return None
def _cache_set(key: str, value: dict) -> None:
with contextlib.suppress(redis.RedisError):
_get_redis().setex(key, _CACHE_TTL_SECONDS, json.dumps(value))
# ---------------------------------------------------------------------------
# Public API
# ---------------------------------------------------------------------------
def is_available() -> bool:
"""Return ``True`` when the platform SearXNG host is configured."""
return bool(config.SEARXNG_DEFAULT_HOST)
async def health_check() -> dict[str, Any]:
"""Ping the SearXNG ``/healthz`` endpoint and return status info."""
host = config.SEARXNG_DEFAULT_HOST
if not host:
return {"status": "unavailable", "error": "SEARXNG_DEFAULT_HOST not set"}
healthz_url = urljoin(host if host.endswith("/") else f"{host}/", "healthz")
t0 = time.perf_counter()
try:
async with httpx.AsyncClient(timeout=5.0, verify=False) as client:
resp = await client.get(healthz_url)
resp.raise_for_status()
elapsed_ms = round((time.perf_counter() - t0) * 1000)
return {
"status": "healthy",
"response_time_ms": elapsed_ms,
"circuit_breaker": "open" if _circuit_is_open() else "closed",
}
except Exception as exc:
elapsed_ms = round((time.perf_counter() - t0) * 1000)
return {
"status": "unhealthy",
"error": str(exc),
"response_time_ms": elapsed_ms,
"circuit_breaker": "open" if _circuit_is_open() else "closed",
}
async def search(
query: str,
top_k: int = 20,
*,
engines: str | None = None,
language: str | None = None,
safesearch: int | None = None,
) -> tuple[dict[str, Any], list[dict[str, Any]]]:
"""Execute a web search against the platform SearXNG instance.
Returns the standard ``(result_object, documents)`` tuple expected by
``ConnectorService.search_searxng``.
"""
host = config.SEARXNG_DEFAULT_HOST
if not host:
return dict(_EMPTY_RESULT), []
if _circuit_is_open():
logger.info("Web search skipped — circuit breaker is open")
result = dict(_EMPTY_RESULT)
result["error"] = "Web search temporarily unavailable (circuit open)"
result["status"] = "degraded"
return result, []
ck = _cache_key(query, engines, language)
cached = _cache_get(ck)
if cached is not None:
logger.debug("Web search cache HIT for query=%r", query[:60])
return cached["result"], cached["documents"]
params: dict[str, Any] = {
"q": query,
"format": "json",
"limit": max(1, min(top_k, 50)),
}
if engines:
params["engines"] = engines
if language:
params["language"] = language
if safesearch is not None and 0 <= safesearch <= 2:
params["safesearch"] = safesearch
searx_endpoint = urljoin(host if host.endswith("/") else f"{host}/", "search")
headers = {"Accept": "application/json"}
data: dict[str, Any] | None = None
last_error: Exception | None = None
for attempt in range(2):
try:
async with httpx.AsyncClient(timeout=15.0, verify=False) as client:
response = await client.get(
searx_endpoint,
params=params,
headers=headers,
)
response.raise_for_status()
data = response.json()
break
except (httpx.HTTPStatusError, httpx.TimeoutException) as exc:
last_error = exc
if attempt == 0 and (
isinstance(exc, httpx.TimeoutException)
or (
isinstance(exc, httpx.HTTPStatusError)
and exc.response.status_code >= 500
)
):
continue
break
except httpx.HTTPError as exc:
last_error = exc
break
except ValueError as exc:
last_error = exc
break
if data is None:
_record_failure()
logger.warning("Web search failed after retries: %s", last_error)
return dict(_EMPTY_RESULT), []
_record_success()
searx_results = data.get("results", [])
if not searx_results:
return dict(_EMPTY_RESULT), []
sources_list: list[dict[str, Any]] = []
documents: list[dict[str, Any]] = []
for idx, result in enumerate(searx_results):
source_id = 200_000 + idx
description = result.get("content") or result.get("snippet") or ""
sources_list.append(
{
"id": source_id,
"title": result.get("title", "Web Search Result"),
"description": description,
"url": result.get("url", ""),
}
)
documents.append(
{
"chunk_id": source_id,
"content": description or result.get("content", ""),
"score": result.get("score", 0.0),
"document": {
"id": source_id,
"title": result.get("title", "Web Search Result"),
"document_type": "SEARXNG_API",
"metadata": {
"url": result.get("url", ""),
"engines": result.get("engines", []),
"category": result.get("category"),
"source": "SEARXNG_API",
},
},
}
)
result_object: dict[str, Any] = {
"id": 11,
"name": "Web Search",
"type": "SEARXNG_API",
"sources": sources_list,
}
_cache_set(ck, {"result": result_object, "documents": documents})
return result_object, documents

View file

@ -9,7 +9,6 @@ from sqlalchemy import select
from app.agents.podcaster.graph import graph as podcaster_graph
from app.agents.podcaster.state import State as PodcasterState
from app.celery_app import celery_app
from app.config import config
from app.db import Podcast, PodcastStatus
from app.tasks.celery_tasks import get_celery_session_maker
@ -29,21 +28,6 @@ if sys.platform.startswith("win"):
# =============================================================================
def _clear_generating_podcast(search_space_id: int) -> None:
"""Clear the generating podcast marker from Redis when task completes."""
import redis
try:
client = redis.from_url(config.REDIS_APP_URL, decode_responses=True)
key = f"podcast:generating:{search_space_id}"
client.delete(key)
logger.info(
f"Cleared generating podcast key for search_space_id={search_space_id}"
)
except Exception as e:
logger.warning(f"Could not clear generating podcast key: {e}")
@celery_app.task(name="generate_content_podcast", bind=True)
def generate_content_podcast_task(
self,
@ -75,7 +59,6 @@ def generate_content_podcast_task(
loop.run_until_complete(_mark_podcast_failed(podcast_id))
return {"status": "failed", "podcast_id": podcast_id}
finally:
_clear_generating_podcast(search_space_id)
asyncio.set_event_loop(None)
loop.close()

View file

@ -0,0 +1,161 @@
"""Celery tasks for video presentation generation."""
import asyncio
import logging
import sys
from sqlalchemy import select
from app.agents.video_presentation.graph import graph as video_presentation_graph
from app.agents.video_presentation.state import State as VideoPresentationState
from app.celery_app import celery_app
from app.db import VideoPresentation, VideoPresentationStatus
from app.tasks.celery_tasks import get_celery_session_maker
logger = logging.getLogger(__name__)
if sys.platform.startswith("win"):
try:
asyncio.set_event_loop_policy(asyncio.WindowsProactorEventLoopPolicy())
except AttributeError:
logger.warning(
"WindowsProactorEventLoopPolicy is unavailable; async subprocess support may fail."
)
@celery_app.task(name="generate_video_presentation", bind=True)
def generate_video_presentation_task(
self,
video_presentation_id: int,
source_content: str,
search_space_id: int,
user_prompt: str | None = None,
) -> dict:
"""
Celery task to generate video presentation from source content.
Updates existing video presentation record created by the tool.
"""
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
result = loop.run_until_complete(
_generate_video_presentation(
video_presentation_id,
source_content,
search_space_id,
user_prompt,
)
)
loop.run_until_complete(loop.shutdown_asyncgens())
return result
except Exception as e:
logger.error(f"Error generating video presentation: {e!s}")
loop.run_until_complete(_mark_video_presentation_failed(video_presentation_id))
return {"status": "failed", "video_presentation_id": video_presentation_id}
finally:
asyncio.set_event_loop(None)
loop.close()
async def _mark_video_presentation_failed(video_presentation_id: int) -> None:
"""Mark a video presentation as failed in the database."""
async with get_celery_session_maker()() as session:
try:
result = await session.execute(
select(VideoPresentation).filter(
VideoPresentation.id == video_presentation_id
)
)
video_pres = result.scalars().first()
if video_pres:
video_pres.status = VideoPresentationStatus.FAILED
await session.commit()
except Exception as e:
logger.error(f"Failed to mark video presentation as failed: {e}")
async def _generate_video_presentation(
video_presentation_id: int,
source_content: str,
search_space_id: int,
user_prompt: str | None = None,
) -> dict:
"""Generate video presentation and update existing record."""
async with get_celery_session_maker()() as session:
result = await session.execute(
select(VideoPresentation).filter(
VideoPresentation.id == video_presentation_id
)
)
video_pres = result.scalars().first()
if not video_pres:
raise ValueError(f"VideoPresentation {video_presentation_id} not found")
try:
video_pres.status = VideoPresentationStatus.GENERATING
await session.commit()
graph_config = {
"configurable": {
"video_title": video_pres.title,
"search_space_id": search_space_id,
"user_prompt": user_prompt,
}
}
initial_state = VideoPresentationState(
source_content=source_content,
db_session=session,
)
graph_result = await video_presentation_graph.ainvoke(
initial_state, config=graph_config
)
# Serialize slides (parsed content + audio info merged)
slides_raw = graph_result.get("slides", [])
audio_results_raw = graph_result.get("slide_audio_results", [])
scene_codes_raw = graph_result.get("slide_scene_codes", [])
audio_map = {}
for ar in audio_results_raw:
data = ar.model_dump() if hasattr(ar, "model_dump") else ar
audio_map[data.get("slide_number", 0)] = data
serializable_slides = []
for slide in slides_raw:
slide_data = (
slide.model_dump() if hasattr(slide, "model_dump") else dict(slide)
)
audio_data = audio_map.get(slide_data.get("slide_number", 0), {})
slide_data["audio_file"] = audio_data.get("audio_file")
slide_data["duration_seconds"] = audio_data.get("duration_seconds")
slide_data["duration_in_frames"] = audio_data.get("duration_in_frames")
serializable_slides.append(slide_data)
serializable_scene_codes = []
for sc in scene_codes_raw:
sc_data = sc.model_dump() if hasattr(sc, "model_dump") else dict(sc)
serializable_scene_codes.append(sc_data)
video_pres.slides = serializable_slides
video_pres.scene_codes = serializable_scene_codes
video_pres.status = VideoPresentationStatus.READY
await session.commit()
logger.info(f"Successfully generated video presentation: {video_pres.id}")
return {
"status": "ready",
"video_presentation_id": video_pres.id,
"title": video_pres.title,
"slide_count": len(serializable_slides),
}
except Exception as e:
logger.error(f"Error in _generate_video_presentation: {e!s}")
video_pres.status = VideoPresentationStatus.FAILED
await session.commit()
raise

View file

@ -613,6 +613,41 @@ async def _stream_agent_events(
status="completed",
items=completed_items,
)
elif tool_name == "generate_video_presentation":
vp_status = (
tool_output.get("status", "unknown")
if isinstance(tool_output, dict)
else "unknown"
)
vp_title = (
tool_output.get("title", "Presentation")
if isinstance(tool_output, dict)
else "Presentation"
)
if vp_status in ("pending", "generating"):
completed_items = [
f"Title: {vp_title}",
"Presentation generation started",
"Processing in background...",
]
elif vp_status == "failed":
error_msg = (
tool_output.get("error", "Unknown error")
if isinstance(tool_output, dict)
else "Unknown error"
)
completed_items = [
f"Title: {vp_title}",
f"Error: {error_msg[:50]}",
]
else:
completed_items = last_active_step_items
yield streaming_service.format_thinking_step(
step_id=original_step_id,
title="Generating video presentation",
status="completed",
items=completed_items,
)
elif tool_name == "generate_report":
report_status = (
tool_output.get("status", "unknown")
@ -756,6 +791,34 @@ async def _stream_agent_events(
f"Podcast generation failed: {error_msg}",
"error",
)
elif tool_name == "generate_video_presentation":
yield streaming_service.format_tool_output_available(
tool_call_id,
tool_output
if isinstance(tool_output, dict)
else {"result": tool_output},
)
if (
isinstance(tool_output, dict)
and tool_output.get("status") == "pending"
):
yield streaming_service.format_terminal_info(
f"Video presentation queued: {tool_output.get('title', 'Presentation')}",
"success",
)
elif (
isinstance(tool_output, dict)
and tool_output.get("status") == "failed"
):
error_msg = (
tool_output.get("error", "Unknown error")
if isinstance(tool_output, dict)
else "Unknown error"
)
yield streaming_service.format_terminal_info(
f"Presentation generation failed: {error_msg}",
"error",
)
elif tool_name == "link_preview":
yield streaming_service.format_tool_output_available(
tool_call_id,

View file

@ -432,7 +432,7 @@ async def index_airtable_records(
"table_name": item["table_name"],
"connector_id": connector_id,
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -28,45 +28,35 @@ def get_current_timestamp() -> datetime:
return datetime.now(UTC)
def safe_set_chunks(document: Document, chunks: list) -> None:
async def safe_set_chunks(
session: "AsyncSession", document: Document, chunks: list
) -> None:
"""
Safely assign chunks to a document without triggering lazy loading.
Delete old chunks and assign new ones to a document.
ALWAYS use this instead of `document.chunks = chunks` to avoid
SQLAlchemy async errors (MissingGreenlet / greenlet_spawn).
Why this is needed:
- Direct assignment `document.chunks = chunks` triggers SQLAlchemy to
load the OLD chunks first (for comparison/orphan detection)
- This lazy loading fails in async context with asyncpg driver
- set_committed_value bypasses this by setting the value directly
This function is safe regardless of how the document was loaded
(with or without selectinload).
This replaces direct ``document.chunks = chunks`` which triggers lazy
loading (and MissingGreenlet errors in async contexts). It also
explicitly deletes pre-existing chunks so they don't accumulate across
repeated re-indexes ``set_committed_value`` bypasses SQLAlchemy's
delete-orphan cascade.
Args:
document: The Document object to update
chunks: List of Chunk objects to assign
Example:
# Instead of: document.chunks = chunks (DANGEROUS!)
safe_set_chunks(document, chunks) # Always safe
session: The current async database session.
document: The Document object to update.
chunks: List of Chunk objects to assign.
"""
from sqlalchemy.orm import object_session
from sqlalchemy import delete
from sqlalchemy.orm.attributes import set_committed_value
# Keep relationship assignment lazy-load-safe.
set_committed_value(document, "chunks", chunks)
from app.db import Chunk
# Ensure chunk rows are actually persisted.
# set_committed_value bypasses normal unit-of-work tracking, so we need to
# explicitly attach chunk objects to the current session.
session = object_session(document)
if session is not None:
if document.id is not None:
for chunk in chunks:
chunk.document_id = document.id
session.add_all(chunks)
if document.id is not None:
await session.execute(delete(Chunk).where(Chunk.document_id == document.id))
for chunk in chunks:
chunk.document_id = document.id
set_committed_value(document, "chunks", chunks)
session.add_all(chunks)
def parse_date_flexible(date_str: str) -> datetime:

View file

@ -430,7 +430,7 @@ async def index_bookstack_pages(
document.content_hash = item["content_hash"]
document.embedding = summary_embedding
document.document_metadata = doc_metadata
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -439,7 +439,7 @@ async def index_clickup_tasks(
"connector_id": connector_id,
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -413,7 +413,7 @@ async def index_confluence_pages(
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
"connector_id": connector_id,
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -690,7 +690,7 @@ async def index_discord_messages(
"indexed_at": datetime.now(UTC).strftime("%Y-%m-%d %H:%M:%S"),
"connector_id": connector_id,
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -386,7 +386,7 @@ async def index_elasticsearch_documents(
document.content_hash = item["content_hash"]
document.unique_identifier_hash = item["unique_identifier_hash"]
document.document_metadata = metadata
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -415,7 +415,7 @@ async def index_github_repos(
document.content_hash = item["content_hash"]
document.embedding = summary_embedding
document.document_metadata = doc_metadata
safe_set_chunks(document, chunks_data)
await safe_set_chunks(session, document, chunks_data)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -528,7 +528,7 @@ async def index_google_calendar_events(
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
"connector_id": connector_id,
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -451,7 +451,7 @@ async def index_google_gmail_messages(
"date": item["date_str"],
"connector_id": connector_id,
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -393,7 +393,7 @@ async def index_jira_issues(
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
"connector_id": connector_id,
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -431,7 +431,7 @@ async def index_linear_issues(
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
"connector_id": connector_id,
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -488,7 +488,7 @@ async def index_luma_events(
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
"connector_id": connector_id,
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -479,7 +479,7 @@ async def index_notion_pages(
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
"connector_id": connector_id,
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -571,7 +571,7 @@ async def index_obsidian_vault(
document.content_hash = content_hash
document.embedding = embedding
document.document_metadata = document_metadata
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -564,7 +564,7 @@ async def index_slack_messages(
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
"connector_id": connector_id,
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -603,7 +603,7 @@ async def index_teams_messages(
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
"connector_id": connector_id,
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.updated_at = get_current_timestamp()
document.status = DocumentStatus.ready()

View file

@ -410,7 +410,7 @@ async def index_crawled_urls(
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
"connector_id": connector_id,
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.status = DocumentStatus.ready() # READY status
document.updated_at = get_current_timestamp()

View file

@ -14,45 +14,35 @@ from app.db import Document
md = MarkdownifyTransformer()
def safe_set_chunks(document: Document, chunks: list) -> None:
async def safe_set_chunks(
session: "AsyncSession", document: Document, chunks: list
) -> None:
"""
Safely assign chunks to a document without triggering lazy loading.
Delete old chunks and assign new ones to a document.
ALWAYS use this instead of `document.chunks = chunks` to avoid
SQLAlchemy async errors (MissingGreenlet / greenlet_spawn).
Why this is needed:
- Direct assignment `document.chunks = chunks` triggers SQLAlchemy to
load the OLD chunks first (for comparison/orphan detection)
- This lazy loading fails in async context with asyncpg driver
- set_committed_value bypasses this by setting the value directly
This function is safe regardless of how the document was loaded
(with or without selectinload).
This replaces direct ``document.chunks = chunks`` which triggers lazy
loading (and MissingGreenlet errors in async contexts). It also
explicitly deletes pre-existing chunks so they don't accumulate across
repeated re-indexes ``set_committed_value`` bypasses SQLAlchemy's
delete-orphan cascade.
Args:
document: The Document object to update
chunks: List of Chunk objects to assign
Example:
# Instead of: document.chunks = chunks (DANGEROUS!)
safe_set_chunks(document, chunks) # Always safe
session: The current async database session.
document: The Document object to update.
chunks: List of Chunk objects to assign.
"""
from sqlalchemy.orm import object_session
from sqlalchemy import delete
from sqlalchemy.orm.attributes import set_committed_value
# Keep relationship assignment lazy-load-safe.
set_committed_value(document, "chunks", chunks)
from app.db import Chunk
# Ensure chunk rows are actually persisted.
# set_committed_value bypasses normal unit-of-work tracking, so we need to
# explicitly attach chunk objects to the current session.
session = object_session(document)
if session is not None:
if document.id is not None:
for chunk in chunks:
chunk.document_id = document.id
session.add_all(chunks)
if document.id is not None:
await session.execute(delete(Chunk).where(Chunk.document_id == document.id))
for chunk in chunks:
chunk.document_id = document.id
set_committed_value(document, "chunks", chunks)
session.add_all(chunks)
def get_current_timestamp() -> datetime:

View file

@ -227,7 +227,7 @@ async def add_circleback_meeting_document(
if summary_embedding is not None:
document.embedding = summary_embedding
document.document_metadata = document_metadata
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.source_markdown = markdown_content
document.content_needs_reindexing = False
document.updated_at = get_current_timestamp()

View file

@ -21,6 +21,7 @@ from app.utils.document_converters import (
from .base import (
check_document_by_unique_identifier,
get_current_timestamp,
safe_set_chunks,
)
@ -154,7 +155,7 @@ async def add_extension_received_document(
existing_document.content_hash = content_hash
existing_document.embedding = summary_embedding
existing_document.document_metadata = content.metadata.model_dump()
existing_document.chunks = chunks
await safe_set_chunks(session, existing_document, chunks)
existing_document.source_markdown = combined_document_string
existing_document.updated_at = get_current_timestamp()

View file

@ -35,6 +35,7 @@ from .base import (
check_document_by_unique_identifier,
check_duplicate_document,
get_current_timestamp,
safe_set_chunks,
)
from .markdown_processor import add_received_markdown_file_document
@ -488,7 +489,7 @@ async def add_received_file_document_using_unstructured(
"FILE_NAME": file_name,
"ETL_SERVICE": "UNSTRUCTURED",
}
existing_document.chunks = chunks
await safe_set_chunks(session, existing_document, chunks)
existing_document.source_markdown = file_in_markdown
existing_document.content_needs_reindexing = False
existing_document.updated_at = get_current_timestamp()
@ -622,7 +623,7 @@ async def add_received_file_document_using_llamacloud(
"FILE_NAME": file_name,
"ETL_SERVICE": "LLAMACLOUD",
}
existing_document.chunks = chunks
await safe_set_chunks(session, existing_document, chunks)
existing_document.source_markdown = file_in_markdown
existing_document.content_needs_reindexing = False
existing_document.updated_at = get_current_timestamp()
@ -777,7 +778,7 @@ async def add_received_file_document_using_docling(
"FILE_NAME": file_name,
"ETL_SERVICE": "DOCLING",
}
existing_document.chunks = chunks
await safe_set_chunks(session, existing_document, chunks)
existing_document.source_markdown = file_in_markdown
existing_document.content_needs_reindexing = False
existing_document.updated_at = get_current_timestamp()

View file

@ -21,6 +21,7 @@ from .base import (
check_document_by_unique_identifier,
check_duplicate_document,
get_current_timestamp,
safe_set_chunks,
)
@ -258,7 +259,7 @@ async def add_received_markdown_file_document(
existing_document.document_metadata = {
"FILE_NAME": file_name,
}
existing_document.chunks = chunks
await safe_set_chunks(session, existing_document, chunks)
existing_document.source_markdown = file_in_markdown
existing_document.updated_at = get_current_timestamp()
existing_document.status = DocumentStatus.ready() # Mark as ready

View file

@ -419,7 +419,7 @@ async def add_youtube_video_document(
"author": video_data.get("author_name", "Unknown"),
"thumbnail": video_data.get("thumbnail_url", ""),
}
safe_set_chunks(document, chunks)
await safe_set_chunks(session, document, chunks)
document.source_markdown = combined_document_string
document.status = DocumentStatus.ready() # READY status - fully processed
document.updated_at = get_current_timestamp()

View file

@ -9,9 +9,10 @@ import re
from datetime import UTC, datetime
from pathlib import Path
from sqlalchemy import select
from sqlalchemy import delete as sa_delete, select
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.orm import selectinload
from sqlalchemy.orm.attributes import set_committed_value
from app.config import config
from app.db import SurfsenseDocsChunk, SurfsenseDocsDocument, async_session_maker
@ -19,6 +20,24 @@ from app.utils.document_converters import embed_text
logger = logging.getLogger(__name__)
async def _safe_set_docs_chunks(
session: AsyncSession, document: SurfsenseDocsDocument, chunks: list
) -> None:
"""safe_set_chunks variant for the SurfsenseDocsDocument/Chunk models."""
if document.id is not None:
await session.execute(
sa_delete(SurfsenseDocsChunk).where(
SurfsenseDocsChunk.document_id == document.id
)
)
for chunk in chunks:
chunk.document_id = document.id
set_committed_value(document, "chunks", chunks)
session.add_all(chunks)
# Path to docs relative to project root
DOCS_DIR = (
Path(__file__).resolve().parent.parent.parent.parent
@ -156,7 +175,7 @@ async def index_surfsense_docs(session: AsyncSession) -> tuple[int, int, int, in
existing_doc.content = content
existing_doc.content_hash = content_hash
existing_doc.embedding = embed_text(content)
existing_doc.chunks = chunks
await _safe_set_docs_chunks(session, existing_doc, chunks)
existing_doc.updated_at = datetime.now(UTC)
updated += 1

View file

@ -6,6 +6,9 @@ requires-python = ">=3.12"
dependencies = [
"alembic>=1.13.0",
"asyncpg>=0.30.0",
"authlib>=1.6.9",
"PyJWT>=2.12.0",
"tornado>=6.5.5",
"datasets>=2.21.0",
"pyarrow>=15.0.0,<19.0.0",
"discord-py>=2.5.2",

View file

@ -413,14 +413,14 @@ wheels = [
[[package]]
name = "authlib"
version = "1.6.8"
version = "1.6.9"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "cryptography" },
]
sdist = { url = "https://files.pythonhosted.org/packages/6b/6c/c88eac87468c607f88bc24df1f3b31445ee6fc9ba123b09e666adf687cd9/authlib-1.6.8.tar.gz", hash = "sha256:41ae180a17cf672bc784e4a518e5c82687f1fe1e98b0cafaeda80c8e4ab2d1cb", size = 165074 }
sdist = { url = "https://files.pythonhosted.org/packages/af/98/00d3dd826d46959ad8e32af2dbb2398868fd9fd0683c26e56d0789bd0e68/authlib-1.6.9.tar.gz", hash = "sha256:d8f2421e7e5980cc1ddb4e32d3f5fa659cfaf60d8eaf3281ebed192e4ab74f04", size = 165134 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/9b/73/f7084bf12755113cd535ae586782ff3a6e710bfbe6a0d13d1c2f81ffbbfa/authlib-1.6.8-py2.py3-none-any.whl", hash = "sha256:97286fd7a15e6cfefc32771c8ef9c54f0ed58028f1322de6a2a7c969c3817888", size = 244116 },
{ url = "https://files.pythonhosted.org/packages/53/23/b65f568ed0c22f1efacb744d2db1a33c8068f384b8c9b482b52ebdbc3ef6/authlib-1.6.9-py2.py3-none-any.whl", hash = "sha256:f08b4c14e08f0861dc18a32357b33fbcfd2ea86cfe3fe149484b4d764c4a0ac3", size = 244197 },
]
[[package]]
@ -6353,11 +6353,11 @@ wheels = [
[[package]]
name = "pyjwt"
version = "2.11.0"
version = "2.12.1"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/5c/5a/b46fa56bf322901eee5b0454a34343cdbdae202cd421775a8ee4e42fd519/pyjwt-2.11.0.tar.gz", hash = "sha256:35f95c1f0fbe5d5ba6e43f00271c275f7a1a4db1dab27bf708073b75318ea623", size = 98019 }
sdist = { url = "https://files.pythonhosted.org/packages/c2/27/a3b6e5bf6ff856d2509292e95c8f57f0df7017cf5394921fc4e4ef40308a/pyjwt-2.12.1.tar.gz", hash = "sha256:c74a7a2adf861c04d002db713dd85f84beb242228e671280bf709d765b03672b", size = 102564 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/6f/01/c26ce75ba460d5cd503da9e13b21a33804d38c2165dec7b716d06b13010c/pyjwt-2.11.0-py3-none-any.whl", hash = "sha256:94a6bde30eb5c8e04fee991062b534071fd1439ef58d2adc9ccb823e7bcd0469", size = 28224 },
{ url = "https://files.pythonhosted.org/packages/e5/7a/8dd906bd22e79e47397a61742927f6747fe93242ef86645ee9092e610244/pyjwt-2.12.1-py3-none-any.whl", hash = "sha256:28ca37c070cad8ba8cd9790cd940535d40274d22f80ab87f3ac6a713e6e8454c", size = 29726 },
]
[package.optional-dependencies]
@ -7854,6 +7854,7 @@ source = { editable = "." }
dependencies = [
{ name = "alembic" },
{ name = "asyncpg" },
{ name = "authlib" },
{ name = "boto3" },
{ name = "celery", extra = ["redis"] },
{ name = "chonkie", extra = ["all"] },
@ -7894,6 +7895,7 @@ dependencies = [
{ name = "playwright" },
{ name = "psycopg", extra = ["binary", "pool"] },
{ name = "pyarrow" },
{ name = "pyjwt" },
{ name = "pypandoc" },
{ name = "pypandoc-binary" },
{ name = "pypdf" },
@ -7909,6 +7911,7 @@ dependencies = [
{ name = "starlette" },
{ name = "static-ffmpeg" },
{ name = "tavily-python" },
{ name = "tornado" },
{ name = "trafilatura" },
{ name = "typst" },
{ name = "unstructured", extra = ["all-docs"] },
@ -7931,6 +7934,7 @@ dev = [
requires-dist = [
{ name = "alembic", specifier = ">=1.13.0" },
{ name = "asyncpg", specifier = ">=0.30.0" },
{ name = "authlib", specifier = ">=1.6.9" },
{ name = "boto3", specifier = ">=1.35.0" },
{ name = "celery", extras = ["redis"], specifier = ">=5.5.3" },
{ name = "chonkie", extras = ["all"], specifier = ">=1.5.0" },
@ -7971,6 +7975,7 @@ requires-dist = [
{ name = "playwright", specifier = ">=1.50.0" },
{ name = "psycopg", extras = ["binary", "pool"], specifier = ">=3.3.2" },
{ name = "pyarrow", specifier = ">=15.0.0,<19.0.0" },
{ name = "pyjwt", specifier = ">=2.12.0" },
{ name = "pypandoc", specifier = ">=1.16.2" },
{ name = "pypandoc-binary", specifier = ">=1.16.2" },
{ name = "pypdf", specifier = ">=5.1.0" },
@ -7986,6 +7991,7 @@ requires-dist = [
{ name = "starlette", specifier = ">=0.40.0,<0.51.0" },
{ name = "static-ffmpeg", specifier = ">=2.13" },
{ name = "tavily-python", specifier = ">=0.3.2" },
{ name = "tornado", specifier = ">=6.5.5" },
{ name = "trafilatura", specifier = ">=2.0.0" },
{ name = "typst", specifier = ">=0.14.0" },
{ name = "unstructured", extras = ["all-docs"], specifier = ">=0.18.31" },
@ -8252,6 +8258,11 @@ dependencies = [
wheels = [
{ url = "https://files.pythonhosted.org/packages/d3/54/a2ba279afcca44bbd320d4e73675b282fcee3d81400ea1b53934efca6462/torch-2.10.0-2-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:13ec4add8c3faaed8d13e0574f5cd4a323c11655546f91fbe6afa77b57423574", size = 79498202 },
{ url = "https://files.pythonhosted.org/packages/ec/23/2c9fe0c9c27f7f6cb865abcea8a4568f29f00acaeadfc6a37f6801f84cb4/torch-2.10.0-2-cp313-none-macosx_11_0_arm64.whl", hash = "sha256:e521c9f030a3774ed770a9c011751fb47c4d12029a3d6522116e48431f2ff89e", size = 79498254 },
{ url = "https://files.pythonhosted.org/packages/b3/7a/abada41517ce0011775f0f4eacc79659bc9bc6c361e6bfe6f7052a6b9363/torch-2.10.0-3-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:98c01b8bb5e3240426dcde1446eed6f40c778091c8544767ef1168fc663a05a6", size = 915622781 },
{ url = "https://files.pythonhosted.org/packages/ab/c6/4dfe238342ffdcec5aef1c96c457548762d33c40b45a1ab7033bb26d2ff2/torch-2.10.0-3-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:80b1b5bfe38eb0e9f5ff09f206dcac0a87aadd084230d4a36eea5ec5232c115b", size = 915627275 },
{ url = "https://files.pythonhosted.org/packages/d8/f0/72bf18847f58f877a6a8acf60614b14935e2f156d942483af1ffc081aea0/torch-2.10.0-3-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:46b3574d93a2a8134b3f5475cfb98e2eb46771794c57015f6ad1fb795ec25e49", size = 915523474 },
{ url = "https://files.pythonhosted.org/packages/f4/39/590742415c3030551944edc2ddc273ea1fdfe8ffb2780992e824f1ebee98/torch-2.10.0-3-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:b1d5e2aba4eb7f8e87fbe04f86442887f9167a35f092afe4c237dfcaaef6e328", size = 915632474 },
{ url = "https://files.pythonhosted.org/packages/b6/8e/34949484f764dde5b222b7fe3fede43e4a6f0da9d7f8c370bb617d629ee2/torch-2.10.0-3-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:0228d20b06701c05a8f978357f657817a4a63984b0c90745def81c18aedfa591", size = 915523882 },
{ url = "https://files.pythonhosted.org/packages/cc/af/758e242e9102e9988969b5e621d41f36b8f258bb4a099109b7a4b4b50ea4/torch-2.10.0-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:5fd4117d89ffd47e3dcc71e71a22efac24828ad781c7e46aaaf56bf7f2796acf", size = 145996088 },
{ url = "https://files.pythonhosted.org/packages/23/8e/3c74db5e53bff7ed9e34c8123e6a8bfef718b2450c35eefab85bb4a7e270/torch-2.10.0-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:787124e7db3b379d4f1ed54dd12ae7c741c16a4d29b49c0226a89bea50923ffb", size = 915711952 },
{ url = "https://files.pythonhosted.org/packages/6e/01/624c4324ca01f66ae4c7cd1b74eb16fb52596dce66dbe51eff95ef9e7a4c/torch-2.10.0-cp312-cp312-win_amd64.whl", hash = "sha256:2c66c61f44c5f903046cc696d088e21062644cbe541c7f1c4eaae88b2ad23547", size = 113757972 },
@ -8308,21 +8319,19 @@ wheels = [
[[package]]
name = "tornado"
version = "6.5.4"
version = "6.5.5"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/37/1d/0a336abf618272d53f62ebe274f712e213f5a03c0b2339575430b8362ef2/tornado-6.5.4.tar.gz", hash = "sha256:a22fa9047405d03260b483980635f0b041989d8bcc9a313f8fe18b411d84b1d7", size = 513632 }
sdist = { url = "https://files.pythonhosted.org/packages/f8/f1/3173dfa4a18db4a9b03e5d55325559dab51ee653763bb8745a75af491286/tornado-6.5.5.tar.gz", hash = "sha256:192b8f3ea91bd7f1f50c06955416ed76c6b72f96779b962f07f911b91e8d30e9", size = 516006 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/ab/a9/e94a9d5224107d7ce3cc1fab8d5dc97f5ea351ccc6322ee4fb661da94e35/tornado-6.5.4-cp39-abi3-macosx_10_9_universal2.whl", hash = "sha256:d6241c1a16b1c9e4cc28148b1cda97dd1c6cb4fb7068ac1bedc610768dff0ba9", size = 443909 },
{ url = "https://files.pythonhosted.org/packages/db/7e/f7b8d8c4453f305a51f80dbb49014257bb7d28ccb4bbb8dd328ea995ecad/tornado-6.5.4-cp39-abi3-macosx_10_9_x86_64.whl", hash = "sha256:2d50f63dda1d2cac3ae1fa23d254e16b5e38153758470e9956cbc3d813d40843", size = 442163 },
{ url = "https://files.pythonhosted.org/packages/ba/b5/206f82d51e1bfa940ba366a8d2f83904b15942c45a78dd978b599870ab44/tornado-6.5.4-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d1cf66105dc6acb5af613c054955b8137e34a03698aa53272dbda4afe252be17", size = 445746 },
{ url = "https://files.pythonhosted.org/packages/8e/9d/1a3338e0bd30ada6ad4356c13a0a6c35fbc859063fa7eddb309183364ac1/tornado-6.5.4-cp39-abi3-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:50ff0a58b0dc97939d29da29cd624da010e7f804746621c78d14b80238669335", size = 445083 },
{ url = "https://files.pythonhosted.org/packages/50/d4/e51d52047e7eb9a582da59f32125d17c0482d065afd5d3bc435ff2120dc5/tornado-6.5.4-cp39-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e5fb5e04efa54cf0baabdd10061eb4148e0be137166146fff835745f59ab9f7f", size = 445315 },
{ url = "https://files.pythonhosted.org/packages/27/07/2273972f69ca63dbc139694a3fc4684edec3ea3f9efabf77ed32483b875c/tornado-6.5.4-cp39-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:9c86b1643b33a4cd415f8d0fe53045f913bf07b4a3ef646b735a6a86047dda84", size = 446003 },
{ url = "https://files.pythonhosted.org/packages/d1/83/41c52e47502bf7260044413b6770d1a48dda2f0246f95ee1384a3cd9c44a/tornado-6.5.4-cp39-abi3-musllinux_1_2_i686.whl", hash = "sha256:6eb82872335a53dd063a4f10917b3efd28270b56a33db69009606a0312660a6f", size = 445412 },
{ url = "https://files.pythonhosted.org/packages/10/c7/bc96917f06cbee182d44735d4ecde9c432e25b84f4c2086143013e7b9e52/tornado-6.5.4-cp39-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:6076d5dda368c9328ff41ab5d9dd3608e695e8225d1cd0fd1e006f05da3635a8", size = 445392 },
{ url = "https://files.pythonhosted.org/packages/0c/1a/d7592328d037d36f2d2462f4bc1fbb383eec9278bc786c1b111cbbd44cfa/tornado-6.5.4-cp39-abi3-win32.whl", hash = "sha256:1768110f2411d5cd281bac0a090f707223ce77fd110424361092859e089b38d1", size = 446481 },
{ url = "https://files.pythonhosted.org/packages/d6/6d/c69be695a0a64fd37a97db12355a035a6d90f79067a3cf936ec2b1dc38cd/tornado-6.5.4-cp39-abi3-win_amd64.whl", hash = "sha256:fa07d31e0cd85c60713f2b995da613588aa03e1303d75705dca6af8babc18ddc", size = 446886 },
{ url = "https://files.pythonhosted.org/packages/50/49/8dc3fd90902f70084bd2cd059d576ddb4f8bb44c2c7c0e33a11422acb17e/tornado-6.5.4-cp39-abi3-win_arm64.whl", hash = "sha256:053e6e16701eb6cbe641f308f4c1a9541f91b6261991160391bfc342e8a551a1", size = 445910 },
{ url = "https://files.pythonhosted.org/packages/59/8c/77f5097695f4dd8255ecbd08b2a1ed8ba8b953d337804dd7080f199e12bf/tornado-6.5.5-cp39-abi3-macosx_10_9_universal2.whl", hash = "sha256:487dc9cc380e29f58c7ab88f9e27cdeef04b2140862e5076a66fb6bb68bb1bfa", size = 445983 },
{ url = "https://files.pythonhosted.org/packages/ab/5e/7625b76cd10f98f1516c36ce0346de62061156352353ef2da44e5c21523c/tornado-6.5.5-cp39-abi3-macosx_10_9_x86_64.whl", hash = "sha256:65a7f1d46d4bb41df1ac99f5fcb685fb25c7e61613742d5108b010975a9a6521", size = 444246 },
{ url = "https://files.pythonhosted.org/packages/b2/04/7b5705d5b3c0fab088f434f9c83edac1573830ca49ccf29fb83bf7178eec/tornado-6.5.5-cp39-abi3-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:e74c92e8e65086b338fd56333fb9a68b9f6f2fe7ad532645a290a464bcf46be5", size = 447229 },
{ url = "https://files.pythonhosted.org/packages/34/01/74e034a30ef59afb4097ef8659515e96a39d910b712a89af76f5e4e1f93c/tornado-6.5.5-cp39-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:435319e9e340276428bbdb4e7fa732c2d399386d1de5686cb331ec8eee754f07", size = 448192 },
{ url = "https://files.pythonhosted.org/packages/be/00/fe9e02c5a96429fce1a1d15a517f5d8444f9c412e0bb9eadfbe3b0fc55bf/tornado-6.5.5-cp39-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:3f54aa540bdbfee7b9eb268ead60e7d199de5021facd276819c193c0fb28ea4e", size = 448039 },
{ url = "https://files.pythonhosted.org/packages/82/9e/656ee4cec0398b1d18d0f1eb6372c41c6b889722641d84948351ae19556d/tornado-6.5.5-cp39-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:36abed1754faeb80fbd6e64db2758091e1320f6bba74a4cf8c09cd18ccce8aca", size = 447445 },
{ url = "https://files.pythonhosted.org/packages/5a/76/4921c00511f88af86a33de770d64141170f1cfd9c00311aea689949e274e/tornado-6.5.5-cp39-abi3-win32.whl", hash = "sha256:dd3eafaaeec1c7f2f8fdcd5f964e8907ad788fe8a5a32c4426fbbdda621223b7", size = 448582 },
{ url = "https://files.pythonhosted.org/packages/2c/23/f6c6112a04d28eed765e374435fb1a9198f73e1ec4b4024184f21faeb1ad/tornado-6.5.5-cp39-abi3-win_amd64.whl", hash = "sha256:6443a794ba961a9f619b1ae926a2e900ac20c34483eea67be4ed8f1e58d3ef7b", size = 448990 },
{ url = "https://files.pythonhosted.org/packages/b7/c8/876602cbc96469911f0939f703453c1157b0c826ecb05bdd32e023397d4e/tornado-6.5.5-cp39-abi3-win_arm64.whl", hash = "sha256:2c9a876e094109333f888539ddb2de4361743e5d21eece20688e3e351e4990a6", size = 448016 },
]
[[package]]

6
surfsense_desktop/.env Normal file
View file

@ -0,0 +1,6 @@
# Electron-specific build-time configuration.
# Set before running pnpm dist:mac / dist:win / dist:linux.
# The hosted web frontend URL. Used to intercept OAuth redirects and keep them
# inside the desktop app. Set to your production frontend domain.
HOSTED_FRONTEND_URL=https://surfsense.net

3
surfsense_desktop/.gitignore vendored Normal file
View file

@ -0,0 +1,3 @@
node_modules/
dist/
release/

View file

@ -0,0 +1,58 @@
# SurfSense Desktop
Electron wrapper around the SurfSense web app. Packages the Next.js standalone build into a native desktop application with OAuth support, deep linking, and system browser integration.
## Prerequisites
- Node.js 18+
- pnpm 10+
- The `surfsense_web` project dependencies installed (`pnpm install` in `surfsense_web/`)
## Development
```bash
pnpm install
pnpm dev
```
This starts the Next.js dev server and Electron concurrently. Hot reload works — edit the web app and changes appear immediately.
## Configuration
Two `.env` files control the build:
**`surfsense_web/.env`** — Next.js environment variables baked into the frontend at build time:
**`surfsense_desktop/.env`** — Electron-specific configuration:
Set these before building.
## Build & Package
**Step 1** — Build the Next.js standalone output:
```bash
cd ../surfsense_web
pnpm build
```
**Step 2** — Compile Electron and prepare the standalone output:
```bash
cd ../surfsense_desktop
pnpm build
```
**Step 3** — Package into a distributable:
```bash
pnpm dist:mac # macOS (.dmg + .zip)
pnpm dist:win # Windows (.exe)
pnpm dist:linux # Linux (.deb + .AppImage)
```
**Step 4** — Find the output:
```bash
ls release/
```

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 151 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.3 MiB

View file

@ -0,0 +1,67 @@
appId: com.surfsense.desktop
productName: SurfSense
publish:
provider: github
owner: MODSetter
repo: SurfSense
directories:
output: release
files:
- dist/**/*
- "!node_modules"
- "!src"
- "!scripts"
- "!release"
extraResources:
- from: ../surfsense_web/.next/standalone/surfsense_web/
to: standalone/
filter:
- "**/*"
- "!**/node_modules"
- from: ../surfsense_web/.next/standalone/surfsense_web/node_modules/
to: standalone/node_modules/
filter: ["**/*"]
- from: ../surfsense_web/.next/static/
to: standalone/.next/static/
filter: ["**/*"]
- from: ../surfsense_web/public/
to: standalone/public/
filter: ["**/*"]
asarUnpack:
- "**/*.node"
mac:
icon: assets/icon.icns
category: public.app-category.productivity
artifactName: "${productName}-${version}-${arch}.${ext}"
hardenedRuntime: true
gatekeeperAssess: false
target:
- target: dmg
arch: [x64, arm64]
- target: zip
arch: [x64, arm64]
win:
icon: assets/icon.ico
target:
- target: nsis
arch: [x64, arm64]
nsis:
oneClick: false
perMachine: false
allowToChangeInstallationDirectory: true
createDesktopShortcut: true
createStartMenuShortcut: true
linux:
icon: assets/icon.png
category: Utility
artifactName: "${productName}-${version}-${arch}.${ext}"
mimeTypes:
- x-scheme-handler/surfsense
desktop:
entry:
Name: SurfSense
Comment: AI-powered research assistant
Categories: Utility;Office;
target:
- deb
- AppImage

View file

@ -0,0 +1,33 @@
{
"name": "surfsense-desktop",
"version": "0.1.0",
"description": "SurfSense Desktop App",
"main": "dist/main.js",
"scripts": {
"dev": "concurrently -k \"pnpm --dir ../surfsense_web dev\" \"wait-on http://localhost:3000 && electron .\"",
"build": "node scripts/build-electron.mjs",
"pack:dir": "pnpm build && electron-builder --dir --config electron-builder.yml",
"dist": "pnpm build && electron-builder --config electron-builder.yml",
"dist:mac": "pnpm build && electron-builder --mac --config electron-builder.yml",
"dist:win": "pnpm build && electron-builder --win --config electron-builder.yml",
"dist:linux": "pnpm build && electron-builder --linux --config electron-builder.yml",
"typecheck": "tsc --noEmit"
},
"author": "MODSetter",
"license": "MIT",
"packageManager": "pnpm@10.24.0",
"devDependencies": {
"@types/node": "^25.5.0",
"concurrently": "^9.2.1",
"dotenv": "^17.3.1",
"electron": "^41.0.2",
"electron-builder": "^26.8.1",
"esbuild": "^0.27.4",
"typescript": "^5.9.3",
"wait-on": "^9.0.4"
},
"dependencies": {
"electron-updater": "^6.8.3",
"get-port-please": "^3.2.0"
}
}

Some files were not shown because too many files have changed in this diff Show more