diff --git a/.cursor/skills/system-architecture/SKILL.md b/.cursor/skills/system-architecture/SKILL.md
new file mode 100755
index 000000000..70683a7ac
--- /dev/null
+++ b/.cursor/skills/system-architecture/SKILL.md
@@ -0,0 +1,136 @@
+---
+name: system-architecture
+description: Design systems with appropriate complexity - no more, no less. Use when the user asks to architect applications, design system boundaries, plan service decomposition, evaluate monolith vs microservices, make scaling decisions, or review structural trade-offs. Applies to new system design, refactoring, and migration planning.
+---
+
+# System Architecture
+
+Design real structures with clear boundaries, explicit trade-offs, and appropriate complexity. Match architecture to actual requirements, not imagined future needs.
+
+## Workflow
+
+When the user requests an architecture, follow these steps:
+
+```
+Task Progress:
+- [ ] Step 1: Clarify constraints
+- [ ] Step 2: Identify domains
+- [ ] Step 3: Map data flow
+- [ ] Step 4: Draw boundaries with rationale
+- [ ] Step 5: Run complexity checklist
+- [ ] Step 6: Present architecture with trade-offs
+```
+
+**Step 1 - Clarify constraints.** Ask about:
+
+| Constraint | Question | Why it matters |
+|------------|----------|----------------|
+| Scale | What's the real load? (users, requests/sec, data size) | Design for 10x current, not 1000x |
+| Team | How many developers? How many teams? | Deployable units ≤ number of teams |
+| Lifespan | Prototype? MVP? Long-term product? | Temporary systems need temporary solutions |
+| Change vectors | What actually varies? | Abstract only where you have evidence of variation |
+
+**Step 2 - Identify domains.** Group by business capability, not technical layer. Look for things that change for different reasons and at different rates.
+
+**Step 3 - Map data flow.** Trace: where does data enter → how does it transform → where does it exit? Make the flow obvious.
+
+**Step 4 - Draw boundaries.** Every boundary needs a reason: different team, different change rate, different compliance requirement, or different scaling need.
+
+**Step 5 - Run complexity checklist.** Before adding any non-trivial pattern:
+
+```
+[ ] Have I tried the simple solution?
+[ ] Do I have evidence it's insufficient?
+[ ] Can my team operate this?
+[ ] Will this still make sense in 6 months?
+[ ] Can I explain why this complexity is necessary?
+```
+
+If any answer is "no", keep it simple.
+
+**Step 6 - Present the architecture** using the output template below.
+
+## Output Template
+
+```markdown
+### System: [Name]
+
+**Constraints**:
+- Scale: [current and expected load]
+- Team: [size and structure]
+- Lifespan: [prototype / MVP / long-term]
+
+**Architecture**:
+[Component diagram or description of components and their relationships]
+
+**Data Flow**:
+[How data enters → transforms → exits]
+
+**Key Boundaries**:
+| Boundary | Reason | Change Rate |
+|----------|--------|-------------|
+| ... | ... | ... |
+
+**Trade-offs**:
+- Chose X over Y because [reason]
+- Accepted [limitation] to gain [benefit]
+
+**Complexity Justification**:
+- [Each non-trivial pattern] → [why it's needed, with evidence]
+```
+
+## Core Principles
+
+1. **Boundaries at real differences.** Separate concerns that change for different reasons and at different rates.
+2. **Dependencies flow inward.** Core logic depends on nothing. Infrastructure depends on core.
+3. **Follow the data.** Architecture should make data flow obvious.
+4. **Design for failure.** Network fails. Databases timeout. Build compensation into the structure.
+5. **Design for operations.** You will debug this at 3am. Every request needs a trace. Every error needs context for replay.
+
+For concrete good/bad examples of each principle, see [examples.md](examples.md).
+
+## Anti-Patterns
+
+| Don't | Do Instead |
+|-------|------------|
+| Microservices for a 3-person team | Well-structured monolith |
+| Event sourcing for CRUD | Simple state storage |
+| Message queues within the same process | Just call the function |
+| Distributed transactions | Redesign to avoid, or accept eventual consistency |
+| Repository wrapping an ORM | Use the ORM directly |
+| Interfaces with one implementation | Mock at boundaries only |
+| AbstractFactoryFactoryBean | Just instantiate the thing |
+| DI containers for simple graphs | Constructor injection is enough |
+| Clean Architecture for a TODO app | Match layers to actual complexity |
+| DDD tactics without strategic design | Aggregates need bounded contexts |
+| Hexagonal ports with one adapter | Just call the database |
+| CQRS when reads = writes | Add when they diverge |
+| "We might swap databases" | You won't; rewrite if you do |
+| "Multi-tenant someday" | Build it when you have tenant #2 |
+| "Microservices for team scale" | Helps at 50+ engineers, not 4 |
+
+## Success Criteria
+
+Your architecture is right-sized when:
+
+1. **You can draw it** - dependency graph fits on a whiteboard
+2. **You can explain it** - new team member understands data flow in 30 minutes
+3. **You can change it** - adding a feature touches 1-3 modules, not 10
+4. **You can delete it** - removing a component needs no archaeology
+5. **You can debug it** - tracing a request takes minutes, not hours
+6. **It matches your team** - deployable units ≤ number of teams
+
+## When the Simple Solution Isn't Enough
+
+If the complexity checklist says "yes, scale is real", see [scaling-checklist.md](scaling-checklist.md) for concrete techniques covering caching, async processing, partitioning, horizontal scaling, and multi-region.
+
+## Iterative Architecture
+
+Architecture is discovered, not designed upfront:
+
+1. **Start obvious** - group by domain, not by technical layer
+2. **Let hotspots emerge** - monitor which modules change together
+3. **Extract when painful** - split only when the current form causes measurable problems
+4. **Document decisions** - record why boundaries exist so future you knows what's load-bearing
+
+Every senior engineer has a graveyard of over-engineered systems they regret. Learn from their pain. Build boring systems that work.
diff --git a/.cursor/skills/system-architecture/examples.md b/.cursor/skills/system-architecture/examples.md
new file mode 100644
index 000000000..fa72f92ce
--- /dev/null
+++ b/.cursor/skills/system-architecture/examples.md
@@ -0,0 +1,120 @@
+# Architecture Examples
+
+Concrete good/bad examples for each core principle in SKILL.md.
+
+---
+
+## Boundaries at Real Differences
+
+**Good** - Meaningful boundary:
+```
+# Users and Billing are separate bounded contexts
+# - Different teams own them
+# - Different change cadences (users: weekly, billing: quarterly)
+# - Different compliance requirements
+
+src/
+ users/ # User management domain
+ models.py
+ services.py
+ api.py
+ billing/ # Billing domain
+ models.py
+ services.py
+ api.py
+ shared/ # Truly shared utilities
+ auth.py
+```
+
+**Bad** - Ceremony without purpose:
+```
+# UserService → UserRepository → UserRepositoryImpl
+# ...when you'll never swap the database
+
+src/
+ interfaces/
+ IUserRepository.py # One implementation exists
+ repositories/
+ UserRepositoryImpl.py # Wraps SQLAlchemy, which is already a repository
+ services/
+ UserService.py # Just calls the repository
+```
+
+---
+
+## Dependencies Flow Inward
+
+**Good** - Clear dependency direction:
+```
+# Dependency flows inward: infrastructure → application → domain
+
+domain/ # Pure business logic, no imports from outer layers
+ order.py # Order entity with business rules
+
+application/ # Use cases, orchestrates domain
+ place_order.py # Imports from domain/, not infrastructure/
+
+infrastructure/ # External concerns
+ postgres.py # Implements persistence, imports from application/
+ stripe.py # Implements payments
+```
+
+---
+
+## Follow the Data
+
+**Good** - Obvious data flow:
+```
+Request → Validate → Transform → Store → Respond
+
+# Each step is a clear function/module:
+api/routes.py # Request enters
+validators.py # Validation
+transformers.py # Business logic transformation
+repositories.py # Storage
+serializers.py # Response shaping
+```
+
+---
+
+## Design for Failure
+
+**Good** - Failure-aware design with compensation:
+```python
+class OrderService:
+ def place_order(self, order: Order) -> Result:
+ inventory = self.inventory.reserve(order.items)
+ if inventory.failed:
+ return Result.failure("Items unavailable", retry=False)
+
+ payment = self.payments.charge(order.total)
+ if payment.failed:
+ self.inventory.release(inventory.reservation_id) # Compensate
+ return Result.failure("Payment failed", retry=True)
+
+ return Result.success(order)
+```
+
+---
+
+## Design for Operations
+
+**Good** - Observable architecture:
+```python
+@trace
+def handle_request(request):
+ log.info("Processing", request_id=request.id, user=request.user_id)
+ try:
+ result = process(request)
+ log.info("Completed", request_id=request.id, result=result.status)
+ return result
+ except Exception as e:
+ log.error("Failed", request_id=request.id, error=str(e),
+ context=request.to_dict()) # Full context for replay
+ raise
+```
+
+Key elements:
+- Every request gets a correlation ID
+- Every service logs with that ID
+- Every error includes full context for reproduction
diff --git a/.cursor/skills/system-architecture/scaling-checklist.md b/.cursor/skills/system-architecture/scaling-checklist.md
new file mode 100644
index 000000000..d9cfdce43
--- /dev/null
+++ b/.cursor/skills/system-architecture/scaling-checklist.md
@@ -0,0 +1,76 @@
+# Scaling Checklist
+
+Concrete techniques for when the complexity checklist in SKILL.md confirms scale is a real problem. Apply in order - each level solves the previous level's bottleneck.
+
+---
+
+## Level 0: Optimize First
+
+Before adding infrastructure, exhaust these:
+
+- [ ] Database queries have proper indexes
+- [ ] N+1 queries eliminated
+- [ ] Connection pooling configured
+- [ ] Slow endpoints profiled and optimized
+- [ ] Static assets served via CDN
+
+## Level 1: Read-Heavy
+
+**Symptom**: Database reads are the bottleneck.
+
+| Technique | When | Trade-off |
+|-----------|------|-----------|
+| Application cache (in-memory) | Small, frequently accessed data | Stale data, memory pressure |
+| Redis/Memcached | Shared cache across instances | Network hop, cache invalidation complexity |
+| Read replicas | High read volume, slight staleness OK | Replication lag, eventual consistency |
+| CDN | Static or semi-static content | Cache invalidation delay |
+
+## Level 2: Write-Heavy
+
+**Symptom**: Database writes or processing are the bottleneck.
+
+| Technique | When | Trade-off |
+|-----------|------|-----------|
+| Async task queue (Celery, SQS) | Work can be deferred | Eventual consistency, failure handling |
+| Write-behind cache | Batch frequent writes | Data loss risk on crash |
+| Event streaming (Kafka) | Multiple consumers of same data | Operational complexity, ordering guarantees |
+| CQRS | Reads and writes have diverged significantly | Two models to maintain |
+
+## Level 3: Traffic Spikes
+
+**Symptom**: Individual instances can't handle peak load.
+
+| Technique | When | Trade-off |
+|-----------|------|-----------|
+| Horizontal scaling + load balancer | Stateless services | Session management, deploy complexity |
+| Auto-scaling | Unpredictable traffic patterns | Cold start latency, cost spikes |
+| Rate limiting | Protect against abuse/spikes | Legitimate users may be throttled |
+| Circuit breakers | Downstream services degrade | Partial functionality during failures |
+
+## Level 4: Data Growth
+
+**Symptom**: Single database can't hold or query all the data efficiently.
+
+| Technique | When | Trade-off |
+|-----------|------|-----------|
+| Table partitioning | Time-series or naturally partitioned data | Query complexity, partition management |
+| Archival / cold storage | Old data rarely accessed | Access latency for archived data |
+| Database sharding | Partitioning insufficient, clear shard key exists | Cross-shard queries, operational burden |
+| Search index (Elasticsearch) | Full-text or complex queries on large datasets | Index lag, another system to operate |
+
+## Level 5: Multi-Region
+
+**Symptom**: Users are geographically distributed, latency matters.
+
+| Technique | When | Trade-off |
+|-----------|------|-----------|
+| CDN + edge caching | Static/semi-static content | Cache invalidation |
+| Read replicas per region | Read-heavy, slight staleness OK | Replication lag |
+| Active-passive failover | Disaster recovery | Failover time, cost of standby |
+| Active-active multi-region | True global low-latency required | Conflict resolution, extreme complexity |
+
+---
+
+## Decision Rule
+
+Always start at Level 0. Move to the next level only when you have **measured evidence** that the current level is insufficient. Skipping levels is how you end up with Kafka for a TODO app.
diff --git a/.github/workflows/desktop-release.yml b/.github/workflows/desktop-release.yml
new file mode 100644
index 000000000..7119fcb6d
--- /dev/null
+++ b/.github/workflows/desktop-release.yml
@@ -0,0 +1,78 @@
+name: Desktop Release
+
+on:
+ push:
+ tags:
+ - 'v*'
+ - 'beta-v*'
+
+permissions:
+ contents: write
+
+jobs:
+ build:
+ runs-on: ${{ matrix.os }}
+ strategy:
+ fail-fast: false
+ matrix:
+ include:
+ - os: macos-latest
+ platform: --mac
+ - os: ubuntu-latest
+ platform: --linux
+ - os: windows-latest
+ platform: --win
+
+ steps:
+ - name: Checkout
+ uses: actions/checkout@v4
+
+ - name: Extract version from tag
+ id: version
+ shell: bash
+ run: |
+ TAG=${GITHUB_REF#refs/tags/}
+ VERSION=${TAG#beta-}
+ VERSION=${VERSION#v}
+ echo "VERSION=$VERSION" >> "$GITHUB_OUTPUT"
+
+ - name: Setup pnpm
+ uses: pnpm/action-setup@v4
+
+ - name: Setup Node.js
+ uses: actions/setup-node@v4
+ with:
+ node-version: 20
+ cache: 'pnpm'
+ cache-dependency-path: |
+ surfsense_web/pnpm-lock.yaml
+ surfsense_desktop/pnpm-lock.yaml
+
+ - name: Install web dependencies
+ run: pnpm install
+ working-directory: surfsense_web
+
+ - name: Build Next.js standalone
+ run: pnpm build
+ working-directory: surfsense_web
+ env:
+ NEXT_PUBLIC_FASTAPI_BACKEND_URL: ${{ vars.NEXT_PUBLIC_FASTAPI_BACKEND_URL }}
+ NEXT_PUBLIC_ELECTRIC_URL: ${{ vars.NEXT_PUBLIC_ELECTRIC_URL }}
+ NEXT_PUBLIC_DEPLOYMENT_MODE: ${{ vars.NEXT_PUBLIC_DEPLOYMENT_MODE }}
+ NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE: ${{ vars.NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE }}
+
+ - name: Install desktop dependencies
+ run: pnpm install
+ working-directory: surfsense_desktop
+
+ - name: Build Electron
+ run: pnpm build
+ working-directory: surfsense_desktop
+ env:
+ HOSTED_FRONTEND_URL: ${{ vars.HOSTED_FRONTEND_URL }}
+
+ - name: Package & Publish
+ run: pnpm exec electron-builder ${{ matrix.platform }} --config electron-builder.yml --publish always -c.extraMetadata.version=${{ steps.version.outputs.VERSION }}
+ working-directory: surfsense_desktop
+ env:
+ GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
diff --git a/.gitignore b/.gitignore
index 559918a61..a5c44ce73 100644
--- a/.gitignore
+++ b/.gitignore
@@ -5,4 +5,4 @@ node_modules/
.ruff_cache/
.venv
.pnpm-store
-.DS_Store
+.DS_Store
\ No newline at end of file
diff --git a/.vscode/launch.json b/.vscode/launch.json
index 2c4784c0e..029e7c647 100644
--- a/.vscode/launch.json
+++ b/.vscode/launch.json
@@ -22,7 +22,11 @@
"console": "integratedTerminal",
"justMyCode": false,
"cwd": "${workspaceFolder}/surfsense_backend",
- "python": "${command:python.interpreterPath}"
+ "python": "uv",
+ "pythonArgs": [
+ "run",
+ "python"
+ ]
},
{
"name": "Backend: FastAPI (No Reload)",
@@ -32,7 +36,11 @@
"console": "integratedTerminal",
"justMyCode": false,
"cwd": "${workspaceFolder}/surfsense_backend",
- "python": "${command:python.interpreterPath}"
+ "python": "uv",
+ "pythonArgs": [
+ "run",
+ "python"
+ ]
},
{
"name": "Backend: FastAPI (main.py)",
@@ -41,14 +49,19 @@
"program": "${workspaceFolder}/surfsense_backend/main.py",
"console": "integratedTerminal",
"justMyCode": false,
- "cwd": "${workspaceFolder}/surfsense_backend"
+ "cwd": "${workspaceFolder}/surfsense_backend",
+ "python": "uv",
+ "pythonArgs": [
+ "run",
+ "python"
+ ]
},
{
"name": "Frontend: Next.js",
"type": "node",
"request": "launch",
"cwd": "${workspaceFolder}/surfsense_web",
- "runtimeExecutable": "npm",
+ "runtimeExecutable": "pnpm",
"runtimeArgs": ["run", "dev"],
"console": "integratedTerminal",
"serverReadyAction": {
@@ -62,7 +75,7 @@
"type": "node",
"request": "launch",
"cwd": "${workspaceFolder}/surfsense_web",
- "runtimeExecutable": "npm",
+ "runtimeExecutable": "pnpm",
"runtimeArgs": ["run", "debug:server"],
"console": "integratedTerminal",
"serverReadyAction": {
@@ -87,7 +100,11 @@
"console": "integratedTerminal",
"justMyCode": false,
"cwd": "${workspaceFolder}/surfsense_backend",
- "python": "${command:python.interpreterPath}"
+ "python": "uv",
+ "pythonArgs": [
+ "run",
+ "python"
+ ]
},
{
"name": "Celery: Beat Scheduler",
@@ -103,7 +120,11 @@
"console": "integratedTerminal",
"justMyCode": false,
"cwd": "${workspaceFolder}/surfsense_backend",
- "python": "${command:python.interpreterPath}"
+ "python": "uv",
+ "pythonArgs": [
+ "run",
+ "python"
+ ]
}
],
"compounds": [
diff --git a/.vscode/settings.json b/.vscode/settings.json
index f134660b6..05bd30702 100644
--- a/.vscode/settings.json
+++ b/.vscode/settings.json
@@ -1,3 +1,4 @@
{
- "biome.configurationPath": "./surfsense_web/biome.json"
+ "biome.configurationPath": "./surfsense_web/biome.json",
+ "deepscan.ignoreConfirmWarning": true
}
\ No newline at end of file
diff --git a/README.es.md b/README.es.md
index a1f5b80d8..e5bc9be7e 100644
--- a/README.es.md
+++ b/README.es.md
@@ -27,11 +27,18 @@ SurfSense es un agente de investigación de IA altamente personalizable, conecta
-# Video
+# Demo
https://github.com/user-attachments/assets/cc0c84d3-1f2f-4f7a-b519-2ecce22310b1
-## Ejemplo de Podcast
+## Ejemplo de Agente de Video
+
+
+https://github.com/user-attachments/assets/cc977e6d-8292-4ffe-abb8-3b0560ef5562
+
+
+
+## Ejemplo de Agente de Podcast
https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
@@ -46,20 +53,25 @@ https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
2. Conecta tus conectores y sincroniza. Activa la sincronización periódica para mantenerlos actualizados.
-
+
3. Mientras se indexan los datos de los conectores, sube documentos.
-
+
4. Una vez que todo esté indexado, pregunta lo que quieras (Casos de uso):
+ - Generación de videos
+
+
+
- Búsqueda básica y citaciones
- QNA con mención de documentos
+
- Generación de informes y exportaciones (PDF, DOCX, HTML, LaTeX, EPUB, ODT, texto plano)
@@ -133,6 +145,8 @@ Para Docker Compose, instalación manual y otras opciones de despliegue, consult
| Soporte Universal de LLM | 100+ LLMs, 6000+ modelos de embeddings, todos los principales rerankers vía OpenAI spec y LiteLLM |
| Privacidad Primero | Soporte completo de LLM local (vLLM, Ollama) tus datos son tuyos |
| Colaboración en Equipo | RBAC con roles de Propietario / Admin / Editor / Visor, chat en tiempo real e hilos de comentarios |
+| Generación de Videos | Genera videos con narración y visuales |
+| Generación de Presentaciones | Crea presentaciones editables basadas en diapositivas |
| Generación de Podcasts | Podcast de 3 min en menos de 20 segundos; múltiples proveedores TTS (OpenAI, Azure, Kokoro) |
| Extensión de Navegador | Extensión multi-navegador para guardar cualquier página web, incluyendo páginas protegidas por autenticación |
| 25+ Conectores | Motores de búsqueda, Google Drive, Slack, Teams, Jira, Notion, GitHub, Discord y [más](#fuentes-externas) |
diff --git a/README.hi.md b/README.hi.md
index 7a4822e68..2966ef4a3 100644
--- a/README.hi.md
+++ b/README.hi.md
@@ -27,11 +27,18 @@ SurfSense एक अत्यधिक अनुकूलन योग्य AI
-# वीडियो
+# डेमो
https://github.com/user-attachments/assets/cc0c84d3-1f2f-4f7a-b519-2ecce22310b1
-## पॉडकास्ट नमूना
+## वीडियो एजेंट नमूना
+
+
+https://github.com/user-attachments/assets/cc977e6d-8292-4ffe-abb8-3b0560ef5562
+
+
+
+## पॉडकास्ट एजेंट नमूना
https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
@@ -46,20 +53,25 @@ https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
2. अपने कनेक्टर जोड़ें और सिंक करें। कनेक्टर्स को अपडेट रखने के लिए आवधिक सिंकिंग सक्षम करें।
-
+
3. जब तक कनेक्टर्स का डेटा इंडेक्स हो रहा है, दस्तावेज़ अपलोड करें।
-
+
4. सब कुछ इंडेक्स हो जाने के बाद, कुछ भी पूछें (उपयोग के मामले):
+ - वीडियो जनरेशन
+
+
+
- बेसिक सर्च और उद्धरण
- दस्तावेज़ मेंशन QNA
+
- रिपोर्ट जनरेशन और एक्सपोर्ट (PDF, DOCX, HTML, LaTeX, EPUB, ODT, सादा टेक्स्ट)
@@ -133,6 +145,8 @@ Docker Compose, मैनुअल इंस्टॉलेशन और अन
| यूनिवर्सल LLM सपोर्ट | 100+ LLMs, 6000+ एम्बेडिंग मॉडल, सभी प्रमुख रीरैंकर्स OpenAI spec और LiteLLM के माध्यम से |
| प्राइवेसी फर्स्ट | पूर्ण लोकल LLM सपोर्ट (vLLM, Ollama) आपका डेटा आपका रहता है |
| टीम सहयोग | मालिक / एडमिन / संपादक / दर्शक भूमिकाओं के साथ RBAC, रीयल-टाइम चैट और कमेंट थ्रेड |
+| वीडियो जनरेशन | नैरेशन और विज़ुअल के साथ वीडियो बनाएं |
+| प्रेजेंटेशन जनरेशन | संपादन योग्य, स्लाइड आधारित प्रेजेंटेशन बनाएं |
| पॉडकास्ट जनरेशन | 20 सेकंड से कम में 3 मिनट का पॉडकास्ट; कई TTS प्रदाता (OpenAI, Azure, Kokoro) |
| ब्राउज़र एक्सटेंशन | किसी भी वेबपेज को सहेजने के लिए क्रॉस-ब्राउज़र एक्सटेंशन, प्रमाणीकरण सुरक्षित पेज सहित |
| 25+ कनेक्टर्स | सर्च इंजन, Google Drive, Slack, Teams, Jira, Notion, GitHub, Discord और [अधिक](#बाहरी-स्रोत) |
diff --git a/README.md b/README.md
index f37664dd7..7ad66e0d9 100644
--- a/README.md
+++ b/README.md
@@ -27,11 +27,18 @@ SurfSense is a highly customizable AI research agent, connected to external sour
-# Video
+# Demo
https://github.com/user-attachments/assets/cc0c84d3-1f2f-4f7a-b519-2ecce22310b1
-## Podcast Sample
+## Video Agent Sample
+
+
+https://github.com/user-attachments/assets/cc977e6d-8292-4ffe-abb8-3b0560ef5562
+
+
+
+## Podcast Agent Sample
https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
@@ -46,20 +53,25 @@ https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
2. Connect your connectors and sync. Enable periodic syncing to keep connectors synced.
-
+
3. Till connectors data index, upload Documents.
-
+
4. Once everything is indexed, Ask Away (Use Cases):
+ - Video Generation
+
+
+
- Basic search and citation
- Document Mention QNA
+
- Report Generations and Exports (PDF, DOCX, HTML, LaTeX, EPUB, ODT, Plain Text)
@@ -133,6 +145,8 @@ For Docker Compose, manual installation, and other deployment options, see the [
| Universal LLM Support | 100+ LLMs, 6000+ embedding models, all major rerankers via OpenAI spec & LiteLLM |
| Privacy First | Full local LLM support (vLLM, Ollama) your data stays yours |
| Team Collaboration | RBAC with Owner / Admin / Editor / Viewer roles, real time chat & comment threads |
+| Video Generation | Generate videos with narration and visuals |
+| Presentation Generation | Create editable, slide based presentations |
| Podcast Generation | 3 min podcast in under 20 seconds; multiple TTS providers (OpenAI, Azure, Kokoro) |
| Browser Extension | Cross browser extension to save any webpage, including auth protected pages |
| 25+ Connectors | Search Engines, Google Drive, Slack, Teams, Jira, Notion, GitHub, Discord & [more](#external-sources) |
diff --git a/README.pt-BR.md b/README.pt-BR.md
index 5461d8824..4b93a8036 100644
--- a/README.pt-BR.md
+++ b/README.pt-BR.md
@@ -27,11 +27,18 @@ SurfSense é um agente de pesquisa de IA altamente personalizável, conectado a
-# Vídeo
+# Demo
https://github.com/user-attachments/assets/cc0c84d3-1f2f-4f7a-b519-2ecce22310b1
-## Exemplo de Podcast
+## Exemplo de Agente de Vídeo
+
+
+https://github.com/user-attachments/assets/cc977e6d-8292-4ffe-abb8-3b0560ef5562
+
+
+
+## Exemplo de Agente de Podcast
https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
@@ -46,20 +53,25 @@ https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
2. Conecte seus conectores e sincronize. Ative a sincronização periódica para manter os conectores atualizados.
-
+
3. Enquanto os dados dos conectores são indexados, faça upload de documentos.
-
+
4. Quando tudo estiver indexado, pergunte o que quiser (Casos de uso):
+ - Geração de vídeos
+
+
+
- Busca básica e citações
- QNA com menção de documentos
+
- Geração de relatórios e exportações (PDF, DOCX, HTML, LaTeX, EPUB, ODT, texto simples)
@@ -133,6 +145,8 @@ Para Docker Compose, instalação manual e outras opções de implantação, con
| Suporte Universal de LLM | 100+ LLMs, 6000+ modelos de embeddings, todos os principais rerankers via OpenAI spec e LiteLLM |
| Privacidade em Primeiro Lugar | Suporte completo a LLM local (vLLM, Ollama) seus dados ficam com você |
| Colaboração em Equipe | RBAC com papéis de Proprietário / Admin / Editor / Visualizador, chat em tempo real e threads de comentários |
+| Geração de Vídeos | Gera vídeos com narração e visuais |
+| Geração de Apresentações | Cria apresentações editáveis baseadas em slides |
| Geração de Podcasts | Podcast de 3 min em menos de 20 segundos; múltiplos provedores TTS (OpenAI, Azure, Kokoro) |
| Extensão de Navegador | Extensão multi-navegador para salvar qualquer página web, incluindo páginas protegidas por autenticação |
| 25+ Conectores | Mecanismos de busca, Google Drive, Slack, Teams, Jira, Notion, GitHub, Discord e [mais](#fontes-externas) |
diff --git a/README.zh-CN.md b/README.zh-CN.md
index 9333348b6..5230a5b80 100644
--- a/README.zh-CN.md
+++ b/README.zh-CN.md
@@ -27,11 +27,18 @@ SurfSense 是一个高度可定制的 AI 研究助手,可以连接外部数据
-# 视频
+# 演示
https://github.com/user-attachments/assets/cc0c84d3-1f2f-4f7a-b519-2ecce22310b1
-## 播客示例
+## 视频代理示例
+
+
+https://github.com/user-attachments/assets/cc977e6d-8292-4ffe-abb8-3b0560ef5562
+
+
+
+## 播客代理示例
https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
@@ -46,20 +53,25 @@ https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7
2. 连接您的连接器并同步。启用定期同步以保持连接器数据更新。
-
+
3. 在连接器数据索引期间,上传文档。
-
+
4. 一切索引完成后,尽管提问(使用场景):
+ - 视频生成
+
+
+
- 基本搜索和引用
- 文档提及问答
+
- 报告生成和导出(PDF、DOCX、HTML、LaTeX、EPUB、ODT、纯文本)
@@ -133,6 +145,8 @@ irm https://raw.githubusercontent.com/MODSetter/SurfSense/main/docker/scripts/in
| 通用 LLM 支持 | 100+ LLM、6000+ 嵌入模型、所有主流重排序器,通过 OpenAI spec 和 LiteLLM |
| 隐私优先 | 完整本地 LLM 支持(vLLM、Ollama),您的数据由您掌控 |
| 团队协作 | RBAC 角色控制(所有者/管理员/编辑者/查看者),实时聊天和评论线程 |
+| 视频生成 | 生成带有旁白和视觉效果的视频 |
+| 演示文稿生成 | 创建可编辑的幻灯片式演示文稿 |
| 播客生成 | 20 秒内生成 3 分钟播客;多种 TTS 提供商(OpenAI、Azure、Kokoro) |
| 浏览器扩展 | 跨浏览器扩展,保存任何网页,包括需要身份验证的页面 |
| 25+ 连接器 | 搜索引擎、Google Drive、Slack、Teams、Jira、Notion、GitHub、Discord 等[更多](#外部数据源) |
diff --git a/docker/.env.example b/docker/.env.example
index c31b87185..a226c2624 100644
--- a/docker/.env.example
+++ b/docker/.env.example
@@ -36,6 +36,7 @@ EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
# BACKEND_PORT=8929
# FRONTEND_PORT=3929
# ELECTRIC_PORT=5929
+# SEARXNG_PORT=8888
# FLOWER_PORT=5555
# ==============================================================================
@@ -199,6 +200,16 @@ STT_SERVICE=local/base
# COMPOSIO_ENABLED=TRUE
# COMPOSIO_REDIRECT_URI=http://localhost:8000/api/v1/auth/composio/connector/callback
+# ------------------------------------------------------------------------------
+# SearXNG (bundled web search — works out of the box, no config needed)
+# ------------------------------------------------------------------------------
+# SearXNG provides web search to all search spaces automatically.
+# To access the SearXNG UI directly: http://localhost:8888
+# To disable the service entirely: docker compose up --scale searxng=0
+# To point at your own SearXNG instance instead of the bundled one:
+# SEARXNG_DEFAULT_HOST=http://your-searxng:8080
+# SEARXNG_SECRET=surfsense-searxng-secret
+
# ------------------------------------------------------------------------------
# Daytona Sandbox (optional — cloud code execution for the deep agent)
# ------------------------------------------------------------------------------
diff --git a/docker/docker-compose.dev.yml b/docker/docker-compose.dev.yml
index 4d602f584..15531bf55 100644
--- a/docker/docker-compose.dev.yml
+++ b/docker/docker-compose.dev.yml
@@ -57,6 +57,20 @@ services:
timeout: 5s
retries: 5
+ searxng:
+ image: searxng/searxng:2026.3.13-3c1f68c59
+ ports:
+ - "${SEARXNG_PORT:-8888}:8080"
+ volumes:
+ - ./searxng:/etc/searxng
+ environment:
+ - SEARXNG_SECRET=${SEARXNG_SECRET:-surfsense-searxng-secret}
+ healthcheck:
+ test: ["CMD", "wget", "--spider", "-q", "http://localhost:8080/healthz"]
+ interval: 10s
+ timeout: 5s
+ retries: 5
+
backend:
build: ../surfsense_backend
ports:
@@ -81,6 +95,7 @@ services:
- ELECTRIC_DB_PASSWORD=${ELECTRIC_DB_PASSWORD:-electric_password}
- AUTH_TYPE=${AUTH_TYPE:-LOCAL}
- NEXT_FRONTEND_URL=${NEXT_FRONTEND_URL:-http://localhost:3000}
+ - SEARXNG_DEFAULT_HOST=${SEARXNG_DEFAULT_HOST:-http://searxng:8080}
# Daytona Sandbox – uncomment and set credentials to enable cloud code execution
# - DAYTONA_SANDBOX_ENABLED=TRUE
# - DAYTONA_API_KEY=${DAYTONA_API_KEY:-}
@@ -92,6 +107,8 @@ services:
condition: service_healthy
redis:
condition: service_healthy
+ searxng:
+ condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 15s
@@ -115,6 +132,7 @@ services:
- PYTHONPATH=/app
- ELECTRIC_DB_USER=${ELECTRIC_DB_USER:-electric}
- ELECTRIC_DB_PASSWORD=${ELECTRIC_DB_PASSWORD:-electric_password}
+ - SEARXNG_DEFAULT_HOST=${SEARXNG_DEFAULT_HOST:-http://searxng:8080}
- SERVICE_ROLE=worker
depends_on:
db:
diff --git a/docker/docker-compose.yml b/docker/docker-compose.yml
index ca20e3ed4..8c85248d2 100644
--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
@@ -42,6 +42,19 @@ services:
timeout: 5s
retries: 5
+ searxng:
+ image: searxng/searxng:2026.3.13-3c1f68c59
+ volumes:
+ - ./searxng:/etc/searxng
+ environment:
+ SEARXNG_SECRET: ${SEARXNG_SECRET:-surfsense-searxng-secret}
+ restart: unless-stopped
+ healthcheck:
+ test: ["CMD", "wget", "--spider", "-q", "http://localhost:8080/healthz"]
+ interval: 10s
+ timeout: 5s
+ retries: 5
+
backend:
image: ghcr.io/modsetter/surfsense-backend:${SURFSENSE_VERSION:-latest}
ports:
@@ -62,6 +75,7 @@ services:
ELECTRIC_DB_USER: ${ELECTRIC_DB_USER:-electric}
ELECTRIC_DB_PASSWORD: ${ELECTRIC_DB_PASSWORD:-electric_password}
NEXT_FRONTEND_URL: ${NEXT_FRONTEND_URL:-http://localhost:${FRONTEND_PORT:-3929}}
+ SEARXNG_DEFAULT_HOST: ${SEARXNG_DEFAULT_HOST:-http://searxng:8080}
# Daytona Sandbox – uncomment and set credentials to enable cloud code execution
# DAYTONA_SANDBOX_ENABLED: "TRUE"
# DAYTONA_API_KEY: ${DAYTONA_API_KEY:-}
@@ -75,6 +89,8 @@ services:
condition: service_healthy
redis:
condition: service_healthy
+ searxng:
+ condition: service_healthy
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
@@ -98,6 +114,7 @@ services:
PYTHONPATH: /app
ELECTRIC_DB_USER: ${ELECTRIC_DB_USER:-electric}
ELECTRIC_DB_PASSWORD: ${ELECTRIC_DB_PASSWORD:-electric_password}
+ SEARXNG_DEFAULT_HOST: ${SEARXNG_DEFAULT_HOST:-http://searxng:8080}
SERVICE_ROLE: worker
depends_on:
db:
diff --git a/docker/scripts/install.ps1 b/docker/scripts/install.ps1
index 5f41ef7d6..b7004bae2 100644
--- a/docker/scripts/install.ps1
+++ b/docker/scripts/install.ps1
@@ -103,6 +103,7 @@ Write-Step "Downloading SurfSense files"
Write-Info "Installation directory: $InstallDir"
New-Item -ItemType Directory -Path "$InstallDir\scripts" -Force | Out-Null
+New-Item -ItemType Directory -Path "$InstallDir\searxng" -Force | Out-Null
$Files = @(
@{ Src = "docker/docker-compose.yml"; Dest = "docker-compose.yml" }
@@ -110,6 +111,8 @@ $Files = @(
@{ Src = "docker/postgresql.conf"; Dest = "postgresql.conf" }
@{ Src = "docker/scripts/init-electric-user.sh"; Dest = "scripts/init-electric-user.sh" }
@{ Src = "docker/scripts/migrate-database.ps1"; Dest = "scripts/migrate-database.ps1" }
+ @{ Src = "docker/searxng/settings.yml"; Dest = "searxng/settings.yml" }
+ @{ Src = "docker/searxng/limiter.toml"; Dest = "searxng/limiter.toml" }
)
foreach ($f in $Files) {
diff --git a/docker/scripts/install.sh b/docker/scripts/install.sh
index eb6aeb83d..7a68a9bd1 100644
--- a/docker/scripts/install.sh
+++ b/docker/scripts/install.sh
@@ -102,6 +102,7 @@ wait_for_pg() {
step "Downloading SurfSense files"
info "Installation directory: ${INSTALL_DIR}"
mkdir -p "${INSTALL_DIR}/scripts"
+mkdir -p "${INSTALL_DIR}/searxng"
FILES=(
"docker/docker-compose.yml:docker-compose.yml"
@@ -109,6 +110,8 @@ FILES=(
"docker/postgresql.conf:postgresql.conf"
"docker/scripts/init-electric-user.sh:scripts/init-electric-user.sh"
"docker/scripts/migrate-database.sh:scripts/migrate-database.sh"
+ "docker/searxng/settings.yml:searxng/settings.yml"
+ "docker/searxng/limiter.toml:searxng/limiter.toml"
)
for entry in "${FILES[@]}"; do
diff --git a/docker/searxng/limiter.toml b/docker/searxng/limiter.toml
new file mode 100644
index 000000000..dce84146f
--- /dev/null
+++ b/docker/searxng/limiter.toml
@@ -0,0 +1,5 @@
+[botdetection.ip_limit]
+link_token = false
+
+[botdetection.ip_lists]
+pass_ip = ["0.0.0.0/0"]
diff --git a/docker/searxng/settings.yml b/docker/searxng/settings.yml
new file mode 100644
index 000000000..0b805b6aa
--- /dev/null
+++ b/docker/searxng/settings.yml
@@ -0,0 +1,90 @@
+use_default_settings:
+ engines:
+ remove:
+ - ahmia
+ - torch
+ - qwant
+ - qwant news
+ - qwant images
+ - qwant videos
+ - mojeek
+ - mojeek images
+ - mojeek news
+
+server:
+ secret_key: "override-me-via-env"
+ limiter: false
+ image_proxy: false
+ method: "GET"
+ default_http_headers:
+ X-Robots-Tag: "noindex, nofollow"
+
+search:
+ formats:
+ - html
+ - json
+ default_lang: "auto"
+ autocomplete: ""
+ safe_search: 0
+ ban_time_on_fail: 5
+ max_ban_time_on_fail: 120
+ suspended_times:
+ SearxEngineAccessDenied: 3600
+ SearxEngineCaptcha: 3600
+ SearxEngineTooManyRequests: 600
+ cf_SearxEngineCaptcha: 7200
+ cf_SearxEngineAccessDenied: 3600
+ recaptcha_SearxEngineCaptcha: 7200
+
+ui:
+ static_use_hash: true
+
+outgoing:
+ request_timeout: 12.0
+ max_request_timeout: 20.0
+ pool_connections: 100
+ pool_maxsize: 20
+ enable_http2: true
+ extra_proxy_timeout: 10
+ retries: 1
+ # Uncomment and set your residential proxy URL to route search engine requests through it.
+ # Format: http://:@:/
+ #
+ # proxies:
+ # all://:
+ # - http://user:pass@proxy-host:port/
+
+engines:
+ - name: google
+ disabled: false
+ weight: 1.2
+ retry_on_http_error: [429, 503]
+ - name: duckduckgo
+ disabled: false
+ weight: 1.1
+ retry_on_http_error: [429, 503]
+ - name: brave
+ disabled: false
+ weight: 1.0
+ retry_on_http_error: [429, 503]
+ - name: bing
+ disabled: false
+ weight: 0.9
+ retry_on_http_error: [429, 503]
+ - name: wikipedia
+ disabled: false
+ weight: 0.8
+ - name: stackoverflow
+ disabled: false
+ weight: 0.7
+ - name: yahoo
+ disabled: false
+ weight: 0.7
+ retry_on_http_error: [429, 503]
+ - name: wikidata
+ disabled: false
+ weight: 0.6
+ - name: currency
+ disabled: false
+ - name: ddg definitions
+ disabled: false
diff --git a/docs/chinese-llm-setup.md b/docs/chinese-llm-setup.md
index 2a184608f..37042aa2f 100644
--- a/docs/chinese-llm-setup.md
+++ b/docs/chinese-llm-setup.md
@@ -14,6 +14,7 @@ SurfSense 现已支持以下国产 LLM:
- ✅ **阿里通义千问 (Alibaba Qwen)** - 阿里云通义千问大模型
- ✅ **月之暗面 Kimi (Moonshot)** - 月之暗面 Kimi 大模型
- ✅ **智谱 AI GLM (Zhipu)** - 智谱 AI GLM 系列模型
+- ✅ **MiniMax** - MiniMax 大模型 (M2.5 系列,204K 上下文)
---
@@ -197,6 +198,52 @@ API Base URL: https://open.bigmodel.cn/api/paas/v4
---
+## 5️⃣ MiniMax 配置 | MiniMax Configuration
+
+### 获取 API Key
+
+1. 访问 [MiniMax 开放平台](https://platform.minimaxi.com/)
+2. 注册并登录账号
+3. 进入 **API Keys** 页面
+4. 创建新的 API Key
+5. 复制 API Key
+
+### 在 SurfSense 中配置
+
+| 字段 | 值 | 说明 |
+|------|-----|------|
+| **Configuration Name** | `MiniMax M2.5` | 配置名称(自定义) |
+| **Provider** | `MINIMAX` | 选择 MiniMax |
+| **Model Name** | `MiniMax-M2.5` | 推荐模型 其他选项: `MiniMax-M2.5-highspeed` |
+| **API Key** | `eyJ...` | 你的 MiniMax API Key |
+| **API Base URL** | `https://api.minimax.io/v1` | MiniMax API 地址 |
+| **Parameters** | `{"temperature": 1.0}` | 注意:temperature 必须在 (0.0, 1.0] 范围内,不能为 0 |
+
+### 示例配置
+
+```
+Configuration Name: MiniMax M2.5
+Provider: MINIMAX
+Model Name: MiniMax-M2.5
+API Key: eyJxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+API Base URL: https://api.minimax.io/v1
+```
+
+### 可用模型
+
+- **MiniMax-M2.5**: 高性能通用模型,204K 上下文窗口(推荐)
+- **MiniMax-M2.5-highspeed**: 高速推理版本,204K 上下文窗口
+
+### 注意事项
+
+- **temperature 参数**: MiniMax 要求 temperature 必须在 (0.0, 1.0] 范围内,不能设置为 0。建议使用 1.0。
+- 两个模型都支持 204K 超长上下文窗口,适合处理长文本任务。
+
+### 定价
+- 请访问 [MiniMax 定价页面](https://platform.minimaxi.com/document/Price) 查看最新价格
+
+---
+
## ⚙️ 高级配置 | Advanced Configuration
### 自定义参数 | Custom Parameters
@@ -268,8 +315,8 @@ docker compose logs backend | grep -i "error"
|---------|---------|------|
| **文档摘要** | Qwen-Plus, GLM-4 | 平衡性能和成本 |
| **代码分析** | DeepSeek-Coder | 代码专用 |
-| **长文本处理** | Kimi 128K | 超长上下文 |
-| **快速响应** | Qwen-Turbo, GLM-4-Flash | 速度优先 |
+| **长文本处理** | Kimi 128K, MiniMax-M2.5 (204K) | 超长上下文 |
+| **快速响应** | Qwen-Turbo, GLM-4-Flash, MiniMax-M2.5-highspeed | 速度优先 |
### 2. 成本优化
@@ -294,6 +341,7 @@ docker compose logs backend | grep -i "error"
- [阿里云百炼文档](https://help.aliyun.com/zh/model-studio/)
- [Moonshot AI 文档](https://platform.moonshot.cn/docs)
- [智谱 AI 文档](https://open.bigmodel.cn/dev/api)
+- [MiniMax 文档](https://platform.minimaxi.com/document/Guides)
### SurfSense 文档
diff --git a/surfsense_backend/.env.example b/surfsense_backend/.env.example
index 413be03c4..621c8cf99 100644
--- a/surfsense_backend/.env.example
+++ b/surfsense_backend/.env.example
@@ -12,6 +12,11 @@ REDIS_APP_URL=redis://localhost:6379/0
# Optional: TTL in seconds for connector indexing lock key
# CONNECTOR_INDEXING_LOCK_TTL_SECONDS=28800
+# Platform Web Search (SearXNG)
+# Set this to enable built-in web search. Docker Compose sets it automatically.
+# Only uncomment if running the backend outside Docker (e.g. uvicorn on host).
+# SEARXNG_DEFAULT_HOST=http://localhost:8888
+
#Electric(for migrations only)
ELECTRIC_DB_USER=electric
ELECTRIC_DB_PASSWORD=electric_password
diff --git a/surfsense_backend/.gitignore b/surfsense_backend/.gitignore
index 443c85e9c..1cd7fd32c 100644
--- a/surfsense_backend/.gitignore
+++ b/surfsense_backend/.gitignore
@@ -6,6 +6,7 @@ __pycache__/
.flashrank_cache
surf_new_backend.egg-info/
podcasts/
+video_presentation_audio/
sandbox_files/
temp_audio/
celerybeat-schedule*
diff --git a/surfsense_backend/alembic/versions/106_add_minimax_to_litellmprovider_enum.py b/surfsense_backend/alembic/versions/106_add_minimax_to_litellmprovider_enum.py
new file mode 100644
index 000000000..fed3bc7c3
--- /dev/null
+++ b/surfsense_backend/alembic/versions/106_add_minimax_to_litellmprovider_enum.py
@@ -0,0 +1,23 @@
+"""Add MINIMAX to LiteLLMProvider enum
+
+Revision ID: 106
+Revises: 105
+"""
+
+from collections.abc import Sequence
+
+from alembic import op
+
+revision: str = "106"
+down_revision: str | None = "105"
+branch_labels: str | Sequence[str] | None = None
+depends_on: str | Sequence[str] | None = None
+
+
+def upgrade() -> None:
+ op.execute("COMMIT")
+ op.execute("ALTER TYPE litellmprovider ADD VALUE IF NOT EXISTS 'MINIMAX'")
+
+
+def downgrade() -> None:
+ pass
diff --git a/surfsense_backend/alembic/versions/107_add_video_presentations_table.py b/surfsense_backend/alembic/versions/107_add_video_presentations_table.py
new file mode 100644
index 000000000..e6f928b50
--- /dev/null
+++ b/surfsense_backend/alembic/versions/107_add_video_presentations_table.py
@@ -0,0 +1,85 @@
+"""Add video_presentations table and video_presentation_status enum
+
+Revision ID: 107
+Revises: 106
+"""
+
+from collections.abc import Sequence
+
+import sqlalchemy as sa
+from sqlalchemy.dialects.postgresql import JSONB
+
+from alembic import op
+
+revision: str = "107"
+down_revision: str | None = "106"
+branch_labels: str | Sequence[str] | None = None
+depends_on: str | Sequence[str] | None = None
+
+video_presentation_status_enum = sa.Enum(
+ "pending",
+ "generating",
+ "ready",
+ "failed",
+ name="video_presentation_status",
+)
+
+
+def upgrade() -> None:
+ video_presentation_status_enum.create(op.get_bind(), checkfirst=True)
+
+ op.create_table(
+ "video_presentations",
+ sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
+ sa.Column("title", sa.String(length=500), nullable=False),
+ sa.Column("slides", JSONB(), nullable=True),
+ sa.Column("scene_codes", JSONB(), nullable=True),
+ sa.Column(
+ "status",
+ video_presentation_status_enum,
+ server_default="ready",
+ nullable=False,
+ ),
+ sa.Column("search_space_id", sa.Integer(), nullable=False),
+ sa.Column("thread_id", sa.Integer(), nullable=True),
+ sa.Column(
+ "created_at",
+ sa.TIMESTAMP(timezone=True),
+ server_default=sa.text("now()"),
+ nullable=False,
+ ),
+ sa.ForeignKeyConstraint(
+ ["search_space_id"],
+ ["searchspaces.id"],
+ ondelete="CASCADE",
+ ),
+ sa.ForeignKeyConstraint(
+ ["thread_id"],
+ ["new_chat_threads.id"],
+ ondelete="SET NULL",
+ ),
+ sa.PrimaryKeyConstraint("id"),
+ )
+ op.create_index(
+ "ix_video_presentations_status",
+ "video_presentations",
+ ["status"],
+ )
+ op.create_index(
+ "ix_video_presentations_thread_id",
+ "video_presentations",
+ ["thread_id"],
+ )
+ op.create_index(
+ "ix_video_presentations_created_at",
+ "video_presentations",
+ ["created_at"],
+ )
+
+
+def downgrade() -> None:
+ op.drop_index("ix_video_presentations_created_at", table_name="video_presentations")
+ op.drop_index("ix_video_presentations_thread_id", table_name="video_presentations")
+ op.drop_index("ix_video_presentations_status", table_name="video_presentations")
+ op.drop_table("video_presentations")
+ video_presentation_status_enum.drop(op.get_bind(), checkfirst=True)
diff --git a/surfsense_backend/app/agents/new_chat/chat_deepagent.py b/surfsense_backend/app/agents/new_chat/chat_deepagent.py
index f3d988e5b..c247ada61 100644
--- a/surfsense_backend/app/agents/new_chat/chat_deepagent.py
+++ b/surfsense_backend/app/agents/new_chat/chat_deepagent.py
@@ -37,13 +37,15 @@ _perf_log = get_perf_logger()
# =============================================================================
# Maps SearchSourceConnectorType enum values to the searchable document/connector types
-# used by the knowledge_base tool. Some connectors map to different document types.
+# used by the knowledge_base and web_search tools.
+# Live search connectors (TAVILY_API, LINKUP_API, BAIDU_SEARCH_API) are routed to
+# the web_search tool; all others go to search_knowledge_base.
_CONNECTOR_TYPE_TO_SEARCHABLE: dict[str, str] = {
- # Direct mappings (connector type == searchable type)
+ # Live search connectors (handled by web_search tool)
"TAVILY_API": "TAVILY_API",
- "SEARXNG_API": "SEARXNG_API",
"LINKUP_API": "LINKUP_API",
"BAIDU_SEARCH_API": "BAIDU_SEARCH_API",
+ # Local/indexed connectors (handled by search_knowledge_base tool)
"SLACK_CONNECTOR": "SLACK_CONNECTOR",
"TEAMS_CONNECTOR": "TEAMS_CONNECTOR",
"NOTION_CONNECTOR": "NOTION_CONNECTOR",
@@ -233,6 +235,7 @@ async def create_surfsense_deep_agent(
available_document_types = await connector_service.get_available_document_types(
search_space_id
)
+
except Exception as e:
logging.warning(f"Failed to discover available connectors/document types: {e}")
_perf_log.info(
diff --git a/surfsense_backend/app/agents/new_chat/llm_config.py b/surfsense_backend/app/agents/new_chat/llm_config.py
index 4ddb47330..60cd2a452 100644
--- a/surfsense_backend/app/agents/new_chat/llm_config.py
+++ b/surfsense_backend/app/agents/new_chat/llm_config.py
@@ -59,6 +59,7 @@ PROVIDER_MAP = {
"DATABRICKS": "databricks",
"COMETAPI": "cometapi",
"HUGGINGFACE": "huggingface",
+ "MINIMAX": "openai",
"CUSTOM": "custom",
}
diff --git a/surfsense_backend/app/agents/new_chat/system_prompt.py b/surfsense_backend/app/agents/new_chat/system_prompt.py
index b042f75c3..f8ac62787 100644
--- a/surfsense_backend/app/agents/new_chat/system_prompt.py
+++ b/surfsense_backend/app/agents/new_chat/system_prompt.py
@@ -99,14 +99,8 @@ _TOOL_INSTRUCTIONS["search_knowledge_base"] = """
- IMPORTANT: When searching for information (meetings, schedules, notes, tasks, etc.), ALWAYS search broadly
across ALL sources first by omitting connectors_to_search. The user may store information in various places
including calendar apps, note-taking apps (Obsidian, Notion), chat apps (Slack, Discord), and more.
- - IMPORTANT (REAL-TIME / PUBLIC WEB QUERIES): For questions that require current public web data
- (e.g., live exchange rates, stock prices, breaking news, weather, current events), you MUST call
- `search_knowledge_base` using live web connectors via `connectors_to_search`:
- ["LINKUP_API", "TAVILY_API", "SEARXNG_API", "BAIDU_SEARCH_API"].
- - For these real-time/public web queries, DO NOT answer from memory and DO NOT say you lack internet
- access before attempting a live connector search.
- - If the live connectors return no relevant results, explain that live web sources did not return enough
- data and ask the user if they want you to retry with a refined query.
+ - This tool searches ONLY local/indexed data (uploaded files, Notion, Slack, browser extension captures, etc.).
+ For real-time web search (current events, news, live data), use the `web_search` tool instead.
- FALLBACK BEHAVIOR: If the search returns no relevant results, you MAY then answer using your own
general knowledge, but clearly indicate that no matching information was found in the knowledge base.
- Only narrow to specific connectors if the user explicitly asks (e.g., "check my Slack" or "in my calendar").
@@ -138,6 +132,17 @@ _TOOL_INSTRUCTIONS["generate_podcast"] = """
- After calling this tool, inform the user that podcast generation has started and they will see the player when it's ready (takes 3-5 minutes).
"""
+_TOOL_INSTRUCTIONS["generate_video_presentation"] = """
+- generate_video_presentation: Generate a video presentation from provided content.
+ - Use this when the user asks to create a video, presentation, slides, or slide deck.
+ - Trigger phrases: "give me a presentation", "create slides", "generate a video", "make a slide deck", "turn this into a presentation"
+ - Args:
+ - source_content: The text content to turn into a presentation. The more detailed, the better.
+ - video_title: Optional title (default: "SurfSense Presentation")
+ - user_prompt: Optional style instructions (e.g., "Make it technical and detailed")
+ - After calling this tool, inform the user that generation has started and they will see the presentation when it's ready.
+"""
+
_TOOL_INSTRUCTIONS["generate_report"] = """
- generate_report: Generate or revise a structured Markdown report artifact.
- WHEN TO CALL THIS TOOL — the message must contain a creation or modification VERB directed at producing a deliverable:
@@ -271,6 +276,24 @@ _TOOL_INSTRUCTIONS["scrape_webpage"] = """
* Don't show every image - just the most relevant 1-3 images that enhance understanding.
"""
+_TOOL_INSTRUCTIONS["web_search"] = """
+- web_search: Search the web for real-time information using all configured search engines.
+ - Use this for current events, news, prices, weather, public facts, or any question requiring
+ up-to-date information from the internet.
+ - This tool dispatches to all configured search engines (SearXNG, Tavily, Linkup, Baidu) in
+ parallel and merges the results.
+ - IMPORTANT (REAL-TIME / PUBLIC WEB QUERIES): For questions that require current public web data
+ (e.g., live exchange rates, stock prices, breaking news, weather, current events), you MUST call
+ `web_search` instead of answering from memory.
+ - For these real-time/public web queries, DO NOT answer from memory and DO NOT say you lack internet
+ access before attempting a web search.
+ - If the search returns no relevant results, explain that web sources did not return enough
+ data and ask the user if they want you to retry with a refined query.
+ - Args:
+ - query: The search query - use specific, descriptive terms
+ - top_k: Number of results to retrieve (default: 10, max: 50)
+"""
+
# Memory tool instructions have private and shared variants.
# We store them keyed as "save_memory" / "recall_memory" with sub-keys.
_MEMORY_TOOL_INSTRUCTIONS: dict[str, dict[str, str]] = {
@@ -401,7 +424,7 @@ _TOOL_EXAMPLES["search_knowledge_base"] = """
- User: "Check my Obsidian notes for meeting notes"
- Call: `search_knowledge_base(query="meeting notes", connectors_to_search=["OBSIDIAN_CONNECTOR"])`
- User: "search me current usd to inr rate"
- - Call: `search_knowledge_base(query="current USD to INR exchange rate", connectors_to_search=["LINKUP_API", "TAVILY_API", "SEARXNG_API", "BAIDU_SEARCH_API"])`
+ - Call: `web_search(query="current USD to INR exchange rate")`
- Then answer using the returned live web results with citations.
"""
@@ -426,6 +449,16 @@ _TOOL_EXAMPLES["generate_podcast"] = """
- Then: `generate_podcast(source_content="Key insights about quantum computing from the knowledge base:\\n\\n[Comprehensive summary of all relevant search results with key facts, concepts, and findings]", podcast_title="Quantum Computing Explained")`
"""
+_TOOL_EXAMPLES["generate_video_presentation"] = """
+- User: "Give me a presentation about AI trends based on what we discussed"
+ - First search for relevant content, then call: `generate_video_presentation(source_content="Based on our conversation and search results: [detailed summary of chat + search findings]", video_title="AI Trends Presentation")`
+- User: "Create slides summarizing this conversation"
+ - Call: `generate_video_presentation(source_content="Complete conversation summary:\\n\\nUser asked about [topic 1]:\\n[Your detailed response]\\n\\nUser then asked about [topic 2]:\\n[Your detailed response]\\n\\n[Continue for all exchanges in the conversation]", video_title="Conversation Summary")`
+- User: "Make a video presentation about quantum computing"
+ - First search: `search_knowledge_base(query="quantum computing")`
+ - Then: `generate_video_presentation(source_content="Key insights about quantum computing from the knowledge base:\\n\\n[Comprehensive summary of all relevant search results with key facts, concepts, and findings]", video_title="Quantum Computing Explained")`
+"""
+
_TOOL_EXAMPLES["generate_report"] = """
- User: "Generate a report about AI trends"
- Call: `generate_report(topic="AI Trends Report", source_strategy="kb_search", search_queries=["AI trends recent developments", "artificial intelligence industry trends", "AI market growth and predictions"], report_style="detailed")`
@@ -471,11 +504,23 @@ _TOOL_EXAMPLES["generate_image"] = """
- Step 2: `display_image(src="", alt="Bean Dream coffee shop logo", title="Generated Image")`
"""
+_TOOL_EXAMPLES["web_search"] = """
+- User: "What's the current USD to INR exchange rate?"
+ - Call: `web_search(query="current USD to INR exchange rate")`
+ - Then answer using the returned web results with citations.
+- User: "What's the latest news about AI?"
+ - Call: `web_search(query="latest AI news today")`
+- User: "What's the weather in New York?"
+ - Call: `web_search(query="weather New York today")`
+"""
+
# All tool names that have prompt instructions (order matters for prompt readability)
_ALL_TOOL_NAMES_ORDERED = [
"search_surfsense_docs",
"search_knowledge_base",
+ "web_search",
"generate_podcast",
+ "generate_video_presentation",
"generate_report",
"link_preview",
"display_image",
@@ -543,7 +588,7 @@ DISABLED TOOLS (by user):
The following tools are available in SurfSense but have been disabled by the user for this session: {disabled_list}.
You do NOT have access to these tools and MUST NOT claim you can use them.
If the user asks about a capability provided by a disabled tool, let them know the relevant tool
-is currently disabled and they can re-enable it from the tools menu (wrench icon) in the composer toolbar.
+is currently disabled and they can re-enable it.
""")
parts.append("\n\n")
@@ -595,11 +640,10 @@ The documents you receive are structured like this:
-**Live web search results (URL chunk IDs):**
+**Web search results (URL chunk IDs):**
- TAVILY_API::Some Title::https://example.com/article
- TAVILY_API
+ WEB_SEARCH
diff --git a/surfsense_backend/app/agents/new_chat/tools/__init__.py b/surfsense_backend/app/agents/new_chat/tools/__init__.py
index 0a11951f0..5002e69bb 100644
--- a/surfsense_backend/app/agents/new_chat/tools/__init__.py
+++ b/surfsense_backend/app/agents/new_chat/tools/__init__.py
@@ -8,6 +8,7 @@ Available tools:
- search_knowledge_base: Search the user's personal knowledge base
- search_surfsense_docs: Search Surfsense documentation for usage help
- generate_podcast: Generate audio podcasts from content
+- generate_video_presentation: Generate video presentations with slides and narration
- generate_image: Generate images from text descriptions using AI models
- link_preview: Fetch rich previews for URLs
- display_image: Display images in chat
@@ -39,6 +40,7 @@ from .registry import (
from .scrape_webpage import create_scrape_webpage_tool
from .search_surfsense_docs import create_search_surfsense_docs_tool
from .user_memory import create_recall_memory_tool, create_save_memory_tool
+from .video_presentation import create_generate_video_presentation_tool
__all__ = [
# Registry
@@ -51,6 +53,7 @@ __all__ = [
"create_display_image_tool",
"create_generate_image_tool",
"create_generate_podcast_tool",
+ "create_generate_video_presentation_tool",
"create_link_preview_tool",
"create_recall_memory_tool",
"create_save_memory_tool",
diff --git a/surfsense_backend/app/agents/new_chat/tools/knowledge_base.py b/surfsense_backend/app/agents/new_chat/tools/knowledge_base.py
index 4596d5efd..a683b1c17 100644
--- a/surfsense_backend/app/agents/new_chat/tools/knowledge_base.py
+++ b/surfsense_backend/app/agents/new_chat/tools/knowledge_base.py
@@ -23,11 +23,10 @@ from app.db import shielded_async_session
from app.services.connector_service import ConnectorService
from app.utils.perf import get_perf_logger
-# Connectors that call external live-search APIs (no local DB / embedding needed).
-# These are never filtered by available_document_types.
+# Connectors that call external live-search APIs. These are handled by the
+# ``web_search`` tool and must be excluded from knowledge-base searches.
_LIVE_SEARCH_CONNECTORS: set[str] = {
"TAVILY_API",
- "SEARXNG_API",
"LINKUP_API",
"BAIDU_SEARCH_API",
}
@@ -190,10 +189,6 @@ _ALL_CONNECTORS: list[str] = [
"GOOGLE_DRIVE_FILE",
"DISCORD_CONNECTOR",
"AIRTABLE_CONNECTOR",
- "TAVILY_API",
- "SEARXNG_API",
- "LINKUP_API",
- "BAIDU_SEARCH_API",
"LUMA_CONNECTOR",
"NOTE",
"BOOKSTACK_CONNECTOR",
@@ -227,10 +222,6 @@ CONNECTOR_DESCRIPTIONS: dict[str, str] = {
"GOOGLE_DRIVE_FILE": "Google Drive files and documents (personal cloud storage)",
"DISCORD_CONNECTOR": "Discord server conversations and shared content (personal community)",
"AIRTABLE_CONNECTOR": "Airtable records, tables, and database content (personal data)",
- "TAVILY_API": "Tavily web search API results (real-time web search)",
- "SEARXNG_API": "SearxNG search API results (privacy-focused web search)",
- "LINKUP_API": "Linkup search API results (web search)",
- "BAIDU_SEARCH_API": "Baidu search API results (Chinese web search)",
"LUMA_CONNECTOR": "Luma events and meetings",
"WEBCRAWLER_CONNECTOR": "Webpages indexed by SurfSense (personally selected websites)",
"CRAWLED_URL": "Webpages indexed by SurfSense (personally selected websites)",
@@ -268,14 +259,15 @@ def _normalize_connectors(
valid_set = (
set(available_connectors) if available_connectors else set(_ALL_CONNECTORS)
)
+ valid_set -= _LIVE_SEARCH_CONNECTORS
if not connectors_to_search:
- # Search all available connectors if none specified
- return (
+ base = (
list(available_connectors)
if available_connectors
else list(_ALL_CONNECTORS)
)
+ return [c for c in base if c not in _LIVE_SEARCH_CONNECTORS]
normalized: list[str] = []
for raw in connectors_to_search:
@@ -302,15 +294,14 @@ def _normalize_connectors(
out.append(c)
# Fallback to all available if nothing matched
- return (
- out
- if out
- else (
+ if not out:
+ base = (
list(available_connectors)
if available_connectors
else list(_ALL_CONNECTORS)
)
- )
+ return [c for c in base if c not in _LIVE_SEARCH_CONNECTORS]
+ return out
# =============================================================================
@@ -479,7 +470,6 @@ def format_documents_for_context(
# a numeric chunk_id (the numeric IDs are meaningless auto-incremented counters).
live_search_connectors = {
"TAVILY_API",
- "SEARXNG_API",
"LINKUP_API",
"BAIDU_SEARCH_API",
}
@@ -623,13 +613,11 @@ async def search_knowledge_base_async(
connectors = _normalize_connectors(connectors_to_search, available_connectors)
- # --- Optimization 1: skip local connectors that have zero indexed documents ---
+ # --- Optimization 1: skip connectors that have zero indexed documents ---
if available_document_types:
doc_types_set = set(available_document_types)
before_count = len(connectors)
- connectors = [
- c for c in connectors if c in _LIVE_SEARCH_CONNECTORS or c in doc_types_set
- ]
+ connectors = [c for c in connectors if c in doc_types_set]
skipped = before_count - len(connectors)
if skipped:
perf.info(
@@ -664,9 +652,7 @@ async def search_knowledge_base_async(
"[kb_search] degenerate query %r detected - falling back to recency browse",
query,
)
- local_connectors = [c for c in connectors if c not in _LIVE_SEARCH_CONNECTORS]
- if not local_connectors:
- local_connectors = [None] # type: ignore[list-item]
+ browse_connectors = connectors if connectors else [None] # type: ignore[list-item]
browse_results = await asyncio.gather(
*[
@@ -677,7 +663,7 @@ async def search_knowledge_base_async(
start_date=resolved_start_date,
end_date=resolved_end_date,
)
- for c in local_connectors
+ for c in browse_connectors
]
)
for docs in browse_results:
@@ -702,66 +688,20 @@ async def search_knowledge_base_async(
)
return result
- # Specs for live-search connectors (external APIs, no local DB/embedding).
- live_connector_specs: dict[str, tuple[str, bool, bool, dict[str, Any]]] = {
- "TAVILY_API": ("search_tavily", False, True, {}),
- "SEARXNG_API": ("search_searxng", False, True, {}),
- "LINKUP_API": ("search_linkup", False, False, {"mode": "standard"}),
- "BAIDU_SEARCH_API": ("search_baidu", False, True, {}),
- }
-
# --- Optimization 2: compute the query embedding once, share across all local searches ---
- precomputed_embedding: list[float] | None = None
- has_local_connectors = any(c not in _LIVE_SEARCH_CONNECTORS for c in connectors)
- if has_local_connectors:
- from app.config import config as app_config
+ from app.config import config as app_config
- t_embed = time.perf_counter()
- precomputed_embedding = app_config.embedding_model_instance.embed(query)
- perf.info(
- "[kb_search] shared embedding computed in %.3fs",
- time.perf_counter() - t_embed,
- )
+ t_embed = time.perf_counter()
+ precomputed_embedding = app_config.embedding_model_instance.embed(query)
+ perf.info(
+ "[kb_search] shared embedding computed in %.3fs",
+ time.perf_counter() - t_embed,
+ )
max_parallel_searches = 4
semaphore = asyncio.Semaphore(max_parallel_searches)
async def _search_one_connector(connector: str) -> list[dict[str, Any]]:
- is_live = connector in _LIVE_SEARCH_CONNECTORS
-
- if is_live:
- spec = live_connector_specs.get(connector)
- if spec is None:
- return []
- method_name, includes_date_range, includes_top_k, extra_kwargs = spec
- kwargs: dict[str, Any] = {
- "user_query": query,
- "search_space_id": search_space_id,
- **extra_kwargs,
- }
- if includes_top_k:
- kwargs["top_k"] = top_k
- if includes_date_range:
- kwargs["start_date"] = resolved_start_date
- kwargs["end_date"] = resolved_end_date
-
- try:
- t_conn = time.perf_counter()
- async with semaphore, shielded_async_session() as isolated_session:
- svc = ConnectorService(isolated_session, search_space_id)
- _, chunks = await getattr(svc, method_name)(**kwargs)
- perf.info(
- "[kb_search] connector=%s results=%d in %.3fs",
- connector,
- len(chunks),
- time.perf_counter() - t_conn,
- )
- return chunks
- except Exception as e:
- perf.warning("[kb_search] connector=%s FAILED: %s", connector, e)
- return []
-
- # --- Optimization 3: call _combined_rrf_search directly with shared embedding ---
try:
t_conn = time.perf_counter()
async with semaphore, shielded_async_session() as isolated_session:
@@ -967,7 +907,9 @@ Focus searches on these types for best results."""
# This is what the LLM sees when deciding whether/how to use the tool
dynamic_description = f"""Search the user's personal knowledge base for relevant information.
-Use this tool to find documents, notes, files, web pages, and other content that may help answer the user's question.
+Use this tool to find documents, notes, files, web pages, and other content the user has indexed.
+This searches ONLY local/indexed data (uploaded files, Notion, Slack, browser extension captures, etc.).
+For real-time web search (current events, news, live data), use the `web_search` tool instead.
IMPORTANT:
- Always craft specific, descriptive search queries using natural language keywords.
@@ -977,9 +919,6 @@ IMPORTANT:
- If the user requests a specific source type (e.g. "my notes", "Slack messages"), pass `connectors_to_search=[...]` using the enums below.
- If `connectors_to_search` is omitted/empty, the system will search broadly.
- Only connectors that are enabled/configured for this search space are available.{doc_types_info}
-- For real-time/public web queries (e.g., current exchange rates, stock prices, breaking news, weather),
- explicitly include live web connectors in `connectors_to_search`, prioritizing:
- ["LINKUP_API", "TAVILY_API", "SEARXNG_API", "BAIDU_SEARCH_API"].
## Available connector enums for `connectors_to_search`
diff --git a/surfsense_backend/app/agents/new_chat/tools/podcast.py b/surfsense_backend/app/agents/new_chat/tools/podcast.py
index 8ac537f9a..248a4f450 100644
--- a/surfsense_backend/app/agents/new_chat/tools/podcast.py
+++ b/surfsense_backend/app/agents/new_chat/tools/podcast.py
@@ -4,60 +4,15 @@ Podcast generation tool for the SurfSense agent.
This module provides a factory function for creating the generate_podcast tool
that submits a Celery task for background podcast generation. The frontend
polls for completion and auto-updates when the podcast is ready.
-
-Duplicate request prevention:
-- Only one podcast can be generated at a time per search space
-- Uses Redis to track active podcast tasks
-- Returns a friendly message if a podcast is already being generated
"""
from typing import Any
-import redis
from langchain_core.tools import tool
from sqlalchemy.ext.asyncio import AsyncSession
-from app.config import config
from app.db import Podcast, PodcastStatus
-# Redis connection for tracking active podcast tasks
-# Defaults to the Celery broker when REDIS_APP_URL is not set
-REDIS_URL = config.REDIS_APP_URL
-_redis_client: redis.Redis | None = None
-
-
-def get_redis_client() -> redis.Redis:
- """Get or create Redis client for podcast task tracking."""
- global _redis_client
- if _redis_client is None:
- _redis_client = redis.from_url(REDIS_URL, decode_responses=True)
- return _redis_client
-
-
-def _redis_key(search_space_id: int) -> str:
- return f"podcast:generating:{search_space_id}"
-
-
-def get_generating_podcast_id(search_space_id: int) -> int | None:
- """Get the podcast ID currently being generated for this search space."""
- try:
- client = get_redis_client()
- value = client.get(_redis_key(search_space_id))
- return int(value) if value else None
- except Exception:
- return None
-
-
-def set_generating_podcast(search_space_id: int, podcast_id: int) -> None:
- """Mark a podcast as currently generating for this search space."""
- try:
- client = get_redis_client()
- client.setex(_redis_key(search_space_id), 1800, str(podcast_id))
- except Exception as e:
- print(
- f"[generate_podcast] Warning: Could not set generating podcast in Redis: {e}"
- )
-
def create_generate_podcast_tool(
search_space_id: int,
@@ -109,18 +64,6 @@ def create_generate_podcast_tool(
- message: Status message (or "error" field if status is failed)
"""
try:
- generating_podcast_id = get_generating_podcast_id(search_space_id)
- if generating_podcast_id:
- print(
- f"[generate_podcast] Blocked duplicate request. Generating podcast: {generating_podcast_id}"
- )
- return {
- "status": PodcastStatus.GENERATING.value,
- "podcast_id": generating_podcast_id,
- "title": podcast_title,
- "message": "A podcast is already being generated. Please wait for it to complete.",
- }
-
podcast = Podcast(
title=podcast_title,
status=PodcastStatus.PENDING,
@@ -142,8 +85,6 @@ def create_generate_podcast_tool(
user_prompt=user_prompt,
)
- set_generating_podcast(search_space_id, podcast.id)
-
print(f"[generate_podcast] Created podcast {podcast.id}, task: {task.id}")
return {
diff --git a/surfsense_backend/app/agents/new_chat/tools/registry.py b/surfsense_backend/app/agents/new_chat/tools/registry.py
index 030cbf239..4feff7d90 100644
--- a/surfsense_backend/app/agents/new_chat/tools/registry.py
+++ b/surfsense_backend/app/agents/new_chat/tools/registry.py
@@ -73,6 +73,8 @@ from .shared_memory import (
create_save_shared_memory_tool,
)
from .user_memory import create_recall_memory_tool, create_save_memory_tool
+from .video_presentation import create_generate_video_presentation_tool
+from .web_search import create_web_search_tool
# =============================================================================
# Tool Definition
@@ -135,6 +137,17 @@ BUILTIN_TOOLS: list[ToolDefinition] = [
),
requires=["search_space_id", "db_session", "thread_id"],
),
+ # Video presentation generation tool
+ ToolDefinition(
+ name="generate_video_presentation",
+ description="Generate a video presentation with slides and narration from provided content",
+ factory=lambda deps: create_generate_video_presentation_tool(
+ search_space_id=deps["search_space_id"],
+ db_session=deps["db_session"],
+ thread_id=deps["thread_id"],
+ ),
+ requires=["search_space_id", "db_session", "thread_id"],
+ ),
# Report generation tool (inline, short-lived sessions for DB ops)
# Supports internal KB search via source_strategy so the agent doesn't
# need to call search_knowledge_base separately before generating.
@@ -186,7 +199,16 @@ BUILTIN_TOOLS: list[ToolDefinition] = [
),
requires=[], # firecrawl_api_key is optional
),
- # Note: write_todos is now provided by TodoListMiddleware from deepagents
+ # Web search tool — real-time web search via SearXNG + user-configured engines
+ ToolDefinition(
+ name="web_search",
+ description="Search the web for real-time information using configured search engines",
+ factory=lambda deps: create_web_search_tool(
+ search_space_id=deps.get("search_space_id"),
+ available_connectors=deps.get("available_connectors"),
+ ),
+ requires=[],
+ ),
# Surfsense documentation search tool
ToolDefinition(
name="search_surfsense_docs",
diff --git a/surfsense_backend/app/agents/new_chat/tools/video_presentation.py b/surfsense_backend/app/agents/new_chat/tools/video_presentation.py
new file mode 100644
index 000000000..a90e08ac3
--- /dev/null
+++ b/surfsense_backend/app/agents/new_chat/tools/video_presentation.py
@@ -0,0 +1,87 @@
+"""
+Video presentation generation tool for the SurfSense agent.
+
+This module provides a factory function for creating the generate_video_presentation
+tool that submits a Celery task for background video presentation generation.
+The frontend polls for completion and auto-updates when the presentation is ready.
+"""
+
+from typing import Any
+
+from langchain_core.tools import tool
+from sqlalchemy.ext.asyncio import AsyncSession
+
+from app.db import VideoPresentation, VideoPresentationStatus
+
+
+def create_generate_video_presentation_tool(
+ search_space_id: int,
+ db_session: AsyncSession,
+ thread_id: int | None = None,
+):
+ """
+ Factory function to create the generate_video_presentation tool with injected dependencies.
+
+ Pre-creates video presentation record with pending status so the ID is available
+ immediately for frontend polling.
+ """
+
+ @tool
+ async def generate_video_presentation(
+ source_content: str,
+ video_title: str = "SurfSense Presentation",
+ user_prompt: str | None = None,
+ ) -> dict[str, Any]:
+ """Generate a video presentation from the provided content.
+
+ Use this tool when the user asks to create a video, presentation, slides, or slide deck.
+
+ Args:
+ source_content: The text content to turn into a presentation.
+ video_title: Title for the presentation (default: "SurfSense Presentation")
+ user_prompt: Optional style/tone instructions.
+ """
+ try:
+ video_pres = VideoPresentation(
+ title=video_title,
+ status=VideoPresentationStatus.PENDING,
+ search_space_id=search_space_id,
+ thread_id=thread_id,
+ )
+ db_session.add(video_pres)
+ await db_session.commit()
+ await db_session.refresh(video_pres)
+
+ from app.tasks.celery_tasks.video_presentation_tasks import (
+ generate_video_presentation_task,
+ )
+
+ task = generate_video_presentation_task.delay(
+ video_presentation_id=video_pres.id,
+ source_content=source_content,
+ search_space_id=search_space_id,
+ user_prompt=user_prompt,
+ )
+
+ print(
+ f"[generate_video_presentation] Created video presentation {video_pres.id}, task: {task.id}"
+ )
+
+ return {
+ "status": VideoPresentationStatus.PENDING.value,
+ "video_presentation_id": video_pres.id,
+ "title": video_title,
+ "message": "Video presentation generation started. This may take a few minutes.",
+ }
+
+ except Exception as e:
+ error_message = str(e)
+ print(f"[generate_video_presentation] Error: {error_message}")
+ return {
+ "status": VideoPresentationStatus.FAILED.value,
+ "error": error_message,
+ "title": video_title,
+ "video_presentation_id": None,
+ }
+
+ return generate_video_presentation
diff --git a/surfsense_backend/app/agents/new_chat/tools/web_search.py b/surfsense_backend/app/agents/new_chat/tools/web_search.py
new file mode 100644
index 000000000..c67db541c
--- /dev/null
+++ b/surfsense_backend/app/agents/new_chat/tools/web_search.py
@@ -0,0 +1,247 @@
+"""
+Web search tool for the SurfSense agent.
+
+Provides a unified tool for real-time web searches that dispatches to all
+configured search engines: the platform SearXNG instance (always available)
+plus any user-configured live-search connectors (Tavily, Linkup, Baidu).
+"""
+
+import asyncio
+import json
+import time
+from typing import Any
+
+from langchain_core.tools import StructuredTool
+from pydantic import BaseModel, Field
+
+from app.db import shielded_async_session
+from app.services.connector_service import ConnectorService
+from app.utils.perf import get_perf_logger
+
+_LIVE_SEARCH_CONNECTORS: set[str] = {
+ "TAVILY_API",
+ "LINKUP_API",
+ "BAIDU_SEARCH_API",
+}
+
+_LIVE_CONNECTOR_SPECS: dict[str, tuple[str, bool, bool, dict[str, Any]]] = {
+ "TAVILY_API": ("search_tavily", False, True, {}),
+ "LINKUP_API": ("search_linkup", False, False, {"mode": "standard"}),
+ "BAIDU_SEARCH_API": ("search_baidu", False, True, {}),
+}
+
+_CONNECTOR_LABELS: dict[str, str] = {
+ "TAVILY_API": "Tavily",
+ "LINKUP_API": "Linkup",
+ "BAIDU_SEARCH_API": "Baidu",
+}
+
+
+class WebSearchInput(BaseModel):
+ """Input schema for the web_search tool."""
+
+ query: str = Field(
+ description="The search query to look up on the web. Use specific, descriptive terms.",
+ )
+ top_k: int = Field(
+ default=10,
+ description="Number of results to retrieve (default: 10, max: 50).",
+ )
+
+
+def _format_web_results(
+ documents: list[dict[str, Any]],
+ *,
+ max_chars: int = 50_000,
+) -> str:
+ """Format web search results into XML suitable for the LLM context."""
+ if not documents:
+ return "No web search results found."
+
+ parts: list[str] = []
+ total_chars = 0
+
+ for doc in documents:
+ doc_info = doc.get("document") or {}
+ metadata = doc_info.get("metadata") or {}
+ title = doc_info.get("title") or "Web Result"
+ url = metadata.get("url") or ""
+ content = (doc.get("content") or "").strip()
+ source = metadata.get("document_type") or doc.get("source") or "WEB_SEARCH"
+ if not content:
+ continue
+
+ metadata_json = json.dumps(metadata, ensure_ascii=False)
+ doc_xml = "\n".join(
+ [
+ "",
+ "",
+ f" {source}",
+ f" ",
+ f" ",
+ f" ",
+ "",
+ "",
+ f" ",
+ "",
+ "",
+ "",
+ ]
+ )
+
+ if total_chars + len(doc_xml) > max_chars:
+ parts.append("")
+ break
+
+ parts.append(doc_xml)
+ total_chars += len(doc_xml)
+
+ return "\n".join(parts).strip() or "No web search results found."
+
+
+async def _search_live_connector(
+ connector: str,
+ query: str,
+ search_space_id: int,
+ top_k: int,
+ semaphore: asyncio.Semaphore,
+) -> list[dict[str, Any]]:
+ """Dispatch a single live-search connector (Tavily / Linkup / Baidu)."""
+ perf = get_perf_logger()
+ spec = _LIVE_CONNECTOR_SPECS.get(connector)
+ if spec is None:
+ return []
+
+ method_name, _includes_date_range, includes_top_k, extra_kwargs = spec
+ kwargs: dict[str, Any] = {
+ "user_query": query,
+ "search_space_id": search_space_id,
+ **extra_kwargs,
+ }
+ if includes_top_k:
+ kwargs["top_k"] = top_k
+
+ try:
+ t0 = time.perf_counter()
+ async with semaphore, shielded_async_session() as session:
+ svc = ConnectorService(session, search_space_id)
+ _, chunks = await getattr(svc, method_name)(**kwargs)
+ perf.info(
+ "[web_search] connector=%s results=%d in %.3fs",
+ connector,
+ len(chunks),
+ time.perf_counter() - t0,
+ )
+ return chunks
+ except Exception as e:
+ perf.warning("[web_search] connector=%s FAILED: %s", connector, e)
+ return []
+
+
+def create_web_search_tool(
+ search_space_id: int | None = None,
+ available_connectors: list[str] | None = None,
+) -> StructuredTool:
+ """Factory for the ``web_search`` tool.
+
+ Dispatches in parallel to the platform SearXNG instance and any
+ user-configured live-search connectors (Tavily, Linkup, Baidu).
+ """
+ active_live_connectors: list[str] = []
+ if available_connectors:
+ active_live_connectors = [
+ c for c in available_connectors if c in _LIVE_SEARCH_CONNECTORS
+ ]
+
+ engine_names = ["SearXNG (platform default)"]
+ engine_names.extend(_CONNECTOR_LABELS.get(c, c) for c in active_live_connectors)
+ engines_summary = ", ".join(engine_names)
+
+ description = (
+ "Search the web for real-time information. "
+ "Use this for current events, news, prices, weather, public facts, or any "
+ "question that requires up-to-date information from the internet.\n\n"
+ f"Active search engines: {engines_summary}.\n"
+ "All configured engines are queried in parallel and results are merged."
+ )
+
+ _search_space_id = search_space_id
+ _active_live = active_live_connectors
+
+ async def _web_search_impl(query: str, top_k: int = 10) -> str:
+ from app.services import web_search_service
+
+ perf = get_perf_logger()
+ t0 = time.perf_counter()
+ clamped_top_k = min(max(1, top_k), 50)
+
+ semaphore = asyncio.Semaphore(4)
+ tasks: list[asyncio.Task[list[dict[str, Any]]]] = []
+
+ if web_search_service.is_available():
+
+ async def _searxng() -> list[dict[str, Any]]:
+ async with semaphore:
+ _result_obj, docs = await web_search_service.search(
+ query=query,
+ top_k=clamped_top_k,
+ )
+ return docs
+
+ tasks.append(asyncio.ensure_future(_searxng()))
+
+ if _search_space_id is not None:
+ for connector in _active_live:
+ tasks.append(
+ asyncio.ensure_future(
+ _search_live_connector(
+ connector=connector,
+ query=query,
+ search_space_id=_search_space_id,
+ top_k=clamped_top_k,
+ semaphore=semaphore,
+ )
+ )
+ )
+
+ if not tasks:
+ return "Web search is not available — no search engines are configured."
+
+ results_lists = await asyncio.gather(*tasks, return_exceptions=True)
+
+ all_documents: list[dict[str, Any]] = []
+ for result in results_lists:
+ if isinstance(result, BaseException):
+ perf.warning("[web_search] a search engine failed: %s", result)
+ continue
+ all_documents.extend(result)
+
+ seen_urls: set[str] = set()
+ deduplicated: list[dict[str, Any]] = []
+ for doc in all_documents:
+ url = ((doc.get("document") or {}).get("metadata") or {}).get("url", "")
+ if url and url in seen_urls:
+ continue
+ if url:
+ seen_urls.add(url)
+ deduplicated.append(doc)
+
+ formatted = _format_web_results(deduplicated)
+
+ perf.info(
+ "[web_search] query=%r engines=%d results=%d deduped=%d chars=%d in %.3fs",
+ query[:60],
+ len(tasks),
+ len(all_documents),
+ len(deduplicated),
+ len(formatted),
+ time.perf_counter() - t0,
+ )
+ return formatted
+
+ return StructuredTool(
+ name="web_search",
+ description=description,
+ coroutine=_web_search_impl,
+ args_schema=WebSearchInput,
+ )
diff --git a/surfsense_backend/app/agents/video_presentation/__init__.py b/surfsense_backend/app/agents/video_presentation/__init__.py
new file mode 100644
index 000000000..caf885218
--- /dev/null
+++ b/surfsense_backend/app/agents/video_presentation/__init__.py
@@ -0,0 +1,10 @@
+"""Video Presentation LangGraph Agent.
+
+This module defines a graph for generating video presentations
+from source content, similar to the podcaster agent but producing
+slide-based video presentations with TTS narration.
+"""
+
+from .graph import graph
+
+__all__ = ["graph"]
diff --git a/surfsense_backend/app/agents/video_presentation/configuration.py b/surfsense_backend/app/agents/video_presentation/configuration.py
new file mode 100644
index 000000000..18724a2ab
--- /dev/null
+++ b/surfsense_backend/app/agents/video_presentation/configuration.py
@@ -0,0 +1,25 @@
+"""Define the configurable parameters for the video presentation agent."""
+
+from __future__ import annotations
+
+from dataclasses import dataclass, fields
+
+from langchain_core.runnables import RunnableConfig
+
+
+@dataclass(kw_only=True)
+class Configuration:
+ """The configuration for the video presentation agent."""
+
+ video_title: str
+ search_space_id: int
+ user_prompt: str | None = None
+
+ @classmethod
+ def from_runnable_config(
+ cls, config: RunnableConfig | None = None
+ ) -> Configuration:
+ """Create a Configuration instance from a RunnableConfig object."""
+ configurable = (config.get("configurable") or {}) if config else {}
+ _fields = {f.name for f in fields(cls) if f.init}
+ return cls(**{k: v for k, v in configurable.items() if k in _fields})
diff --git a/surfsense_backend/app/agents/video_presentation/graph.py b/surfsense_backend/app/agents/video_presentation/graph.py
new file mode 100644
index 000000000..1d87bcd76
--- /dev/null
+++ b/surfsense_backend/app/agents/video_presentation/graph.py
@@ -0,0 +1,39 @@
+from langgraph.graph import StateGraph
+
+from .configuration import Configuration
+from .nodes import (
+ assign_slide_themes,
+ create_presentation_slides,
+ create_slide_audio,
+ generate_slide_scene_codes,
+)
+from .state import State
+
+
+def build_graph():
+ workflow = StateGraph(State, config_schema=Configuration)
+
+ workflow.add_node("create_presentation_slides", create_presentation_slides)
+ workflow.add_node("create_slide_audio", create_slide_audio)
+ workflow.add_node("assign_slide_themes", assign_slide_themes)
+ workflow.add_node("generate_slide_scene_codes", generate_slide_scene_codes)
+
+ # Fan-out: after slides are parsed, run audio generation and theme
+ # assignment in parallel (themes only need slide metadata, not audio).
+ workflow.add_edge("__start__", "create_presentation_slides")
+ workflow.add_edge("create_presentation_slides", "create_slide_audio")
+ workflow.add_edge("create_presentation_slides", "assign_slide_themes")
+
+ # Fan-in: scene code generation waits for both audio and themes.
+ workflow.add_edge("create_slide_audio", "generate_slide_scene_codes")
+ workflow.add_edge("assign_slide_themes", "generate_slide_scene_codes")
+
+ workflow.add_edge("generate_slide_scene_codes", "__end__")
+
+ graph = workflow.compile()
+ graph.name = "Surfsense Video Presentation"
+
+ return graph
+
+
+graph = build_graph()
diff --git a/surfsense_backend/app/agents/video_presentation/nodes.py b/surfsense_backend/app/agents/video_presentation/nodes.py
new file mode 100644
index 000000000..1b3d71e84
--- /dev/null
+++ b/surfsense_backend/app/agents/video_presentation/nodes.py
@@ -0,0 +1,580 @@
+import asyncio
+import contextlib
+import json
+import math
+import os
+import shutil
+import uuid
+from pathlib import Path
+from typing import Any
+
+from ffmpeg.asyncio import FFmpeg
+from langchain_core.messages import HumanMessage, SystemMessage
+from langchain_core.runnables import RunnableConfig
+from litellm import aspeech
+
+from app.config import config as app_config
+from app.services.kokoro_tts_service import get_kokoro_tts_service
+from app.services.llm_service import get_agent_llm
+
+from .configuration import Configuration
+from .prompts import (
+ DEFAULT_DURATION_IN_FRAMES,
+ FPS,
+ REFINE_SCENE_SYSTEM_PROMPT,
+ REMOTION_SCENE_SYSTEM_PROMPT,
+ THEME_PRESETS,
+ build_scene_generation_user_prompt,
+ build_theme_assignment_user_prompt,
+ get_slide_generation_prompt,
+ get_theme_assignment_system_prompt,
+ pick_theme_and_mode_fallback,
+)
+from .state import (
+ PresentationSlides,
+ SlideAudioResult,
+ SlideContent,
+ SlideSceneCode,
+ State,
+)
+from .utils import get_voice_for_provider
+
+MAX_REFINE_ATTEMPTS = 3
+
+
+async def create_presentation_slides(
+ state: State, config: RunnableConfig
+) -> dict[str, Any]:
+ """Parse source content into structured presentation slides using LLM."""
+
+ configuration = Configuration.from_runnable_config(config)
+ search_space_id = configuration.search_space_id
+ user_prompt = configuration.user_prompt
+
+ llm = await get_agent_llm(state.db_session, search_space_id)
+ if not llm:
+ error_message = f"No LLM configured for search space {search_space_id}"
+ print(error_message)
+ raise RuntimeError(error_message)
+
+ prompt = get_slide_generation_prompt(user_prompt)
+
+ messages = [
+ SystemMessage(content=prompt),
+ HumanMessage(
+ content=f"{state.source_content}"
+ ),
+ ]
+
+ llm_response = await llm.ainvoke(messages)
+
+ try:
+ presentation = PresentationSlides.model_validate(
+ json.loads(llm_response.content)
+ )
+ except (json.JSONDecodeError, ValueError) as e:
+ print(f"Direct JSON parsing failed, trying fallback approach: {e!s}")
+
+ try:
+ content = llm_response.content
+ json_start = content.find("{")
+ json_end = content.rfind("}") + 1
+ if json_start >= 0 and json_end > json_start:
+ json_str = content[json_start:json_end]
+ parsed_data = json.loads(json_str)
+ presentation = PresentationSlides.model_validate(parsed_data)
+ print("Successfully parsed presentation slides using fallback approach")
+ else:
+ error_message = f"Could not find valid JSON in LLM response. Raw response: {content}"
+ print(error_message)
+ raise ValueError(error_message)
+
+ except (json.JSONDecodeError, ValueError) as e2:
+ error_message = f"Error parsing LLM response (fallback also failed): {e2!s}"
+ print(f"Error parsing LLM response: {e2!s}")
+ print(f"Raw response: {llm_response.content}")
+ raise
+
+ return {"slides": presentation.slides}
+
+
+async def create_slide_audio(state: State, config: RunnableConfig) -> dict[str, Any]:
+ """Generate TTS audio for each slide.
+
+ Each slide's speaker_transcripts are generated as individual TTS chunks,
+ then concatenated with ffmpeg (matching the POC in RemotionTets/api/tts).
+ """
+
+ session_id = str(uuid.uuid4())
+ temp_dir = Path("temp_audio")
+ temp_dir.mkdir(exist_ok=True)
+ output_dir = Path("video_presentation_audio")
+ output_dir.mkdir(exist_ok=True)
+
+ slides = state.slides or []
+ voice = get_voice_for_provider(app_config.TTS_SERVICE, speaker_id=0)
+ ext = "wav" if app_config.TTS_SERVICE == "local/kokoro" else "mp3"
+
+ async def _generate_tts_chunk(text: str, chunk_path: str) -> str:
+ """Generate a single TTS chunk and write it to *chunk_path*."""
+ if app_config.TTS_SERVICE == "local/kokoro":
+ kokoro_service = await get_kokoro_tts_service(lang_code="a")
+ await kokoro_service.generate_speech(
+ text=text,
+ voice=voice,
+ speed=1.0,
+ output_path=chunk_path,
+ )
+ else:
+ kwargs: dict[str, Any] = {
+ "model": app_config.TTS_SERVICE,
+ "api_key": app_config.TTS_SERVICE_API_KEY,
+ "voice": voice,
+ "input": text,
+ "max_retries": 2,
+ "timeout": 600,
+ }
+ if app_config.TTS_SERVICE_API_BASE:
+ kwargs["api_base"] = app_config.TTS_SERVICE_API_BASE
+
+ response = await aspeech(**kwargs)
+ with open(chunk_path, "wb") as f:
+ f.write(response.content)
+
+ return chunk_path
+
+ async def _concat_with_ffmpeg(chunk_paths: list[str], output_file: str) -> None:
+ """Concatenate multiple audio chunks into one file using async ffmpeg."""
+ ffmpeg = FFmpeg().option("y")
+ for chunk in chunk_paths:
+ ffmpeg = ffmpeg.input(chunk)
+
+ filter_parts = [f"[{i}:0]" for i in range(len(chunk_paths))]
+ filter_str = (
+ "".join(filter_parts) + f"concat=n={len(chunk_paths)}:v=0:a=1[outa]"
+ )
+ ffmpeg = ffmpeg.option("filter_complex", filter_str)
+ ffmpeg = ffmpeg.output(output_file, map="[outa]")
+ await ffmpeg.execute()
+
+ async def generate_audio_for_slide(slide: SlideContent) -> SlideAudioResult:
+ has_transcripts = (
+ slide.speaker_transcripts and len(slide.speaker_transcripts) > 0
+ )
+
+ if not has_transcripts:
+ print(
+ f"Slide {slide.slide_number}: no speaker_transcripts, "
+ f"using default duration ({DEFAULT_DURATION_IN_FRAMES} frames)"
+ )
+ return SlideAudioResult(
+ slide_number=slide.slide_number,
+ audio_file="",
+ duration_seconds=DEFAULT_DURATION_IN_FRAMES / FPS,
+ duration_in_frames=DEFAULT_DURATION_IN_FRAMES,
+ )
+
+ output_file = str(output_dir / f"{session_id}_slide_{slide.slide_number}.{ext}")
+
+ chunk_paths: list[str] = []
+ try:
+ chunk_paths = [
+ str(
+ temp_dir
+ / f"{session_id}_slide_{slide.slide_number}_chunk_{i}.{ext}"
+ )
+ for i in range(len(slide.speaker_transcripts))
+ ]
+
+ for i, text in enumerate(slide.speaker_transcripts):
+ print(
+ f" Slide {slide.slide_number} chunk {i + 1}/"
+ f"{len(slide.speaker_transcripts)}: "
+ f'"{text[:60]}..."'
+ )
+
+ await asyncio.gather(
+ *[
+ _generate_tts_chunk(text, path)
+ for text, path in zip(
+ slide.speaker_transcripts, chunk_paths, strict=False
+ )
+ ]
+ )
+
+ if len(chunk_paths) == 1:
+ shutil.move(chunk_paths[0], output_file)
+ else:
+ print(
+ f" Concatenating {len(chunk_paths)} chunks for slide "
+ f"{slide.slide_number} with ffmpeg"
+ )
+ await _concat_with_ffmpeg(chunk_paths, output_file)
+
+ duration_seconds = await _get_audio_duration(output_file)
+ duration_in_frames = math.ceil(duration_seconds * FPS)
+
+ return SlideAudioResult(
+ slide_number=slide.slide_number,
+ audio_file=output_file,
+ duration_seconds=duration_seconds,
+ duration_in_frames=max(duration_in_frames, DEFAULT_DURATION_IN_FRAMES),
+ )
+
+ except Exception as e:
+ print(f"Error generating audio for slide {slide.slide_number}: {e!s}")
+ raise
+ finally:
+ for p in chunk_paths:
+ with contextlib.suppress(OSError):
+ os.remove(p)
+
+ tasks = [generate_audio_for_slide(slide) for slide in slides]
+ audio_results = await asyncio.gather(*tasks)
+
+ audio_results_sorted = sorted(audio_results, key=lambda r: r.slide_number)
+
+ print(
+ f"Generated audio for {len(audio_results_sorted)} slides "
+ f"(total duration: {sum(r.duration_seconds for r in audio_results_sorted):.1f}s)"
+ )
+
+ return {"slide_audio_results": audio_results_sorted}
+
+
+async def _get_audio_duration(file_path: str) -> float:
+ """Get audio duration in seconds using ffprobe (via python-ffmpeg).
+
+ Falls back to file-size estimation if ffprobe fails.
+ """
+ try:
+ import subprocess
+
+ proc = await asyncio.create_subprocess_exec(
+ "ffprobe",
+ "-v",
+ "error",
+ "-show_entries",
+ "format=duration",
+ "-of",
+ "default=noprint_wrappers=1:nokey=1",
+ file_path,
+ stdout=subprocess.PIPE,
+ stderr=subprocess.PIPE,
+ )
+ stdout, _ = await asyncio.wait_for(proc.communicate(), timeout=10)
+ if proc.returncode == 0 and stdout.strip():
+ return float(stdout.strip())
+ except Exception as e:
+ print(f"ffprobe failed for {file_path}: {e!s}, using file-size estimation")
+
+ try:
+ file_size = os.path.getsize(file_path)
+ if file_path.endswith(".wav"):
+ return file_size / (16000 * 2)
+ else:
+ return file_size / 16000
+ except Exception:
+ return DEFAULT_DURATION_IN_FRAMES / FPS
+
+
+async def _assign_themes_with_llm(
+ llm, slides: list[SlideContent]
+) -> dict[int, tuple[str, str]]:
+ """Ask the LLM to assign a theme+mode to each slide in one call.
+
+ Returns a dict mapping slide_number → (theme, mode).
+ Falls back to round-robin if the LLM response can't be parsed.
+ """
+ total = len(slides)
+ slide_summaries = [
+ {
+ "slide_number": s.slide_number,
+ "title": s.title,
+ "subtitle": s.subtitle or "",
+ "background_explanation": s.background_explanation or "",
+ }
+ for s in slides
+ ]
+
+ system = get_theme_assignment_system_prompt()
+ user = build_theme_assignment_user_prompt(slide_summaries)
+
+ try:
+ response = await llm.ainvoke(
+ [
+ SystemMessage(content=system),
+ HumanMessage(content=user),
+ ]
+ )
+
+ text = response.content.strip()
+ if text.startswith("```"):
+ lines = text.split("\n")
+ text = "\n".join(
+ line for line in lines if not line.strip().startswith("```")
+ ).strip()
+
+ assignments = json.loads(text)
+ valid_themes = set(THEME_PRESETS)
+ result: dict[int, tuple[str, str]] = {}
+ for entry in assignments:
+ sn = entry.get("slide_number")
+ theme = entry.get("theme", "").upper()
+ mode = entry.get("mode", "dark").lower()
+ if sn and theme in valid_themes and mode in ("dark", "light"):
+ result[sn] = (theme, mode)
+
+ if len(result) == total:
+ print(
+ "LLM theme assignment: "
+ + ", ".join(f"S{sn}={t}/{m}" for sn, (t, m) in sorted(result.items()))
+ )
+ return result
+
+ print(
+ f"LLM returned {len(result)}/{total} valid assignments, "
+ "filling gaps with fallback"
+ )
+ for s in slides:
+ if s.slide_number not in result:
+ result[s.slide_number] = pick_theme_and_mode_fallback(
+ s.slide_number - 1, total
+ )
+ return result
+
+ except Exception as e:
+ print(f"LLM theme assignment failed ({e!s}), using fallback")
+ return {
+ s.slide_number: pick_theme_and_mode_fallback(s.slide_number - 1, total)
+ for s in slides
+ }
+
+
+async def assign_slide_themes(state: State, config: RunnableConfig) -> dict[str, Any]:
+ """Assign a theme preset + dark/light mode to every slide via a single LLM call.
+
+ Runs in parallel with audio generation since it only needs slide metadata.
+ """
+ configuration = Configuration.from_runnable_config(config)
+ search_space_id = configuration.search_space_id
+
+ llm = await get_agent_llm(state.db_session, search_space_id)
+ if not llm:
+ raise RuntimeError(f"No LLM configured for search space {search_space_id}")
+
+ slides = state.slides or []
+ assignments = await _assign_themes_with_llm(llm, slides)
+ return {"slide_theme_assignments": assignments}
+
+
+async def generate_slide_scene_codes(
+ state: State, config: RunnableConfig
+) -> dict[str, Any]:
+ """Generate Remotion component code for each slide using LLM.
+
+ Reads pre-assigned themes from state (produced by the parallel
+ assign_slide_themes node) and generates scene code concurrently.
+ """
+
+ configuration = Configuration.from_runnable_config(config)
+ search_space_id = configuration.search_space_id
+
+ llm = await get_agent_llm(state.db_session, search_space_id)
+ if not llm:
+ raise RuntimeError(f"No LLM configured for search space {search_space_id}")
+
+ slides = state.slides or []
+ audio_results = state.slide_audio_results or []
+
+ audio_map: dict[int, SlideAudioResult] = {r.slide_number: r for r in audio_results}
+ total_slides = len(slides)
+
+ theme_assignments = state.slide_theme_assignments or {}
+
+ async def _generate_scene_for_slide(slide: SlideContent) -> SlideSceneCode:
+ audio = audio_map.get(slide.slide_number)
+ duration = audio.duration_in_frames if audio else DEFAULT_DURATION_IN_FRAMES
+
+ theme, mode = theme_assignments.get(
+ slide.slide_number,
+ pick_theme_and_mode_fallback(slide.slide_number - 1, total_slides),
+ )
+
+ user_prompt = build_scene_generation_user_prompt(
+ slide_number=slide.slide_number,
+ total_slides=total_slides,
+ title=slide.title,
+ subtitle=slide.subtitle,
+ content_in_markdown=slide.content_in_markdown,
+ background_explanation=slide.background_explanation,
+ duration_in_frames=duration,
+ theme=theme,
+ mode=mode,
+ )
+
+ messages = [
+ SystemMessage(content=REMOTION_SCENE_SYSTEM_PROMPT),
+ HumanMessage(content=user_prompt),
+ ]
+
+ print(
+ f"Generating scene code for slide {slide.slide_number}/{total_slides}: "
+ f'"{slide.title}" ({duration} frames)'
+ )
+
+ llm_response = await llm.ainvoke(messages)
+ code, scene_title = _extract_code_and_title(llm_response.content)
+
+ code = await _refine_if_needed(llm, code, slide.slide_number)
+
+ print(f"Scene code ready for slide {slide.slide_number} ({len(code)} chars)")
+
+ return SlideSceneCode(
+ slide_number=slide.slide_number,
+ code=code,
+ title=scene_title or slide.title,
+ )
+
+ scene_codes = list(
+ await asyncio.gather(*[_generate_scene_for_slide(s) for s in slides])
+ )
+
+ return {"slide_scene_codes": scene_codes}
+
+
+def _extract_code_and_title(content: str) -> tuple[str, str | None]:
+ """Extract code and optional title from LLM response.
+
+ The LLM may return a JSON object like the POC's structured output:
+ { "code": "...", "title": "..." }
+ Or it may return raw code (with optional markdown fences).
+
+ Returns (code, title) where title may be None.
+ """
+ text = content.strip()
+
+ if text.startswith("{"):
+ try:
+ parsed = json.loads(text)
+ if isinstance(parsed, dict) and "code" in parsed:
+ return parsed["code"], parsed.get("title")
+ except (json.JSONDecodeError, ValueError):
+ pass
+
+ json_start = text.find("{")
+ json_end = text.rfind("}") + 1
+ if json_start >= 0 and json_end > json_start:
+ try:
+ parsed = json.loads(text[json_start:json_end])
+ if isinstance(parsed, dict) and "code" in parsed:
+ return parsed["code"], parsed.get("title")
+ except (json.JSONDecodeError, ValueError):
+ pass
+
+ code = text
+ if code.startswith("```"):
+ lines = code.split("\n")
+ start = 1
+ end = len(lines)
+ for i in range(len(lines) - 1, 0, -1):
+ if lines[i].strip().startswith("```"):
+ end = i
+ break
+ code = "\n".join(lines[start:end]).strip()
+
+ return code, None
+
+
+async def _refine_if_needed(llm, code: str, slide_number: int) -> str:
+ """Attempt basic syntax validation and auto-repair via LLM if needed.
+
+ Raises RuntimeError if the code is still invalid after MAX_REFINE_ATTEMPTS,
+ matching the POC's behavior where a failed slide aborts the pipeline.
+ """
+ error = _basic_syntax_check(code)
+ if error is None:
+ return code
+
+ for attempt in range(1, MAX_REFINE_ATTEMPTS + 1):
+ print(
+ f"Slide {slide_number}: syntax issue (attempt {attempt}/{MAX_REFINE_ATTEMPTS}): {error}"
+ )
+
+ messages = [
+ SystemMessage(content=REFINE_SCENE_SYSTEM_PROMPT),
+ HumanMessage(
+ content=(
+ f"Here is the broken Remotion component code:\n\n{code}\n\n"
+ f"Compilation error:\n{error}\n\nFix the code."
+ )
+ ),
+ ]
+
+ response = await llm.ainvoke(messages)
+ code, _ = _extract_code_and_title(response.content)
+
+ error = _basic_syntax_check(code)
+ if error is None:
+ print(f"Slide {slide_number}: fixed on attempt {attempt}")
+ return code
+
+ raise RuntimeError(
+ f"Slide {slide_number} failed to compile after {MAX_REFINE_ATTEMPTS} "
+ f"refine attempts. Last error: {error}"
+ )
+
+
+def _basic_syntax_check(code: str) -> str | None:
+ """Run a lightweight syntax check on the generated code.
+
+ Full Babel-based compilation happens on the frontend. This backend check
+ catches the most common LLM code-generation mistakes so the refine loop
+ can fix them before persisting.
+
+ Returns an error description or None if the code looks valid.
+ """
+ if not code or not code.strip():
+ return "Empty code"
+
+ if "export" not in code and "MyComposition" not in code:
+ return "Missing exported component (expected 'export const MyComposition')"
+
+ brace_count = 0
+ paren_count = 0
+ bracket_count = 0
+ for ch in code:
+ if ch == "{":
+ brace_count += 1
+ elif ch == "}":
+ brace_count -= 1
+ elif ch == "(":
+ paren_count += 1
+ elif ch == ")":
+ paren_count -= 1
+ elif ch == "[":
+ bracket_count += 1
+ elif ch == "]":
+ bracket_count -= 1
+
+ if brace_count < 0:
+ return "Unmatched closing brace '}'"
+ if paren_count < 0:
+ return "Unmatched closing parenthesis ')'"
+ if bracket_count < 0:
+ return "Unmatched closing bracket ']'"
+
+ if brace_count != 0:
+ return f"Unbalanced braces: {brace_count} unclosed"
+ if paren_count != 0:
+ return f"Unbalanced parentheses: {paren_count} unclosed"
+ if bracket_count != 0:
+ return f"Unbalanced brackets: {bracket_count} unclosed"
+
+ if "useCurrentFrame" not in code:
+ return "Missing useCurrentFrame() — required for Remotion animations"
+
+ if "AbsoluteFill" not in code:
+ return "Missing AbsoluteFill — required as the root layout component"
+
+ return None
diff --git a/surfsense_backend/app/agents/video_presentation/prompts.py b/surfsense_backend/app/agents/video_presentation/prompts.py
new file mode 100644
index 000000000..5533bb01c
--- /dev/null
+++ b/surfsense_backend/app/agents/video_presentation/prompts.py
@@ -0,0 +1,509 @@
+import datetime
+
+# TODO: move these to config file
+MAX_SLIDES = 5
+FPS = 30
+DEFAULT_DURATION_IN_FRAMES = 300
+
+THEME_PRESETS = [
+ "TERRA",
+ "OCEAN",
+ "SUNSET",
+ "EMERALD",
+ "ECLIPSE",
+ "ROSE",
+ "FROST",
+ "NEBULA",
+ "AURORA",
+ "CORAL",
+ "MIDNIGHT",
+ "AMBER",
+ "LAVENDER",
+ "STEEL",
+ "CITRUS",
+ "CHERRY",
+]
+
+THEME_DESCRIPTIONS: dict[str, str] = {
+ "TERRA": "Warm earthy tones — terracotta, olive. Heritage, tradition, organic warmth.",
+ "OCEAN": "Cool oceanic depth — teal, coral accents. Calm, marine, fluid elegance.",
+ "SUNSET": "Vibrant warm energy — orange, purple. Passion, creativity, bold expression.",
+ "EMERALD": "Fresh natural life — green, mint. Growth, health, sustainability.",
+ "ECLIPSE": "Dramatic luxury — black, gold. Premium, power, prestige.",
+ "ROSE": "Soft elegance — dusty pink, mauve. Beauty, care, refined femininity.",
+ "FROST": "Crisp clarity — ice blue, silver. Tech, data, precision analytics.",
+ "NEBULA": "Cosmic mystery — magenta, deep purple. AI, innovation, cutting-edge future.",
+ "AURORA": "Ethereal northern lights — green-teal, violet. Mystical, transformative, wonder.",
+ "CORAL": "Tropical warmth — coral, turquoise. Inviting, lively, community.",
+ "MIDNIGHT": "Deep sophistication — navy, silver. Contemplative, trust, authority.",
+ "AMBER": "Rich honey warmth — amber, brown. Comfort, wisdom, organic richness.",
+ "LAVENDER": "Gentle dreaminess — purple, lilac. Calm, imaginative, serene.",
+ "STEEL": "Industrial strength — gray, steel blue. Modern professional, reliability.",
+ "CITRUS": "Bright optimism — yellow, lime. Energy, joy, fresh starts.",
+ "CHERRY": "Bold impact — deep red, dark. Power, urgency, passionate conviction.",
+}
+
+
+# ---------------------------------------------------------------------------
+# LLM-based theme assignment (replaces keyword-based pick_theme_and_mode)
+# ---------------------------------------------------------------------------
+
+THEME_ASSIGNMENT_SYSTEM_PROMPT = """You are a visual design director assigning color themes to presentation slides.
+Given a list of slides, assign each slide a theme preset and color mode (dark or light).
+
+Available themes (name — description):
+{theme_list}
+
+Rules:
+1. Pick the theme that best matches each slide's mood, content, and visual direction.
+2. Maximize visual variety — avoid repeating the same theme on consecutive slides.
+3. Mix dark and light modes across the presentation for contrast and rhythm.
+4. Opening slides often benefit from a bold dark theme; closing/summary slides can go either way.
+5. The "background_explanation" field is the primary signal — it describes the intended mood and color direction.
+
+Return ONLY a JSON array (no markdown fences, no explanation):
+[
+ {{"slide_number": 1, "theme": "THEME_NAME", "mode": "dark"}},
+ {{"slide_number": 2, "theme": "THEME_NAME", "mode": "light"}}
+]
+""".strip()
+
+
+def build_theme_assignment_user_prompt(
+ slides: list[dict[str, str]],
+) -> str:
+ """Build the user prompt for LLM theme assignment.
+
+ *slides* is a list of dicts with keys: slide_number, title, subtitle,
+ background_explanation (mood).
+ """
+ lines = ["Assign a theme and mode to each of these slides:", ""]
+ for s in slides:
+ lines.append(
+ f'Slide {s["slide_number"]}: "{s["title"]}" '
+ f'(subtitle: "{s.get("subtitle", "")}") — '
+ f'Mood: "{s.get("background_explanation", "neutral")}"'
+ )
+ return "\n".join(lines)
+
+
+def get_theme_assignment_system_prompt() -> str:
+ """Return the theme assignment system prompt with the full theme list injected."""
+ theme_list = "\n".join(
+ f"- {name}: {desc}" for name, desc in THEME_DESCRIPTIONS.items()
+ )
+ return THEME_ASSIGNMENT_SYSTEM_PROMPT.format(theme_list=theme_list)
+
+
+def pick_theme_and_mode_fallback(
+ slide_index: int, total_slides: int
+) -> tuple[str, str]:
+ """Simple round-robin fallback when LLM theme assignment fails."""
+ theme = THEME_PRESETS[slide_index % len(THEME_PRESETS)]
+ mode = "dark" if slide_index % 2 == 0 else "light"
+ if total_slides == 1:
+ mode = "dark"
+ return theme, mode
+
+
+def get_slide_generation_prompt(user_prompt: str | None = None) -> str:
+ return f"""
+Today's date: {datetime.datetime.now().strftime("%Y-%m-%d")}
+
+You are a content-to-slides converter. You receive raw source content (articles, notes, transcripts,
+product descriptions, chat conversations, etc.) and break it into a sequence of presentation slides
+for a video presentation with voiceover narration.
+
+{
+ f'''
+You **MUST** strictly adhere to the following user instruction while generating the slides:
+
+{user_prompt}
+
+'''
+ if user_prompt
+ else ""
+ }
+
+
+- '': A block of text containing the information to be presented. This could be
+ research findings, an article summary, a detailed outline, user chat history, or any relevant
+ raw information. The content serves as the factual basis for the video presentation.
+
+
+
+A JSON object containing the presentation slides:
+{{
+ "slides": [
+ {{
+ "slide_number": 1,
+ "title": "Concise slide title",
+ "subtitle": "One-line subtitle or tagline",
+ "content_in_markdown": "## Heading\\n- Bullet point 1\\n- **Bold text**\\n- Bullet point 3",
+ "speaker_transcripts": [
+ "First narration sentence for this slide.",
+ "Second narration sentence expanding on the point.",
+ "Third sentence wrapping up this slide."
+ ],
+ "background_explanation": "Emotional mood and color direction for this slide"
+ }}
+ ]
+}}
+
+
+
+=== SLIDE COUNT ===
+
+Dynamically decide the number of slides between 1 and {MAX_SLIDES} (inclusive).
+Base your decision entirely on the content's depth, richness, and how many distinct ideas it contains.
+Thin or simple content should produce fewer slides; dense or multi-faceted content may use more.
+Do NOT inflate or pad slides to reach {
+ MAX_SLIDES
+ } — only use what the content genuinely warrants.
+Do NOT treat {MAX_SLIDES} as a target; it is a hard ceiling, not a goal.
+
+=== SLIDE STRUCTURE ===
+
+- Each slide should cover ONE distinct key idea or section.
+- Keep slides focused: 2-5 bullet points of content per slide max.
+- The first slide should be a title/intro slide.
+- The last slide should be a summary or closing slide ONLY if there are 3+ slides.
+ For 1-2 slides, skip the closing slide — just cover the content.
+- Do NOT create a separate closing slide if its content would just repeat earlier slides.
+
+=== CONTENT FIELDS ===
+
+- Write speaker_transcripts as if a human presenter is narrating — natural, conversational, 2-4 sentences per slide.
+ These will be converted to TTS audio, so write in a way that sounds great when spoken aloud.
+- background_explanation should describe a visual style matching the slide's mood:
+ - Describe the emotional feel: "warm and organic", "dramatic and urgent", "clean and optimistic",
+ "technical and precise", "celebratory", "earthy and grounded", "cosmic and futuristic"
+ - Mention color direction: warm tones, cool tones, earth tones, neon accents, gold/black, etc.
+ - Vary the mood across slides — do NOT always say "dark blue gradient".
+- content_in_markdown should use proper markdown: ## headings, **bold**, - bullets, etc.
+
+=== NARRATION QUALITY ===
+
+- Speaker transcripts should explain the slide content in an engaging, presenter-like voice.
+- Keep narration concise: 2-4 sentences per slide (targeting ~10-15 seconds of audio per slide).
+- The narration should add context beyond what's on the slide — don't just read the bullets.
+- Use natural language: contractions, conversational tone, occasional enthusiasm.
+
+
+
+Input: "Quantum computing uses quantum bits or qubits which can exist in multiple states simultaneously due to superposition."
+
+Output:
+{{
+ "slides": [
+ {{
+ "slide_number": 1,
+ "title": "Quantum Computing",
+ "subtitle": "Beyond Classical Bits",
+ "content_in_markdown": "## The Quantum Leap\\n- Classical computers use **bits** (0 or 1)\\n- Quantum computers use **qubits**\\n- Qubits leverage **superposition**",
+ "speaker_transcripts": [
+ "Let's explore quantum computing, a technology that's fundamentally different from the computers we use every day.",
+ "While traditional computers work with bits that are either zero or one, quantum computers use something called qubits.",
+ "The magic of qubits is superposition — they can exist in multiple states at the same time."
+ ],
+ "background_explanation": "Cosmic and futuristic with deep purple and magenta tones, evoking the mystery of quantum mechanics"
+ }}
+ ]
+}}
+
+
+Transform the source material into well-structured presentation slides with engaging narration.
+Ensure each slide has a clear visual mood and natural-sounding speaker transcripts.
+
+"""
+
+
+# ---------------------------------------------------------------------------
+# Remotion scene code generation prompt
+# Ported from RemotionTets POC /api/generate system prompt
+# ---------------------------------------------------------------------------
+
+REMOTION_SCENE_SYSTEM_PROMPT = """
+You are a Remotion component generator that creates cinematic, modern motion graphics.
+Generate a single self-contained React component that uses Remotion.
+
+=== THEME PRESETS (pick ONE per slide — see user prompt for which to use) ===
+
+Each slide MUST use a DIFFERENT preset. The user prompt will tell you which preset to use.
+Use ALL colors from that preset — background, surface, text, accent, glow. Do NOT mix presets.
+
+TERRA (warm earth — terracotta + olive):
+ dark: bg #1C1510 surface #261E16 border #3D3024 text #E8DDD0 muted #9A8A78 accent #C2623D secondary #7D8C52 glow rgba(194,98,61,0.12)
+ light: bg #F7F0E8 surface #FFF8F0 border #DDD0BF text #2C1D0E muted #8A7A68 accent #B85430 secondary #6B7A42 glow rgba(184,84,48,0.08)
+ gradient-dark: radial-gradient(ellipse at 30% 80%, rgba(194,98,61,0.18), transparent 60%), linear-gradient(180deg, #1C1510, #261E16)
+ gradient-light: radial-gradient(ellipse at 70% 20%, rgba(107,122,66,0.12), transparent 55%), linear-gradient(180deg, #F7F0E8, #FFF8F0)
+
+OCEAN (cool depth — teal + coral):
+ dark: bg #0B1A1E surface #122428 border #1E3740 text #D5EAF0 muted #6A9AA8 accent #1DB6A8 secondary #E87461 glow rgba(29,182,168,0.12)
+ light: bg #F0F8FA surface #FFFFFF border #C8E0E8 text #0E2830 muted #5A8A98 accent #0EA69A secondary #D05F4E glow rgba(14,166,154,0.08)
+ gradient-dark: radial-gradient(ellipse at 80% 30%, rgba(29,182,168,0.20), transparent 55%), radial-gradient(circle at 20% 80%, rgba(232,116,97,0.10), transparent 50%), #0B1A1E
+ gradient-light: radial-gradient(ellipse at 20% 40%, rgba(14,166,154,0.10), transparent 55%), linear-gradient(180deg, #F0F8FA, #FFFFFF)
+
+SUNSET (warm energy — orange + purple):
+ dark: bg #1E130F surface #2A1B14 border #42291C text #F0DDD0 muted #A08878 accent #E86A20 secondary #A855C0 glow rgba(232,106,32,0.12)
+ light: bg #FFF5ED surface #FFFFFF border #EADAC8 text #2E1508 muted #907860 accent #D05A18 secondary #9045A8 glow rgba(208,90,24,0.08)
+ gradient-dark: linear-gradient(135deg, rgba(232,106,32,0.15) 0%, transparent 40%), radial-gradient(circle at 80% 70%, rgba(168,85,192,0.15), transparent 50%), #1E130F
+ gradient-light: linear-gradient(135deg, rgba(208,90,24,0.08) 0%, rgba(144,69,168,0.06) 100%), #FFF5ED
+
+EMERALD (fresh life — green + mint):
+ dark: bg #0B1E14 surface #12281A border #1E3C28 text #D0F0E0 muted #5EA880 accent #10B981 secondary #84CC16 glow rgba(16,185,129,0.12)
+ light: bg #F0FAF5 surface #FFFFFF border #C0E8D0 text #0E2C18 muted #489068 accent #059669 secondary #65A30D glow rgba(5,150,105,0.08)
+ gradient-dark: radial-gradient(ellipse at 50% 50%, rgba(16,185,129,0.18), transparent 60%), linear-gradient(180deg, #0B1E14, #12281A)
+ gradient-light: radial-gradient(ellipse at 60% 30%, rgba(101,163,13,0.10), transparent 55%), linear-gradient(180deg, #F0FAF5, #FFFFFF)
+
+ECLIPSE (dramatic — black + gold):
+ dark: bg #100C05 surface #1A1508 border #2E2510 text #D4B96A muted #8A7840 accent #E8B830 secondary #C09020 glow rgba(232,184,48,0.14)
+ light: bg #FAF6ED surface #FFFFFF border #E0D8C0 text #1A1408 muted #7A6818 accent #C09820 secondary #A08018 glow rgba(192,152,32,0.08)
+ gradient-dark: radial-gradient(circle at 50% 40%, rgba(232,184,48,0.20), transparent 50%), radial-gradient(ellipse at 50% 90%, rgba(192,144,32,0.08), transparent 50%), #100C05
+ gradient-light: radial-gradient(circle at 50% 40%, rgba(192,152,32,0.10), transparent 55%), linear-gradient(180deg, #FAF6ED, #FFFFFF)
+
+ROSE (soft elegance — dusty pink + mauve):
+ dark: bg #1E1018 surface #281820 border #3D2830 text #F0D8E0 muted #A08090 accent #E4508C secondary #B06498 glow rgba(228,80,140,0.12)
+ light: bg #FDF2F5 surface #FFFFFF border #F0D0D8 text #2C1018 muted #906878 accent #D43D78 secondary #9A5080 glow rgba(212,61,120,0.08)
+ gradient-dark: radial-gradient(ellipse at 70% 30%, rgba(228,80,140,0.18), transparent 55%), radial-gradient(circle at 20% 80%, rgba(176,100,152,0.10), transparent 50%), #1E1018
+ gradient-light: radial-gradient(ellipse at 30% 60%, rgba(212,61,120,0.08), transparent 55%), linear-gradient(180deg, #FDF2F5, #FFFFFF)
+
+FROST (crisp clarity — ice blue + silver):
+ dark: bg #0A1520 surface #101D2A border #1A3040 text #D0E5F5 muted #6090B0 accent #5AB4E8 secondary #8BA8C0 glow rgba(90,180,232,0.12)
+ light: bg #F0F6FC surface #FFFFFF border #C8D8E8 text #0C1820 muted #5080A0 accent #3A96D0 secondary #7090A8 glow rgba(58,150,208,0.08)
+ gradient-dark: radial-gradient(ellipse at 40% 20%, rgba(90,180,232,0.16), transparent 55%), linear-gradient(180deg, #0A1520, #101D2A)
+ gradient-light: radial-gradient(ellipse at 50% 50%, rgba(58,150,208,0.08), transparent 55%), linear-gradient(180deg, #F0F6FC, #FFFFFF)
+
+NEBULA (cosmic — magenta + deep purple):
+ dark: bg #150A1E surface #1E1028 border #351A48 text #E0D0F0 muted #8060A0 accent #C850E0 secondary #8030C0 glow rgba(200,80,224,0.14)
+ light: bg #F8F0FF surface #FFFFFF border #E0C8F0 text #1A0A24 muted #7050A0 accent #A840C0 secondary #6820A0 glow rgba(168,64,192,0.08)
+ gradient-dark: radial-gradient(circle at 60% 40%, rgba(200,80,224,0.18), transparent 50%), radial-gradient(ellipse at 30% 80%, rgba(128,48,192,0.12), transparent 50%), #150A1E
+ gradient-light: radial-gradient(circle at 40% 30%, rgba(168,64,192,0.10), transparent 55%), linear-gradient(180deg, #F8F0FF, #FFFFFF)
+
+AURORA (ethereal lights — green-teal + violet):
+ dark: bg #0A1A1A surface #102020 border #1A3838 text #D0F0F0 muted #60A0A0 accent #30D0B0 secondary #8040D0 glow rgba(48,208,176,0.12)
+ light: bg #F0FAF8 surface #FFFFFF border #C0E8E0 text #0A2020 muted #508080 accent #20B090 secondary #6830B0 glow rgba(32,176,144,0.08)
+ gradient-dark: radial-gradient(ellipse at 30% 70%, rgba(48,208,176,0.18), transparent 55%), radial-gradient(circle at 70% 30%, rgba(128,64,208,0.12), transparent 50%), #0A1A1A
+ gradient-light: radial-gradient(ellipse at 50% 40%, rgba(32,176,144,0.10), transparent 55%), linear-gradient(180deg, #F0FAF8, #FFFFFF)
+
+CORAL (tropical warmth — coral + turquoise):
+ dark: bg #1E0F0F surface #281818 border #402828 text #F0D8D8 muted #A07070 accent #F06050 secondary #30B8B0 glow rgba(240,96,80,0.12)
+ light: bg #FFF5F3 surface #FFFFFF border #F0D0C8 text #2E1010 muted #906060 accent #E04838 secondary #20A098 glow rgba(224,72,56,0.08)
+ gradient-dark: radial-gradient(ellipse at 60% 60%, rgba(240,96,80,0.18), transparent 55%), radial-gradient(circle at 30% 30%, rgba(48,184,176,0.10), transparent 50%), #1E0F0F
+ gradient-light: radial-gradient(ellipse at 40% 50%, rgba(224,72,56,0.08), transparent 55%), linear-gradient(180deg, #FFF5F3, #FFFFFF)
+
+MIDNIGHT (deep sophistication — navy + silver):
+ dark: bg #080C18 surface #0E1420 border #1A2438 text #C8D8F0 muted #5070A0 accent #4080E0 secondary #A0B0D0 glow rgba(64,128,224,0.12)
+ light: bg #F0F2F8 surface #FFFFFF border #C8D0E0 text #101828 muted #506080 accent #3060C0 secondary #8090B0 glow rgba(48,96,192,0.08)
+ gradient-dark: radial-gradient(ellipse at 50% 30%, rgba(64,128,224,0.16), transparent 55%), linear-gradient(180deg, #080C18, #0E1420)
+ gradient-light: radial-gradient(ellipse at 50% 50%, rgba(48,96,192,0.08), transparent 55%), linear-gradient(180deg, #F0F2F8, #FFFFFF)
+
+AMBER (rich honey warmth — amber + brown):
+ dark: bg #1A1208 surface #221A0E border #3A2C18 text #F0E0C0 muted #A09060 accent #E0A020 secondary #C08030 glow rgba(224,160,32,0.12)
+ light: bg #FFF8E8 surface #FFFFFF border #E8D8B8 text #2A1C08 muted #907840 accent #C88810 secondary #A86820 glow rgba(200,136,16,0.08)
+ gradient-dark: radial-gradient(ellipse at 40% 60%, rgba(224,160,32,0.18), transparent 55%), linear-gradient(180deg, #1A1208, #221A0E)
+ gradient-light: radial-gradient(ellipse at 60% 40%, rgba(200,136,16,0.10), transparent 55%), linear-gradient(180deg, #FFF8E8, #FFFFFF)
+
+LAVENDER (gentle dreaminess — purple + lilac):
+ dark: bg #14101E surface #1C1628 border #302840 text #E0D8F0 muted #8070A0 accent #A060E0 secondary #C090D0 glow rgba(160,96,224,0.12)
+ light: bg #F8F0FF surface #FFFFFF border #E0D0F0 text #1C1028 muted #706090 accent #8848C0 secondary #A878B8 glow rgba(136,72,192,0.08)
+ gradient-dark: radial-gradient(ellipse at 60% 40%, rgba(160,96,224,0.18), transparent 55%), radial-gradient(circle at 30% 70%, rgba(192,144,208,0.10), transparent 50%), #14101E
+ gradient-light: radial-gradient(ellipse at 40% 30%, rgba(136,72,192,0.10), transparent 55%), linear-gradient(180deg, #F8F0FF, #FFFFFF)
+
+STEEL (industrial strength — gray + steel blue):
+ dark: bg #101214 surface #181C20 border #282E38 text #D0D8E0 muted #708090 accent #5088B0 secondary #90A0B0 glow rgba(80,136,176,0.12)
+ light: bg #F2F4F6 surface #FFFFFF border #D0D8E0 text #181C24 muted #607080 accent #3870A0 secondary #708898 glow rgba(56,112,160,0.08)
+ gradient-dark: radial-gradient(ellipse at 50% 50%, rgba(80,136,176,0.14), transparent 55%), linear-gradient(180deg, #101214, #181C20)
+ gradient-light: radial-gradient(ellipse at 50% 40%, rgba(56,112,160,0.08), transparent 55%), linear-gradient(180deg, #F2F4F6, #FFFFFF)
+
+CITRUS (bright optimism — yellow + lime):
+ dark: bg #181808 surface #202010 border #383818 text #F0F0C0 muted #A0A060 accent #E8D020 secondary #90D030 glow rgba(232,208,32,0.12)
+ light: bg #FFFFF0 surface #FFFFFF border #E8E8C0 text #282808 muted #808040 accent #C8B010 secondary #70B020 glow rgba(200,176,16,0.08)
+ gradient-dark: radial-gradient(ellipse at 40% 40%, rgba(232,208,32,0.18), transparent 55%), radial-gradient(circle at 70% 70%, rgba(144,208,48,0.10), transparent 50%), #181808
+ gradient-light: radial-gradient(ellipse at 50% 30%, rgba(200,176,16,0.10), transparent 55%), linear-gradient(180deg, #FFFFF0, #FFFFFF)
+
+CHERRY (bold impact — deep red + dark):
+ dark: bg #1A0808 surface #241010 border #401818 text #F0D0D0 muted #A06060 accent #D02030 secondary #E05060 glow rgba(208,32,48,0.14)
+ light: bg #FFF0F0 surface #FFFFFF border #F0C8C8 text #280808 muted #904848 accent #B01828 secondary #C83848 glow rgba(176,24,40,0.08)
+ gradient-dark: radial-gradient(ellipse at 50% 40%, rgba(208,32,48,0.20), transparent 50%), linear-gradient(180deg, #1A0808, #241010)
+ gradient-light: radial-gradient(ellipse at 50% 50%, rgba(176,24,40,0.10), transparent 55%), linear-gradient(180deg, #FFF0F0, #FFFFFF)
+
+=== SHARED TOKENS (use with any theme above) ===
+
+SPACING: xs 8px, sm 16px, md 24px, lg 32px, xl 48px, 2xl 64px, 3xl 96px, 4xl 128px
+TYPOGRAPHY: fontFamily "Inter, system-ui, -apple-system, sans-serif"
+ caption 14px/1.4, body 18px/1.6, subhead 24px/1.4, title 40px/1.2 w600, headline 64px/1.1 w700, display 96px/1.0 w800
+ letterSpacing: tight "-0.02em", normal "0", wide "0.05em"
+BORDER RADIUS: 12px (cards), 8px (buttons), 9999px (pills)
+
+=== VISUAL VARIETY (CRITICAL) ===
+
+The user prompt assigns each slide a specific theme preset AND mode (dark/light).
+You MUST use EXACTLY the assigned preset and mode. Additionally:
+
+1. Use the preset's gradient as the AbsoluteFill background.
+2. Use the preset's accent/secondary colors for highlights, pill badges, and card accents.
+3. Use the preset's glow value for all boxShadow effects.
+4. LAYOUT VARIATION: Vary layout between slides:
+ - One slide: bold centered headline + subtle stat
+ - Another: two-column card layout
+ - Another: single large number or quote as hero
+ Do NOT use the same layout pattern for every slide.
+
+=== LAYOUT RULES (CRITICAL — elements must NEVER overlap) ===
+
+The canvas is 1920x1080. You MUST use a SINGLE-LAYER layout. NO stacking, NO multiple AbsoluteFill layers.
+
+STRUCTURE — every component must follow this exact pattern:
+
+ {/* ALL content goes here as direct children in normal flow */}
+
+
+ABSOLUTE RULES:
+- Use exactly ONE AbsoluteFill as the root. Set its background color/gradient via its style prop.
+- NEVER nest AbsoluteFill inside AbsoluteFill.
+- NEVER use position "absolute" or position "fixed" on ANY element.
+- NEVER use multiple layers or z-index.
+- ALL elements must be in normal document flow inside the single root AbsoluteFill.
+
+SPACING:
+- Root padding: 80px on all sides (safe area).
+- Use flexDirection "column" with gap for vertical stacking, flexDirection "row" with gap for horizontal.
+- Minimum gap between elements: 24px vertical, 32px horizontal.
+- Text hierarchy gaps: headline→subheading 16px, subheading→body 12px, body→button 32px.
+- Cards/panels: padding 32px-48px, borderRadius 12px.
+- NEVER use margin to space siblings — always use the parent's gap property.
+
+=== DESIGN STYLE ===
+
+- Premium aesthetic — use the exact colors from the assigned theme preset (do NOT invent your own)
+- Background: use the preset's gradient-dark or gradient-light value directly as the AbsoluteFill's background
+- Card/surface backgrounds: use the preset's surface color
+- Text colors: use the preset's text, muted values
+- Borders: use the preset's border color
+- Glows: use the preset's glow value for all boxShadow — do NOT substitute other colors
+- Generous whitespace — less is more, let elements breathe
+- NO decorative background shapes, blurs, or overlapping ornaments
+
+=== REMOTION RULES ===
+
+- Export the component as: export const MyComposition = () => { ... }
+- Use useCurrentFrame() and useVideoConfig() from "remotion"
+- Do NOT use Sequence
+- Do NOT manually calculate animation timings or frame offsets
+
+=== ANIMATION (use the stagger() helper for ALL element animations) ===
+
+A pre-built helper function called stagger() is available globally.
+It handles enter, hold, and exit phases automatically — you MUST use it.
+
+Signature:
+ stagger(frame, fps, index, total) → { opacity: number, transform: string }
+
+Parameters:
+ frame — from useCurrentFrame()
+ fps — from useVideoConfig()
+ index — 0-based index of this element in the entrance order
+ total — total number of animated elements in the scene
+
+It returns a style object with opacity and transform that you spread onto the element.
+Timing is handled for you: staggered spring entrances, ambient hold motion, and a graceful exit.
+
+Usage pattern:
+ const frame = useCurrentFrame();
+ const { fps } = useVideoConfig();
+
+
Headline
+
Subtitle
+
Card
+
Footer
+
+Rules:
+- Count ALL animated elements in your scene and pass that count as the "total" parameter.
+- Assign each element a sequential index starting from 0.
+- You can merge stagger's return with additional styles:
+