Merge branch 'MODSetter:main' into main

This commit is contained in:
Anshul Sharma 2025-04-26 23:06:33 +05:30 committed by GitHub
commit 7e639108c6
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
19 changed files with 899 additions and 376 deletions

130
README.md
View file

@ -72,97 +72,23 @@ Join the [SurfSense Discord](https://discord.gg/ejRNvftDp9) and help shape the f
## How to get started?
### PRE-START CHECKS
### Installation Options
#### PGVector
Make sure pgvector extension is installed on your machine. Setup Guide https://github.com/pgvector/pgvector?tab=readme-ov-file#installation
SurfSense provides two installation methods:
#### File Uploading Support
For File uploading you need Unstructured.io API key. You can get it at http://platform.unstructured.io/
1. **[Docker Installation](https://www.surfsense.net/docs/docker-installation)** - The easiest way to get SurfSense up and running with all dependencies containerized. Less Customization.
#### Auth
SurfSense now only works with Google OAuth. Make sure to set your OAuth Client at https://developers.google.com/identity/protocols/oauth2 . We need client id and client secret for backend. Make sure to enable people api and add the required scopes under data access (openid, userinfo.email, userinfo.profile)
2. **[Manual Installation (Recommended)](https://www.surfsense.net/docs/manual-installation)** - For users who prefer more control over their setup or need to customize their deployment.
![gauth](https://github.com/user-attachments/assets/80d60fe5-889b-48a6-b947-200fdaf544c1)
Both installation guides include detailed OS-specific instructions for Windows, macOS, and Linux.
#### LLM Observability
One easy way to observe SurfSense Researcher Agent is to use LangSmith. Get its API KEY from https://smith.langchain.com/
Before installation, make sure to complete the [prerequisite setup steps](https://www.surfsense.net/docs/) including:
- PGVector setup
- Google OAuth configuration
- Unstructured.io API key
- Other required API keys
**Open AI LLMS**
![openai_langraph](https://github.com/user-attachments/assets/b1f4c7a1-0a66-4d21-9053-2e09a5634f95)
**Ollama LLMS**
![ollama_langgraph](https://github.com/user-attachments/assets/5b6c870e-095c-4368-86e6-f7488e0fca28)
#### Crawler Support
SurfSense currently uses [Firecrawl.py](https://www.firecrawl.dev/) right now. Playwright crawler support will be added soon.
## Quick Start
### Preferred Method: Docker Setup
The recommended way to run SurfSense is using Docker, which ensures consistent environment across different systems.
1. Make sure you have Docker and Docker Compose installed
2. Follow the detailed instructions in our [Docker Setup Guide](DOCKER_SETUP.md)
```bash
# Start all services with one command
docker-compose up --build
```
---
### Alternative: Manual Setup
### Backend (./surfsense_backend)
This is the core of SurfSense. Before we begin let's look at `.env` variables' that we need to successfully setup SurfSense.
|ENV VARIABLE|DESCRIPTION|
|--|--|
| DATABASE_URL| Your PostgreSQL database connection string. Eg. `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`|
| SECRET_KEY| JWT Secret key used for authentication. Should be a secure random string. Eg. `SURFSENSE_SECRET_KEY_123456789`|
| GOOGLE_OAUTH_CLIENT_ID| Google OAuth client ID obtained from Google Cloud Console when setting up OAuth authentication|
| GOOGLE_OAUTH_CLIENT_SECRET| Google OAuth client secret obtained from Google Cloud Console when setting up OAuth authentication|
| NEXT_FRONTEND_URL| URL where your frontend application is hosted. Eg. `http://localhost:3000`|
| EMBEDDING_MODEL| Name of the embedding model to use for vector embeddings. Currently works with Sentence Transformers only. Expect other embeddings soon. Eg. `mixedbread-ai/mxbai-embed-large-v1`|
| RERANKERS_MODEL_NAME| Name of the reranker model for search result reranking. Eg. `ms-marco-MiniLM-L-12-v2`|
| RERANKERS_MODEL_TYPE| Type of reranker model being used. Eg. `flashrank`|
| FAST_LLM| LiteLLM routed Smaller, faster LLM for quick responses. Eg. `openai/gpt-4o-mini`, `ollama/deepseek-r1:8b`|
| STRATEGIC_LLM| LiteLLM routed Advanced LLM for complex reasoning tasks. Eg. `openai/gpt-4o`, `ollama/gemma3:12b`|
| LONG_CONTEXT_LLM| LiteLLM routed LLM capable of handling longer context windows. Eg. `gemini/gemini-2.0-flash`, `ollama/deepseek-r1:8b`|
| UNSTRUCTURED_API_KEY| API key for Unstructured.io service for document parsing|
| FIRECRAWL_API_KEY| API key for Firecrawl service for web crawling and data extraction|
IMPORTANT: Since LLM calls are routed through LiteLLM make sure to include API keys of LLM models you are using. For example if you used `openai/gpt-4o` make sure to include OpenAI API Key `OPENAI_API_KEY` or if you use `gemini/gemini-2.0-flash` then you include `GEMINI_API_KEY`.
You can also integrate any LLM just follow this https://docs.litellm.ai/docs/providers
Now once you have everything let's proceed to run SurfSense.
1. Install `uv` : https://docs.astral.sh/uv/getting-started/installation/
2. Now just run this command to install dependencies i.e `uv sync`
3. That's it. Now just run the `main.py` file using `uv run main.py`. You can also optionally pass `--reload` as an argument to enable hot reloading.
4. If everything worked fine you should see screen like this.
![backend](https://i.ibb.co/542Vhqw/backendrunning.png)
---
### FrontEnd (./surfsense_web)
For local frontend setup just fill out the `.env` file of frontend.
|ENV VARIABLE|DESCRIPTION|
|--|--|
| NEXT_PUBLIC_FASTAPI_BACKEND_URL | Give hosted backend url here. Eg. `http://localhost:8000`|
1. Now install dependencies using `pnpm install`
2. Run it using `pnpm run dev`
You should see your Next.js frontend running at `localhost:3000`
#### Some FrontEnd Screens
## Screenshots
**Search Spaces**
@ -180,43 +106,13 @@ You should see your Next.js frontend running at `localhost:3000`
![chat](https://github.com/user-attachments/assets/bb352d52-1c6d-4020-926b-722d0b98b491)
---
### Extension (./surfsense_browser_extension)
Extension is in plasmo framework which is a cross browser extension framework. Extension main usecase is to save any webpages protected beyond authentication.
For building extension just fill out the `.env` file of frontend.
|ENV VARIABLE|DESCRIPTION|
|--|--|
| PLASMO_PUBLIC_BACKEND_URL| SurfSense Backend URL eg. "http://127.0.0.1:8000" |
Build the extension for your favorite browser using this guide: https://docs.plasmo.com/framework/workflows/build#with-a-specific-target
When you load and start the extension you should see a Apu page like this
**Browser Extension**
![ext1](https://github.com/user-attachments/assets/1f042b7a-6349-422b-94fb-d40d0df16c40)
After filling in your SurfSense API key you should be able to use extension now.
![ext2](https://github.com/user-attachments/assets/a9b9f1aa-2677-404d-b0a0-c1b2dddf24a7)
|Options|Explanations|
|--|--|
| Search Space | Search Space to save your dynamic bookmarks. |
| Clear Inactive History Sessions | It clears the saved content for Inactive Tab Sessions. |
| Save Current Webpage Snapshot | Stores the current webpage session info into SurfSense history store|
| Save to SurfSense | Processes the SurfSense History Store & Initiates a Save Job |
## Tech Stack
## Tech Stack
### **BackEnd**

View file

@ -0,0 +1,5 @@
from app.agents.researcher.graph import graph as researcher_graph
from app.agents.researcher.sub_section_writer.graph import graph as sub_section_writer_graph
print(researcher_graph.get_graph().draw_mermaid())
print(sub_section_writer_graph.get_graph().draw_mermaid())

View file

@ -8,8 +8,15 @@ RUN npm install -g pnpm
# Copy package files
COPY package.json pnpm-lock.yaml ./
# Install dependencies
RUN pnpm install
# First copy the config file to avoid fumadocs-mdx postinstall error
COPY source.config.ts ./
COPY content ./content
# Install dependencies with --ignore-scripts to skip postinstall
RUN pnpm install --ignore-scripts
# Now run the postinstall script manually
RUN pnpm fumadocs-mdx
# Copy source code
COPY . .

View file

@ -240,7 +240,7 @@ const SourcesDialogContent = ({
const ChatPage = () => {
const [token, setToken] = React.useState<string | null>(null);
const [activeTab, setActiveTab] = useState("");
const [dialogOpen, setDialogOpen] = useState(false);
const [dialogOpenId, setDialogOpenId] = useState<number | null>(null);
const [sourcesPage, setSourcesPage] = useState(1);
const [expandedSources, setExpandedSources] = useState(false);
const [canScrollLeft, setCanScrollLeft] = useState(false);
@ -260,6 +260,13 @@ const ChatPage = () => {
const { search_space_id, chat_id } = useParams();
// Function to scroll terminal to bottom
const scrollTerminalToBottom = () => {
if (terminalMessagesRef.current) {
terminalMessagesRef.current.scrollTop = terminalMessagesRef.current.scrollHeight;
}
};
// Get token from localStorage on client side only
React.useEffect(() => {
setToken(localStorage.getItem('surfsense_bearer_token'));
@ -469,54 +476,60 @@ const ChatPage = () => {
updateChat();
}, [messages, status, chat_id, researchMode, selectedConnectors, search_space_id]);
// Log messages whenever they update and extract annotations from the latest assistant message if available
useEffect(() => {
console.log('Messages updated:', messages);
// Extract annotations from the latest assistant message if available
// Memoize connector sources to prevent excessive re-renders
const processedConnectorSources = React.useMemo(() => {
if (messages.length === 0) return connectorSources;
// Only process when we have a complete message (not streaming)
if (status !== 'ready') return connectorSources;
// Find the latest assistant message
const assistantMessages = messages.filter(msg => msg.role === 'assistant');
if (assistantMessages.length > 0) {
const latestAssistantMessage = assistantMessages[assistantMessages.length - 1];
if (latestAssistantMessage?.annotations) {
const annotations = latestAssistantMessage.annotations as any[];
// Debug log to track streaming annotations
if (process.env.NODE_ENV === 'development') {
console.log('Streaming annotations:', annotations);
// Log counts of each annotation type
const terminalInfoCount = annotations.filter(a => a.type === 'TERMINAL_INFO').length;
const sourcesCount = annotations.filter(a => a.type === 'SOURCES').length;
const answerCount = annotations.filter(a => a.type === 'ANSWER').length;
console.log(`Annotation counts - Terminal: ${terminalInfoCount}, Sources: ${sourcesCount}, Answer: ${answerCount}`);
}
// Process SOURCES annotation - get the last one to ensure we have the latest
const sourcesAnnotations = annotations.filter(
(annotation) => annotation.type === 'SOURCES'
);
if (sourcesAnnotations.length > 0) {
// Get the last SOURCES annotation to ensure we have the most recent one
const latestSourcesAnnotation = sourcesAnnotations[sourcesAnnotations.length - 1];
if (latestSourcesAnnotation.content) {
setConnectorSources(latestSourcesAnnotation.content);
}
}
// Check for terminal info annotations and scroll terminal to bottom if they exist
const terminalInfoAnnotations = annotations.filter(
(annotation) => annotation.type === 'TERMINAL_INFO'
);
if (terminalInfoAnnotations.length > 0) {
// Schedule scrolling after the DOM has been updated
setTimeout(scrollTerminalToBottom, 100);
}
}
if (assistantMessages.length === 0) return connectorSources;
const latestAssistantMessage = assistantMessages[assistantMessages.length - 1];
if (!latestAssistantMessage?.annotations) return connectorSources;
// Find the latest SOURCES annotation
const annotations = latestAssistantMessage.annotations as any[];
const sourcesAnnotations = annotations.filter(a => a.type === 'SOURCES');
if (sourcesAnnotations.length === 0) return connectorSources;
const latestSourcesAnnotation = sourcesAnnotations[sourcesAnnotations.length - 1];
if (!latestSourcesAnnotation.content) return connectorSources;
// Use this content if it differs from current
return latestSourcesAnnotation.content;
}, [messages, status, connectorSources]);
// Update connector sources when processed value changes
useEffect(() => {
if (processedConnectorSources !== connectorSources) {
setConnectorSources(processedConnectorSources);
}
}, [messages]);
}, [processedConnectorSources, connectorSources]);
// Check and scroll terminal when terminal info is available
useEffect(() => {
if (messages.length === 0 || status !== 'ready') return;
// Find the latest assistant message
const assistantMessages = messages.filter(msg => msg.role === 'assistant');
if (assistantMessages.length === 0) return;
const latestAssistantMessage = assistantMessages[assistantMessages.length - 1];
if (!latestAssistantMessage?.annotations) return;
// Check for terminal info annotations
const annotations = latestAssistantMessage.annotations as any[];
const terminalInfoAnnotations = annotations.filter(a => a.type === 'TERMINAL_INFO');
if (terminalInfoAnnotations.length > 0) {
// Schedule scrolling after the DOM has been updated
setTimeout(scrollTerminalToBottom, 100);
}
}, [messages, status]);
// Custom handleSubmit function to include selected connectors and answer type
const handleSubmit = (e: React.FormEvent) => {
@ -543,24 +556,22 @@ const ChatPage = () => {
messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
};
// Function to scroll terminal to bottom
const scrollTerminalToBottom = () => {
if (terminalMessagesRef.current) {
terminalMessagesRef.current.scrollTop = terminalMessagesRef.current.scrollHeight;
}
};
// Scroll to bottom when messages change
useEffect(() => {
scrollToBottom();
}, [messages]);
// Set activeTab when connectorSources change
useEffect(() => {
if (connectorSources.length > 0) {
setActiveTab(connectorSources[0].type);
}
// Set activeTab when connectorSources change using a memoized value
const activeTabValue = React.useMemo(() => {
return connectorSources.length > 0 ? connectorSources[0].type : "";
}, [connectorSources]);
// Update activeTab when the memoized value changes
useEffect(() => {
if (activeTabValue && activeTabValue !== activeTab) {
setActiveTab(activeTabValue);
}
}, [activeTabValue, activeTab]);
// Scroll terminal to bottom when expanded
useEffect(() => {
@ -617,49 +628,89 @@ const ChatPage = () => {
};
// Function to get a citation source by ID
const getCitationSource = (citationId: number): Source | null => {
const getCitationSource = React.useCallback((citationId: number, messageIndex?: number): Source | null => {
if (!messages || messages.length === 0) return null;
// Find the latest assistant message
const assistantMessages = messages.filter(msg => msg.role === 'assistant');
if (assistantMessages.length === 0) return null;
// If no specific message index is provided, use the latest assistant message
if (messageIndex === undefined) {
// Find the latest assistant message
const assistantMessages = messages.filter(msg => msg.role === 'assistant');
if (assistantMessages.length === 0) return null;
const latestAssistantMessage = assistantMessages[assistantMessages.length - 1];
if (!latestAssistantMessage?.annotations) return null;
const latestAssistantMessage = assistantMessages[assistantMessages.length - 1];
if (!latestAssistantMessage?.annotations) return null;
// Find all SOURCES annotations
const annotations = latestAssistantMessage.annotations as any[];
const sourcesAnnotations = annotations.filter(
(annotation) => annotation.type === 'SOURCES'
);
// Find all SOURCES annotations
const annotations = latestAssistantMessage.annotations as any[];
const sourcesAnnotations = annotations.filter(
(annotation) => annotation.type === 'SOURCES'
);
// Get the latest SOURCES annotation
if (sourcesAnnotations.length === 0) return null;
const latestSourcesAnnotation = sourcesAnnotations[sourcesAnnotations.length - 1];
// Get the latest SOURCES annotation
if (sourcesAnnotations.length === 0) return null;
const latestSourcesAnnotation = sourcesAnnotations[sourcesAnnotations.length - 1];
if (!latestSourcesAnnotation.content) return null;
if (!latestSourcesAnnotation.content) return null;
// Flatten all sources from all connectors
const allSources: Source[] = [];
latestSourcesAnnotation.content.forEach((connector: ConnectorSource) => {
if (connector.sources && Array.isArray(connector.sources)) {
connector.sources.forEach((source: SourceItem) => {
allSources.push({
id: source.id,
title: source.title,
description: source.description,
url: source.url,
connectorType: connector.type
// Flatten all sources from all connectors
const allSources: Source[] = [];
latestSourcesAnnotation.content.forEach((connector: ConnectorSource) => {
if (connector.sources && Array.isArray(connector.sources)) {
connector.sources.forEach((source: SourceItem) => {
allSources.push({
id: source.id,
title: source.title,
description: source.description,
url: source.url,
connectorType: connector.type
});
});
});
}
});
}
});
// Find the source with the matching ID
const foundSource = allSources.find(source => source.id === citationId);
// Find the source with the matching ID
const foundSource = allSources.find(source => source.id === citationId);
return foundSource || null;
};
return foundSource || null;
} else {
// Use the specific message by index
const message = messages[messageIndex];
if (!message || message.role !== 'assistant' || !message.annotations) return null;
// Find all SOURCES annotations
const annotations = message.annotations as any[];
const sourcesAnnotations = annotations.filter(
(annotation) => annotation.type === 'SOURCES'
);
// Get the latest SOURCES annotation
if (sourcesAnnotations.length === 0) return null;
const latestSourcesAnnotation = sourcesAnnotations[sourcesAnnotations.length - 1];
if (!latestSourcesAnnotation.content) return null;
// Flatten all sources from all connectors
const allSources: Source[] = [];
latestSourcesAnnotation.content.forEach((connector: ConnectorSource) => {
if (connector.sources && Array.isArray(connector.sources)) {
connector.sources.forEach((source: SourceItem) => {
allSources.push({
id: source.id,
title: source.title,
description: source.description,
url: source.url,
connectorType: connector.type
});
});
}
});
// Find the source with the matching ID
const foundSource = allSources.find(source => source.id === citationId);
return foundSource || null;
}
}, [messages]);
return (
<>
@ -685,7 +736,11 @@ const ChatPage = () => {
<div className="flex-1">
<Card className="border-gray-300 dark:border-gray-700">
<CardContent className="p-3">
<MarkdownViewer content={message.content} getCitationSource={getCitationSource} className="text-sm" />
<MarkdownViewer
content={message.content}
getCitationSource={(id) => getCitationSource(id, index)}
className="text-sm"
/>
</CardContent>
</Card>
</div>
@ -856,7 +911,7 @@ const ChatPage = () => {
))}
{connector.sources.length > INITIAL_SOURCES_DISPLAY && (
<Dialog open={dialogOpen && activeTab === connector.type} onOpenChange={(open) => setDialogOpen(open)}>
<Dialog open={dialogOpenId === connector.id} onOpenChange={(open) => setDialogOpenId(open ? connector.id : null)}>
<DialogTrigger asChild>
<Button variant="ghost" className="w-full text-sm text-gray-500 dark:text-gray-400">
Show {connector.sources.length - INITIAL_SOURCES_DISPLAY} More Sources
@ -901,13 +956,16 @@ const ChatPage = () => {
return (
<MarkdownViewer
content={latestAnswer.content.join('\n')}
getCitationSource={getCitationSource}
getCitationSource={(id) => getCitationSource(id, index)}
/>
);
}
// Fallback to the message content if no ANSWER annotation is available
return <MarkdownViewer content={message.content} getCitationSource={getCitationSource} />;
return <MarkdownViewer
content={message.content}
getCitationSource={(id) => getCitationSource(id, index)}
/>;
})()}
</div>
}

View file

@ -1,6 +1,6 @@
"use client";
import { cn } from "@/lib/utils";
import { IconArrowRight, IconBrandGithub, IconBrandDiscord } from "@tabler/icons-react";
import { IconFileTypeDoc, IconBrandGithub, IconBrandDiscord } from "@tabler/icons-react";
import Link from "next/link";
import React from "react";
import { motion } from "framer-motion";
@ -20,11 +20,11 @@ export function ModernHeroWithGradients() {
<div className="relative z-20 flex flex-col items-center justify-center overflow-hidden rounded-3xl p-4 md:p-12 lg:p-16">
<Link
href="https://github.com/MODSetter/SurfSense"
href="/docs"
className="flex items-center gap-1 rounded-full border border-gray-200 bg-gradient-to-b from-gray-50 to-gray-100 px-4 py-1 text-center text-sm text-gray-800 shadow-sm dark:border-[#404040] dark:bg-gradient-to-b dark:from-[#5B5B5D] dark:to-[#262627] dark:text-white dark:shadow-inner dark:shadow-purple-500/10"
>
<span>SurfSense v0.0.6 Released</span>
<IconArrowRight className="h-4 w-4 text-gray-800 dark:text-white" />
<IconFileTypeDoc className="h-4 w-4 text-gray-800 dark:text-white" />
<span>Documentation</span>
</Link>
{/* Import the Logo component or define it in this file */}
<div className="flex items-center justify-center gap-4 mt-10 mb-2">

View file

@ -24,8 +24,8 @@ interface NavbarProps {
export const Navbar = () => {
const navItems = [
{
name: "",
link: "/",
name: "Docs",
link: "/docs",
},
// {
// name: "Product",
@ -118,53 +118,52 @@ const DesktopNav = ({ navItems, visible }: NavbarProps) => {
<Logo className="h-8 w-8 rounded-md" />
<span className="dark:text-white/90 text-gray-800 text-lg font-bold">SurfSense</span>
</div>
<motion.div
className="lg:flex flex-row flex-1 items-center justify-center space-x-1 text-sm"
animate={{
scale: visible ? 0.9 : 1,
justifyContent: visible ? "flex-end" : "center",
}}
>
{navItems.map((navItem, idx) => (
<motion.div
key={`nav-item-${idx}`}
onHoverStart={() => setHoveredIndex(idx)}
className="relative"
>
<Link
className="dark:text-white/90 text-gray-800 relative px-3 py-1.5 transition-colors"
href={navItem.link}
<div className="flex items-center gap-4">
<motion.div
className="lg:flex flex-row items-center justify-end space-x-1 text-sm"
animate={{
scale: visible ? 0.9 : 1,
}}
>
{navItems.map((navItem, idx) => (
<motion.div
key={`nav-item-${idx}`}
onHoverStart={() => setHoveredIndex(idx)}
className="relative"
>
<span className="relative z-10">{navItem.name}</span>
{hoveredIndex === idx && (
<motion.div
layoutId="menu-hover"
className="absolute inset-0 rounded-full dark:bg-gradient-to-r dark:from-white/10 dark:to-white/20 bg-gradient-to-r from-gray-200 to-gray-300"
initial={{ opacity: 0, scale: 0.8 }}
animate={{
opacity: 1,
scale: 1.1,
background: "var(--tw-dark) ? radial-gradient(circle at center, rgba(255,255,255,0.2) 0%, rgba(255,255,255,0.1) 50%, transparent 100%) : radial-gradient(circle at center, rgba(0,0,0,0.05) 0%, rgba(0,0,0,0.03) 50%, transparent 100%)",
}}
exit={{
opacity: 0,
scale: 0.8,
transition: {
duration: 0.2,
},
}}
transition={{
type: "spring",
bounce: 0.4,
duration: 0.4,
}}
/>
)}
</Link>
</motion.div>
))}
</motion.div>
<div className="flex items-center gap-2">
<Link
className="dark:text-white/90 text-gray-800 relative px-3 py-1.5 transition-colors"
href={navItem.link}
>
<span className="relative z-10">{navItem.name}</span>
{hoveredIndex === idx && (
<motion.div
layoutId="menu-hover"
className="absolute inset-0 rounded-full dark:bg-gradient-to-r dark:from-white/10 dark:to-white/20 bg-gradient-to-r from-gray-200 to-gray-300"
initial={{ opacity: 0, scale: 0.8 }}
animate={{
opacity: 1,
scale: 1.1,
background: "var(--tw-dark) ? radial-gradient(circle at center, rgba(255,255,255,0.2) 0%, rgba(255,255,255,0.1) 50%, transparent 100%) : radial-gradient(circle at center, rgba(0,0,0,0.05) 0%, rgba(0,0,0,0.03) 50%, transparent 100%)",
}}
exit={{
opacity: 0,
scale: 0.8,
transition: {
duration: 0.2,
},
}}
transition={{
type: "spring",
bounce: 0.4,
duration: 0.4,
}}
/>
)}
</Link>
</motion.div>
))}
</motion.div>
<ThemeTogglerComponent />
<AnimatePresence mode="popLayout" initial={false}>
{!visible && (

View file

@ -20,7 +20,7 @@ type CitationProps = {
/**
* Citation component to handle individual citations
*/
export const Citation = ({ citationId, citationText, position, source }: CitationProps) => {
export const Citation = React.memo(({ citationId, citationText, position, source }: CitationProps) => {
const [open, setOpen] = useState(false);
const citationKey = `citation-${citationId}-${position}`;
@ -38,37 +38,41 @@ export const Citation = ({ citationId, citationText, position, source }: Citatio
</span>
</sup>
</DropdownMenuTrigger>
<DropdownMenuContent align="start" className="w-80 p-0">
<Card className="border-0 shadow-none">
<div className="p-3 flex items-start gap-3">
<div className="flex-shrink-0 w-7 h-7 flex items-center justify-center bg-muted rounded-full">
{getConnectorIcon(source.connectorType || '')}
</div>
<div className="flex-1">
<div className="flex items-center gap-2 mb-1">
<h3 className="font-medium text-sm text-card-foreground">{source.title}</h3>
{open && (
<DropdownMenuContent align="start" className="w-80 p-0" forceMount>
<Card className="border-0 shadow-none">
<div className="p-3 flex items-start gap-3">
<div className="flex-shrink-0 w-7 h-7 flex items-center justify-center bg-muted rounded-full">
{getConnectorIcon(source.connectorType || '')}
</div>
<p className="text-sm text-muted-foreground mt-0.5">{source.description}</p>
<div className="mt-2 flex items-center text-xs text-muted-foreground">
<span className="truncate max-w-[200px]">{source.url}</span>
<div className="flex-1">
<div className="flex items-center gap-2 mb-1">
<h3 className="font-medium text-sm text-card-foreground">{source.title}</h3>
</div>
<p className="text-sm text-muted-foreground mt-0.5">{source.description}</p>
<div className="mt-2 flex items-center text-xs text-muted-foreground">
<span className="truncate max-w-[200px]">{source.url}</span>
</div>
</div>
<Button
variant="ghost"
size="icon"
className="h-7 w-7 rounded-full"
onClick={() => window.open(source.url, '_blank', 'noopener,noreferrer')}
title="Open in new tab"
>
<ExternalLink className="h-3.5 w-3.5" />
</Button>
</div>
<Button
variant="ghost"
size="icon"
className="h-7 w-7 rounded-full"
onClick={() => window.open(source.url, '_blank')}
title="Open in new tab"
>
<ExternalLink className="h-3.5 w-3.5" />
</Button>
</div>
</Card>
</DropdownMenuContent>
</Card>
</DropdownMenuContent>
)}
</DropdownMenu>
</span>
);
};
});
Citation.displayName = 'Citation';
/**
* Function to render text with citations

View file

@ -1,4 +1,4 @@
import React from "react";
import React, { useMemo } from "react";
import ReactMarkdown from "react-markdown";
import rehypeRaw from "rehype-raw";
import rehypeSanitize from "rehype-sanitize";
@ -14,75 +14,87 @@ interface MarkdownViewerProps {
}
export function MarkdownViewer({ content, className, getCitationSource }: MarkdownViewerProps) {
// Memoize the markdown components to prevent unnecessary re-renders
const components = useMemo(() => {
return {
// Define custom components for markdown elements
p: ({node, children, ...props}: any) => {
// If there's no getCitationSource function, just render normally
if (!getCitationSource) {
return <p className="my-2" {...props}>{children}</p>;
}
// Process citations within paragraph content
return <p className="my-2" {...props}>{processCitationsInReactChildren(children, getCitationSource)}</p>;
},
a: ({node, children, ...props}: any) => {
// Process citations within link content if needed
const processedChildren = getCitationSource
? processCitationsInReactChildren(children, getCitationSource)
: children;
return <a className="text-primary hover:underline" {...props}>{processedChildren}</a>;
},
li: ({node, children, ...props}: any) => {
// Process citations within list item content
const processedChildren = getCitationSource
? processCitationsInReactChildren(children, getCitationSource)
: children;
return <li {...props}>{processedChildren}</li>;
},
ul: ({node, ...props}: any) => <ul className="list-disc pl-5 my-2" {...props} />,
ol: ({node, ...props}: any) => <ol className="list-decimal pl-5 my-2" {...props} />,
h1: ({node, children, ...props}: any) => {
const processedChildren = getCitationSource
? processCitationsInReactChildren(children, getCitationSource)
: children;
return <h1 className="text-2xl font-bold mt-6 mb-2" {...props}>{processedChildren}</h1>;
},
h2: ({node, children, ...props}: any) => {
const processedChildren = getCitationSource
? processCitationsInReactChildren(children, getCitationSource)
: children;
return <h2 className="text-xl font-bold mt-5 mb-2" {...props}>{processedChildren}</h2>;
},
h3: ({node, children, ...props}: any) => {
const processedChildren = getCitationSource
? processCitationsInReactChildren(children, getCitationSource)
: children;
return <h3 className="text-lg font-bold mt-4 mb-2" {...props}>{processedChildren}</h3>;
},
h4: ({node, children, ...props}: any) => {
const processedChildren = getCitationSource
? processCitationsInReactChildren(children, getCitationSource)
: children;
return <h4 className="text-base font-bold mt-3 mb-1" {...props}>{processedChildren}</h4>;
},
blockquote: ({node, ...props}: any) => <blockquote className="border-l-4 border-muted pl-4 italic my-2" {...props} />,
hr: ({node, ...props}: any) => <hr className="my-4 border-muted" {...props} />,
img: ({node, ...props}: any) => <img className="max-w-full h-auto my-4 rounded" {...props} />,
table: ({node, ...props}: any) => <div className="overflow-x-auto my-4"><table className="min-w-full divide-y divide-border" {...props} /></div>,
th: ({node, ...props}: any) => <th className="px-3 py-2 text-left font-medium bg-muted" {...props} />,
td: ({node, ...props}: any) => <td className="px-3 py-2 border-t border-border" {...props} />,
code: ({node, className, children, ...props}: any) => {
const match = /language-(\w+)/.exec(className || '');
const isInline = !match;
return isInline
? <code className="bg-muted px-1 py-0.5 rounded text-xs" {...props}>{children}</code>
: (
<div className="relative my-4">
<pre className="bg-muted p-4 rounded-md overflow-x-auto">
<code className="text-xs" {...props}>{children}</code>
</pre>
</div>
);
}
};
}, [getCitationSource]);
return (
<div className={cn("prose prose-sm dark:prose-invert max-w-none", className)}>
<ReactMarkdown
rehypePlugins={[rehypeRaw, rehypeSanitize]}
remarkPlugins={[remarkGfm]}
components={{
// Define custom components for markdown elements
p: ({node, children, ...props}) => {
// If there's no getCitationSource function, just render normally
if (!getCitationSource) {
return <p className="my-2" {...props}>{children}</p>;
}
// Process citations within paragraph content
return <p className="my-2" {...props}>{processCitationsInReactChildren(children, getCitationSource)}</p>;
},
a: ({node, children, ...props}) => {
// Process citations within link content if needed
const processedChildren = getCitationSource
? processCitationsInReactChildren(children, getCitationSource)
: children;
return <a className="text-primary hover:underline" {...props}>{processedChildren}</a>;
},
ul: ({node, ...props}) => <ul className="list-disc pl-5 my-2" {...props} />,
ol: ({node, ...props}) => <ol className="list-decimal pl-5 my-2" {...props} />,
h1: ({node, children, ...props}) => {
const processedChildren = getCitationSource
? processCitationsInReactChildren(children, getCitationSource)
: children;
return <h1 className="text-2xl font-bold mt-6 mb-2" {...props}>{processedChildren}</h1>;
},
h2: ({node, children, ...props}) => {
const processedChildren = getCitationSource
? processCitationsInReactChildren(children, getCitationSource)
: children;
return <h2 className="text-xl font-bold mt-5 mb-2" {...props}>{processedChildren}</h2>;
},
h3: ({node, children, ...props}) => {
const processedChildren = getCitationSource
? processCitationsInReactChildren(children, getCitationSource)
: children;
return <h3 className="text-lg font-bold mt-4 mb-2" {...props}>{processedChildren}</h3>;
},
h4: ({node, children, ...props}) => {
const processedChildren = getCitationSource
? processCitationsInReactChildren(children, getCitationSource)
: children;
return <h4 className="text-base font-bold mt-3 mb-1" {...props}>{processedChildren}</h4>;
},
blockquote: ({node, ...props}) => <blockquote className="border-l-4 border-muted pl-4 italic my-2" {...props} />,
hr: ({node, ...props}) => <hr className="my-4 border-muted" {...props} />,
img: ({node, ...props}) => <img className="max-w-full h-auto my-4 rounded" {...props} />,
table: ({node, ...props}) => <div className="overflow-x-auto my-4"><table className="min-w-full divide-y divide-border" {...props} /></div>,
th: ({node, ...props}) => <th className="px-3 py-2 text-left font-medium bg-muted" {...props} />,
td: ({node, ...props}) => <td className="px-3 py-2 border-t border-border" {...props} />,
code: ({node, className, children, ...props}: any) => {
const match = /language-(\w+)/.exec(className || '');
const isInline = !match;
return isInline
? <code className="bg-muted px-1 py-0.5 rounded text-xs" {...props}>{children}</code>
: (
<div className="relative my-4">
<pre className="bg-muted p-4 rounded-md overflow-x-auto">
<code className="text-xs" {...props}>{children}</code>
</pre>
</div>
);
}
}}
components={components}
>
{content}
</ReactMarkdown>
@ -91,7 +103,7 @@ export function MarkdownViewer({ content, className, getCitationSource }: Markdo
}
// Helper function to process citations within React children
function processCitationsInReactChildren(children: React.ReactNode, getCitationSource: (id: number) => Source | null): React.ReactNode {
const processCitationsInReactChildren = (children: React.ReactNode, getCitationSource: (id: number) => Source | null): React.ReactNode => {
// If children is not an array or string, just return it
if (!children || (typeof children !== 'string' && !Array.isArray(children))) {
return children;
@ -113,10 +125,10 @@ function processCitationsInReactChildren(children: React.ReactNode, getCitationS
}
return children;
}
};
// Process citation references in text content
function processCitationsInText(text: string, getCitationSource: (id: number) => Source | null): React.ReactNode[] {
const processCitationsInText = (text: string, getCitationSource: (id: number) => Source | null): React.ReactNode[] => {
// Use improved regex to catch citation numbers more reliably
// This will match patterns like [1], [42], etc. including when they appear at the end of a line or sentence
const citationRegex = /\[(\d+)\]/g;
@ -124,14 +136,8 @@ function processCitationsInText(text: string, getCitationSource: (id: number) =>
let lastIndex = 0;
let match;
let position = 0;
// Debug log for troubleshooting
console.log("Processing citations in text:", text);
while ((match = citationRegex.exec(text)) !== null) {
// Log each match for debugging
console.log("Citation match found:", match[0], "at index", match.index);
// Add text before the citation
if (match.index > lastIndex) {
parts.push(text.substring(lastIndex, match.index));
@ -141,9 +147,6 @@ function processCitationsInText(text: string, getCitationSource: (id: number) =>
const citationId = parseInt(match[1], 10);
const source = getCitationSource(citationId);
// Log the citation details
console.log("Citation ID:", citationId, "Source:", source ? "found" : "not found");
parts.push(
<Citation
key={`citation-${citationId}-${position}`}
@ -164,4 +167,4 @@ function processCitationsInText(text: string, getCitationSource: (id: number) =>
}
return parts;
}
};

View file

@ -0,0 +1,168 @@
---
title: Docker Installation
description: Setting up SurfSense using Docker
full: true
---
## Known Limitations
⚠️ **Important Note:** Currently, the following features have limited functionality when running in Docker:
- **Ollama integration:** Local Ollama models do not work when running SurfSense in Docker. Please use other LLM providers like OpenAI or Gemini instead.
- **Web crawler functionality:** The web crawler feature currently doesn't work properly within the Docker environment.
We're actively working to resolve these limitations in future releases.
# Docker Installation
This guide explains how to run SurfSense using Docker Compose, which is the preferred and recommended method for deployment.
## Prerequisites
Before you begin, ensure you have:
- [Docker](https://docs.docker.com/get-docker/) and [Docker Compose](https://docs.docker.com/compose/install/) installed on your machine
- [Git](https://git-scm.com/downloads) (to clone the repository)
- Completed all the [prerequisite setup steps](/docs) including:
- PGVector setup
- Google OAuth configuration
- Unstructured.io API key
- Other required API keys
## Installation Steps
1. **Configure Environment Variables**
Set up the necessary environment variables:
**Linux/macOS:**
```bash
# Copy example environment files
cp surfsense_backend/.env.example surfsense_backend/.env
cp surfsense_web/.env.example surfsense_web/.env
```
**Windows (Command Prompt):**
```cmd
copy surfsense_backend\.env.example surfsense_backend\.env
copy surfsense_web\.env.example surfsense_web\.env
```
**Windows (PowerShell):**
```powershell
Copy-Item -Path surfsense_backend\.env.example -Destination surfsense_backend\.env
Copy-Item -Path surfsense_web\.env.example -Destination surfsense_web\.env
```
Edit both `.env` files and fill in the required values:
**Backend Environment Variables:**
| ENV VARIABLE | DESCRIPTION |
|--------------|-------------|
| DATABASE_URL | PostgreSQL connection string (e.g., `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`) |
| SECRET_KEY | JWT Secret key for authentication (should be a secure random string) |
| GOOGLE_OAUTH_CLIENT_ID | Google OAuth client ID obtained from Google Cloud Console |
| GOOGLE_OAUTH_CLIENT_SECRET | Google OAuth client secret obtained from Google Cloud Console |
| NEXT_FRONTEND_URL | URL where your frontend application is hosted (e.g., `http://localhost:3000`) |
| EMBEDDING_MODEL | Name of the embedding model (e.g., `mixedbread-ai/mxbai-embed-large-v1`) |
| RERANKERS_MODEL_NAME | Name of the reranker model (e.g., `ms-marco-MiniLM-L-12-v2`) |
| RERANKERS_MODEL_TYPE | Type of reranker model (e.g., `flashrank`) |
| FAST_LLM | LiteLLM routed smaller, faster LLM (e.g., `openai/gpt-4o-mini`, `ollama/deepseek-r1:8b`) |
| STRATEGIC_LLM | LiteLLM routed advanced LLM for complex tasks (e.g., `openai/gpt-4o`, `ollama/gemma3:12b`) |
| LONG_CONTEXT_LLM | LiteLLM routed LLM for longer context windows (e.g., `gemini/gemini-2.0-flash`, `ollama/deepseek-r1:8b`) |
| UNSTRUCTURED_API_KEY | API key for Unstructured.io service for document parsing |
| FIRECRAWL_API_KEY | API key for Firecrawl service for web crawling |
Include API keys for the LLM providers you're using. For example:
- `OPENAI_API_KEY`: If using OpenAI models
- `GEMINI_API_KEY`: If using Google Gemini models
For other LLM providers, refer to the [LiteLLM documentation](https://docs.litellm.ai/docs/providers).
**Frontend Environment Variables:**
| ENV VARIABLE | DESCRIPTION |
|--------------|-------------|
| NEXT_PUBLIC_FASTAPI_BACKEND_URL | URL of the backend service (e.g., `http://localhost:8000`) |
2. **Build and Start Containers**
Start the Docker containers:
**Linux/macOS/Windows:**
```bash
docker-compose up --build
```
To run in detached mode (in the background):
**Linux/macOS/Windows:**
```bash
docker-compose up -d
```
**Note for Windows users:** If you're using older Docker Desktop versions, you might need to use `docker compose` (with a space) instead of `docker-compose`.
3. **Access the Applications**
Once the containers are running, you can access:
- Frontend: [http://localhost:3000](http://localhost:3000)
- Backend API: [http://localhost:8000](http://localhost:8000)
- API Documentation: [http://localhost:8000/docs](http://localhost:8000/docs)
## Useful Docker Commands
### Container Management
- **Stop containers:**
**Linux/macOS/Windows:**
```bash
docker-compose down
```
- **View logs:**
**Linux/macOS/Windows:**
```bash
# All services
docker-compose logs -f
# Specific service
docker-compose logs -f backend
docker-compose logs -f frontend
docker-compose logs -f db
```
- **Restart a specific service:**
**Linux/macOS/Windows:**
```bash
docker-compose restart backend
```
- **Execute commands in a running container:**
**Linux/macOS/Windows:**
```bash
# Backend
docker-compose exec backend python -m pytest
# Frontend
docker-compose exec frontend pnpm lint
```
## Troubleshooting
- **Linux/macOS:** If you encounter permission errors, you may need to run the docker commands with `sudo`.
- **Windows:** If you see access denied errors, make sure you're running Command Prompt or PowerShell as Administrator.
- If ports are already in use, modify the port mappings in the `docker-compose.yml` file.
- For backend dependency issues, check the `Dockerfile` in the backend directory.
- For frontend dependency issues, check the `Dockerfile` in the frontend directory.
- **Windows-specific:** If you encounter line ending issues (CRLF vs LF), configure Git to handle line endings properly with `git config --global core.autocrlf true` before cloning the repository.
## Next Steps
Once your installation is complete, you can start using SurfSense! Navigate to the frontend URL and log in using your Google account.

View file

@ -1,7 +1,99 @@
---
title: Welcome Docs
title: Prerequisites
description: Required setup's before setting up SurfSense
full: true
---
## Introduction
I love Docs.
## PGVector installation Guide
SurfSense requires the pgvector extension for PostgreSQL:
### Linux and Mac
Compile and install the extension (supports Postgres 13+)
```sh
cd /tmp
git clone --branch v0.8.0 https://github.com/pgvector/pgvector.git
cd pgvector
make
make install # may need sudo
```
See the [installation notes](https://github.com/pgvector/pgvector/tree/master#installation-notes---linux-and-mac) if you run into issues
### Windows
Ensure [C++ support in Visual Studio](https://learn.microsoft.com/en-us/cpp/build/building-on-the-command-line?view=msvc-170#download-and-install-the-tools) is installed, and run:
```cmd
call "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvars64.bat"
```
Note: The exact path will vary depending on your Visual Studio version and edition
Then use `nmake` to build:
```cmd
set "PGROOT=C:\Program Files\PostgreSQL\16"
cd %TEMP%
git clone --branch v0.8.0 https://github.com/pgvector/pgvector.git
cd pgvector
nmake /F Makefile.win
nmake /F Makefile.win install
```
See the [installation notes](https://github.com/pgvector/pgvector/tree/master#installation-notes---windows) if you run into issues
---
## Google OAuth Setup
SurfSense user management and authentication works on Google OAuth. Lets set it up.
1. Login to your [Google Developer Console](https://console.cloud.google.com/)
2. Enable People API.
![Google Developer Console People API](/docs/google_oauth_people_api.png)
3. Set up OAuth consent screen.
![Google Developer Console OAuth consent screen](/docs/google_oauth_screen.png)
4. Create OAuth client ID and secret.
![Google Developer Console OAuth client ID](/docs/google_oauth_client.png)
5. It should look like this.
![Google Developer Console Config](/docs/google_oauth_config.png)
---
## File Upload's
Files are converted to LLM friendly formats using [Unstructured](https://github.com/Unstructured-IO/unstructured)
1. Get an Unstructured.io API key from [Unstructured Platform](https://platform.unstructured.io/)
2. You should be able to generate API keys once registered
![Unstructured Dashboard](/docs/unstructured.png)
---
## LLM Observability (Optional)
This is not required for SurfSense to work. But it is always a good idea to monitor LLM interactions. So we do not have those WTH moments.
1. Get a LangSmith API key from [smith.langchain.com](https://smith.langchain.com/)
2. This helps in observing SurfSense Researcher Agent.
![LangSmith](/docs/langsmith.png)
---
## Crawler
SurfSense have 2 options for saving webpages:
- [SurfSense Extension](https://github.com/MODSetter/SurfSense/tree/main/surfsense_browser_extension) (Overall better experience & ability to save private webpages, recommended)
- Crawler (If you want to save public webpages)
**NOTE:** SurfSense currently uses [Firecrawl.py](https://www.firecrawl.dev/) for web crawling. If you plan on using the crawler, you will need to create a Firecrawl account and get an API key.
---
## Next Steps
Once you have all prerequisites in place, proceed to the [installation guide](/docs/installation) to set up SurfSense.

View file

@ -0,0 +1,21 @@
---
title: Installation
description: Current ways to use SurfSense
full: true
---
# Installing SurfSense
There are two ways to install SurfSense, but both require the repository to be cloned first. Clone [SurfSense](https://github.com/MODSetter/SurfSense) and then:
## Docker Installation
This method provides a containerized environment with all dependencies pre-configured. Less Customization.
[Learn more about Docker installation](/docs/docker-installation)
## Manual Installation (Preferred)
For users who prefer more control over the installation process or need to customize their setup, we also provide manual installation instructions.
[Learn more about Manual installation](/docs/manual-installation)

View file

@ -0,0 +1,258 @@
---
title: Manual Installation
description: Setting up SurfSense manually for customized deployments (Preferred)
full: true
---
# Manual Installation (Preferred)
This guide provides step-by-step instructions for setting up SurfSense without Docker. This approach gives you more control over the installation process and allows for customization of the environment.
## Prerequisites
Before beginning the manual installation, ensure you have completed all the [prerequisite setup steps](/docs), including:
- PGVector installation
- Google OAuth setup
- Unstructured.io API key
- LLM observability (optional)
- Crawler setup (if needed)
## Backend Setup
The backend is the core of SurfSense. Follow these steps to set it up:
### 1. Environment Configuration
First, create and configure your environment variables by copying the example file:
**Linux/macOS:**
```bash
cd surfsense_backend
cp .env.example .env
```
**Windows (Command Prompt):**
```cmd
cd surfsense_backend
copy .env.example .env
```
**Windows (PowerShell):**
```powershell
cd surfsense_backend
Copy-Item -Path .env.example -Destination .env
```
Edit the `.env` file and set the following variables:
| ENV VARIABLE | DESCRIPTION |
|--------------|-------------|
| DATABASE_URL | PostgreSQL connection string (e.g., `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`) |
| SECRET_KEY | JWT Secret key for authentication (should be a secure random string) |
| GOOGLE_OAUTH_CLIENT_ID | Google OAuth client ID |
| GOOGLE_OAUTH_CLIENT_SECRET | Google OAuth client secret |
| NEXT_FRONTEND_URL | Frontend application URL (e.g., `http://localhost:3000`) |
| EMBEDDING_MODEL | Name of the embedding model (e.g., `mixedbread-ai/mxbai-embed-large-v1`) |
| RERANKERS_MODEL_NAME | Name of the reranker model (e.g., `ms-marco-MiniLM-L-12-v2`) |
| RERANKERS_MODEL_TYPE | Type of reranker model (e.g., `flashrank`) |
| FAST_LLM | LiteLLM routed faster LLM (e.g., `openai/gpt-4o-mini`, `ollama/deepseek-r1:8b`) |
| STRATEGIC_LLM | LiteLLM routed advanced LLM (e.g., `openai/gpt-4o`, `ollama/gemma3:12b`) |
| LONG_CONTEXT_LLM | LiteLLM routed long-context LLM (e.g., `gemini/gemini-2.0-flash`, `ollama/deepseek-r1:8b`) |
| UNSTRUCTURED_API_KEY | API key for Unstructured.io service |
| FIRECRAWL_API_KEY | API key for Firecrawl service (if using crawler) |
**Important**: Since LLM calls are routed through LiteLLM, include API keys for the LLM providers you're using:
- For OpenAI models: `OPENAI_API_KEY`
- For Google Gemini models: `GEMINI_API_KEY`
- For other providers, refer to the [LiteLLM documentation](https://docs.litellm.ai/docs/providers)
### 2. Install Dependencies
Install the backend dependencies using `uv`:
**Linux/macOS:**
```bash
# Install uv if you don't have it
curl -fsSL https://astral.sh/uv/install.sh | bash
# Install dependencies
uv sync
```
**Windows (PowerShell):**
```powershell
# Install uv if you don't have it
iwr -useb https://astral.sh/uv/install.ps1 | iex
# Install dependencies
uv sync
```
**Windows (Command Prompt):**
```cmd
# Install dependencies with uv (after installing uv)
uv sync
```
### 3. Run the Backend
Start the backend server:
**Linux/macOS/Windows:**
```bash
# Run without hot reloading
uv run main.py
# Or with hot reloading for development
uv run main.py --reload
```
If everything is set up correctly, you should see output indicating the server is running on `http://localhost:8000`.
## Frontend Setup
### 1. Environment Configuration
Set up the frontend environment:
**Linux/macOS:**
```bash
cd surfsense_web
cp .env.example .env
```
**Windows (Command Prompt):**
```cmd
cd surfsense_web
copy .env.example .env
```
**Windows (PowerShell):**
```powershell
cd surfsense_web
Copy-Item -Path .env.example -Destination .env
```
Edit the `.env` file and set:
| ENV VARIABLE | DESCRIPTION |
|--------------|-------------|
| NEXT_PUBLIC_FASTAPI_BACKEND_URL | Backend URL (e.g., `http://localhost:8000`) |
### 2. Install Dependencies
Install the frontend dependencies:
**Linux/macOS:**
```bash
# Install pnpm if you don't have it
npm install -g pnpm
# Install dependencies
pnpm install
```
**Windows:**
```powershell
# Install pnpm if you don't have it
npm install -g pnpm
# Install dependencies
pnpm install
```
### 3. Run the Frontend
Start the Next.js development server:
**Linux/macOS/Windows:**
```bash
pnpm run dev
```
The frontend should now be running at `http://localhost:3000`.
## Browser Extension Setup (Optional)
The SurfSense browser extension allows you to save any webpage, including those protected behind authentication.
### 1. Environment Configuration
**Linux/macOS:**
```bash
cd surfsense_browser_extension
cp .env.example .env
```
**Windows (Command Prompt):**
```cmd
cd surfsense_browser_extension
copy .env.example .env
```
**Windows (PowerShell):**
```powershell
cd surfsense_browser_extension
Copy-Item -Path .env.example -Destination .env
```
Edit the `.env` file:
| ENV VARIABLE | DESCRIPTION |
|--------------|-------------|
| PLASMO_PUBLIC_BACKEND_URL | SurfSense Backend URL (e.g., `http://127.0.0.1:8000`) |
### 2. Build the Extension
Build the extension for your browser using the [Plasmo framework](https://docs.plasmo.com/framework/workflows/build#with-a-specific-target).
**Linux/macOS/Windows:**
```bash
# Install dependencies
pnpm install
# Build for Chrome (default)
pnpm build
# Or for other browsers
pnpm build --target=firefox
pnpm build --target=edge
```
### 3. Load the Extension
Load the extension in your browser's developer mode and configure it with your SurfSense API key.
## Verification
To verify your installation:
1. Open your browser and navigate to `http://localhost:3000`
2. Sign in with your Google account
3. Create a search space and try uploading a document
4. Test the chat functionality with your uploaded content
## Troubleshooting
- **Database Connection Issues**: Verify your PostgreSQL server is running and pgvector is properly installed
- **Authentication Problems**: Check your Google OAuth configuration and ensure redirect URIs are set correctly
- **LLM Errors**: Confirm your LLM API keys are valid and the selected models are accessible
- **File Upload Failures**: Validate your Unstructured.io API key
- **Windows-specific**: If you encounter path issues, ensure you're using the correct path separator (`\` instead of `/`)
- **macOS-specific**: If you encounter permission issues, you may need to use `sudo` for some installation commands
## Next Steps
Now that you have SurfSense running locally, you can explore its features:
- Create search spaces for organizing your content
- Upload documents or use the browser extension to save webpages
- Ask questions about your saved content
- Explore the advanced RAG capabilities
For production deployments, consider setting up:
- A reverse proxy like Nginx
- SSL certificates for secure connections
- Proper database backups
- User access controls

View file

@ -0,0 +1,12 @@
{
"title": "Setup",
"description": "The setup guide for Surfsense",
"root": true,
"pages": [
"---Setup---",
"index",
"installation",
"docker-installation",
"manual-installation"
]
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 71 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 301 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 108 KiB