mirror of
https://github.com/MODSetter/SurfSense.git
synced 2026-06-08 20:25:19 +02:00
Merge branch 'MODSetter:main' into main
This commit is contained in:
commit
7e639108c6
19 changed files with 899 additions and 376 deletions
130
README.md
130
README.md
|
|
@ -72,97 +72,23 @@ Join the [SurfSense Discord](https://discord.gg/ejRNvftDp9) and help shape the f
|
|||
|
||||
## How to get started?
|
||||
|
||||
### PRE-START CHECKS
|
||||
### Installation Options
|
||||
|
||||
#### PGVector
|
||||
Make sure pgvector extension is installed on your machine. Setup Guide https://github.com/pgvector/pgvector?tab=readme-ov-file#installation
|
||||
SurfSense provides two installation methods:
|
||||
|
||||
#### File Uploading Support
|
||||
For File uploading you need Unstructured.io API key. You can get it at http://platform.unstructured.io/
|
||||
1. **[Docker Installation](https://www.surfsense.net/docs/docker-installation)** - The easiest way to get SurfSense up and running with all dependencies containerized. Less Customization.
|
||||
|
||||
#### Auth
|
||||
SurfSense now only works with Google OAuth. Make sure to set your OAuth Client at https://developers.google.com/identity/protocols/oauth2 . We need client id and client secret for backend. Make sure to enable people api and add the required scopes under data access (openid, userinfo.email, userinfo.profile)
|
||||
2. **[Manual Installation (Recommended)](https://www.surfsense.net/docs/manual-installation)** - For users who prefer more control over their setup or need to customize their deployment.
|
||||
|
||||

|
||||
Both installation guides include detailed OS-specific instructions for Windows, macOS, and Linux.
|
||||
|
||||
#### LLM Observability
|
||||
One easy way to observe SurfSense Researcher Agent is to use LangSmith. Get its API KEY from https://smith.langchain.com/
|
||||
Before installation, make sure to complete the [prerequisite setup steps](https://www.surfsense.net/docs/) including:
|
||||
- PGVector setup
|
||||
- Google OAuth configuration
|
||||
- Unstructured.io API key
|
||||
- Other required API keys
|
||||
|
||||
**Open AI LLMS**
|
||||

|
||||
|
||||
|
||||
**Ollama LLMS**
|
||||

|
||||
|
||||
|
||||
#### Crawler Support
|
||||
SurfSense currently uses [Firecrawl.py](https://www.firecrawl.dev/) right now. Playwright crawler support will be added soon.
|
||||
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Preferred Method: Docker Setup
|
||||
The recommended way to run SurfSense is using Docker, which ensures consistent environment across different systems.
|
||||
|
||||
1. Make sure you have Docker and Docker Compose installed
|
||||
2. Follow the detailed instructions in our [Docker Setup Guide](DOCKER_SETUP.md)
|
||||
|
||||
```bash
|
||||
# Start all services with one command
|
||||
docker-compose up --build
|
||||
```
|
||||
|
||||
---
|
||||
### Alternative: Manual Setup
|
||||
|
||||
### Backend (./surfsense_backend)
|
||||
This is the core of SurfSense. Before we begin let's look at `.env` variables' that we need to successfully setup SurfSense.
|
||||
|
||||
|ENV VARIABLE|DESCRIPTION|
|
||||
|--|--|
|
||||
| DATABASE_URL| Your PostgreSQL database connection string. Eg. `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`|
|
||||
| SECRET_KEY| JWT Secret key used for authentication. Should be a secure random string. Eg. `SURFSENSE_SECRET_KEY_123456789`|
|
||||
| GOOGLE_OAUTH_CLIENT_ID| Google OAuth client ID obtained from Google Cloud Console when setting up OAuth authentication|
|
||||
| GOOGLE_OAUTH_CLIENT_SECRET| Google OAuth client secret obtained from Google Cloud Console when setting up OAuth authentication|
|
||||
| NEXT_FRONTEND_URL| URL where your frontend application is hosted. Eg. `http://localhost:3000`|
|
||||
| EMBEDDING_MODEL| Name of the embedding model to use for vector embeddings. Currently works with Sentence Transformers only. Expect other embeddings soon. Eg. `mixedbread-ai/mxbai-embed-large-v1`|
|
||||
| RERANKERS_MODEL_NAME| Name of the reranker model for search result reranking. Eg. `ms-marco-MiniLM-L-12-v2`|
|
||||
| RERANKERS_MODEL_TYPE| Type of reranker model being used. Eg. `flashrank`|
|
||||
| FAST_LLM| LiteLLM routed Smaller, faster LLM for quick responses. Eg. `openai/gpt-4o-mini`, `ollama/deepseek-r1:8b`|
|
||||
| STRATEGIC_LLM| LiteLLM routed Advanced LLM for complex reasoning tasks. Eg. `openai/gpt-4o`, `ollama/gemma3:12b`|
|
||||
| LONG_CONTEXT_LLM| LiteLLM routed LLM capable of handling longer context windows. Eg. `gemini/gemini-2.0-flash`, `ollama/deepseek-r1:8b`|
|
||||
| UNSTRUCTURED_API_KEY| API key for Unstructured.io service for document parsing|
|
||||
| FIRECRAWL_API_KEY| API key for Firecrawl service for web crawling and data extraction|
|
||||
|
||||
IMPORTANT: Since LLM calls are routed through LiteLLM make sure to include API keys of LLM models you are using. For example if you used `openai/gpt-4o` make sure to include OpenAI API Key `OPENAI_API_KEY` or if you use `gemini/gemini-2.0-flash` then you include `GEMINI_API_KEY`.
|
||||
|
||||
You can also integrate any LLM just follow this https://docs.litellm.ai/docs/providers
|
||||
|
||||
Now once you have everything let's proceed to run SurfSense.
|
||||
1. Install `uv` : https://docs.astral.sh/uv/getting-started/installation/
|
||||
2. Now just run this command to install dependencies i.e `uv sync`
|
||||
3. That's it. Now just run the `main.py` file using `uv run main.py`. You can also optionally pass `--reload` as an argument to enable hot reloading.
|
||||
4. If everything worked fine you should see screen like this.
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
### FrontEnd (./surfsense_web)
|
||||
|
||||
For local frontend setup just fill out the `.env` file of frontend.
|
||||
|
||||
|ENV VARIABLE|DESCRIPTION|
|
||||
|--|--|
|
||||
| NEXT_PUBLIC_FASTAPI_BACKEND_URL | Give hosted backend url here. Eg. `http://localhost:8000`|
|
||||
|
||||
1. Now install dependencies using `pnpm install`
|
||||
2. Run it using `pnpm run dev`
|
||||
|
||||
You should see your Next.js frontend running at `localhost:3000`
|
||||
|
||||
#### Some FrontEnd Screens
|
||||
## Screenshots
|
||||
|
||||
**Search Spaces**
|
||||
|
||||
|
|
@ -180,43 +106,13 @@ You should see your Next.js frontend running at `localhost:3000`
|
|||
|
||||

|
||||
|
||||
---
|
||||
|
||||
### Extension (./surfsense_browser_extension)
|
||||
|
||||
Extension is in plasmo framework which is a cross browser extension framework. Extension main usecase is to save any webpages protected beyond authentication.
|
||||
|
||||
For building extension just fill out the `.env` file of frontend.
|
||||
|
||||
|ENV VARIABLE|DESCRIPTION|
|
||||
|--|--|
|
||||
| PLASMO_PUBLIC_BACKEND_URL| SurfSense Backend URL eg. "http://127.0.0.1:8000" |
|
||||
|
||||
Build the extension for your favorite browser using this guide: https://docs.plasmo.com/framework/workflows/build#with-a-specific-target
|
||||
|
||||
When you load and start the extension you should see a Apu page like this
|
||||
**Browser Extension**
|
||||
|
||||

|
||||
|
||||
|
||||
|
||||
After filling in your SurfSense API key you should be able to use extension now.
|
||||
|
||||
|
||||

|
||||
|
||||
|
||||
|Options|Explanations|
|
||||
|--|--|
|
||||
| Search Space | Search Space to save your dynamic bookmarks. |
|
||||
| Clear Inactive History Sessions | It clears the saved content for Inactive Tab Sessions. |
|
||||
| Save Current Webpage Snapshot | Stores the current webpage session info into SurfSense history store|
|
||||
| Save to SurfSense | Processes the SurfSense History Store & Initiates a Save Job |
|
||||
|
||||
|
||||
|
||||
|
||||
## Tech Stack
|
||||
## Tech Stack
|
||||
|
||||
|
||||
### **BackEnd**
|
||||
|
|
|
|||
5
surfsense_backend/draw.py
Normal file
5
surfsense_backend/draw.py
Normal file
|
|
@ -0,0 +1,5 @@
|
|||
from app.agents.researcher.graph import graph as researcher_graph
|
||||
from app.agents.researcher.sub_section_writer.graph import graph as sub_section_writer_graph
|
||||
|
||||
print(researcher_graph.get_graph().draw_mermaid())
|
||||
print(sub_section_writer_graph.get_graph().draw_mermaid())
|
||||
|
|
@ -8,8 +8,15 @@ RUN npm install -g pnpm
|
|||
# Copy package files
|
||||
COPY package.json pnpm-lock.yaml ./
|
||||
|
||||
# Install dependencies
|
||||
RUN pnpm install
|
||||
# First copy the config file to avoid fumadocs-mdx postinstall error
|
||||
COPY source.config.ts ./
|
||||
COPY content ./content
|
||||
|
||||
# Install dependencies with --ignore-scripts to skip postinstall
|
||||
RUN pnpm install --ignore-scripts
|
||||
|
||||
# Now run the postinstall script manually
|
||||
RUN pnpm fumadocs-mdx
|
||||
|
||||
# Copy source code
|
||||
COPY . .
|
||||
|
|
|
|||
|
|
@ -240,7 +240,7 @@ const SourcesDialogContent = ({
|
|||
const ChatPage = () => {
|
||||
const [token, setToken] = React.useState<string | null>(null);
|
||||
const [activeTab, setActiveTab] = useState("");
|
||||
const [dialogOpen, setDialogOpen] = useState(false);
|
||||
const [dialogOpenId, setDialogOpenId] = useState<number | null>(null);
|
||||
const [sourcesPage, setSourcesPage] = useState(1);
|
||||
const [expandedSources, setExpandedSources] = useState(false);
|
||||
const [canScrollLeft, setCanScrollLeft] = useState(false);
|
||||
|
|
@ -260,6 +260,13 @@ const ChatPage = () => {
|
|||
|
||||
const { search_space_id, chat_id } = useParams();
|
||||
|
||||
// Function to scroll terminal to bottom
|
||||
const scrollTerminalToBottom = () => {
|
||||
if (terminalMessagesRef.current) {
|
||||
terminalMessagesRef.current.scrollTop = terminalMessagesRef.current.scrollHeight;
|
||||
}
|
||||
};
|
||||
|
||||
// Get token from localStorage on client side only
|
||||
React.useEffect(() => {
|
||||
setToken(localStorage.getItem('surfsense_bearer_token'));
|
||||
|
|
@ -469,54 +476,60 @@ const ChatPage = () => {
|
|||
updateChat();
|
||||
}, [messages, status, chat_id, researchMode, selectedConnectors, search_space_id]);
|
||||
|
||||
// Log messages whenever they update and extract annotations from the latest assistant message if available
|
||||
useEffect(() => {
|
||||
console.log('Messages updated:', messages);
|
||||
|
||||
// Extract annotations from the latest assistant message if available
|
||||
// Memoize connector sources to prevent excessive re-renders
|
||||
const processedConnectorSources = React.useMemo(() => {
|
||||
if (messages.length === 0) return connectorSources;
|
||||
|
||||
// Only process when we have a complete message (not streaming)
|
||||
if (status !== 'ready') return connectorSources;
|
||||
|
||||
// Find the latest assistant message
|
||||
const assistantMessages = messages.filter(msg => msg.role === 'assistant');
|
||||
if (assistantMessages.length > 0) {
|
||||
const latestAssistantMessage = assistantMessages[assistantMessages.length - 1];
|
||||
if (latestAssistantMessage?.annotations) {
|
||||
const annotations = latestAssistantMessage.annotations as any[];
|
||||
|
||||
// Debug log to track streaming annotations
|
||||
if (process.env.NODE_ENV === 'development') {
|
||||
console.log('Streaming annotations:', annotations);
|
||||
|
||||
// Log counts of each annotation type
|
||||
const terminalInfoCount = annotations.filter(a => a.type === 'TERMINAL_INFO').length;
|
||||
const sourcesCount = annotations.filter(a => a.type === 'SOURCES').length;
|
||||
const answerCount = annotations.filter(a => a.type === 'ANSWER').length;
|
||||
|
||||
console.log(`Annotation counts - Terminal: ${terminalInfoCount}, Sources: ${sourcesCount}, Answer: ${answerCount}`);
|
||||
}
|
||||
|
||||
// Process SOURCES annotation - get the last one to ensure we have the latest
|
||||
const sourcesAnnotations = annotations.filter(
|
||||
(annotation) => annotation.type === 'SOURCES'
|
||||
);
|
||||
|
||||
if (sourcesAnnotations.length > 0) {
|
||||
// Get the last SOURCES annotation to ensure we have the most recent one
|
||||
const latestSourcesAnnotation = sourcesAnnotations[sourcesAnnotations.length - 1];
|
||||
if (latestSourcesAnnotation.content) {
|
||||
setConnectorSources(latestSourcesAnnotation.content);
|
||||
}
|
||||
}
|
||||
|
||||
// Check for terminal info annotations and scroll terminal to bottom if they exist
|
||||
const terminalInfoAnnotations = annotations.filter(
|
||||
(annotation) => annotation.type === 'TERMINAL_INFO'
|
||||
);
|
||||
|
||||
if (terminalInfoAnnotations.length > 0) {
|
||||
// Schedule scrolling after the DOM has been updated
|
||||
setTimeout(scrollTerminalToBottom, 100);
|
||||
}
|
||||
}
|
||||
if (assistantMessages.length === 0) return connectorSources;
|
||||
|
||||
const latestAssistantMessage = assistantMessages[assistantMessages.length - 1];
|
||||
if (!latestAssistantMessage?.annotations) return connectorSources;
|
||||
|
||||
// Find the latest SOURCES annotation
|
||||
const annotations = latestAssistantMessage.annotations as any[];
|
||||
const sourcesAnnotations = annotations.filter(a => a.type === 'SOURCES');
|
||||
|
||||
if (sourcesAnnotations.length === 0) return connectorSources;
|
||||
|
||||
const latestSourcesAnnotation = sourcesAnnotations[sourcesAnnotations.length - 1];
|
||||
if (!latestSourcesAnnotation.content) return connectorSources;
|
||||
|
||||
// Use this content if it differs from current
|
||||
return latestSourcesAnnotation.content;
|
||||
}, [messages, status, connectorSources]);
|
||||
|
||||
// Update connector sources when processed value changes
|
||||
useEffect(() => {
|
||||
if (processedConnectorSources !== connectorSources) {
|
||||
setConnectorSources(processedConnectorSources);
|
||||
}
|
||||
}, [messages]);
|
||||
}, [processedConnectorSources, connectorSources]);
|
||||
|
||||
// Check and scroll terminal when terminal info is available
|
||||
useEffect(() => {
|
||||
if (messages.length === 0 || status !== 'ready') return;
|
||||
|
||||
// Find the latest assistant message
|
||||
const assistantMessages = messages.filter(msg => msg.role === 'assistant');
|
||||
if (assistantMessages.length === 0) return;
|
||||
|
||||
const latestAssistantMessage = assistantMessages[assistantMessages.length - 1];
|
||||
if (!latestAssistantMessage?.annotations) return;
|
||||
|
||||
// Check for terminal info annotations
|
||||
const annotations = latestAssistantMessage.annotations as any[];
|
||||
const terminalInfoAnnotations = annotations.filter(a => a.type === 'TERMINAL_INFO');
|
||||
|
||||
if (terminalInfoAnnotations.length > 0) {
|
||||
// Schedule scrolling after the DOM has been updated
|
||||
setTimeout(scrollTerminalToBottom, 100);
|
||||
}
|
||||
}, [messages, status]);
|
||||
|
||||
// Custom handleSubmit function to include selected connectors and answer type
|
||||
const handleSubmit = (e: React.FormEvent) => {
|
||||
|
|
@ -543,24 +556,22 @@ const ChatPage = () => {
|
|||
messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
|
||||
};
|
||||
|
||||
// Function to scroll terminal to bottom
|
||||
const scrollTerminalToBottom = () => {
|
||||
if (terminalMessagesRef.current) {
|
||||
terminalMessagesRef.current.scrollTop = terminalMessagesRef.current.scrollHeight;
|
||||
}
|
||||
};
|
||||
|
||||
// Scroll to bottom when messages change
|
||||
useEffect(() => {
|
||||
scrollToBottom();
|
||||
}, [messages]);
|
||||
|
||||
// Set activeTab when connectorSources change
|
||||
useEffect(() => {
|
||||
if (connectorSources.length > 0) {
|
||||
setActiveTab(connectorSources[0].type);
|
||||
}
|
||||
// Set activeTab when connectorSources change using a memoized value
|
||||
const activeTabValue = React.useMemo(() => {
|
||||
return connectorSources.length > 0 ? connectorSources[0].type : "";
|
||||
}, [connectorSources]);
|
||||
|
||||
// Update activeTab when the memoized value changes
|
||||
useEffect(() => {
|
||||
if (activeTabValue && activeTabValue !== activeTab) {
|
||||
setActiveTab(activeTabValue);
|
||||
}
|
||||
}, [activeTabValue, activeTab]);
|
||||
|
||||
// Scroll terminal to bottom when expanded
|
||||
useEffect(() => {
|
||||
|
|
@ -617,49 +628,89 @@ const ChatPage = () => {
|
|||
};
|
||||
|
||||
// Function to get a citation source by ID
|
||||
const getCitationSource = (citationId: number): Source | null => {
|
||||
const getCitationSource = React.useCallback((citationId: number, messageIndex?: number): Source | null => {
|
||||
if (!messages || messages.length === 0) return null;
|
||||
|
||||
// Find the latest assistant message
|
||||
const assistantMessages = messages.filter(msg => msg.role === 'assistant');
|
||||
if (assistantMessages.length === 0) return null;
|
||||
// If no specific message index is provided, use the latest assistant message
|
||||
if (messageIndex === undefined) {
|
||||
// Find the latest assistant message
|
||||
const assistantMessages = messages.filter(msg => msg.role === 'assistant');
|
||||
if (assistantMessages.length === 0) return null;
|
||||
|
||||
const latestAssistantMessage = assistantMessages[assistantMessages.length - 1];
|
||||
if (!latestAssistantMessage?.annotations) return null;
|
||||
const latestAssistantMessage = assistantMessages[assistantMessages.length - 1];
|
||||
if (!latestAssistantMessage?.annotations) return null;
|
||||
|
||||
// Find all SOURCES annotations
|
||||
const annotations = latestAssistantMessage.annotations as any[];
|
||||
const sourcesAnnotations = annotations.filter(
|
||||
(annotation) => annotation.type === 'SOURCES'
|
||||
);
|
||||
// Find all SOURCES annotations
|
||||
const annotations = latestAssistantMessage.annotations as any[];
|
||||
const sourcesAnnotations = annotations.filter(
|
||||
(annotation) => annotation.type === 'SOURCES'
|
||||
);
|
||||
|
||||
// Get the latest SOURCES annotation
|
||||
if (sourcesAnnotations.length === 0) return null;
|
||||
const latestSourcesAnnotation = sourcesAnnotations[sourcesAnnotations.length - 1];
|
||||
// Get the latest SOURCES annotation
|
||||
if (sourcesAnnotations.length === 0) return null;
|
||||
const latestSourcesAnnotation = sourcesAnnotations[sourcesAnnotations.length - 1];
|
||||
|
||||
if (!latestSourcesAnnotation.content) return null;
|
||||
if (!latestSourcesAnnotation.content) return null;
|
||||
|
||||
// Flatten all sources from all connectors
|
||||
const allSources: Source[] = [];
|
||||
latestSourcesAnnotation.content.forEach((connector: ConnectorSource) => {
|
||||
if (connector.sources && Array.isArray(connector.sources)) {
|
||||
connector.sources.forEach((source: SourceItem) => {
|
||||
allSources.push({
|
||||
id: source.id,
|
||||
title: source.title,
|
||||
description: source.description,
|
||||
url: source.url,
|
||||
connectorType: connector.type
|
||||
// Flatten all sources from all connectors
|
||||
const allSources: Source[] = [];
|
||||
latestSourcesAnnotation.content.forEach((connector: ConnectorSource) => {
|
||||
if (connector.sources && Array.isArray(connector.sources)) {
|
||||
connector.sources.forEach((source: SourceItem) => {
|
||||
allSources.push({
|
||||
id: source.id,
|
||||
title: source.title,
|
||||
description: source.description,
|
||||
url: source.url,
|
||||
connectorType: connector.type
|
||||
});
|
||||
});
|
||||
});
|
||||
}
|
||||
});
|
||||
}
|
||||
});
|
||||
|
||||
// Find the source with the matching ID
|
||||
const foundSource = allSources.find(source => source.id === citationId);
|
||||
// Find the source with the matching ID
|
||||
const foundSource = allSources.find(source => source.id === citationId);
|
||||
|
||||
return foundSource || null;
|
||||
};
|
||||
return foundSource || null;
|
||||
} else {
|
||||
// Use the specific message by index
|
||||
const message = messages[messageIndex];
|
||||
if (!message || message.role !== 'assistant' || !message.annotations) return null;
|
||||
|
||||
// Find all SOURCES annotations
|
||||
const annotations = message.annotations as any[];
|
||||
const sourcesAnnotations = annotations.filter(
|
||||
(annotation) => annotation.type === 'SOURCES'
|
||||
);
|
||||
|
||||
// Get the latest SOURCES annotation
|
||||
if (sourcesAnnotations.length === 0) return null;
|
||||
const latestSourcesAnnotation = sourcesAnnotations[sourcesAnnotations.length - 1];
|
||||
|
||||
if (!latestSourcesAnnotation.content) return null;
|
||||
|
||||
// Flatten all sources from all connectors
|
||||
const allSources: Source[] = [];
|
||||
latestSourcesAnnotation.content.forEach((connector: ConnectorSource) => {
|
||||
if (connector.sources && Array.isArray(connector.sources)) {
|
||||
connector.sources.forEach((source: SourceItem) => {
|
||||
allSources.push({
|
||||
id: source.id,
|
||||
title: source.title,
|
||||
description: source.description,
|
||||
url: source.url,
|
||||
connectorType: connector.type
|
||||
});
|
||||
});
|
||||
}
|
||||
});
|
||||
|
||||
// Find the source with the matching ID
|
||||
const foundSource = allSources.find(source => source.id === citationId);
|
||||
|
||||
return foundSource || null;
|
||||
}
|
||||
}, [messages]);
|
||||
|
||||
return (
|
||||
<>
|
||||
|
|
@ -685,7 +736,11 @@ const ChatPage = () => {
|
|||
<div className="flex-1">
|
||||
<Card className="border-gray-300 dark:border-gray-700">
|
||||
<CardContent className="p-3">
|
||||
<MarkdownViewer content={message.content} getCitationSource={getCitationSource} className="text-sm" />
|
||||
<MarkdownViewer
|
||||
content={message.content}
|
||||
getCitationSource={(id) => getCitationSource(id, index)}
|
||||
className="text-sm"
|
||||
/>
|
||||
</CardContent>
|
||||
</Card>
|
||||
</div>
|
||||
|
|
@ -856,7 +911,7 @@ const ChatPage = () => {
|
|||
))}
|
||||
|
||||
{connector.sources.length > INITIAL_SOURCES_DISPLAY && (
|
||||
<Dialog open={dialogOpen && activeTab === connector.type} onOpenChange={(open) => setDialogOpen(open)}>
|
||||
<Dialog open={dialogOpenId === connector.id} onOpenChange={(open) => setDialogOpenId(open ? connector.id : null)}>
|
||||
<DialogTrigger asChild>
|
||||
<Button variant="ghost" className="w-full text-sm text-gray-500 dark:text-gray-400">
|
||||
Show {connector.sources.length - INITIAL_SOURCES_DISPLAY} More Sources
|
||||
|
|
@ -901,13 +956,16 @@ const ChatPage = () => {
|
|||
return (
|
||||
<MarkdownViewer
|
||||
content={latestAnswer.content.join('\n')}
|
||||
getCitationSource={getCitationSource}
|
||||
getCitationSource={(id) => getCitationSource(id, index)}
|
||||
/>
|
||||
);
|
||||
}
|
||||
|
||||
// Fallback to the message content if no ANSWER annotation is available
|
||||
return <MarkdownViewer content={message.content} getCitationSource={getCitationSource} />;
|
||||
return <MarkdownViewer
|
||||
content={message.content}
|
||||
getCitationSource={(id) => getCitationSource(id, index)}
|
||||
/>;
|
||||
})()}
|
||||
</div>
|
||||
}
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
"use client";
|
||||
import { cn } from "@/lib/utils";
|
||||
import { IconArrowRight, IconBrandGithub, IconBrandDiscord } from "@tabler/icons-react";
|
||||
import { IconFileTypeDoc, IconBrandGithub, IconBrandDiscord } from "@tabler/icons-react";
|
||||
import Link from "next/link";
|
||||
import React from "react";
|
||||
import { motion } from "framer-motion";
|
||||
|
|
@ -20,11 +20,11 @@ export function ModernHeroWithGradients() {
|
|||
|
||||
<div className="relative z-20 flex flex-col items-center justify-center overflow-hidden rounded-3xl p-4 md:p-12 lg:p-16">
|
||||
<Link
|
||||
href="https://github.com/MODSetter/SurfSense"
|
||||
href="/docs"
|
||||
className="flex items-center gap-1 rounded-full border border-gray-200 bg-gradient-to-b from-gray-50 to-gray-100 px-4 py-1 text-center text-sm text-gray-800 shadow-sm dark:border-[#404040] dark:bg-gradient-to-b dark:from-[#5B5B5D] dark:to-[#262627] dark:text-white dark:shadow-inner dark:shadow-purple-500/10"
|
||||
>
|
||||
<span>SurfSense v0.0.6 Released</span>
|
||||
<IconArrowRight className="h-4 w-4 text-gray-800 dark:text-white" />
|
||||
<IconFileTypeDoc className="h-4 w-4 text-gray-800 dark:text-white" />
|
||||
<span>Documentation</span>
|
||||
</Link>
|
||||
{/* Import the Logo component or define it in this file */}
|
||||
<div className="flex items-center justify-center gap-4 mt-10 mb-2">
|
||||
|
|
|
|||
|
|
@ -24,8 +24,8 @@ interface NavbarProps {
|
|||
export const Navbar = () => {
|
||||
const navItems = [
|
||||
{
|
||||
name: "",
|
||||
link: "/",
|
||||
name: "Docs",
|
||||
link: "/docs",
|
||||
},
|
||||
// {
|
||||
// name: "Product",
|
||||
|
|
@ -118,53 +118,52 @@ const DesktopNav = ({ navItems, visible }: NavbarProps) => {
|
|||
<Logo className="h-8 w-8 rounded-md" />
|
||||
<span className="dark:text-white/90 text-gray-800 text-lg font-bold">SurfSense</span>
|
||||
</div>
|
||||
<motion.div
|
||||
className="lg:flex flex-row flex-1 items-center justify-center space-x-1 text-sm"
|
||||
animate={{
|
||||
scale: visible ? 0.9 : 1,
|
||||
justifyContent: visible ? "flex-end" : "center",
|
||||
}}
|
||||
>
|
||||
{navItems.map((navItem, idx) => (
|
||||
<motion.div
|
||||
key={`nav-item-${idx}`}
|
||||
onHoverStart={() => setHoveredIndex(idx)}
|
||||
className="relative"
|
||||
>
|
||||
<Link
|
||||
className="dark:text-white/90 text-gray-800 relative px-3 py-1.5 transition-colors"
|
||||
href={navItem.link}
|
||||
<div className="flex items-center gap-4">
|
||||
<motion.div
|
||||
className="lg:flex flex-row items-center justify-end space-x-1 text-sm"
|
||||
animate={{
|
||||
scale: visible ? 0.9 : 1,
|
||||
}}
|
||||
>
|
||||
{navItems.map((navItem, idx) => (
|
||||
<motion.div
|
||||
key={`nav-item-${idx}`}
|
||||
onHoverStart={() => setHoveredIndex(idx)}
|
||||
className="relative"
|
||||
>
|
||||
<span className="relative z-10">{navItem.name}</span>
|
||||
{hoveredIndex === idx && (
|
||||
<motion.div
|
||||
layoutId="menu-hover"
|
||||
className="absolute inset-0 rounded-full dark:bg-gradient-to-r dark:from-white/10 dark:to-white/20 bg-gradient-to-r from-gray-200 to-gray-300"
|
||||
initial={{ opacity: 0, scale: 0.8 }}
|
||||
animate={{
|
||||
opacity: 1,
|
||||
scale: 1.1,
|
||||
background: "var(--tw-dark) ? radial-gradient(circle at center, rgba(255,255,255,0.2) 0%, rgba(255,255,255,0.1) 50%, transparent 100%) : radial-gradient(circle at center, rgba(0,0,0,0.05) 0%, rgba(0,0,0,0.03) 50%, transparent 100%)",
|
||||
}}
|
||||
exit={{
|
||||
opacity: 0,
|
||||
scale: 0.8,
|
||||
transition: {
|
||||
duration: 0.2,
|
||||
},
|
||||
}}
|
||||
transition={{
|
||||
type: "spring",
|
||||
bounce: 0.4,
|
||||
duration: 0.4,
|
||||
}}
|
||||
/>
|
||||
)}
|
||||
</Link>
|
||||
</motion.div>
|
||||
))}
|
||||
</motion.div>
|
||||
<div className="flex items-center gap-2">
|
||||
<Link
|
||||
className="dark:text-white/90 text-gray-800 relative px-3 py-1.5 transition-colors"
|
||||
href={navItem.link}
|
||||
>
|
||||
<span className="relative z-10">{navItem.name}</span>
|
||||
{hoveredIndex === idx && (
|
||||
<motion.div
|
||||
layoutId="menu-hover"
|
||||
className="absolute inset-0 rounded-full dark:bg-gradient-to-r dark:from-white/10 dark:to-white/20 bg-gradient-to-r from-gray-200 to-gray-300"
|
||||
initial={{ opacity: 0, scale: 0.8 }}
|
||||
animate={{
|
||||
opacity: 1,
|
||||
scale: 1.1,
|
||||
background: "var(--tw-dark) ? radial-gradient(circle at center, rgba(255,255,255,0.2) 0%, rgba(255,255,255,0.1) 50%, transparent 100%) : radial-gradient(circle at center, rgba(0,0,0,0.05) 0%, rgba(0,0,0,0.03) 50%, transparent 100%)",
|
||||
}}
|
||||
exit={{
|
||||
opacity: 0,
|
||||
scale: 0.8,
|
||||
transition: {
|
||||
duration: 0.2,
|
||||
},
|
||||
}}
|
||||
transition={{
|
||||
type: "spring",
|
||||
bounce: 0.4,
|
||||
duration: 0.4,
|
||||
}}
|
||||
/>
|
||||
)}
|
||||
</Link>
|
||||
</motion.div>
|
||||
))}
|
||||
</motion.div>
|
||||
<ThemeTogglerComponent />
|
||||
<AnimatePresence mode="popLayout" initial={false}>
|
||||
{!visible && (
|
||||
|
|
|
|||
|
|
@ -20,7 +20,7 @@ type CitationProps = {
|
|||
/**
|
||||
* Citation component to handle individual citations
|
||||
*/
|
||||
export const Citation = ({ citationId, citationText, position, source }: CitationProps) => {
|
||||
export const Citation = React.memo(({ citationId, citationText, position, source }: CitationProps) => {
|
||||
const [open, setOpen] = useState(false);
|
||||
const citationKey = `citation-${citationId}-${position}`;
|
||||
|
||||
|
|
@ -38,37 +38,41 @@ export const Citation = ({ citationId, citationText, position, source }: Citatio
|
|||
</span>
|
||||
</sup>
|
||||
</DropdownMenuTrigger>
|
||||
<DropdownMenuContent align="start" className="w-80 p-0">
|
||||
<Card className="border-0 shadow-none">
|
||||
<div className="p-3 flex items-start gap-3">
|
||||
<div className="flex-shrink-0 w-7 h-7 flex items-center justify-center bg-muted rounded-full">
|
||||
{getConnectorIcon(source.connectorType || '')}
|
||||
</div>
|
||||
<div className="flex-1">
|
||||
<div className="flex items-center gap-2 mb-1">
|
||||
<h3 className="font-medium text-sm text-card-foreground">{source.title}</h3>
|
||||
{open && (
|
||||
<DropdownMenuContent align="start" className="w-80 p-0" forceMount>
|
||||
<Card className="border-0 shadow-none">
|
||||
<div className="p-3 flex items-start gap-3">
|
||||
<div className="flex-shrink-0 w-7 h-7 flex items-center justify-center bg-muted rounded-full">
|
||||
{getConnectorIcon(source.connectorType || '')}
|
||||
</div>
|
||||
<p className="text-sm text-muted-foreground mt-0.5">{source.description}</p>
|
||||
<div className="mt-2 flex items-center text-xs text-muted-foreground">
|
||||
<span className="truncate max-w-[200px]">{source.url}</span>
|
||||
<div className="flex-1">
|
||||
<div className="flex items-center gap-2 mb-1">
|
||||
<h3 className="font-medium text-sm text-card-foreground">{source.title}</h3>
|
||||
</div>
|
||||
<p className="text-sm text-muted-foreground mt-0.5">{source.description}</p>
|
||||
<div className="mt-2 flex items-center text-xs text-muted-foreground">
|
||||
<span className="truncate max-w-[200px]">{source.url}</span>
|
||||
</div>
|
||||
</div>
|
||||
<Button
|
||||
variant="ghost"
|
||||
size="icon"
|
||||
className="h-7 w-7 rounded-full"
|
||||
onClick={() => window.open(source.url, '_blank', 'noopener,noreferrer')}
|
||||
title="Open in new tab"
|
||||
>
|
||||
<ExternalLink className="h-3.5 w-3.5" />
|
||||
</Button>
|
||||
</div>
|
||||
<Button
|
||||
variant="ghost"
|
||||
size="icon"
|
||||
className="h-7 w-7 rounded-full"
|
||||
onClick={() => window.open(source.url, '_blank')}
|
||||
title="Open in new tab"
|
||||
>
|
||||
<ExternalLink className="h-3.5 w-3.5" />
|
||||
</Button>
|
||||
</div>
|
||||
</Card>
|
||||
</DropdownMenuContent>
|
||||
</Card>
|
||||
</DropdownMenuContent>
|
||||
)}
|
||||
</DropdownMenu>
|
||||
</span>
|
||||
);
|
||||
};
|
||||
});
|
||||
|
||||
Citation.displayName = 'Citation';
|
||||
|
||||
/**
|
||||
* Function to render text with citations
|
||||
|
|
|
|||
|
|
@ -1,4 +1,4 @@
|
|||
import React from "react";
|
||||
import React, { useMemo } from "react";
|
||||
import ReactMarkdown from "react-markdown";
|
||||
import rehypeRaw from "rehype-raw";
|
||||
import rehypeSanitize from "rehype-sanitize";
|
||||
|
|
@ -14,75 +14,87 @@ interface MarkdownViewerProps {
|
|||
}
|
||||
|
||||
export function MarkdownViewer({ content, className, getCitationSource }: MarkdownViewerProps) {
|
||||
// Memoize the markdown components to prevent unnecessary re-renders
|
||||
const components = useMemo(() => {
|
||||
return {
|
||||
// Define custom components for markdown elements
|
||||
p: ({node, children, ...props}: any) => {
|
||||
// If there's no getCitationSource function, just render normally
|
||||
if (!getCitationSource) {
|
||||
return <p className="my-2" {...props}>{children}</p>;
|
||||
}
|
||||
|
||||
// Process citations within paragraph content
|
||||
return <p className="my-2" {...props}>{processCitationsInReactChildren(children, getCitationSource)}</p>;
|
||||
},
|
||||
a: ({node, children, ...props}: any) => {
|
||||
// Process citations within link content if needed
|
||||
const processedChildren = getCitationSource
|
||||
? processCitationsInReactChildren(children, getCitationSource)
|
||||
: children;
|
||||
return <a className="text-primary hover:underline" {...props}>{processedChildren}</a>;
|
||||
},
|
||||
li: ({node, children, ...props}: any) => {
|
||||
// Process citations within list item content
|
||||
const processedChildren = getCitationSource
|
||||
? processCitationsInReactChildren(children, getCitationSource)
|
||||
: children;
|
||||
return <li {...props}>{processedChildren}</li>;
|
||||
},
|
||||
ul: ({node, ...props}: any) => <ul className="list-disc pl-5 my-2" {...props} />,
|
||||
ol: ({node, ...props}: any) => <ol className="list-decimal pl-5 my-2" {...props} />,
|
||||
h1: ({node, children, ...props}: any) => {
|
||||
const processedChildren = getCitationSource
|
||||
? processCitationsInReactChildren(children, getCitationSource)
|
||||
: children;
|
||||
return <h1 className="text-2xl font-bold mt-6 mb-2" {...props}>{processedChildren}</h1>;
|
||||
},
|
||||
h2: ({node, children, ...props}: any) => {
|
||||
const processedChildren = getCitationSource
|
||||
? processCitationsInReactChildren(children, getCitationSource)
|
||||
: children;
|
||||
return <h2 className="text-xl font-bold mt-5 mb-2" {...props}>{processedChildren}</h2>;
|
||||
},
|
||||
h3: ({node, children, ...props}: any) => {
|
||||
const processedChildren = getCitationSource
|
||||
? processCitationsInReactChildren(children, getCitationSource)
|
||||
: children;
|
||||
return <h3 className="text-lg font-bold mt-4 mb-2" {...props}>{processedChildren}</h3>;
|
||||
},
|
||||
h4: ({node, children, ...props}: any) => {
|
||||
const processedChildren = getCitationSource
|
||||
? processCitationsInReactChildren(children, getCitationSource)
|
||||
: children;
|
||||
return <h4 className="text-base font-bold mt-3 mb-1" {...props}>{processedChildren}</h4>;
|
||||
},
|
||||
blockquote: ({node, ...props}: any) => <blockquote className="border-l-4 border-muted pl-4 italic my-2" {...props} />,
|
||||
hr: ({node, ...props}: any) => <hr className="my-4 border-muted" {...props} />,
|
||||
img: ({node, ...props}: any) => <img className="max-w-full h-auto my-4 rounded" {...props} />,
|
||||
table: ({node, ...props}: any) => <div className="overflow-x-auto my-4"><table className="min-w-full divide-y divide-border" {...props} /></div>,
|
||||
th: ({node, ...props}: any) => <th className="px-3 py-2 text-left font-medium bg-muted" {...props} />,
|
||||
td: ({node, ...props}: any) => <td className="px-3 py-2 border-t border-border" {...props} />,
|
||||
code: ({node, className, children, ...props}: any) => {
|
||||
const match = /language-(\w+)/.exec(className || '');
|
||||
const isInline = !match;
|
||||
return isInline
|
||||
? <code className="bg-muted px-1 py-0.5 rounded text-xs" {...props}>{children}</code>
|
||||
: (
|
||||
<div className="relative my-4">
|
||||
<pre className="bg-muted p-4 rounded-md overflow-x-auto">
|
||||
<code className="text-xs" {...props}>{children}</code>
|
||||
</pre>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
};
|
||||
}, [getCitationSource]);
|
||||
|
||||
return (
|
||||
<div className={cn("prose prose-sm dark:prose-invert max-w-none", className)}>
|
||||
<ReactMarkdown
|
||||
rehypePlugins={[rehypeRaw, rehypeSanitize]}
|
||||
remarkPlugins={[remarkGfm]}
|
||||
components={{
|
||||
// Define custom components for markdown elements
|
||||
p: ({node, children, ...props}) => {
|
||||
// If there's no getCitationSource function, just render normally
|
||||
if (!getCitationSource) {
|
||||
return <p className="my-2" {...props}>{children}</p>;
|
||||
}
|
||||
|
||||
// Process citations within paragraph content
|
||||
return <p className="my-2" {...props}>{processCitationsInReactChildren(children, getCitationSource)}</p>;
|
||||
},
|
||||
a: ({node, children, ...props}) => {
|
||||
// Process citations within link content if needed
|
||||
const processedChildren = getCitationSource
|
||||
? processCitationsInReactChildren(children, getCitationSource)
|
||||
: children;
|
||||
return <a className="text-primary hover:underline" {...props}>{processedChildren}</a>;
|
||||
},
|
||||
ul: ({node, ...props}) => <ul className="list-disc pl-5 my-2" {...props} />,
|
||||
ol: ({node, ...props}) => <ol className="list-decimal pl-5 my-2" {...props} />,
|
||||
h1: ({node, children, ...props}) => {
|
||||
const processedChildren = getCitationSource
|
||||
? processCitationsInReactChildren(children, getCitationSource)
|
||||
: children;
|
||||
return <h1 className="text-2xl font-bold mt-6 mb-2" {...props}>{processedChildren}</h1>;
|
||||
},
|
||||
h2: ({node, children, ...props}) => {
|
||||
const processedChildren = getCitationSource
|
||||
? processCitationsInReactChildren(children, getCitationSource)
|
||||
: children;
|
||||
return <h2 className="text-xl font-bold mt-5 mb-2" {...props}>{processedChildren}</h2>;
|
||||
},
|
||||
h3: ({node, children, ...props}) => {
|
||||
const processedChildren = getCitationSource
|
||||
? processCitationsInReactChildren(children, getCitationSource)
|
||||
: children;
|
||||
return <h3 className="text-lg font-bold mt-4 mb-2" {...props}>{processedChildren}</h3>;
|
||||
},
|
||||
h4: ({node, children, ...props}) => {
|
||||
const processedChildren = getCitationSource
|
||||
? processCitationsInReactChildren(children, getCitationSource)
|
||||
: children;
|
||||
return <h4 className="text-base font-bold mt-3 mb-1" {...props}>{processedChildren}</h4>;
|
||||
},
|
||||
blockquote: ({node, ...props}) => <blockquote className="border-l-4 border-muted pl-4 italic my-2" {...props} />,
|
||||
hr: ({node, ...props}) => <hr className="my-4 border-muted" {...props} />,
|
||||
img: ({node, ...props}) => <img className="max-w-full h-auto my-4 rounded" {...props} />,
|
||||
table: ({node, ...props}) => <div className="overflow-x-auto my-4"><table className="min-w-full divide-y divide-border" {...props} /></div>,
|
||||
th: ({node, ...props}) => <th className="px-3 py-2 text-left font-medium bg-muted" {...props} />,
|
||||
td: ({node, ...props}) => <td className="px-3 py-2 border-t border-border" {...props} />,
|
||||
code: ({node, className, children, ...props}: any) => {
|
||||
const match = /language-(\w+)/.exec(className || '');
|
||||
const isInline = !match;
|
||||
return isInline
|
||||
? <code className="bg-muted px-1 py-0.5 rounded text-xs" {...props}>{children}</code>
|
||||
: (
|
||||
<div className="relative my-4">
|
||||
<pre className="bg-muted p-4 rounded-md overflow-x-auto">
|
||||
<code className="text-xs" {...props}>{children}</code>
|
||||
</pre>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
}}
|
||||
components={components}
|
||||
>
|
||||
{content}
|
||||
</ReactMarkdown>
|
||||
|
|
@ -91,7 +103,7 @@ export function MarkdownViewer({ content, className, getCitationSource }: Markdo
|
|||
}
|
||||
|
||||
// Helper function to process citations within React children
|
||||
function processCitationsInReactChildren(children: React.ReactNode, getCitationSource: (id: number) => Source | null): React.ReactNode {
|
||||
const processCitationsInReactChildren = (children: React.ReactNode, getCitationSource: (id: number) => Source | null): React.ReactNode => {
|
||||
// If children is not an array or string, just return it
|
||||
if (!children || (typeof children !== 'string' && !Array.isArray(children))) {
|
||||
return children;
|
||||
|
|
@ -113,10 +125,10 @@ function processCitationsInReactChildren(children: React.ReactNode, getCitationS
|
|||
}
|
||||
|
||||
return children;
|
||||
}
|
||||
};
|
||||
|
||||
// Process citation references in text content
|
||||
function processCitationsInText(text: string, getCitationSource: (id: number) => Source | null): React.ReactNode[] {
|
||||
const processCitationsInText = (text: string, getCitationSource: (id: number) => Source | null): React.ReactNode[] => {
|
||||
// Use improved regex to catch citation numbers more reliably
|
||||
// This will match patterns like [1], [42], etc. including when they appear at the end of a line or sentence
|
||||
const citationRegex = /\[(\d+)\]/g;
|
||||
|
|
@ -124,14 +136,8 @@ function processCitationsInText(text: string, getCitationSource: (id: number) =>
|
|||
let lastIndex = 0;
|
||||
let match;
|
||||
let position = 0;
|
||||
|
||||
// Debug log for troubleshooting
|
||||
console.log("Processing citations in text:", text);
|
||||
|
||||
while ((match = citationRegex.exec(text)) !== null) {
|
||||
// Log each match for debugging
|
||||
console.log("Citation match found:", match[0], "at index", match.index);
|
||||
|
||||
// Add text before the citation
|
||||
if (match.index > lastIndex) {
|
||||
parts.push(text.substring(lastIndex, match.index));
|
||||
|
|
@ -141,9 +147,6 @@ function processCitationsInText(text: string, getCitationSource: (id: number) =>
|
|||
const citationId = parseInt(match[1], 10);
|
||||
const source = getCitationSource(citationId);
|
||||
|
||||
// Log the citation details
|
||||
console.log("Citation ID:", citationId, "Source:", source ? "found" : "not found");
|
||||
|
||||
parts.push(
|
||||
<Citation
|
||||
key={`citation-${citationId}-${position}`}
|
||||
|
|
@ -164,4 +167,4 @@ function processCitationsInText(text: string, getCitationSource: (id: number) =>
|
|||
}
|
||||
|
||||
return parts;
|
||||
}
|
||||
};
|
||||
168
surfsense_web/content/docs/docker-installation.mdx
Normal file
168
surfsense_web/content/docs/docker-installation.mdx
Normal file
|
|
@ -0,0 +1,168 @@
|
|||
---
|
||||
title: Docker Installation
|
||||
description: Setting up SurfSense using Docker
|
||||
full: true
|
||||
---
|
||||
## Known Limitations
|
||||
|
||||
⚠️ **Important Note:** Currently, the following features have limited functionality when running in Docker:
|
||||
|
||||
- **Ollama integration:** Local Ollama models do not work when running SurfSense in Docker. Please use other LLM providers like OpenAI or Gemini instead.
|
||||
- **Web crawler functionality:** The web crawler feature currently doesn't work properly within the Docker environment.
|
||||
|
||||
We're actively working to resolve these limitations in future releases.
|
||||
|
||||
|
||||
# Docker Installation
|
||||
|
||||
This guide explains how to run SurfSense using Docker Compose, which is the preferred and recommended method for deployment.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before you begin, ensure you have:
|
||||
|
||||
- [Docker](https://docs.docker.com/get-docker/) and [Docker Compose](https://docs.docker.com/compose/install/) installed on your machine
|
||||
- [Git](https://git-scm.com/downloads) (to clone the repository)
|
||||
- Completed all the [prerequisite setup steps](/docs) including:
|
||||
- PGVector setup
|
||||
- Google OAuth configuration
|
||||
- Unstructured.io API key
|
||||
- Other required API keys
|
||||
|
||||
## Installation Steps
|
||||
|
||||
1. **Configure Environment Variables**
|
||||
|
||||
Set up the necessary environment variables:
|
||||
|
||||
**Linux/macOS:**
|
||||
```bash
|
||||
# Copy example environment files
|
||||
cp surfsense_backend/.env.example surfsense_backend/.env
|
||||
cp surfsense_web/.env.example surfsense_web/.env
|
||||
```
|
||||
|
||||
**Windows (Command Prompt):**
|
||||
```cmd
|
||||
copy surfsense_backend\.env.example surfsense_backend\.env
|
||||
copy surfsense_web\.env.example surfsense_web\.env
|
||||
```
|
||||
|
||||
**Windows (PowerShell):**
|
||||
```powershell
|
||||
Copy-Item -Path surfsense_backend\.env.example -Destination surfsense_backend\.env
|
||||
Copy-Item -Path surfsense_web\.env.example -Destination surfsense_web\.env
|
||||
```
|
||||
|
||||
Edit both `.env` files and fill in the required values:
|
||||
|
||||
**Backend Environment Variables:**
|
||||
|
||||
| ENV VARIABLE | DESCRIPTION |
|
||||
|--------------|-------------|
|
||||
| DATABASE_URL | PostgreSQL connection string (e.g., `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`) |
|
||||
| SECRET_KEY | JWT Secret key for authentication (should be a secure random string) |
|
||||
| GOOGLE_OAUTH_CLIENT_ID | Google OAuth client ID obtained from Google Cloud Console |
|
||||
| GOOGLE_OAUTH_CLIENT_SECRET | Google OAuth client secret obtained from Google Cloud Console |
|
||||
| NEXT_FRONTEND_URL | URL where your frontend application is hosted (e.g., `http://localhost:3000`) |
|
||||
| EMBEDDING_MODEL | Name of the embedding model (e.g., `mixedbread-ai/mxbai-embed-large-v1`) |
|
||||
| RERANKERS_MODEL_NAME | Name of the reranker model (e.g., `ms-marco-MiniLM-L-12-v2`) |
|
||||
| RERANKERS_MODEL_TYPE | Type of reranker model (e.g., `flashrank`) |
|
||||
| FAST_LLM | LiteLLM routed smaller, faster LLM (e.g., `openai/gpt-4o-mini`, `ollama/deepseek-r1:8b`) |
|
||||
| STRATEGIC_LLM | LiteLLM routed advanced LLM for complex tasks (e.g., `openai/gpt-4o`, `ollama/gemma3:12b`) |
|
||||
| LONG_CONTEXT_LLM | LiteLLM routed LLM for longer context windows (e.g., `gemini/gemini-2.0-flash`, `ollama/deepseek-r1:8b`) |
|
||||
| UNSTRUCTURED_API_KEY | API key for Unstructured.io service for document parsing |
|
||||
| FIRECRAWL_API_KEY | API key for Firecrawl service for web crawling |
|
||||
|
||||
Include API keys for the LLM providers you're using. For example:
|
||||
- `OPENAI_API_KEY`: If using OpenAI models
|
||||
- `GEMINI_API_KEY`: If using Google Gemini models
|
||||
|
||||
For other LLM providers, refer to the [LiteLLM documentation](https://docs.litellm.ai/docs/providers).
|
||||
|
||||
**Frontend Environment Variables:**
|
||||
|
||||
| ENV VARIABLE | DESCRIPTION |
|
||||
|--------------|-------------|
|
||||
| NEXT_PUBLIC_FASTAPI_BACKEND_URL | URL of the backend service (e.g., `http://localhost:8000`) |
|
||||
|
||||
2. **Build and Start Containers**
|
||||
|
||||
Start the Docker containers:
|
||||
|
||||
**Linux/macOS/Windows:**
|
||||
```bash
|
||||
docker-compose up --build
|
||||
```
|
||||
|
||||
To run in detached mode (in the background):
|
||||
|
||||
**Linux/macOS/Windows:**
|
||||
```bash
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
**Note for Windows users:** If you're using older Docker Desktop versions, you might need to use `docker compose` (with a space) instead of `docker-compose`.
|
||||
|
||||
3. **Access the Applications**
|
||||
|
||||
Once the containers are running, you can access:
|
||||
- Frontend: [http://localhost:3000](http://localhost:3000)
|
||||
- Backend API: [http://localhost:8000](http://localhost:8000)
|
||||
- API Documentation: [http://localhost:8000/docs](http://localhost:8000/docs)
|
||||
|
||||
## Useful Docker Commands
|
||||
|
||||
### Container Management
|
||||
|
||||
- **Stop containers:**
|
||||
|
||||
**Linux/macOS/Windows:**
|
||||
```bash
|
||||
docker-compose down
|
||||
```
|
||||
|
||||
- **View logs:**
|
||||
|
||||
**Linux/macOS/Windows:**
|
||||
```bash
|
||||
# All services
|
||||
docker-compose logs -f
|
||||
|
||||
# Specific service
|
||||
docker-compose logs -f backend
|
||||
docker-compose logs -f frontend
|
||||
docker-compose logs -f db
|
||||
```
|
||||
|
||||
- **Restart a specific service:**
|
||||
|
||||
**Linux/macOS/Windows:**
|
||||
```bash
|
||||
docker-compose restart backend
|
||||
```
|
||||
|
||||
- **Execute commands in a running container:**
|
||||
|
||||
**Linux/macOS/Windows:**
|
||||
```bash
|
||||
# Backend
|
||||
docker-compose exec backend python -m pytest
|
||||
|
||||
# Frontend
|
||||
docker-compose exec frontend pnpm lint
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **Linux/macOS:** If you encounter permission errors, you may need to run the docker commands with `sudo`.
|
||||
- **Windows:** If you see access denied errors, make sure you're running Command Prompt or PowerShell as Administrator.
|
||||
- If ports are already in use, modify the port mappings in the `docker-compose.yml` file.
|
||||
- For backend dependency issues, check the `Dockerfile` in the backend directory.
|
||||
- For frontend dependency issues, check the `Dockerfile` in the frontend directory.
|
||||
- **Windows-specific:** If you encounter line ending issues (CRLF vs LF), configure Git to handle line endings properly with `git config --global core.autocrlf true` before cloning the repository.
|
||||
|
||||
|
||||
## Next Steps
|
||||
|
||||
Once your installation is complete, you can start using SurfSense! Navigate to the frontend URL and log in using your Google account.
|
||||
|
|
@ -1,7 +1,99 @@
|
|||
---
|
||||
title: Welcome Docs
|
||||
title: Prerequisites
|
||||
description: Required setup's before setting up SurfSense
|
||||
full: true
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
I love Docs.
|
||||
|
||||
## PGVector installation Guide
|
||||
|
||||
SurfSense requires the pgvector extension for PostgreSQL:
|
||||
|
||||
### Linux and Mac
|
||||
|
||||
Compile and install the extension (supports Postgres 13+)
|
||||
|
||||
```sh
|
||||
cd /tmp
|
||||
git clone --branch v0.8.0 https://github.com/pgvector/pgvector.git
|
||||
cd pgvector
|
||||
make
|
||||
make install # may need sudo
|
||||
```
|
||||
|
||||
See the [installation notes](https://github.com/pgvector/pgvector/tree/master#installation-notes---linux-and-mac) if you run into issues
|
||||
|
||||
### Windows
|
||||
|
||||
Ensure [C++ support in Visual Studio](https://learn.microsoft.com/en-us/cpp/build/building-on-the-command-line?view=msvc-170#download-and-install-the-tools) is installed, and run:
|
||||
|
||||
```cmd
|
||||
call "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvars64.bat"
|
||||
```
|
||||
|
||||
Note: The exact path will vary depending on your Visual Studio version and edition
|
||||
|
||||
Then use `nmake` to build:
|
||||
|
||||
```cmd
|
||||
set "PGROOT=C:\Program Files\PostgreSQL\16"
|
||||
cd %TEMP%
|
||||
git clone --branch v0.8.0 https://github.com/pgvector/pgvector.git
|
||||
cd pgvector
|
||||
nmake /F Makefile.win
|
||||
nmake /F Makefile.win install
|
||||
```
|
||||
|
||||
See the [installation notes](https://github.com/pgvector/pgvector/tree/master#installation-notes---windows) if you run into issues
|
||||
|
||||
---
|
||||
|
||||
## Google OAuth Setup
|
||||
|
||||
SurfSense user management and authentication works on Google OAuth. Lets set it up.
|
||||
|
||||
1. Login to your [Google Developer Console](https://console.cloud.google.com/)
|
||||
2. Enable People API.
|
||||

|
||||
3. Set up OAuth consent screen.
|
||||

|
||||
4. Create OAuth client ID and secret.
|
||||

|
||||
5. It should look like this.
|
||||

|
||||
|
||||
---
|
||||
|
||||
## File Upload's
|
||||
|
||||
Files are converted to LLM friendly formats using [Unstructured](https://github.com/Unstructured-IO/unstructured)
|
||||
|
||||
1. Get an Unstructured.io API key from [Unstructured Platform](https://platform.unstructured.io/)
|
||||
2. You should be able to generate API keys once registered
|
||||

|
||||
|
||||
---
|
||||
|
||||
## LLM Observability (Optional)
|
||||
|
||||
This is not required for SurfSense to work. But it is always a good idea to monitor LLM interactions. So we do not have those WTH moments.
|
||||
|
||||
1. Get a LangSmith API key from [smith.langchain.com](https://smith.langchain.com/)
|
||||
2. This helps in observing SurfSense Researcher Agent.
|
||||

|
||||
|
||||
---
|
||||
|
||||
## Crawler
|
||||
|
||||
SurfSense have 2 options for saving webpages:
|
||||
- [SurfSense Extension](https://github.com/MODSetter/SurfSense/tree/main/surfsense_browser_extension) (Overall better experience & ability to save private webpages, recommended)
|
||||
- Crawler (If you want to save public webpages)
|
||||
|
||||
**NOTE:** SurfSense currently uses [Firecrawl.py](https://www.firecrawl.dev/) for web crawling. If you plan on using the crawler, you will need to create a Firecrawl account and get an API key.
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
Once you have all prerequisites in place, proceed to the [installation guide](/docs/installation) to set up SurfSense.
|
||||
21
surfsense_web/content/docs/installation.mdx
Normal file
21
surfsense_web/content/docs/installation.mdx
Normal file
|
|
@ -0,0 +1,21 @@
|
|||
---
|
||||
title: Installation
|
||||
description: Current ways to use SurfSense
|
||||
full: true
|
||||
---
|
||||
|
||||
# Installing SurfSense
|
||||
|
||||
There are two ways to install SurfSense, but both require the repository to be cloned first. Clone [SurfSense](https://github.com/MODSetter/SurfSense) and then:
|
||||
|
||||
## Docker Installation
|
||||
|
||||
This method provides a containerized environment with all dependencies pre-configured. Less Customization.
|
||||
|
||||
[Learn more about Docker installation](/docs/docker-installation)
|
||||
|
||||
## Manual Installation (Preferred)
|
||||
|
||||
For users who prefer more control over the installation process or need to customize their setup, we also provide manual installation instructions.
|
||||
|
||||
[Learn more about Manual installation](/docs/manual-installation)
|
||||
258
surfsense_web/content/docs/manual-installation.mdx
Normal file
258
surfsense_web/content/docs/manual-installation.mdx
Normal file
|
|
@ -0,0 +1,258 @@
|
|||
---
|
||||
title: Manual Installation
|
||||
description: Setting up SurfSense manually for customized deployments (Preferred)
|
||||
full: true
|
||||
---
|
||||
|
||||
# Manual Installation (Preferred)
|
||||
|
||||
This guide provides step-by-step instructions for setting up SurfSense without Docker. This approach gives you more control over the installation process and allows for customization of the environment.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before beginning the manual installation, ensure you have completed all the [prerequisite setup steps](/docs), including:
|
||||
|
||||
- PGVector installation
|
||||
- Google OAuth setup
|
||||
- Unstructured.io API key
|
||||
- LLM observability (optional)
|
||||
- Crawler setup (if needed)
|
||||
|
||||
## Backend Setup
|
||||
|
||||
The backend is the core of SurfSense. Follow these steps to set it up:
|
||||
|
||||
### 1. Environment Configuration
|
||||
|
||||
First, create and configure your environment variables by copying the example file:
|
||||
|
||||
**Linux/macOS:**
|
||||
```bash
|
||||
cd surfsense_backend
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
**Windows (Command Prompt):**
|
||||
```cmd
|
||||
cd surfsense_backend
|
||||
copy .env.example .env
|
||||
```
|
||||
|
||||
**Windows (PowerShell):**
|
||||
```powershell
|
||||
cd surfsense_backend
|
||||
Copy-Item -Path .env.example -Destination .env
|
||||
```
|
||||
|
||||
Edit the `.env` file and set the following variables:
|
||||
|
||||
| ENV VARIABLE | DESCRIPTION |
|
||||
|--------------|-------------|
|
||||
| DATABASE_URL | PostgreSQL connection string (e.g., `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`) |
|
||||
| SECRET_KEY | JWT Secret key for authentication (should be a secure random string) |
|
||||
| GOOGLE_OAUTH_CLIENT_ID | Google OAuth client ID |
|
||||
| GOOGLE_OAUTH_CLIENT_SECRET | Google OAuth client secret |
|
||||
| NEXT_FRONTEND_URL | Frontend application URL (e.g., `http://localhost:3000`) |
|
||||
| EMBEDDING_MODEL | Name of the embedding model (e.g., `mixedbread-ai/mxbai-embed-large-v1`) |
|
||||
| RERANKERS_MODEL_NAME | Name of the reranker model (e.g., `ms-marco-MiniLM-L-12-v2`) |
|
||||
| RERANKERS_MODEL_TYPE | Type of reranker model (e.g., `flashrank`) |
|
||||
| FAST_LLM | LiteLLM routed faster LLM (e.g., `openai/gpt-4o-mini`, `ollama/deepseek-r1:8b`) |
|
||||
| STRATEGIC_LLM | LiteLLM routed advanced LLM (e.g., `openai/gpt-4o`, `ollama/gemma3:12b`) |
|
||||
| LONG_CONTEXT_LLM | LiteLLM routed long-context LLM (e.g., `gemini/gemini-2.0-flash`, `ollama/deepseek-r1:8b`) |
|
||||
| UNSTRUCTURED_API_KEY | API key for Unstructured.io service |
|
||||
| FIRECRAWL_API_KEY | API key for Firecrawl service (if using crawler) |
|
||||
|
||||
**Important**: Since LLM calls are routed through LiteLLM, include API keys for the LLM providers you're using:
|
||||
- For OpenAI models: `OPENAI_API_KEY`
|
||||
- For Google Gemini models: `GEMINI_API_KEY`
|
||||
- For other providers, refer to the [LiteLLM documentation](https://docs.litellm.ai/docs/providers)
|
||||
|
||||
### 2. Install Dependencies
|
||||
|
||||
Install the backend dependencies using `uv`:
|
||||
|
||||
**Linux/macOS:**
|
||||
```bash
|
||||
# Install uv if you don't have it
|
||||
curl -fsSL https://astral.sh/uv/install.sh | bash
|
||||
|
||||
# Install dependencies
|
||||
uv sync
|
||||
```
|
||||
|
||||
**Windows (PowerShell):**
|
||||
```powershell
|
||||
# Install uv if you don't have it
|
||||
iwr -useb https://astral.sh/uv/install.ps1 | iex
|
||||
|
||||
# Install dependencies
|
||||
uv sync
|
||||
```
|
||||
|
||||
**Windows (Command Prompt):**
|
||||
```cmd
|
||||
# Install dependencies with uv (after installing uv)
|
||||
uv sync
|
||||
```
|
||||
|
||||
### 3. Run the Backend
|
||||
|
||||
Start the backend server:
|
||||
|
||||
**Linux/macOS/Windows:**
|
||||
```bash
|
||||
# Run without hot reloading
|
||||
uv run main.py
|
||||
|
||||
# Or with hot reloading for development
|
||||
uv run main.py --reload
|
||||
```
|
||||
|
||||
If everything is set up correctly, you should see output indicating the server is running on `http://localhost:8000`.
|
||||
|
||||
## Frontend Setup
|
||||
|
||||
### 1. Environment Configuration
|
||||
|
||||
Set up the frontend environment:
|
||||
|
||||
**Linux/macOS:**
|
||||
```bash
|
||||
cd surfsense_web
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
**Windows (Command Prompt):**
|
||||
```cmd
|
||||
cd surfsense_web
|
||||
copy .env.example .env
|
||||
```
|
||||
|
||||
**Windows (PowerShell):**
|
||||
```powershell
|
||||
cd surfsense_web
|
||||
Copy-Item -Path .env.example -Destination .env
|
||||
```
|
||||
|
||||
Edit the `.env` file and set:
|
||||
|
||||
| ENV VARIABLE | DESCRIPTION |
|
||||
|--------------|-------------|
|
||||
| NEXT_PUBLIC_FASTAPI_BACKEND_URL | Backend URL (e.g., `http://localhost:8000`) |
|
||||
|
||||
### 2. Install Dependencies
|
||||
|
||||
Install the frontend dependencies:
|
||||
|
||||
**Linux/macOS:**
|
||||
```bash
|
||||
# Install pnpm if you don't have it
|
||||
npm install -g pnpm
|
||||
|
||||
# Install dependencies
|
||||
pnpm install
|
||||
```
|
||||
|
||||
**Windows:**
|
||||
```powershell
|
||||
# Install pnpm if you don't have it
|
||||
npm install -g pnpm
|
||||
|
||||
# Install dependencies
|
||||
pnpm install
|
||||
```
|
||||
|
||||
### 3. Run the Frontend
|
||||
|
||||
Start the Next.js development server:
|
||||
|
||||
**Linux/macOS/Windows:**
|
||||
```bash
|
||||
pnpm run dev
|
||||
```
|
||||
|
||||
The frontend should now be running at `http://localhost:3000`.
|
||||
|
||||
## Browser Extension Setup (Optional)
|
||||
|
||||
The SurfSense browser extension allows you to save any webpage, including those protected behind authentication.
|
||||
|
||||
### 1. Environment Configuration
|
||||
|
||||
**Linux/macOS:**
|
||||
```bash
|
||||
cd surfsense_browser_extension
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
**Windows (Command Prompt):**
|
||||
```cmd
|
||||
cd surfsense_browser_extension
|
||||
copy .env.example .env
|
||||
```
|
||||
|
||||
**Windows (PowerShell):**
|
||||
```powershell
|
||||
cd surfsense_browser_extension
|
||||
Copy-Item -Path .env.example -Destination .env
|
||||
```
|
||||
|
||||
Edit the `.env` file:
|
||||
|
||||
| ENV VARIABLE | DESCRIPTION |
|
||||
|--------------|-------------|
|
||||
| PLASMO_PUBLIC_BACKEND_URL | SurfSense Backend URL (e.g., `http://127.0.0.1:8000`) |
|
||||
|
||||
### 2. Build the Extension
|
||||
|
||||
Build the extension for your browser using the [Plasmo framework](https://docs.plasmo.com/framework/workflows/build#with-a-specific-target).
|
||||
|
||||
**Linux/macOS/Windows:**
|
||||
```bash
|
||||
# Install dependencies
|
||||
pnpm install
|
||||
|
||||
# Build for Chrome (default)
|
||||
pnpm build
|
||||
|
||||
# Or for other browsers
|
||||
pnpm build --target=firefox
|
||||
pnpm build --target=edge
|
||||
```
|
||||
|
||||
### 3. Load the Extension
|
||||
|
||||
Load the extension in your browser's developer mode and configure it with your SurfSense API key.
|
||||
|
||||
## Verification
|
||||
|
||||
To verify your installation:
|
||||
|
||||
1. Open your browser and navigate to `http://localhost:3000`
|
||||
2. Sign in with your Google account
|
||||
3. Create a search space and try uploading a document
|
||||
4. Test the chat functionality with your uploaded content
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **Database Connection Issues**: Verify your PostgreSQL server is running and pgvector is properly installed
|
||||
- **Authentication Problems**: Check your Google OAuth configuration and ensure redirect URIs are set correctly
|
||||
- **LLM Errors**: Confirm your LLM API keys are valid and the selected models are accessible
|
||||
- **File Upload Failures**: Validate your Unstructured.io API key
|
||||
- **Windows-specific**: If you encounter path issues, ensure you're using the correct path separator (`\` instead of `/`)
|
||||
- **macOS-specific**: If you encounter permission issues, you may need to use `sudo` for some installation commands
|
||||
|
||||
## Next Steps
|
||||
|
||||
Now that you have SurfSense running locally, you can explore its features:
|
||||
|
||||
- Create search spaces for organizing your content
|
||||
- Upload documents or use the browser extension to save webpages
|
||||
- Ask questions about your saved content
|
||||
- Explore the advanced RAG capabilities
|
||||
|
||||
For production deployments, consider setting up:
|
||||
- A reverse proxy like Nginx
|
||||
- SSL certificates for secure connections
|
||||
- Proper database backups
|
||||
- User access controls
|
||||
12
surfsense_web/content/docs/meta.json
Normal file
12
surfsense_web/content/docs/meta.json
Normal file
|
|
@ -0,0 +1,12 @@
|
|||
{
|
||||
"title": "Setup",
|
||||
"description": "The setup guide for Surfsense",
|
||||
"root": true,
|
||||
"pages": [
|
||||
"---Setup---",
|
||||
"index",
|
||||
"installation",
|
||||
"docker-installation",
|
||||
"manual-installation"
|
||||
]
|
||||
}
|
||||
BIN
surfsense_web/public/docs/google_oauth_client.png
Normal file
BIN
surfsense_web/public/docs/google_oauth_client.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 98 KiB |
BIN
surfsense_web/public/docs/google_oauth_config.png
Normal file
BIN
surfsense_web/public/docs/google_oauth_config.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 47 KiB |
BIN
surfsense_web/public/docs/google_oauth_people_api.png
Normal file
BIN
surfsense_web/public/docs/google_oauth_people_api.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 68 KiB |
BIN
surfsense_web/public/docs/google_oauth_screen.png
Normal file
BIN
surfsense_web/public/docs/google_oauth_screen.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 71 KiB |
BIN
surfsense_web/public/docs/langsmith.png
Normal file
BIN
surfsense_web/public/docs/langsmith.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 301 KiB |
BIN
surfsense_web/public/docs/unstructured.png
Normal file
BIN
surfsense_web/public/docs/unstructured.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 108 KiB |
Loading…
Add table
Add a link
Reference in a new issue