mirror of
https://github.com/MODSetter/SurfSense.git
synced 2026-05-21 18:55:16 +02:00
711 lines
36 KiB
Text
711 lines
36 KiB
Text
---
|
|
title: Manual Installation
|
|
description: Setting up SurfSense manually for customized deployments
|
|
icon: Wrench
|
|
---
|
|
|
|
# Manual Installation (Preferred)
|
|
|
|
This guide provides step-by-step instructions for setting up SurfSense without Docker. This approach gives you more control over the installation process and allows for customization of the environment.
|
|
|
|
## Prerequisites
|
|
|
|
Before beginning the manual installation, ensure you have the following installed and configured:
|
|
|
|
### Required Software
|
|
- **Python 3.12+** - Backend runtime environment
|
|
- **Node.js 20+** - Frontend runtime environment
|
|
- **PostgreSQL 14+** - Database server (must be configured with `wal_level = logical` for [Zero real-time sync](/docs/how-to/zero-sync))
|
|
- **PGVector** - PostgreSQL extension for vector similarity search
|
|
- **Redis** - Message broker for Celery task queue
|
|
- **Zero-cache** - Rocicorp Zero real-time sync server (run via Docker; see [Zero-Cache Setup](#zero-cache-setup) below)
|
|
- **Docker** - Required to run zero-cache (the simplest way; the Postgres + Redis can be installed natively)
|
|
- **Git** - Version control (to clone the repository)
|
|
|
|
### Required Services & API Keys
|
|
|
|
Complete all the [setup steps](/docs), including:
|
|
|
|
- **Authentication Setup** (choose one):
|
|
- Google OAuth credentials (for `AUTH_TYPE=GOOGLE`)
|
|
- Local authentication setup (for `AUTH_TYPE=LOCAL`)
|
|
- **File Processing ETL Service** (choose one):
|
|
- Unstructured.io API key (Supports 34+ formats)
|
|
- LlamaCloud API key (enhanced parsing, supports 50+ formats)
|
|
- Docling (local processing, no API key required, supports PDF, Office docs, images, HTML, CSV)
|
|
- **Other API keys** as needed for your use case
|
|
|
|
## Backend Setup
|
|
|
|
The backend is the core of SurfSense. Follow these steps to set it up:
|
|
|
|
### 1. Environment Configuration
|
|
|
|
First, create and configure your environment variables by copying the example file:
|
|
|
|
**Linux/macOS:**
|
|
|
|
```bash
|
|
cd surfsense_backend
|
|
cp .env.example .env
|
|
```
|
|
|
|
**Windows (Command Prompt):**
|
|
|
|
```cmd
|
|
cd surfsense_backend
|
|
copy .env.example .env
|
|
```
|
|
|
|
**Windows (PowerShell):**
|
|
|
|
```powershell
|
|
cd surfsense_backend
|
|
Copy-Item -Path .env.example -Destination .env
|
|
```
|
|
|
|
Edit the `.env` file and set the following variables:
|
|
|
|
| ENV VARIABLE | DESCRIPTION |
|
|
| -------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
| DATABASE_URL | PostgreSQL connection string (e.g., `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`) |
|
|
| SECRET_KEY | JWT Secret key for authentication (should be a secure random string) |
|
|
| NEXT_FRONTEND_URL | URL where your frontend application is hosted (e.g., `http://localhost:3000`) |
|
|
| BACKEND_URL | (Optional) Public URL of the backend for OAuth callbacks (e.g., `https://api.yourdomain.com`). Required when running behind a reverse proxy with HTTPS. Used to set correct OAuth redirect URLs and secure cookies. |
|
|
| AUTH_TYPE | Authentication method: `GOOGLE` for OAuth with Google, `LOCAL` for email/password authentication |
|
|
| GOOGLE_OAUTH_CLIENT_ID | (Optional) Client ID from Google Cloud Console (required if AUTH_TYPE=GOOGLE) |
|
|
| GOOGLE_OAUTH_CLIENT_SECRET | (Optional) Client secret from Google Cloud Console (required if AUTH_TYPE=GOOGLE) |
|
|
| EMBEDDING_MODEL | Name of the embedding model (e.g., `sentence-transformers/all-MiniLM-L6-v2`, `openai://text-embedding-ada-002`) |
|
|
| RERANKERS_ENABLED | (Optional) Enable or disable document reranking for improved search results (e.g., `TRUE` or `FALSE`, default: `FALSE`) |
|
|
| RERANKERS_MODEL_NAME | Name of the reranker model (e.g., `ms-marco-MiniLM-L-12-v2`) (required if RERANKERS_ENABLED=TRUE) |
|
|
| RERANKERS_MODEL_TYPE | Type of reranker model (e.g., `flashrank`) (required if RERANKERS_ENABLED=TRUE) |
|
|
| TTS_SERVICE | Text-to-Speech API provider for Podcasts (e.g., `local/kokoro`, `openai/tts-1`). See [supported providers](https://docs.litellm.ai/docs/text_to_speech#supported-providers) |
|
|
| TTS_SERVICE_API_KEY | (Optional if local) API key for the Text-to-Speech service |
|
|
| TTS_SERVICE_API_BASE | (Optional) Custom API base URL for the Text-to-Speech service |
|
|
| STT_SERVICE | Speech-to-Text API provider for Audio Files (e.g., `local/base`, `openai/whisper-1`). See [supported providers](https://docs.litellm.ai/docs/audio_transcription#supported-providers) |
|
|
| STT_SERVICE_API_KEY | (Optional if local) API key for the Speech-to-Text service |
|
|
| STT_SERVICE_API_BASE | (Optional) Custom API base URL for the Speech-to-Text service |
|
|
| FIRECRAWL_API_KEY | (Optional) API key for Firecrawl service for web crawling |
|
|
| ETL_SERVICE | Document parsing service: `UNSTRUCTURED` (supports 34+ formats), `LLAMACLOUD` (supports 50+ formats including legacy document types), or `DOCLING` (local processing, supports PDF, Office docs, images, HTML, CSV) |
|
|
| UNSTRUCTURED_API_KEY | API key for Unstructured.io service for document parsing (required if ETL_SERVICE=UNSTRUCTURED) |
|
|
| LLAMA_CLOUD_API_KEY | API key for LlamaCloud service for document parsing (required if ETL_SERVICE=LLAMACLOUD) |
|
|
| CELERY_BROKER_URL | Redis connection URL for Celery broker (e.g., `redis://localhost:6379/0`) |
|
|
| CELERY_RESULT_BACKEND | Redis connection URL for Celery result backend (e.g., `redis://localhost:6379/0`) |
|
|
| SCHEDULE_CHECKER_INTERVAL | (Optional) How often to check for scheduled connector tasks. Format: `<number><unit>` where unit is `m` (minutes) or `h` (hours). Examples: `1m`, `5m`, `1h`, `2h` (default: `1m`) |
|
|
| REGISTRATION_ENABLED | (Optional) Enable or disable new user registration (e.g., `TRUE` or `FALSE`, default: `TRUE`) |
|
|
| PAGES_LIMIT | (Optional) Maximum pages limit per user for ETL services (default: `999999999` for unlimited in OSS version) |
|
|
|
|
**Google Connector OAuth Configuration:**
|
|
| ENV VARIABLE | DESCRIPTION |
|
|
| -------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
| GOOGLE_CALENDAR_REDIRECT_URI | (Optional) Redirect URI for Google Calendar connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/google/calendar/connector/callback`) |
|
|
| GOOGLE_GMAIL_REDIRECT_URI | (Optional) Redirect URI for Gmail connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/google/gmail/connector/callback`) |
|
|
| GOOGLE_DRIVE_REDIRECT_URI | (Optional) Redirect URI for Google Drive connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/google/drive/connector/callback`) |
|
|
|
|
**Connector OAuth Configurations (Optional):**
|
|
|
|
| ENV VARIABLE | DESCRIPTION |
|
|
| -------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
| AIRTABLE_CLIENT_ID | (Optional) Airtable OAuth client ID from [Airtable Developer Hub](https://airtable.com/create/oauth) |
|
|
| AIRTABLE_CLIENT_SECRET | (Optional) Airtable OAuth client secret |
|
|
| AIRTABLE_REDIRECT_URI | (Optional) Redirect URI for Airtable connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/airtable/connector/callback`) |
|
|
| CLICKUP_CLIENT_ID | (Optional) ClickUp OAuth client ID |
|
|
| CLICKUP_CLIENT_SECRET | (Optional) ClickUp OAuth client secret |
|
|
| CLICKUP_REDIRECT_URI | (Optional) Redirect URI for ClickUp connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/clickup/connector/callback`) |
|
|
| DISCORD_CLIENT_ID | (Optional) Discord OAuth client ID |
|
|
| DISCORD_CLIENT_SECRET | (Optional) Discord OAuth client secret |
|
|
| DISCORD_REDIRECT_URI | (Optional) Redirect URI for Discord connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/discord/connector/callback`) |
|
|
| DISCORD_BOT_TOKEN | (Optional) Discord bot token from Developer Portal |
|
|
| ATLASSIAN_CLIENT_ID | (Optional) Atlassian OAuth client ID (for Jira and Confluence) |
|
|
| ATLASSIAN_CLIENT_SECRET | (Optional) Atlassian OAuth client secret |
|
|
| JIRA_REDIRECT_URI | (Optional) Redirect URI for Jira connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/jira/connector/callback`) |
|
|
| CONFLUENCE_REDIRECT_URI | (Optional) Redirect URI for Confluence connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/confluence/connector/callback`) |
|
|
| LINEAR_CLIENT_ID | (Optional) Linear OAuth client ID |
|
|
| LINEAR_CLIENT_SECRET | (Optional) Linear OAuth client secret |
|
|
| LINEAR_REDIRECT_URI | (Optional) Redirect URI for Linear connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/linear/connector/callback`) |
|
|
| NOTION_CLIENT_ID | (Optional) Notion OAuth client ID |
|
|
| NOTION_CLIENT_SECRET | (Optional) Notion OAuth client secret |
|
|
| NOTION_REDIRECT_URI | (Optional) Redirect URI for Notion connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/notion/connector/callback`) |
|
|
| SLACK_CLIENT_ID | (Optional) Slack OAuth client ID |
|
|
| SLACK_CLIENT_SECRET | (Optional) Slack OAuth client secret |
|
|
| SLACK_REDIRECT_URI | (Optional) Redirect URI for Slack connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/slack/connector/callback`) |
|
|
| MICROSOFT_CLIENT_ID | (Optional) Microsoft OAuth client ID (shared for Teams and OneDrive) |
|
|
| MICROSOFT_CLIENT_SECRET | (Optional) Microsoft OAuth client secret (shared for Teams and OneDrive) |
|
|
| TEAMS_REDIRECT_URI | (Optional) Redirect URI for Teams connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/teams/connector/callback`) |
|
|
| ONEDRIVE_REDIRECT_URI | (Optional) Redirect URI for OneDrive connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/onedrive/connector/callback`) |
|
|
| DROPBOX_APP_KEY | (Optional) Dropbox OAuth app key |
|
|
| DROPBOX_APP_SECRET | (Optional) Dropbox OAuth app secret |
|
|
| DROPBOX_REDIRECT_URI | (Optional) Redirect URI for Dropbox connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/dropbox/connector/callback`) |
|
|
|
|
**(Optional) Backend LangSmith Observability:**
|
|
| ENV VARIABLE | DESCRIPTION |
|
|
|--------------|-------------|
|
|
| LANGSMITH_TRACING | Enable LangSmith tracing (e.g., `true`) |
|
|
| LANGSMITH_ENDPOINT | LangSmith API endpoint (e.g., `https://api.smith.langchain.com`) |
|
|
| LANGSMITH_API_KEY | Your LangSmith API key |
|
|
| LANGSMITH_PROJECT | LangSmith project name (e.g., `surfsense`) |
|
|
|
|
**(Optional) Uvicorn Server Configuration**
|
|
| ENV VARIABLE | DESCRIPTION | DEFAULT VALUE |
|
|
|------------------------------|---------------------------------------------|---------------|
|
|
| UVICORN_HOST | Host address to bind the server | 0.0.0.0 |
|
|
| UVICORN_PORT | Port to run the backend API | 8000 |
|
|
| UVICORN_LOG_LEVEL | Logging level (e.g., info, debug, warning) | info |
|
|
| UVICORN_PROXY_HEADERS | Enable/disable proxy headers | false |
|
|
| UVICORN_FORWARDED_ALLOW_IPS | Comma-separated list of allowed IPs | 127.0.0.1 |
|
|
| UVICORN_WORKERS | Number of worker processes | 1 |
|
|
| UVICORN_ACCESS_LOG | Enable/disable access log (true/false) | true |
|
|
| UVICORN_LOOP | Event loop implementation | auto |
|
|
| UVICORN_HTTP | HTTP protocol implementation | auto |
|
|
| UVICORN_WS | WebSocket protocol implementation | auto |
|
|
| UVICORN_LIFESPAN | Lifespan implementation | auto |
|
|
| UVICORN_LOG_CONFIG | Path to logging config file or empty string | |
|
|
| UVICORN_SERVER_HEADER | Enable/disable Server header | true |
|
|
| UVICORN_DATE_HEADER | Enable/disable Date header | true |
|
|
| UVICORN_LIMIT_CONCURRENCY | Max concurrent connections | |
|
|
| UVICORN_LIMIT_MAX_REQUESTS | Max requests before worker restart | |
|
|
| UVICORN_TIMEOUT_KEEP_ALIVE | Keep-alive timeout (seconds) | 5 |
|
|
| UVICORN_TIMEOUT_NOTIFY | Worker shutdown notification timeout (sec) | 30 |
|
|
| UVICORN_SSL_KEYFILE | Path to SSL key file | |
|
|
| UVICORN_SSL_CERTFILE | Path to SSL certificate file | |
|
|
| UVICORN_SSL_KEYFILE_PASSWORD | Password for SSL key file | |
|
|
| UVICORN_SSL_VERSION | SSL version | |
|
|
| UVICORN_SSL_CERT_REQS | SSL certificate requirements | |
|
|
| UVICORN_SSL_CA_CERTS | Path to CA certificates file | |
|
|
| UVICORN_SSL_CIPHERS | SSL ciphers | |
|
|
| UVICORN_HEADERS | Comma-separated list of headers | |
|
|
| UVICORN_USE_COLORS | Enable/disable colored logs | true |
|
|
| UVICORN_UDS | Unix domain socket path | |
|
|
| UVICORN_FD | File descriptor to bind to | |
|
|
| UVICORN_ROOT_PATH | Root path for the application | |
|
|
|
|
Refer to the `.env.example` file for all available Uvicorn options and their usage. Uncomment and set in your `.env` file as needed.
|
|
|
|
For more details, see the [Uvicorn documentation](https://www.uvicorn.org/#command-line-options).
|
|
|
|
### 2. Install Dependencies
|
|
|
|
Install the backend dependencies using `uv`:
|
|
|
|
**Linux/macOS:**
|
|
|
|
```bash
|
|
# Install uv if you don't have it
|
|
curl -fsSL https://astral.sh/uv/install.sh | bash
|
|
|
|
# Install dependencies
|
|
uv sync
|
|
```
|
|
|
|
**Windows (PowerShell):**
|
|
|
|
```powershell
|
|
# Install uv if you don't have it
|
|
iwr -useb https://astral.sh/uv/install.ps1 | iex
|
|
|
|
# Install dependencies
|
|
uv sync
|
|
```
|
|
|
|
**Windows (Command Prompt):**
|
|
|
|
```cmd
|
|
# Install dependencies with uv (after installing uv)
|
|
uv sync
|
|
```
|
|
|
|
### 3. Configure PostgreSQL for Zero Sync
|
|
|
|
SurfSense uses [Rocicorp Zero](https://zero.rocicorp.dev/) for real-time data synchronization (notifications, document status, chat comments, indexing progress). Zero replicates data from PostgreSQL via **logical replication**, which requires a one-time PostgreSQL configuration change.
|
|
|
|
Edit your `postgresql.conf` (typical locations: `/etc/postgresql/<version>/main/postgresql.conf` on Linux, `/usr/local/var/postgres/postgresql.conf` on macOS via Homebrew, `C:\Program Files\PostgreSQL\<version>\data\postgresql.conf` on Windows) and set:
|
|
|
|
```ini
|
|
wal_level = logical
|
|
max_replication_slots = 10
|
|
max_wal_senders = 10
|
|
```
|
|
|
|
Then restart PostgreSQL:
|
|
|
|
**Linux:**
|
|
|
|
```bash
|
|
sudo systemctl restart postgresql
|
|
```
|
|
|
|
**macOS (Homebrew):**
|
|
|
|
```bash
|
|
brew services restart postgresql
|
|
```
|
|
|
|
**Windows (PowerShell, replace `17` with your major version):**
|
|
|
|
```powershell
|
|
Restart-Service postgresql-x64-17
|
|
```
|
|
|
|
Verify the change:
|
|
|
|
```bash
|
|
psql -U postgres -d surfsense -c "SHOW wal_level;"
|
|
# Should return: logical
|
|
```
|
|
|
|
**Managed databases (RDS, Supabase, Cloud SQL, etc.):** Enable logical replication via your provider's parameter group (e.g. `rds.logical_replication=1` on RDS) and grant your database user the `REPLICATION` privilege:
|
|
|
|
```sql
|
|
ALTER USER surfsense WITH REPLICATION;
|
|
GRANT CREATE ON DATABASE surfsense TO surfsense;
|
|
```
|
|
|
|
### 4. Run Database Migrations
|
|
|
|
Before starting the backend, run Alembic migrations. This creates the schema **and** the `zero_publication` that zero-cache needs to start. Skipping this step will cause zero-cache to crash-loop with `Unknown or invalid publications. Specified: [zero_publication]`.
|
|
|
|
**If using uv:**
|
|
|
|
```bash
|
|
# From surfsense_backend/
|
|
uv run alembic upgrade head
|
|
```
|
|
|
|
**If using pip/venv:**
|
|
|
|
```bash
|
|
# Activate virtual environment first
|
|
source .venv/bin/activate # Linux/macOS
|
|
# OR
|
|
.venv\Scripts\activate # Windows
|
|
|
|
alembic upgrade head
|
|
```
|
|
|
|
Verify the publication was created:
|
|
|
|
```bash
|
|
psql -U postgres -d surfsense -c "SELECT pubname FROM pg_publication;"
|
|
# Should include: zero_publication
|
|
```
|
|
|
|
### 5. Start Redis Server
|
|
|
|
Redis is required for Celery task queue. Start the Redis server:
|
|
|
|
**Linux:**
|
|
|
|
```bash
|
|
# Start Redis server
|
|
sudo systemctl start redis
|
|
|
|
# Or if using Redis installed via package manager
|
|
redis-server
|
|
```
|
|
|
|
**macOS:**
|
|
|
|
```bash
|
|
# If installed via Homebrew
|
|
brew services start redis
|
|
|
|
# Or run directly
|
|
redis-server
|
|
```
|
|
|
|
**Windows:**
|
|
|
|
```powershell
|
|
# Option 1: If using Redis on Windows (via WSL or Windows port)
|
|
redis-server
|
|
|
|
# Option 2: If installed as a Windows service
|
|
net start Redis
|
|
```
|
|
|
|
**Alternative for Windows - Run Redis in Docker:**
|
|
|
|
If you have Docker Desktop installed, you can run Redis in a container:
|
|
|
|
```powershell
|
|
# Pull and run Redis container
|
|
docker run -d --name redis -p 6379:6379 redis:latest
|
|
|
|
# To stop Redis
|
|
docker stop redis
|
|
|
|
# To start Redis again
|
|
docker start redis
|
|
|
|
# To remove Redis container
|
|
docker rm -f redis
|
|
```
|
|
|
|
Verify Redis is running by connecting to it:
|
|
|
|
```bash
|
|
redis-cli ping
|
|
# Should return: PONG
|
|
```
|
|
|
|
### 6. Start Celery Worker
|
|
|
|
In a new terminal window, start the Celery worker to handle background tasks:
|
|
|
|
**If using uv:**
|
|
|
|
```bash
|
|
# Make sure you're in the surfsense_backend directory
|
|
cd surfsense_backend
|
|
|
|
# Start Celery worker (consume both default and connectors queues)
|
|
DEFAULT_Q="${CELERY_TASK_DEFAULT_QUEUE:-surfsense}"
|
|
uv run celery -A celery_worker.celery_app worker --loglevel=info --concurrency=1 --pool=solo --queues="${DEFAULT_Q},${DEFAULT_Q}.connectors"
|
|
```
|
|
|
|
**If using pip/venv:**
|
|
|
|
```bash
|
|
# Make sure you're in the surfsense_backend directory
|
|
cd surfsense_backend
|
|
|
|
# Activate virtual environment
|
|
source .venv/bin/activate # Linux/macOS
|
|
# OR
|
|
.venv\Scripts\activate # Windows
|
|
|
|
# Start Celery worker (consume both default and connectors queues)
|
|
DEFAULT_Q="${CELERY_TASK_DEFAULT_QUEUE:-surfsense}"
|
|
celery -A celery_worker.celery_app worker --loglevel=info --concurrency=1 --pool=solo --queues="${DEFAULT_Q},${DEFAULT_Q}.connectors"
|
|
```
|
|
|
|
**Optional: Start Flower for monitoring Celery tasks:**
|
|
|
|
In another terminal window:
|
|
|
|
```bash
|
|
# If using uv
|
|
uv run celery -A celery_worker.celery_app flower --port=5555
|
|
|
|
# If using pip/venv (activate venv first)
|
|
celery -A celery_worker.celery_app flower --port=5555
|
|
```
|
|
|
|
Access Flower at [http://localhost:5555](http://localhost:5555) to monitor your Celery tasks.
|
|
|
|
### 7. Start Celery Beat (Scheduler)
|
|
|
|
In another new terminal window, start Celery Beat to enable periodic tasks (like scheduled connector indexing):
|
|
|
|
**If using uv:**
|
|
|
|
```bash
|
|
# Make sure you're in the surfsense_backend directory
|
|
cd surfsense_backend
|
|
|
|
# Start Celery Beat
|
|
uv run celery -A celery_worker.celery_app beat --loglevel=info
|
|
```
|
|
|
|
**If using pip/venv:**
|
|
|
|
```bash
|
|
# Make sure you're in the surfsense_backend directory
|
|
cd surfsense_backend
|
|
|
|
# Activate virtual environment
|
|
source .venv/bin/activate # Linux/macOS
|
|
# OR
|
|
.venv\Scripts\activate # Windows
|
|
|
|
# Start Celery Beat
|
|
celery -A celery_worker.celery_app beat --loglevel=info
|
|
```
|
|
|
|
**Important**: Celery Beat is required for the periodic indexing functionality to work. Without it, scheduled connector tasks won't run automatically. The schedule interval can be configured using the `SCHEDULE_CHECKER_INTERVAL` environment variable.
|
|
|
|
### 8. Run the Backend
|
|
|
|
Start the backend server:
|
|
|
|
**If using uv:**
|
|
|
|
```bash
|
|
# Run without hot reloading
|
|
uv run main.py
|
|
|
|
# Or with hot reloading for development
|
|
uv run main.py --reload
|
|
```
|
|
|
|
**If using pip/venv:**
|
|
|
|
```bash
|
|
# Activate virtual environment if not already activated
|
|
source .venv/bin/activate # Linux/macOS
|
|
# OR
|
|
.venv\Scripts\activate # Windows
|
|
|
|
# Run without hot reloading
|
|
python main.py
|
|
|
|
# Or with hot reloading for development
|
|
python main.py --reload
|
|
```
|
|
|
|
If everything is set up correctly, you should see output indicating the server is running on `http://localhost:8000`.
|
|
|
|
## Zero-Cache Setup
|
|
|
|
**zero-cache** is the Rocicorp Zero server that sits between PostgreSQL and the browser. It streams real-time updates (notifications, document indexing status, chat comments, collaboration indicators) to all connected clients via WebSocket. The frontend connects to it on startup — without zero-cache running, you will not see live updates and many parts of the UI will sit on stale data.
|
|
|
|
For an overview of how Zero works and the list of synced tables, see the [Real-Time Sync with Zero](/docs/how-to/zero-sync) guide.
|
|
|
|
### 1. Run Zero-Cache via Docker
|
|
|
|
The simplest way to run zero-cache is the official Docker image. Open a new terminal:
|
|
|
|
**Linux/macOS:**
|
|
|
|
```bash
|
|
docker run -d --name surfsense-zero-cache \
|
|
-p 4848:4848 \
|
|
--add-host=host.docker.internal:host-gateway \
|
|
-e ZERO_UPSTREAM_DB="postgresql://postgres:postgres@host.docker.internal:5432/surfsense?sslmode=disable" \
|
|
-e ZERO_CVR_DB="postgresql://postgres:postgres@host.docker.internal:5432/surfsense?sslmode=disable" \
|
|
-e ZERO_CHANGE_DB="postgresql://postgres:postgres@host.docker.internal:5432/surfsense?sslmode=disable" \
|
|
-e ZERO_REPLICA_FILE=/data/zero.db \
|
|
-e ZERO_ADMIN_PASSWORD=surfsense-zero-admin \
|
|
-e ZERO_APP_PUBLICATIONS=zero_publication \
|
|
-e ZERO_NUM_SYNC_WORKERS=4 \
|
|
-e ZERO_UPSTREAM_MAX_CONNS=20 \
|
|
-e ZERO_CVR_MAX_CONNS=30 \
|
|
-e ZERO_QUERY_URL="http://host.docker.internal:3000/api/zero/query" \
|
|
-e ZERO_MUTATE_URL="http://host.docker.internal:3000/api/zero/mutate" \
|
|
-v surfsense-zero-cache:/data \
|
|
rocicorp/zero:1.4.0
|
|
```
|
|
|
|
**Windows (PowerShell):**
|
|
|
|
```powershell
|
|
docker run -d --name surfsense-zero-cache `
|
|
-p 4848:4848 `
|
|
--add-host=host.docker.internal:host-gateway `
|
|
-e ZERO_UPSTREAM_DB="postgresql://postgres:postgres@host.docker.internal:5432/surfsense?sslmode=disable" `
|
|
-e ZERO_CVR_DB="postgresql://postgres:postgres@host.docker.internal:5432/surfsense?sslmode=disable" `
|
|
-e ZERO_CHANGE_DB="postgresql://postgres:postgres@host.docker.internal:5432/surfsense?sslmode=disable" `
|
|
-e ZERO_REPLICA_FILE=/data/zero.db `
|
|
-e ZERO_ADMIN_PASSWORD=surfsense-zero-admin `
|
|
-e ZERO_APP_PUBLICATIONS=zero_publication `
|
|
-e ZERO_NUM_SYNC_WORKERS=4 `
|
|
-e ZERO_UPSTREAM_MAX_CONNS=20 `
|
|
-e ZERO_CVR_MAX_CONNS=30 `
|
|
-e ZERO_QUERY_URL="http://host.docker.internal:3000/api/zero/query" `
|
|
-e ZERO_MUTATE_URL="http://host.docker.internal:3000/api/zero/mutate" `
|
|
-v surfsense-zero-cache:/data `
|
|
rocicorp/zero:1.4.0
|
|
```
|
|
|
|
**Adjustments to make for your setup:**
|
|
|
|
- Replace `postgres:postgres` in the connection URLs with your actual `DB_USER:DB_PASSWORD`.
|
|
- On Linux without Docker Desktop, `host.docker.internal` may not resolve. Either keep the `--add-host=host.docker.internal:host-gateway` flag (Docker 20.10+) or replace `host.docker.internal` with your host's IP / `--network=host` + `localhost`.
|
|
- For production / custom domains, set `ZERO_QUERY_URL` and `ZERO_MUTATE_URL` to your public frontend URL (e.g. `https://app.yourdomain.com/api/zero/query`).
|
|
|
|
### 2. Verify Zero-Cache
|
|
|
|
Confirm zero-cache is healthy:
|
|
|
|
```bash
|
|
curl http://localhost:4848/keepalive
|
|
# Should return HTTP 200
|
|
```
|
|
|
|
Tail the logs to confirm initial replication completed without errors:
|
|
|
|
```bash
|
|
docker logs -f surfsense-zero-cache
|
|
```
|
|
|
|
### Alternative: Use `docker-compose.deps-only.yml`
|
|
|
|
If you would rather have Docker manage Postgres, Redis, SearXNG, and zero-cache together (while still running the backend and frontend natively), the repository ships a deps-only compose file. **Run alembic migrations on the host first** so `zero_publication` exists before zero-cache starts:
|
|
|
|
```bash
|
|
cd surfsense_backend
|
|
uv run alembic upgrade head
|
|
cd ../docker
|
|
docker compose -f docker-compose.deps-only.yml up -d
|
|
```
|
|
|
|
The deps-only stack exposes zero-cache on port `4848` (default) — keep `NEXT_PUBLIC_ZERO_CACHE_URL=http://localhost:4848` in your `surfsense_web/.env`.
|
|
|
|
## Frontend Setup
|
|
|
|
### 1. Environment Configuration
|
|
|
|
Set up the frontend environment:
|
|
|
|
**Linux/macOS:**
|
|
|
|
```bash
|
|
cd surfsense_web
|
|
cp .env.example .env
|
|
```
|
|
|
|
**Windows (Command Prompt):**
|
|
|
|
```cmd
|
|
cd surfsense_web
|
|
copy .env.example .env
|
|
```
|
|
|
|
**Windows (PowerShell):**
|
|
|
|
```powershell
|
|
cd surfsense_web
|
|
Copy-Item -Path .env.example -Destination .env
|
|
```
|
|
|
|
Edit the `.env` file and set:
|
|
|
|
| ENV VARIABLE | DESCRIPTION |
|
|
| ------------------------------- | ------------------------------------------- |
|
|
| NEXT_PUBLIC_FASTAPI_BACKEND_URL | Backend URL (e.g., `http://localhost:8000`) |
|
|
| NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE | Same value as set in backend AUTH_TYPE i.e `GOOGLE` for OAuth with Google, `LOCAL` for email/password authentication |
|
|
| NEXT_PUBLIC_ETL_SERVICE | Document parsing service (should match backend ETL_SERVICE): `UNSTRUCTURED`, `LLAMACLOUD`, or `DOCLING` - affects supported file formats in upload interface |
|
|
| NEXT_PUBLIC_ZERO_CACHE_URL | URL for Zero-cache real-time sync service (e.g., `http://localhost:4848`) |
|
|
|
|
### 2. Install Dependencies
|
|
|
|
Install the frontend dependencies:
|
|
|
|
**Linux/macOS:**
|
|
|
|
```bash
|
|
# Install pnpm if you don't have it
|
|
npm install -g pnpm
|
|
|
|
# Install dependencies
|
|
pnpm install
|
|
```
|
|
|
|
**Windows:**
|
|
|
|
```powershell
|
|
# Install pnpm if you don't have it
|
|
npm install -g pnpm
|
|
|
|
# Install dependencies
|
|
pnpm install
|
|
```
|
|
|
|
### 3. Run the Frontend
|
|
|
|
Start the Next.js development server:
|
|
|
|
**Linux/macOS/Windows:**
|
|
|
|
```bash
|
|
pnpm run dev
|
|
```
|
|
|
|
The frontend should now be running at `http://localhost:3000`.
|
|
|
|
## Browser Extension Setup (Optional)
|
|
|
|
The SurfSense browser extension allows you to save any webpage, including those protected behind authentication.
|
|
|
|
### 1. Environment Configuration
|
|
|
|
**Linux/macOS:**
|
|
|
|
```bash
|
|
cd surfsense_browser_extension
|
|
cp .env.example .env
|
|
```
|
|
|
|
**Windows (Command Prompt):**
|
|
|
|
```cmd
|
|
cd surfsense_browser_extension
|
|
copy .env.example .env
|
|
```
|
|
|
|
**Windows (PowerShell):**
|
|
|
|
```powershell
|
|
cd surfsense_browser_extension
|
|
Copy-Item -Path .env.example -Destination .env
|
|
```
|
|
|
|
Edit the `.env` file:
|
|
|
|
| ENV VARIABLE | DESCRIPTION |
|
|
| ------------------------- | ----------------------------------------------------- |
|
|
| PLASMO_PUBLIC_BACKEND_URL | SurfSense Backend URL (e.g., `http://127.0.0.1:8000`) |
|
|
|
|
### 2. Build the Extension
|
|
|
|
Build the extension for your browser using the [Plasmo framework](https://docs.plasmo.com/framework/workflows/build#with-a-specific-target).
|
|
|
|
**Linux/macOS/Windows:**
|
|
|
|
```bash
|
|
# Install dependencies
|
|
pnpm install
|
|
|
|
# Build for Chrome (default)
|
|
pnpm build
|
|
|
|
# Or for other browsers
|
|
pnpm build --target=firefox
|
|
pnpm build --target=edge
|
|
```
|
|
|
|
### 3. Load the Extension
|
|
|
|
Load the extension in your browser's developer mode and configure it with your SurfSense API key.
|
|
|
|
## Verification
|
|
|
|
To verify your installation:
|
|
|
|
1. Open your browser and navigate to `http://localhost:3000`
|
|
2. Sign in with your Google account (or local credentials if `AUTH_TYPE=LOCAL`)
|
|
3. Create a search space and try uploading a document
|
|
4. Watch the upload status update live without refreshing — this confirms zero-cache is wired up correctly
|
|
5. Test the chat functionality with your uploaded content
|
|
|
|
## Troubleshooting
|
|
|
|
- **Database Connection Issues**: Verify your PostgreSQL server is running and pgvector is properly installed
|
|
- **Redis Connection Issues**: Ensure Redis server is running (`redis-cli ping` should return `PONG`). Check that `CELERY_BROKER_URL` and `CELERY_RESULT_BACKEND` are correctly set in your `.env` file
|
|
- **Celery Worker Issues**: Make sure the Celery worker is running in a separate terminal. Check worker logs for any errors
|
|
- **Authentication Problems**: Check your Google OAuth configuration and ensure redirect URIs are set correctly
|
|
- **LLM Errors**: Confirm your LLM API keys are valid and the selected models are accessible
|
|
- **File Upload Failures**: Validate your ETL service API key (Unstructured.io or LlamaCloud) or ensure Docling is properly configured
|
|
- **Real-time updates not working / stale UI**: Verify zero-cache is running (`curl http://localhost:4848/keepalive` returns 200). Open browser DevTools → Console and look for WebSocket errors. Confirm `NEXT_PUBLIC_ZERO_CACHE_URL` in `surfsense_web/.env` matches the running zero-cache address.
|
|
- **Zero-cache stuck on `Unknown or invalid publications. Specified: [zero_publication]`**: You skipped (or never ran) `uv run alembic upgrade head` from `surfsense_backend/`. Run it, then restart the zero-cache container with `docker restart surfsense-zero-cache`.
|
|
- **Zero-cache crashes with `_zero.tableMetadata` errors**: A previous run left a half-built SQLite replica behind. Stop the container, remove the volume, and start fresh: `docker rm -f surfsense-zero-cache && docker volume rm surfsense-zero-cache && docker run -d ...` (re-run the command from [Zero-Cache Setup](#zero-cache-setup)).
|
|
- **`wal_level` is not set to `logical`**: zero-cache requires logical replication. Set `wal_level = logical` in `postgresql.conf`, restart PostgreSQL, and verify with `SHOW wal_level;` in psql.
|
|
- **Backend `/ready` returns 503**: The readiness probe verifies `zero_publication` exists. Run `uv run alembic upgrade head` to create it.
|
|
- **Windows-specific**: If you encounter path issues, ensure you're using the correct path separator (`\` instead of `/`)
|
|
- **macOS-specific**: If you encounter permission issues, you may need to use `sudo` for some installation commands
|
|
|
|
## Next Steps
|
|
|
|
Now that you have SurfSense running locally, you can explore its features:
|
|
|
|
- Create search spaces for organizing your content
|
|
- Upload documents or use the browser extension to save webpages
|
|
- Ask questions about your saved content
|
|
- Explore the advanced RAG capabilities
|
|
|
|
For production deployments, consider setting up:
|
|
|
|
- A reverse proxy like Nginx
|
|
- SSL certificates for secure connections
|
|
- Proper database backups
|
|
- User access controls
|