SurfSense/surfsense_web/content/docs/manual-installation.mdx

---
title: Manual Installation
description: Setting up SurfSense manually for customized deployments
icon: Wrench
---

# Manual Installation (Preferred)

This guide provides step-by-step instructions for setting up SurfSense without Docker. This approach gives you more control over the installation process and allows for customization of the environment.

## Prerequisites

Before beginning the manual installation, ensure you have the following installed and configured:

### Required Software
- **Python 3.12+** - Backend runtime environment
- **Node.js 20+** - Frontend runtime environment
- **PostgreSQL 14+** - Database server
- **PGVector** - PostgreSQL extension for vector similarity search
- **Redis** - Message broker for Celery task queue
- **Git** - Version control (to clone the repository)

### Required Services & API Keys

Complete all the [setup steps](/docs), including:

- **Authentication Setup** (choose one):
  - Google OAuth credentials (for `AUTH_TYPE=GOOGLE`)
  - Local authentication setup (for `AUTH_TYPE=LOCAL`)
- **File Processing ETL Service** (choose one):
  - Unstructured.io API key (Supports 34+ formats)
  - LlamaCloud API key (enhanced parsing, supports 50+ formats)
  - Docling (local processing, no API key required, supports PDF, Office docs, images, HTML, CSV)
- **Other API keys** as needed for your use case

## Backend Setup

The backend is the core of SurfSense. Follow these steps to set it up:

### 1. Environment Configuration

First, create and configure your environment variables by copying the example file:

**Linux/macOS:**

```bash
cd surfsense_backend
cp .env.example .env
```

**Windows (Command Prompt):**

```cmd
cd surfsense_backend
copy .env.example .env
```

**Windows (PowerShell):**

```powershell
cd surfsense_backend
Copy-Item -Path .env.example -Destination .env
```

Edit the `.env` file and set the following variables:

| ENV VARIABLE               | DESCRIPTION                                                                                                                                                                               |
| -------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| DATABASE_URL               | PostgreSQL connection string (e.g., `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`)                                                                                    |
| SECRET_KEY                 | JWT Secret key for authentication (should be a secure random string)                                                                                                                      |
| NEXT_FRONTEND_URL          | URL where your frontend application is hosted (e.g., `http://localhost:3000`)                                                                                                             |
| BACKEND_URL                | (Optional) Public URL of the backend for OAuth callbacks (e.g., `https://api.yourdomain.com`). Required when running behind a reverse proxy with HTTPS. Used to set correct OAuth redirect URLs and secure cookies. |
| AUTH_TYPE                  | Authentication method: `GOOGLE` for OAuth with Google, `LOCAL` for email/password authentication                                                                                          |
| GOOGLE_OAUTH_CLIENT_ID     | (Optional) Client ID from Google Cloud Console (required if AUTH_TYPE=GOOGLE)                                                                                                                        |
| GOOGLE_OAUTH_CLIENT_SECRET | (Optional) Client secret from Google Cloud Console (required if AUTH_TYPE=GOOGLE)                                                                                                                    |
| EMBEDDING_MODEL            | Name of the embedding model (e.g., `sentence-transformers/all-MiniLM-L6-v2`, `openai://text-embedding-ada-002`)                                                                                                                 |
| RERANKERS_ENABLED          | (Optional) Enable or disable document reranking for improved search results (e.g., `TRUE` or `FALSE`, default: `FALSE`)                                                                  |
| RERANKERS_MODEL_NAME       | Name of the reranker model (e.g., `ms-marco-MiniLM-L-12-v2`) (required if RERANKERS_ENABLED=TRUE)                                                                                        |
| RERANKERS_MODEL_TYPE       | Type of reranker model (e.g., `flashrank`) (required if RERANKERS_ENABLED=TRUE)                                                                                                          |
| TTS_SERVICE                | Text-to-Speech API provider for Podcasts (e.g., `local/kokoro`, `openai/tts-1`). See [supported providers](https://docs.litellm.ai/docs/text_to_speech#supported-providers)                            |
| TTS_SERVICE_API_KEY        | (Optional if local) API key for the Text-to-Speech service                                                                                                                                                    |
| TTS_SERVICE_API_BASE       | (Optional) Custom API base URL for the Text-to-Speech service                                                                                                                           |
| STT_SERVICE                | Speech-to-Text API provider for Audio Files (e.g., `local/base`, `openai/whisper-1`). See [supported providers](https://docs.litellm.ai/docs/audio_transcription#supported-providers)                   |
| STT_SERVICE_API_KEY        | (Optional if local) API key for the Speech-to-Text service                                                                                                                                                    |
| STT_SERVICE_API_BASE       | (Optional) Custom API base URL for the Speech-to-Text service                                                                                                                      |
| FIRECRAWL_API_KEY          | (Optional) API key for Firecrawl service for web crawling                                                                                                                                 |
| ETL_SERVICE                | Document parsing service: `UNSTRUCTURED` (supports 34+ formats), `LLAMACLOUD` (supports 50+ formats including legacy document types), or `DOCLING` (local processing, supports PDF, Office docs, images, HTML, CSV)                                                  |
| UNSTRUCTURED_API_KEY       | API key for Unstructured.io service for document parsing (required if ETL_SERVICE=UNSTRUCTURED)                                                                                           |
| LLAMA_CLOUD_API_KEY        | API key for LlamaCloud service for document parsing (required if ETL_SERVICE=LLAMACLOUD)                                                                                                  |
| CELERY_BROKER_URL          | Redis connection URL for Celery broker (e.g., `redis://localhost:6379/0`)                                                                                                                |
| CELERY_RESULT_BACKEND      | Redis connection URL for Celery result backend (e.g., `redis://localhost:6379/0`)                                                                                                        |
| SCHEDULE_CHECKER_INTERVAL  | (Optional) How often to check for scheduled connector tasks. Format: `<number><unit>` where unit is `m` (minutes) or `h` (hours). Examples: `1m`, `5m`, `1h`, `2h` (default: `1m`)                  |
| REGISTRATION_ENABLED       | (Optional) Enable or disable new user registration (e.g., `TRUE` or `FALSE`, default: `TRUE`)                                                                                                       |
| PAGES_LIMIT                | (Optional) Maximum pages limit per user for ETL services (default: `999999999` for unlimited in OSS version)                                                                                        |

**Google Connector OAuth Configuration:**
| ENV VARIABLE               | DESCRIPTION                                                                                                                                                                               |
| -------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| GOOGLE_CALENDAR_REDIRECT_URI | (Optional) Redirect URI for Google Calendar connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/google/calendar/connector/callback`)                                     |
| GOOGLE_GMAIL_REDIRECT_URI  | (Optional) Redirect URI for Gmail connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/google/gmail/connector/callback`)                                                   |
| GOOGLE_DRIVE_REDIRECT_URI  | (Optional) Redirect URI for Google Drive connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/google/drive/connector/callback`)                                            |

**Connector OAuth Configurations (Optional):**

| ENV VARIABLE               | DESCRIPTION                                                                                                                                                                               |
| -------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| AIRTABLE_CLIENT_ID         | (Optional) Airtable OAuth client ID from [Airtable Developer Hub](https://airtable.com/create/oauth)                                                                                      |
| AIRTABLE_CLIENT_SECRET     | (Optional) Airtable OAuth client secret                                                                                                                                                   |
| AIRTABLE_REDIRECT_URI      | (Optional) Redirect URI for Airtable connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/airtable/connector/callback`)                                                    |
| CLICKUP_CLIENT_ID          | (Optional) ClickUp OAuth client ID                                                                                                                                                        |
| CLICKUP_CLIENT_SECRET      | (Optional) ClickUp OAuth client secret                                                                                                                                                    |
| CLICKUP_REDIRECT_URI       | (Optional) Redirect URI for ClickUp connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/clickup/connector/callback`)                                                      |
| DISCORD_CLIENT_ID          | (Optional) Discord OAuth client ID                                                                                                                                                        |
| DISCORD_CLIENT_SECRET      | (Optional) Discord OAuth client secret                                                                                                                                                    |
| DISCORD_REDIRECT_URI       | (Optional) Redirect URI for Discord connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/discord/connector/callback`)                                                      |
| DISCORD_BOT_TOKEN          | (Optional) Discord bot token from Developer Portal                                                                                                                                        |
| ATLASSIAN_CLIENT_ID        | (Optional) Atlassian OAuth client ID (for Jira and Confluence)                                                                                                                            |
| ATLASSIAN_CLIENT_SECRET    | (Optional) Atlassian OAuth client secret                                                                                                                                                  |
| JIRA_REDIRECT_URI          | (Optional) Redirect URI for Jira connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/jira/connector/callback`)                                                            |
| CONFLUENCE_REDIRECT_URI    | (Optional) Redirect URI for Confluence connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/confluence/connector/callback`)                                                |
| LINEAR_CLIENT_ID           | (Optional) Linear OAuth client ID                                                                                                                                                         |
| LINEAR_CLIENT_SECRET       | (Optional) Linear OAuth client secret                                                                                                                                                     |
| LINEAR_REDIRECT_URI        | (Optional) Redirect URI for Linear connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/linear/connector/callback`)                                                        |
| NOTION_CLIENT_ID           | (Optional) Notion OAuth client ID                                                                                                                                                         |
| NOTION_CLIENT_SECRET       | (Optional) Notion OAuth client secret                                                                                                                                                     |
| NOTION_REDIRECT_URI        | (Optional) Redirect URI for Notion connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/notion/connector/callback`)                                                        |
| SLACK_CLIENT_ID            | (Optional) Slack OAuth client ID                                                                                                                                                          |
| SLACK_CLIENT_SECRET        | (Optional) Slack OAuth client secret                                                                                                                                                      |
| SLACK_REDIRECT_URI         | (Optional) Redirect URI for Slack connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/slack/connector/callback`)                                                          |
| TEAMS_CLIENT_ID            | (Optional) Microsoft Teams OAuth client ID                                                                                                                                                |
| TEAMS_CLIENT_SECRET        | (Optional) Microsoft Teams OAuth client secret                                                                                                                                            |
| TEAMS_REDIRECT_URI         | (Optional) Redirect URI for Teams connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/teams/connector/callback`)                                                          |

**(Optional) Backend LangSmith Observability:**
| ENV VARIABLE | DESCRIPTION |
|--------------|-------------|
| LANGSMITH_TRACING | Enable LangSmith tracing (e.g., `true`) |
| LANGSMITH_ENDPOINT | LangSmith API endpoint (e.g., `https://api.smith.langchain.com`) |
| LANGSMITH_API_KEY | Your LangSmith API key |
| LANGSMITH_PROJECT | LangSmith project name (e.g., `surfsense`) |

**(Optional) Uvicorn Server Configuration**
| ENV VARIABLE | DESCRIPTION | DEFAULT VALUE |
|------------------------------|---------------------------------------------|---------------|
| UVICORN_HOST                 | Host address to bind the server             | 0.0.0.0       |
| UVICORN_PORT                 | Port to run the backend API                 | 8000          |
| UVICORN_LOG_LEVEL            | Logging level (e.g., info, debug, warning)  | info          |
| UVICORN_PROXY_HEADERS        | Enable/disable proxy headers                | false         |
| UVICORN_FORWARDED_ALLOW_IPS  | Comma-separated list of allowed IPs         | 127.0.0.1     |
| UVICORN_WORKERS              | Number of worker processes                  | 1             |
| UVICORN_ACCESS_LOG           | Enable/disable access log (true/false)      | true          |
| UVICORN_LOOP                 | Event loop implementation                   | auto          |
| UVICORN_HTTP                 | HTTP protocol implementation                | auto          |
| UVICORN_WS                   | WebSocket protocol implementation           | auto          |
| UVICORN_LIFESPAN             | Lifespan implementation                     | auto          |
| UVICORN_LOG_CONFIG           | Path to logging config file or empty string |               |
| UVICORN_SERVER_HEADER        | Enable/disable Server header                | true          |
| UVICORN_DATE_HEADER          | Enable/disable Date header                  | true          |
| UVICORN_LIMIT_CONCURRENCY    | Max concurrent connections                  |               |
| UVICORN_LIMIT_MAX_REQUESTS   | Max requests before worker restart          |               |
| UVICORN_TIMEOUT_KEEP_ALIVE   | Keep-alive timeout (seconds)                | 5             |
| UVICORN_TIMEOUT_NOTIFY       | Worker shutdown notification timeout (sec)  | 30            |
| UVICORN_SSL_KEYFILE          | Path to SSL key file                        |               |
| UVICORN_SSL_CERTFILE         | Path to SSL certificate file                |               |
| UVICORN_SSL_KEYFILE_PASSWORD | Password for SSL key file                   |               |
| UVICORN_SSL_VERSION          | SSL version                                 |               |
| UVICORN_SSL_CERT_REQS        | SSL certificate requirements                |               |
| UVICORN_SSL_CA_CERTS         | Path to CA certificates file                |               |
| UVICORN_SSL_CIPHERS          | SSL ciphers                                 |               |
| UVICORN_HEADERS              | Comma-separated list of headers             |               |
| UVICORN_USE_COLORS           | Enable/disable colored logs                 | true          |
| UVICORN_UDS                  | Unix domain socket path                     |               |
| UVICORN_FD                   | File descriptor to bind to                  |               |
| UVICORN_ROOT_PATH            | Root path for the application               |               |

Refer to the `.env.example` file for all available Uvicorn options and their usage. Uncomment and set in your `.env` file as needed.

For more details, see the [Uvicorn documentation](https://www.uvicorn.org/#command-line-options).

### 2. Install Dependencies

Install the backend dependencies using `uv`:

**Linux/macOS:**

```bash
# Install uv if you don't have it
curl -fsSL https://astral.sh/uv/install.sh | bash

# Install dependencies
uv sync
```

**Windows (PowerShell):**

```powershell
# Install uv if you don't have it
iwr -useb https://astral.sh/uv/install.ps1 | iex

# Install dependencies
uv sync
```

**Windows (Command Prompt):**

```cmd
# Install dependencies with uv (after installing uv)
uv sync
```

### 3. Start Redis Server

Redis is required for Celery task queue. Start the Redis server:

**Linux:**

```bash
# Start Redis server
sudo systemctl start redis

# Or if using Redis installed via package manager
redis-server
```

**macOS:**

```bash
# If installed via Homebrew
brew services start redis

# Or run directly
redis-server
```

**Windows:**

```powershell
# Option 1: If using Redis on Windows (via WSL or Windows port)
redis-server

# Option 2: If installed as a Windows service
net start Redis
```

**Alternative for Windows - Run Redis in Docker:**

If you have Docker Desktop installed, you can run Redis in a container:

```powershell
# Pull and run Redis container
docker run -d --name redis -p 6379:6379 redis:latest

# To stop Redis
docker stop redis

# To start Redis again
docker start redis

# To remove Redis container
docker rm -f redis
```

Verify Redis is running by connecting to it:

```bash
redis-cli ping
# Should return: PONG
```

### 4. Start Celery Worker

In a new terminal window, start the Celery worker to handle background tasks:

**If using uv:**

```bash
# Make sure you're in the surfsense_backend directory
cd surfsense_backend

# Start Celery worker (consume both default and connectors queues)
DEFAULT_Q="${CELERY_TASK_DEFAULT_QUEUE:-surfsense}"
uv run celery -A celery_worker.celery_app worker --loglevel=info --concurrency=1 --pool=solo --queues="${DEFAULT_Q},${DEFAULT_Q}.connectors"
```

**If using pip/venv:**

```bash
# Make sure you're in the surfsense_backend directory
cd surfsense_backend

# Activate virtual environment
source .venv/bin/activate  # Linux/macOS
# OR
.venv\Scripts\activate     # Windows

# Start Celery worker (consume both default and connectors queues)
DEFAULT_Q="${CELERY_TASK_DEFAULT_QUEUE:-surfsense}"
celery -A celery_worker.celery_app worker --loglevel=info --concurrency=1 --pool=solo --queues="${DEFAULT_Q},${DEFAULT_Q}.connectors"
```

**Optional: Start Flower for monitoring Celery tasks:**

In another terminal window:

```bash
# If using uv
uv run celery -A celery_worker.celery_app flower --port=5555

# If using pip/venv (activate venv first)
celery -A celery_worker.celery_app flower --port=5555
```

Access Flower at [http://localhost:5555](http://localhost:5555) to monitor your Celery tasks.

### 5. Start Celery Beat (Scheduler)

In another new terminal window, start Celery Beat to enable periodic tasks (like scheduled connector indexing):

**If using uv:**

```bash
# Make sure you're in the surfsense_backend directory
cd surfsense_backend

# Start Celery Beat
uv run celery -A celery_worker.celery_app beat --loglevel=info
```

**If using pip/venv:**

```bash
# Make sure you're in the surfsense_backend directory
cd surfsense_backend

# Activate virtual environment
source .venv/bin/activate  # Linux/macOS
# OR
.venv\Scripts\activate     # Windows

# Start Celery Beat
celery -A celery_worker.celery_app beat --loglevel=info
```

**Important**: Celery Beat is required for the periodic indexing functionality to work. Without it, scheduled connector tasks won't run automatically. The schedule interval can be configured using the `SCHEDULE_CHECKER_INTERVAL` environment variable.

### 6. Run the Backend

Start the backend server:

**If using uv:**

```bash
# Run without hot reloading
uv run main.py

# Or with hot reloading for development
uv run main.py --reload
```

**If using pip/venv:**

```bash
# Activate virtual environment if not already activated
source .venv/bin/activate  # Linux/macOS
# OR
.venv\Scripts\activate     # Windows

# Run without hot reloading
python main.py

# Or with hot reloading for development
python main.py --reload
```

If everything is set up correctly, you should see output indicating the server is running on `http://localhost:8000`.

## Frontend Setup

### 1. Environment Configuration

Set up the frontend environment:

**Linux/macOS:**

```bash
cd surfsense_web
cp .env.example .env
```

**Windows (Command Prompt):**

```cmd
cd surfsense_web
copy .env.example .env
```

**Windows (PowerShell):**

```powershell
cd surfsense_web
Copy-Item -Path .env.example -Destination .env
```

Edit the `.env` file and set:

| ENV VARIABLE                    | DESCRIPTION                                 |
| ------------------------------- | ------------------------------------------- |
| NEXT_PUBLIC_FASTAPI_BACKEND_URL | Backend URL (e.g., `http://localhost:8000`) |
| NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE | Same value as set in backend AUTH_TYPE i.e `GOOGLE` for OAuth with Google, `LOCAL` for email/password authentication  |
| NEXT_PUBLIC_ETL_SERVICE         | Document parsing service (should match backend ETL_SERVICE): `UNSTRUCTURED`, `LLAMACLOUD`, or `DOCLING` - affects supported file formats in upload interface |

### 2. Install Dependencies

Install the frontend dependencies:

**Linux/macOS:**

```bash
# Install pnpm if you don't have it
npm install -g pnpm

# Install dependencies
pnpm install
```

**Windows:**

```powershell
# Install pnpm if you don't have it
npm install -g pnpm

# Install dependencies
pnpm install
```

### 3. Run the Frontend

Start the Next.js development server:

**Linux/macOS/Windows:**

```bash
pnpm run dev
```

The frontend should now be running at `http://localhost:3000`.

## Browser Extension Setup (Optional)

The SurfSense browser extension allows you to save any webpage, including those protected behind authentication.

### 1. Environment Configuration

**Linux/macOS:**

```bash
cd surfsense_browser_extension
cp .env.example .env
```

**Windows (Command Prompt):**

```cmd
cd surfsense_browser_extension
copy .env.example .env
```

**Windows (PowerShell):**

```powershell
cd surfsense_browser_extension
Copy-Item -Path .env.example -Destination .env
```

Edit the `.env` file:

| ENV VARIABLE              | DESCRIPTION                                           |
| ------------------------- | ----------------------------------------------------- |
| PLASMO_PUBLIC_BACKEND_URL | SurfSense Backend URL (e.g., `http://127.0.0.1:8000`) |

### 2. Build the Extension

Build the extension for your browser using the [Plasmo framework](https://docs.plasmo.com/framework/workflows/build#with-a-specific-target).

**Linux/macOS/Windows:**

```bash
# Install dependencies
pnpm install

# Build for Chrome (default)
pnpm build

# Or for other browsers
pnpm build --target=firefox
pnpm build --target=edge
```

### 3. Load the Extension

Load the extension in your browser's developer mode and configure it with your SurfSense API key.

## Verification

To verify your installation:

1. Open your browser and navigate to `http://localhost:3000`
2. Sign in with your Google account
3. Create a search space and try uploading a document
4. Test the chat functionality with your uploaded content

## Troubleshooting

- **Database Connection Issues**: Verify your PostgreSQL server is running and pgvector is properly installed
- **Redis Connection Issues**: Ensure Redis server is running (`redis-cli ping` should return `PONG`). Check that `CELERY_BROKER_URL` and `CELERY_RESULT_BACKEND` are correctly set in your `.env` file
- **Celery Worker Issues**: Make sure the Celery worker is running in a separate terminal. Check worker logs for any errors
- **Authentication Problems**: Check your Google OAuth configuration and ensure redirect URIs are set correctly
- **LLM Errors**: Confirm your LLM API keys are valid and the selected models are accessible
- **File Upload Failures**: Validate your ETL service API key (Unstructured.io or LlamaCloud) or ensure Docling is properly configured
- **Windows-specific**: If you encounter path issues, ensure you're using the correct path separator (`\` instead of `/`)
- **macOS-specific**: If you encounter permission issues, you may need to use `sudo` for some installation commands

## Next Steps

Now that you have SurfSense running locally, you can explore its features:

- Create search spaces for organizing your content
- Upload documents or use the browser extension to save webpages
- Ask questions about your saved content
- Explore the advanced RAG capabilities

For production deployments, consider setting up:

- A reverse proxy like Nginx
- SSL certificates for secure connections
- Proper database backups
- User access controls