diff --git a/README.md b/README.md index 9056c27f2..c839e9c99 100644 --- a/README.md +++ b/README.md @@ -81,21 +81,26 @@ https://github.com/user-attachments/assets/a0a16566-6967-4374-ac51-9b3e07fbecd7 Run SurfSense on your own infrastructure for full data control and privacy. -**Quick Start (Docker one-liner):** - ```bash -docker run -d -p 3000:3000 -p 8000:8000 -p 5133:5133 -v surfsense-data:/data --name surfsense --restart unless-stopped ghcr.io/modsetter/surfsense:latest +curl -fsSL https://raw.githubusercontent.com/MODSetter/SurfSense/main/docker/scripts/install.sh | bash ``` -After starting, open [http://localhost:3000](http://localhost:3000) in your browser. +For Docker Compose and other deployment options, see the [Docker Installation docs](https://www.surfsense.com/docs/docker-installation). -**Update (Automatic updates with Watchtower):** +**Update (recommended — Watchtower):** ```bash -docker run --rm -v /var/run/docker.sock:/var/run/docker.sock nickfedor/watchtower --run-once surfsense +docker run --rm -v /var/run/docker.sock:/var/run/docker.sock nickfedor/watchtower --run-once --label-filter "com.docker.compose.project=surfsense" ``` -For Docker Compose, manual installation, and other deployment options, check the [docs](https://www.surfsense.com/docs/). +**Update (manual):** + +```bash +cd surfsense # or SurfSense/docker if you used Option 2 +docker compose pull && docker compose up -d +``` + +For manual installation and other deployment options, check the [docs](https://www.surfsense.com/docs/). ### How to Realtime Collaborate (Beta) diff --git a/surfsense_web/content/docs/docker-installation.mdx b/surfsense_web/content/docs/docker-installation.mdx index 767240206..4ca525d7c 100644 --- a/surfsense_web/content/docs/docker-installation.mdx +++ b/surfsense_web/content/docs/docker-installation.mdx @@ -3,511 +3,225 @@ title: Docker Installation description: Setting up SurfSense using Docker --- +This guide explains how to run SurfSense using Docker, with options ranging from a single-command install to a fully manual setup. -This guide explains how to run SurfSense using Docker, with options ranging from quick single-command deployment to full production setups. +## Quick Start -## Quick Start with Docker 🐳 +### Option 1 — Install Script (recommended) -Get SurfSense running in seconds with a single command: - - -The all-in-one Docker image bundles PostgreSQL (with pgvector), Redis, and all SurfSense services. Perfect for quick evaluation and development. - - - -Make sure to include the `-v surfsense-data:/data` in your Docker command. This ensures your database and files are properly persisted. - - -### One-Line Installation - -**Linux/macOS:** +Downloads the compose files, generates a `SECRET_KEY`, and starts all services automatically: ```bash -docker run -d -p 3000:3000 -p 8000:8000 -p 5133:5133 \ - -v surfsense-data:/data \ - --name surfsense \ - --restart unless-stopped \ - ghcr.io/modsetter/surfsense:latest +curl -fsSL https://raw.githubusercontent.com/MODSetter/SurfSense/main/docker/scripts/install.sh | bash ``` -**Windows (PowerShell):** +This creates a `./surfsense/` directory with `docker-compose.yml` and `.env`, then runs `docker compose up -d`. -```powershell -docker run -d -p 3000:3000 -p 8000:8000 -p 5133:5133 ` - -v surfsense-data:/data ` - --name surfsense ` - --restart unless-stopped ` - ghcr.io/modsetter/surfsense:latest -``` - -> **Note:** A secure `SECRET_KEY` is automatically generated and persisted in the data volume on first run. - -### With Custom Configuration - -You can pass any [environment variable](/docs/manual-installation#backend-environment-variables) using `-e` flags: +### Option 2 — Manual Docker Compose ```bash -docker run -d -p 3000:3000 -p 8000:8000 -p 5133:5133 \ - -v surfsense-data:/data \ - -e EMBEDDING_MODEL=openai://text-embedding-ada-002 \ - -e OPENAI_API_KEY=your_openai_api_key \ - -e AUTH_TYPE=GOOGLE \ - -e GOOGLE_OAUTH_CLIENT_ID=your_google_client_id \ - -e GOOGLE_OAUTH_CLIENT_SECRET=your_google_client_secret \ - -e ETL_SERVICE=LLAMACLOUD \ - -e LLAMA_CLOUD_API_KEY=your_llama_cloud_key \ - --name surfsense \ - --restart unless-stopped \ - ghcr.io/modsetter/surfsense:latest -``` - - -- For Google OAuth, create credentials in the [Google Cloud Console](https://console.cloud.google.com/apis/credentials) -- For Airtable connector, create an OAuth integration in the [Airtable Developer Hub](https://airtable.com/create/oauth) -- If deploying behind a reverse proxy with HTTPS, add `-e BACKEND_URL=https://api.yourdomain.com` - - -### Quick Start with Docker Compose - -For easier management with environment files: - -```bash -# Download the quick start compose file -curl -o docker-compose.yml https://raw.githubusercontent.com/MODSetter/SurfSense/main/docker-compose.quickstart.yml - -# Create .env file (optional - for custom configuration) -cat > .env << EOF -# EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2 -# ETL_SERVICE=DOCLING -# SECRET_KEY=your_custom_secret_key # Auto-generated if not set -EOF - -# Start SurfSense +git clone https://github.com/MODSetter/SurfSense.git +cd SurfSense/docker +cp .env.example .env +# Edit .env — at minimum set SECRET_KEY docker compose up -d ``` After starting, access SurfSense at: + - **Frontend**: [http://localhost:3000](http://localhost:3000) - **Backend API**: [http://localhost:8000](http://localhost:8000) - **API Docs**: [http://localhost:8000/docs](http://localhost:8000/docs) -- **Electric-SQL**: [http://localhost:5133](http://localhost:5133) +- **Electric SQL**: [http://localhost:5133](http://localhost:5133) -### Quick Start Environment Variables +--- + +## Configuration + +All configuration lives in a single `docker/.env` file (or `surfsense/.env` if you used the install script). Copy `.env.example` to `.env` and edit the values you need. + +### Required + +| Variable | Description | +|----------|-------------| +| `SECRET_KEY` | JWT secret key. Generate with: `openssl rand -base64 32`. Auto-generated by the install script. | + +### Core Settings | Variable | Description | Default | |----------|-------------|---------| -| SECRET_KEY | JWT secret key (auto-generated if not set) | Auto-generated | -| AUTH_TYPE | Authentication: `LOCAL` or `GOOGLE` | LOCAL | -| EMBEDDING_MODEL | Model for embeddings | sentence-transformers/all-MiniLM-L6-v2 | -| ETL_SERVICE | Document parser: `DOCLING`, `UNSTRUCTURED`, `LLAMACLOUD` | DOCLING | -| TTS_SERVICE | Text-to-speech for podcasts | local/kokoro | -| STT_SERVICE | Speech-to-text for audio (model size: tiny, base, small, medium, large) | local/base | -| REGISTRATION_ENABLED | Allow new user registration | TRUE | +| `SURFSENSE_VERSION` | Image tag to deploy. Pin to a version (e.g. `0.0.13.1`) or use `latest` | `latest` | +| `AUTH_TYPE` | Authentication method: `LOCAL` (email/password) or `GOOGLE` (OAuth) | `LOCAL` | +| `ETL_SERVICE` | Document parsing: `DOCLING` (local), `UNSTRUCTURED`, or `LLAMACLOUD` | `DOCLING` | +| `EMBEDDING_MODEL` | Embedding model for vector search | `sentence-transformers/all-MiniLM-L6-v2` | +| `TTS_SERVICE` | Text-to-speech provider for podcasts | `local/kokoro` | +| `STT_SERVICE` | Speech-to-text provider for audio files | `local/base` | +| `REGISTRATION_ENABLED` | Allow new user registrations | `TRUE` | -### Useful Commands +### Ports + +| Variable | Description | Default | +|----------|-------------|---------| +| `FRONTEND_PORT` | Frontend service port | `3000` | +| `BACKEND_PORT` | Backend API service port | `8000` | +| `ELECTRIC_PORT` | Electric SQL service port | `5133` | + +### Custom Domain / Reverse Proxy + +Only set these if serving SurfSense on a real domain via a reverse proxy (Caddy, Nginx, Cloudflare Tunnel, etc.). Leave commented out for standard localhost deployments. + +| Variable | Description | +|----------|-------------| +| `NEXT_FRONTEND_URL` | Public frontend URL (e.g. `https://app.yourdomain.com`) | +| `BACKEND_URL` | Public backend URL for OAuth callbacks (e.g. `https://api.yourdomain.com`) | +| `NEXT_PUBLIC_FASTAPI_BACKEND_URL` | Backend URL used by the frontend (e.g. `https://api.yourdomain.com`) | +| `NEXT_PUBLIC_ELECTRIC_URL` | Electric SQL URL used by the frontend (e.g. `https://electric.yourdomain.com`) | + +### Database + +Defaults work out of the box. Change for security in production. + +| Variable | Description | Default | +|----------|-------------|---------| +| `DB_USER` | PostgreSQL username | `surfsense` | +| `DB_PASSWORD` | PostgreSQL password | `surfsense` | +| `DB_NAME` | PostgreSQL database name | `surfsense` | +| `DB_HOST` | PostgreSQL host | `db` | +| `DB_PORT` | PostgreSQL port | `5432` | +| `DB_SSLMODE` | SSL mode: `disable`, `require`, `verify-ca`, `verify-full` | `disable` | +| `DATABASE_URL` | Full connection URL override. Use for managed databases (RDS, Supabase, etc.) | *(built from above)* | + +### Electric SQL + +| Variable | Description | Default | +|----------|-------------|---------| +| `ELECTRIC_DB_USER` | Replication user for Electric SQL | `electric` | +| `ELECTRIC_DB_PASSWORD` | Replication password for Electric SQL | `electric_password` | +| `ELECTRIC_DATABASE_URL` | Full connection URL override for Electric. Set to `host.docker.internal` when pointing at a local Postgres instance | *(built from above)* | + +### Authentication + +| Variable | Description | +|----------|-------------| +| `GOOGLE_OAUTH_CLIENT_ID` | Google OAuth client ID (required if `AUTH_TYPE=GOOGLE`) | +| `GOOGLE_OAUTH_CLIENT_SECRET` | Google OAuth client secret (required if `AUTH_TYPE=GOOGLE`) | + +Create credentials at the [Google Cloud Console](https://console.cloud.google.com/apis/credentials). + +### External API Keys + +| Variable | Description | +|----------|-------------| +| `FIRECRAWL_API_KEY` | Firecrawl API key for web crawling | +| `UNSTRUCTURED_API_KEY` | Unstructured.io API key (required if `ETL_SERVICE=UNSTRUCTURED`) | +| `LLAMA_CLOUD_API_KEY` | LlamaCloud API key (required if `ETL_SERVICE=LLAMACLOUD`) | + +### Connector OAuth Keys + +Uncomment the connectors you want to use. Redirect URIs follow the pattern `http://localhost:8000/api/v1/auth//connector/callback`. + +| Connector | Variables | +|-----------|-----------| +| Google Drive / Gmail / Calendar | `GOOGLE_DRIVE_REDIRECT_URI`, `GOOGLE_GMAIL_REDIRECT_URI`, `GOOGLE_CALENDAR_REDIRECT_URI` | +| Notion | `NOTION_CLIENT_ID`, `NOTION_CLIENT_SECRET`, `NOTION_REDIRECT_URI` | +| Slack | `SLACK_CLIENT_ID`, `SLACK_CLIENT_SECRET`, `SLACK_REDIRECT_URI` | +| Discord | `DISCORD_CLIENT_ID`, `DISCORD_CLIENT_SECRET`, `DISCORD_BOT_TOKEN`, `DISCORD_REDIRECT_URI` | +| Jira & Confluence | `ATLASSIAN_CLIENT_ID`, `ATLASSIAN_CLIENT_SECRET`, `JIRA_REDIRECT_URI`, `CONFLUENCE_REDIRECT_URI` | +| Linear | `LINEAR_CLIENT_ID`, `LINEAR_CLIENT_SECRET`, `LINEAR_REDIRECT_URI` | +| ClickUp | `CLICKUP_CLIENT_ID`, `CLICKUP_CLIENT_SECRET`, `CLICKUP_REDIRECT_URI` | +| Airtable | `AIRTABLE_CLIENT_ID`, `AIRTABLE_CLIENT_SECRET`, `AIRTABLE_REDIRECT_URI` | +| Microsoft Teams | `TEAMS_CLIENT_ID`, `TEAMS_CLIENT_SECRET`, `TEAMS_REDIRECT_URI` | + +For Airtable, create an OAuth integration at the [Airtable Developer Hub](https://airtable.com/create/oauth). + +### Observability (optional) + +| Variable | Description | +|----------|-------------| +| `LANGSMITH_TRACING` | Enable LangSmith tracing (`true` / `false`) | +| `LANGSMITH_ENDPOINT` | LangSmith API endpoint | +| `LANGSMITH_API_KEY` | LangSmith API key | +| `LANGSMITH_PROJECT` | LangSmith project name | + +### Advanced (optional) + +| Variable | Description | Default | +|----------|-------------|---------| +| `SCHEDULE_CHECKER_INTERVAL` | How often to check for scheduled connector tasks (e.g. `5m`, `1h`) | `5m` | +| `RERANKERS_ENABLED` | Enable document reranking for improved search | `FALSE` | +| `RERANKERS_MODEL_NAME` | Reranker model name (e.g. `ms-marco-MiniLM-L-12-v2`) | | +| `RERANKERS_MODEL_TYPE` | Reranker model type (e.g. `flashrank`) | | +| `PAGES_LIMIT` | Max pages per user for ETL services | unlimited | + +--- + +## Docker Services + +| Service | Description | +|---------|-------------| +| `db` | PostgreSQL with pgvector extension | +| `redis` | Message broker for Celery | +| `backend` | FastAPI application server | +| `celery_worker` | Background task processing (document indexing, etc.) | +| `celery_beat` | Periodic task scheduler (connector sync) | +| `electric` | Electric SQL — real-time sync for the frontend | +| `frontend` | Next.js web application | + +All services start automatically with `docker compose up -d`. + +--- + +## Updating + +**Option 1 — Watchtower (recommended):** ```bash -# View logs -docker logs -f surfsense - -# Stop SurfSense -docker stop surfsense - -# Start SurfSense -docker start surfsense - -# Remove container (data preserved in volume) -docker rm surfsense - -# Remove container AND data -docker rm surfsense && docker volume rm surfsense-data -``` - -### Updating - -To update SurfSense to the latest version, you can use either of the following methods: - - -Your data is safe! The `surfsense-data` volume persists across updates, and database migrations are applied automatically on every startup. - - -**Option 1: Using Watchtower (one-time auto-update)** - -[Watchtower](https://github.com/nicholas-fedor/watchtower) can automatically pull the latest image, stop the old container, and restart it with the same options: - -```bash -docker run --rm \ - -v /var/run/docker.sock:/var/run/docker.sock \ - nickfedor/watchtower \ - --run-once surfsense +docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \ + nickfedor/watchtower --run-once \ + --label-filter "com.docker.compose.project=surfsense" ``` -Use the `nickfedor/watchtower` fork. The original `containrrr/watchtower` is no longer maintained and may fail with newer Docker versions. +Use `nickfedor/watchtower`. The original `containrrr/watchtower` is no longer maintained and may fail with newer Docker versions. -**Option 2: Manual Update** +**Option 2 — Manual:** ```bash -# Stop and remove the current container -docker rm -f surfsense - -# Pull the latest image -docker pull ghcr.io/modsetter/surfsense:latest - -# Start with the new image -docker run -d -p 3000:3000 -p 8000:8000 -p 5133:5133 \ - -v surfsense-data:/data \ - --name surfsense \ - --restart unless-stopped \ - ghcr.io/modsetter/surfsense:latest +cd surfsense # or SurfSense/docker if you cloned manually +docker compose pull && docker compose up -d ``` -If you used Docker Compose for the quick start, updating is simpler: +Database migrations are applied automatically on every startup. + +--- + +## Useful Commands ```bash -docker compose -f docker-compose.quickstart.yml pull -docker compose -f docker-compose.quickstart.yml up -d +# View logs (all services) +docker compose logs -f + +# View logs for a specific service +docker compose logs -f backend +docker compose logs -f electric + +# Stop all services +docker compose down + +# Restart a specific service +docker compose restart backend + +# Stop and remove all containers + volumes (destructive!) +docker compose down -v ``` --- -## Full Docker Compose Setup (Production) - -For production deployments with separate services and more control, use the full Docker Compose setup below. - -## Prerequisites - -Before you begin, ensure you have: - -- [Docker](https://docs.docker.com/get-docker/) and [Docker Compose](https://docs.docker.com/compose/install/) installed on your machine -- [Git](https://git-scm.com/downloads) (to clone the repository) -- Completed all the [prerequisite setup steps](/docs) including: - - Auth setup - - **File Processing ETL Service** (choose one): - - Unstructured.io API key (Supports 34+ formats) - - LlamaIndex API key (enhanced parsing, supports 50+ formats) - - Docling (local processing, no API key required, supports PDF, Office docs, images, HTML, CSV) - - Other required API keys - -## Installation Steps - -1. **Configure Environment Variables** - Set up the necessary environment variables: - - **Linux/macOS:** - - ```bash - # Copy example environment files - cp surfsense_backend/.env.example surfsense_backend/.env - cp surfsense_web/.env.example surfsense_web/.env - cp .env.example .env # For Docker-specific settings - ``` - - **Windows (Command Prompt):** - - ```cmd - copy surfsense_backend\.env.example surfsense_backend\.env - copy surfsense_web\.env.example surfsense_web\.env - copy .env.example .env - ``` - - **Windows (PowerShell):** - - ```powershell - Copy-Item -Path surfsense_backend\.env.example -Destination surfsense_backend\.env - Copy-Item -Path surfsense_web\.env.example -Destination surfsense_web\.env - Copy-Item -Path .env.example -Destination .env - ``` - - Edit all `.env` files and fill in the required values: - -### Docker-Specific Environment Variables (Optional) - -| ENV VARIABLE | DESCRIPTION | DEFAULT VALUE | -|----------------------------|-----------------------------------------------------------------------------|---------------------| -| FRONTEND_PORT | Port for the frontend service | 3000 | -| BACKEND_PORT | Port for the backend API service | 8000 | -| POSTGRES_PORT | Port for the PostgreSQL database | 5432 | -| PGADMIN_PORT | Port for pgAdmin web interface | 5050 | -| REDIS_PORT | Port for Redis (used by Celery) | 6379 | -| FLOWER_PORT | Port for Flower (Celery monitoring tool) | 5555 | -| POSTGRES_USER | PostgreSQL username | postgres | -| POSTGRES_PASSWORD | PostgreSQL password | postgres | -| POSTGRES_DB | PostgreSQL database name | surfsense | -| PGADMIN_DEFAULT_EMAIL | Email for pgAdmin login | admin@surfsense.com | -| PGADMIN_DEFAULT_PASSWORD | Password for pgAdmin login | surfsense | -| NEXT_PUBLIC_FASTAPI_BACKEND_URL | URL of the backend API (used by frontend during build and runtime) | http://localhost:8000 | -| NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE | Authentication method for frontend: `LOCAL` or `GOOGLE` | LOCAL | -| NEXT_PUBLIC_ETL_SERVICE | Document parsing service for frontend UI: `UNSTRUCTURED`, `LLAMACLOUD`, or `DOCLING` | DOCLING | -| ELECTRIC_PORT | Port for Electric-SQL service | 5133 | -| POSTGRES_HOST | PostgreSQL host for Electric connection (`db` for Docker PostgreSQL, `host.docker.internal` for local PostgreSQL) | db | -| ELECTRIC_DB_USER | PostgreSQL username for Electric connection | electric | -| ELECTRIC_DB_PASSWORD | PostgreSQL password for Electric connection | electric_password | -| NEXT_PUBLIC_ELECTRIC_URL | URL for Electric-SQL service (used by frontend) | http://localhost:5133 | - -**Note:** Frontend environment variables with the `NEXT_PUBLIC_` prefix are embedded into the Next.js production build at build time. Since the frontend now runs as a production build in Docker, these variables must be set in the root `.env` file (Docker-specific configuration) and will be passed as build arguments during the Docker build process. - -**Backend Environment Variables:** - -| ENV VARIABLE | DESCRIPTION | -| -------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| DATABASE_URL | PostgreSQL connection string (e.g., `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`) | -| SECRET_KEY | JWT Secret key for authentication (should be a secure random string) | -| NEXT_FRONTEND_URL | URL where your frontend application is hosted (e.g., `http://localhost:3000`) | -| BACKEND_URL | (Optional) Public URL of the backend for OAuth callbacks (e.g., `https://api.yourdomain.com`). Required when running behind a reverse proxy with HTTPS. Used to set correct OAuth redirect URLs and secure cookies. | -| AUTH_TYPE | Authentication method: `GOOGLE` for OAuth with Google, `LOCAL` for email/password authentication | -| GOOGLE_OAUTH_CLIENT_ID | (Optional) Client ID from Google Cloud Console (required if AUTH_TYPE=GOOGLE) | -| GOOGLE_OAUTH_CLIENT_SECRET | (Optional) Client secret from Google Cloud Console (required if AUTH_TYPE=GOOGLE) | -| ELECTRIC_DB_USER | (Optional) PostgreSQL username for Electric-SQL connection (default: `electric`) | -| ELECTRIC_DB_PASSWORD | (Optional) PostgreSQL password for Electric-SQL connection (default: `electric_password`) | -| EMBEDDING_MODEL | Name of the embedding model (e.g., `sentence-transformers/all-MiniLM-L6-v2`, `openai://text-embedding-ada-002`) | -| RERANKERS_ENABLED | (Optional) Enable or disable document reranking for improved search results (e.g., `TRUE` or `FALSE`, default: `FALSE`) | -| RERANKERS_MODEL_NAME | Name of the reranker model (e.g., `ms-marco-MiniLM-L-12-v2`) (required if RERANKERS_ENABLED=TRUE) | -| RERANKERS_MODEL_TYPE | Type of reranker model (e.g., `flashrank`) (required if RERANKERS_ENABLED=TRUE) | -| TTS_SERVICE | Text-to-Speech API provider for Podcasts (e.g., `local/kokoro`, `openai/tts-1`). See [supported providers](https://docs.litellm.ai/docs/text_to_speech#supported-providers) | -| TTS_SERVICE_API_KEY | (Optional if local) API key for the Text-to-Speech service | -| TTS_SERVICE_API_BASE | (Optional) Custom API base URL for the Text-to-Speech service | -| STT_SERVICE | Speech-to-Text API provider for Audio Files (e.g., `local/base`, `openai/whisper-1`). See [supported providers](https://docs.litellm.ai/docs/audio_transcription#supported-providers) | -| STT_SERVICE_API_KEY | (Optional if local) API key for the Speech-to-Text service | -| STT_SERVICE_API_BASE | (Optional) Custom API base URL for the Speech-to-Text service | -| FIRECRAWL_API_KEY | API key for Firecrawl service for web crawling | -| ETL_SERVICE | Document parsing service: `UNSTRUCTURED` (supports 34+ formats), `LLAMACLOUD` (supports 50+ formats including legacy document types), or `DOCLING` (local processing, supports PDF, Office docs, images, HTML, CSV) | -| UNSTRUCTURED_API_KEY | API key for Unstructured.io service for document parsing (required if ETL_SERVICE=UNSTRUCTURED) | -| LLAMA_CLOUD_API_KEY | API key for LlamaCloud service for document parsing (required if ETL_SERVICE=LLAMACLOUD) | -| CELERY_BROKER_URL | Redis connection URL for Celery broker (e.g., `redis://localhost:6379/0`) | -| CELERY_RESULT_BACKEND | Redis connection URL for Celery result backend (e.g., `redis://localhost:6379/0`) | -| SCHEDULE_CHECKER_INTERVAL | (Optional) How often to check for scheduled connector tasks. Format: `` where unit is `m` (minutes) or `h` (hours). Examples: `1m`, `5m`, `1h`, `2h` (default: `1m`) | -| REGISTRATION_ENABLED | (Optional) Enable or disable new user registration (e.g., `TRUE` or `FALSE`, default: `TRUE`) | -| PAGES_LIMIT | (Optional) Maximum pages limit per user for ETL services (default: `999999999` for unlimited in OSS version) | - -**Google Connector OAuth Configuration:** -| ENV VARIABLE | DESCRIPTION | -| -------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| GOOGLE_CALENDAR_REDIRECT_URI | (Optional) Redirect URI for Google Calendar connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/google/calendar/connector/callback`) | -| GOOGLE_GMAIL_REDIRECT_URI | (Optional) Redirect URI for Gmail connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/google/gmail/connector/callback`) | -| GOOGLE_DRIVE_REDIRECT_URI | (Optional) Redirect URI for Google Drive connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/google/drive/connector/callback`) | - -**Connector OAuth Configurations (Optional):** - -| ENV VARIABLE | DESCRIPTION | -| -------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| AIRTABLE_CLIENT_ID | (Optional) Airtable OAuth client ID from [Airtable Developer Hub](https://airtable.com/create/oauth) | -| AIRTABLE_CLIENT_SECRET | (Optional) Airtable OAuth client secret | -| AIRTABLE_REDIRECT_URI | (Optional) Redirect URI for Airtable connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/airtable/connector/callback`) | -| CLICKUP_CLIENT_ID | (Optional) ClickUp OAuth client ID | -| CLICKUP_CLIENT_SECRET | (Optional) ClickUp OAuth client secret | -| CLICKUP_REDIRECT_URI | (Optional) Redirect URI for ClickUp connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/clickup/connector/callback`) | -| DISCORD_CLIENT_ID | (Optional) Discord OAuth client ID | -| DISCORD_CLIENT_SECRET | (Optional) Discord OAuth client secret | -| DISCORD_REDIRECT_URI | (Optional) Redirect URI for Discord connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/discord/connector/callback`) | -| DISCORD_BOT_TOKEN | (Optional) Discord bot token from Developer Portal | -| ATLASSIAN_CLIENT_ID | (Optional) Atlassian OAuth client ID (for Jira and Confluence) | -| ATLASSIAN_CLIENT_SECRET | (Optional) Atlassian OAuth client secret | -| JIRA_REDIRECT_URI | (Optional) Redirect URI for Jira connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/jira/connector/callback`) | -| CONFLUENCE_REDIRECT_URI | (Optional) Redirect URI for Confluence connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/confluence/connector/callback`) | -| LINEAR_CLIENT_ID | (Optional) Linear OAuth client ID | -| LINEAR_CLIENT_SECRET | (Optional) Linear OAuth client secret | -| LINEAR_REDIRECT_URI | (Optional) Redirect URI for Linear connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/linear/connector/callback`) | -| NOTION_CLIENT_ID | (Optional) Notion OAuth client ID | -| NOTION_CLIENT_SECRET | (Optional) Notion OAuth client secret | -| NOTION_REDIRECT_URI | (Optional) Redirect URI for Notion connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/notion/connector/callback`) | -| SLACK_CLIENT_ID | (Optional) Slack OAuth client ID | -| SLACK_CLIENT_SECRET | (Optional) Slack OAuth client secret | -| SLACK_REDIRECT_URI | (Optional) Redirect URI for Slack connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/slack/connector/callback`) | -| TEAMS_CLIENT_ID | (Optional) Microsoft Teams OAuth client ID | -| TEAMS_CLIENT_SECRET | (Optional) Microsoft Teams OAuth client secret | -| TEAMS_REDIRECT_URI | (Optional) Redirect URI for Teams connector OAuth callback (e.g., `http://localhost:8000/api/v1/auth/teams/connector/callback`) | - - -**Optional Backend LangSmith Observability:** -| ENV VARIABLE | DESCRIPTION | -|--------------|-------------| -| LANGSMITH_TRACING | Enable LangSmith tracing (e.g., `true`) | -| LANGSMITH_ENDPOINT | LangSmith API endpoint (e.g., `https://api.smith.langchain.com`) | -| LANGSMITH_API_KEY | Your LangSmith API key | -| LANGSMITH_PROJECT | LangSmith project name (e.g., `surfsense`) | - -**Backend Uvicorn Server Configuration:** -| ENV VARIABLE | DESCRIPTION | DEFAULT VALUE | -|------------------------------|---------------------------------------------|---------------| -| UVICORN_HOST | Host address to bind the server | 0.0.0.0 | -| UVICORN_PORT | Port to run the backend API | 8000 | -| UVICORN_LOG_LEVEL | Logging level (e.g., info, debug, warning) | info | -| UVICORN_PROXY_HEADERS | Enable/disable proxy headers | false | -| UVICORN_FORWARDED_ALLOW_IPS | Comma-separated list of allowed IPs | 127.0.0.1 | -| UVICORN_WORKERS | Number of worker processes | 1 | -| UVICORN_ACCESS_LOG | Enable/disable access log (true/false) | true | -| UVICORN_LOOP | Event loop implementation | auto | -| UVICORN_HTTP | HTTP protocol implementation | auto | -| UVICORN_WS | WebSocket protocol implementation | auto | -| UVICORN_LIFESPAN | Lifespan implementation | auto | -| UVICORN_LOG_CONFIG | Path to logging config file or empty string | | -| UVICORN_SERVER_HEADER | Enable/disable Server header | true | -| UVICORN_DATE_HEADER | Enable/disable Date header | true | -| UVICORN_LIMIT_CONCURRENCY | Max concurrent connections | | -| UVICORN_LIMIT_MAX_REQUESTS | Max requests before worker restart | | -| UVICORN_TIMEOUT_KEEP_ALIVE | Keep-alive timeout (seconds) | 5 | -| UVICORN_TIMEOUT_NOTIFY | Worker shutdown notification timeout (sec) | 30 | -| UVICORN_SSL_KEYFILE | Path to SSL key file | | -| UVICORN_SSL_CERTFILE | Path to SSL certificate file | | -| UVICORN_SSL_KEYFILE_PASSWORD | Password for SSL key file | | -| UVICORN_SSL_VERSION | SSL version | | -| UVICORN_SSL_CERT_REQS | SSL certificate requirements | | -| UVICORN_SSL_CA_CERTS | Path to CA certificates file | | -| UVICORN_SSL_CIPHERS | SSL ciphers | | -| UVICORN_HEADERS | Comma-separated list of headers | | -| UVICORN_USE_COLORS | Enable/disable colored logs | true | -| UVICORN_UDS | Unix domain socket path | | -| UVICORN_FD | File descriptor to bind to | | -| UVICORN_ROOT_PATH | Root path for the application | | - -For more details, see the [Uvicorn documentation](https://www.uvicorn.org/#command-line-options). - -### Frontend Environment Variables - -**Important:** Frontend environment variables are now configured in the **Docker-Specific Environment Variables** section above since the Next.js application runs as a production build in Docker. The following `NEXT_PUBLIC_*` variables should be set in your root `.env` file: - -- `NEXT_PUBLIC_FASTAPI_BACKEND_URL` - URL of the backend service -- `NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE` - Authentication method (`LOCAL` or `GOOGLE`) -- `NEXT_PUBLIC_ETL_SERVICE` - Document parsing service (should match backend `ETL_SERVICE`) -- `NEXT_PUBLIC_ELECTRIC_URL` - URL for Electric-SQL service (default: `http://localhost:5133`) -- `NEXT_PUBLIC_ELECTRIC_AUTH_MODE` - Electric-SQL authentication mode (default: `insecure`) - -These variables are embedded into the application during the Docker build process and affect the frontend's behavior and available features. - -2. **Build and Start Containers** - - Start the Docker containers: - - **Linux/macOS/Windows:** - - ```bash - docker compose up --build - ``` - - To run in detached mode (in the background): - - **Linux/macOS/Windows:** - - ```bash - docker compose up -d - ``` - - **Note for Windows users:** If you're using older Docker Desktop versions, you might need to use `docker compose` (with a space) instead of `docker compose`. - -3. **Access the Applications** - - Once the containers are running, you can access: - - - Frontend: [http://localhost:3000](http://localhost:3000) - - Backend API: [http://localhost:8000](http://localhost:8000) - - API Documentation: [http://localhost:8000/docs](http://localhost:8000/docs) - - Electric-SQL: [http://localhost:5133](http://localhost:5133) - - pgAdmin: [http://localhost:5050](http://localhost:5050) - -## Docker Services Overview - -The Docker setup includes several services that work together: - -- **Backend**: FastAPI application server -- **Frontend**: Next.js web application -- **PostgreSQL (db)**: Database with pgvector extension -- **Redis**: Message broker for Celery -- **Electric-SQL**: Real-time sync service for database operations -- **Celery Worker**: Handles background tasks (document processing, indexing, etc.) -- **Celery Beat**: Scheduler for periodic tasks (enables scheduled connector indexing) - - The schedule interval can be configured using the `SCHEDULE_CHECKER_INTERVAL` environment variable in your backend `.env` file - - Default: checks every minute for connectors that need indexing -- **pgAdmin**: Database management interface - -All services start automatically with `docker compose up`. The Celery Beat service ensures that periodic indexing functionality works out of the box. - -## Using pgAdmin - -pgAdmin is included in the Docker setup to help manage your PostgreSQL database. To connect: - -1. Open pgAdmin at [http://localhost:5050](http://localhost:5050) -2. Login with the credentials from your `.env` file (default: admin@surfsense.com / surfsense) -3. Right-click "Servers" > "Create" > "Server" -4. In the "General" tab, name your connection (e.g., "SurfSense DB") -5. In the "Connection" tab: - - Host: `db` - - Port: `5432` - - Maintenance database: `surfsense` - - Username: `postgres` (or your custom POSTGRES_USER) - - Password: `postgres` (or your custom POSTGRES_PASSWORD) -6. Click "Save" to connect - -## Updating (Full Docker Compose) - -To update the full Docker Compose production setup to the latest version: - -```bash -# Pull latest changes -git pull - -# Rebuild and restart containers -docker compose up --build -d -``` - -Database migrations are applied automatically on startup. - -## Useful Docker Commands - -### Container Management - -- **Stop containers:** - - **Linux/macOS/Windows:** - - ```bash - docker compose down - ``` - -- **View logs:** - - **Linux/macOS/Windows:** - - ```bash - # All services - docker compose logs -f - - # Specific service - docker compose logs -f backend - docker compose logs -f frontend - docker compose logs -f db - ``` - -- **Restart a specific service:** - - **Linux/macOS/Windows:** - - ```bash - docker compose restart backend - ``` - -- **Execute commands in a running container:** - - **Linux/macOS/Windows:** - - ```bash - # Backend - docker compose exec backend python -m pytest - - # Frontend - docker compose exec frontend pnpm lint - ``` - ## Troubleshooting -- **Linux/macOS:** If you encounter permission errors, you may need to run the docker commands with `sudo`. -- **Windows:** If you see access denied errors, make sure you're running Command Prompt or PowerShell as Administrator. -- If ports are already in use, modify the port mappings in the `docker-compose.yml` file. -- For backend dependency issues, check the `Dockerfile` in the backend directory. -- For frontend dependency issues, check the `Dockerfile` in the frontend directory. -- **Windows-specific:** If you encounter line ending issues (CRLF vs LF), configure Git to handle line endings properly with `git config --global core.autocrlf true` before cloning the repository. - -## Next Steps - -Once your installation is complete, you can start using SurfSense! Navigate to the frontend URL and log in using your Google account. +- **Ports already in use** — Change the relevant `*_PORT` variable in `.env` and restart. +- **Permission errors on Linux** — You may need to prefix `docker` commands with `sudo`. +- **Electric SQL not connecting** — Check `docker compose logs electric`. If it shows `domain does not exist: db`, ensure `ELECTRIC_DATABASE_URL` is not set to a stale value in `.env`. +- **Real-time updates not working in browser** — Open DevTools → Console and look for `[Electric]` errors. Check that `NEXT_PUBLIC_ELECTRIC_URL` matches the running Electric SQL address. +- **Line ending issues on Windows** — Run `git config --global core.autocrlf true` before cloning.