--- title: Docker Compose description: Manual Docker Compose setup for SurfSense --- ## Setup ```bash git clone https://github.com/MODSetter/SurfSense.git cd SurfSense/docker cp .env.example .env # Edit .env, at minimum set SECRET_KEY docker compose up -d ``` After starting, access SurfSense at: - **SurfSense**: [http://localhost:3929](http://localhost:3929) - **Backend API**: [http://localhost:3929/api/v1](http://localhost:3929/api/v1) - **Zero sync**: `ws://localhost:3929/zero` --- ## Configuration All configuration lives in a single `docker/.env` file (or `surfsense/.env` if you used the install script). Copy `.env.example` to `.env` and edit the values you need. ### Required | Variable | Description | |----------|-------------| | `SECRET_KEY` | JWT secret key. Generate with: `openssl rand -base64 32`. Auto-generated by the install script. | ### Core Settings | Variable | Description | Default | |----------|-------------|---------| | `SURFSENSE_VERSION` | Image tag to deploy. Use `latest`, a clean version (e.g. `0.0.14`), or a specific build (e.g. `0.0.14.1`) | `latest` | | `SURFSENSE_VARIANT` | Backend image variant. Leave empty for CPU, set `cuda` for CUDA 12.8, or `cuda126` for CUDA 12.6. | *(empty)* | | `AUTH_TYPE` | Authentication method: `LOCAL` (email/password) or `GOOGLE` (OAuth) | `LOCAL` | | `ETL_SERVICE` | Document parsing: `DOCLING` (local), `UNSTRUCTURED`, or `LLAMACLOUD` | `DOCLING` | | `EMBEDDING_MODEL` | Embedding model for vector search | `sentence-transformers/all-MiniLM-L6-v2` | | `TTS_SERVICE` | Text-to-speech provider for podcasts | `local/kokoro` | | `STT_SERVICE` | Speech-to-text provider for audio files | `local/base` | | `REGISTRATION_ENABLED` | Allow new user registrations | `TRUE` | ### Image Variants SurfSense publishes CPU and CUDA backend image variants. The frontend image is not variant-specific. | Backend tag | Use case | `SURFSENSE_VARIANT` | |-------------|----------|---------------------| | `:latest` | CPU-only default | *(empty)* | | `:latest-cuda` | NVIDIA CUDA 12.8 backend image | `cuda` | | `:latest-cuda126` | NVIDIA CUDA 12.6 backend image for older driver stacks | `cuda126` | All backend variants are published for `linux/amd64` and `linux/arm64`. CUDA on `linux/arm64` is best-effort. GPU acceleration needs two settings: `SURFSENSE_VARIANT` selects the CUDA image, and `COMPOSE_FILE` enables the GPU device overlay. The host must have the NVIDIA Container Toolkit installed. ### NVIDIA GPU Acceleration For most NVIDIA systems, add these values to `.env` to use the CUDA 12.8 image: ```dotenv SURFSENSE_VARIANT=cuda COMPOSE_FILE=docker-compose.yml:docker-compose.gpu.yml SURFSENSE_GPU_COUNT=1 ``` Use `SURFSENSE_VARIANT=cuda126` for older NVIDIA driver stacks or older GPUs that need the CUDA 12.6 fallback image. On Windows, use `;` instead of `:` in `COMPOSE_FILE` inside `.env`: ```dotenv COMPOSE_FILE=docker-compose.yml;docker-compose.gpu.yml ``` To switch variants later, edit `SURFSENSE_VARIANT` and `COMPOSE_FILE` in `.env`, then run: ```bash docker compose pull docker compose up -d --wait ``` ### Automatic Updates Manual Docker Compose installs do not start Watchtower automatically. To enable external automatic updates, run Watchtower separately: ```bash docker run -d --name watchtower \ --restart unless-stopped \ -v /var/run/docker.sock:/var/run/docker.sock \ nickfedor/watchtower \ --label-enable \ --interval 86400 ``` SurfSense containers are labeled for Watchtower, so `--label-enable` limits updates to the SurfSense services. ### Public URL and Ports | Variable | Description | Default | |----------|-------------|---------| | `SURFSENSE_PUBLIC_URL` | Public origin used by the frontend, backend OAuth callbacks, and Zero browser URL | `http://localhost:3929` | | `SURFSENSE_SITE_ADDRESS` | Caddy site address. `:80` means local plain HTTP; a hostname enables automatic HTTPS | `:80` | | `LISTEN_HTTP_PORT` | Host port mapped to Caddy's HTTP listener | `3929` | | `LISTEN_HTTPS_PORT` | Host port mapped to Caddy's HTTPS listener for domain mode | `443` | SurfSense includes Caddy by default. The `frontend`, `backend`, and `zero-cache` containers are internal-only in the production compose file; the browser reaches them through Caddy path routing. ### Custom Domain / Automatic HTTPS For a real domain, point DNS at the Docker host and set: ```dotenv SURFSENSE_SITE_ADDRESS=surf.example.com LISTEN_HTTP_PORT=80 LISTEN_HTTPS_PORT=443 CERT_EMAIL=you@example.com SURFSENSE_PUBLIC_URL=https://surf.example.com ``` Caddy will issue and renew Let's Encrypt certificates automatically. Ports 80 and 443 must be reachable from the internet for the default HTTP-01 challenge. | Variable | Description | |----------|-------------| | `CERT_EMAIL` | Optional ACME contact email | | `CERT_ACME_CA` | ACME directory URL; use Let's Encrypt staging when testing cert issuance | | `CERT_ACME_DNS` | DNS-01 challenge config; requires the custom Caddy build | | `TRUSTED_PROXIES` | CIDR ranges trusted for forwarded client IP headers | | `SURFSENSE_MAX_BODY_SIZE` | Upload limit enforced at the proxy | ### Bring Your Own Proxy If you already run nginx, Traefik, Cloudflare Tunnel, or another ingress, you can comment out the `proxy` service and route traffic to the internal services with the same path contract: | Public path | Upstream | |-------------|----------| | `/auth/*` | `backend:8000` | | `/api/v1/*` | `backend:8000` | | `/zero/*` | `zero-cache:4848` | | `/*` | `frontend:3000` | Alternative proxies must preserve WebSocket upgrades for `/zero`, avoid buffering streaming responses, allow long-running requests, and support large uploads. For DNS-01 or wildcard certificates with Caddy, build `docker/proxy/Dockerfile` and set `CERT_ACME_DNS` for your DNS provider. ### Zero-cache (Real-Time Sync) Defaults work out of the box. Change `ZERO_ADMIN_PASSWORD` for security in production. | Variable | Description | Default | |----------|-------------|---------| | `ZERO_ADMIN_PASSWORD` | Password for the zero-cache admin UI and `/statz` endpoint | `surfsense-zero-admin` | | `ZERO_UPSTREAM_DB` | PostgreSQL connection URL for replication (must be a direct connection, not via pgbouncer) | *(built from DB_* vars)* | | `ZERO_CVR_DB` | PostgreSQL connection URL for client view records | *(built from DB_* vars)* | | `ZERO_CHANGE_DB` | PostgreSQL connection URL for replication log entries | *(built from DB_* vars)* | | `ZERO_APP_PUBLICATIONS` | PostgreSQL publication restricting which tables are replicated (created by migration 116, verified by the `migrations` service before `zero-cache` starts) | `zero_publication` | | `ZERO_NUM_SYNC_WORKERS` | Number of view-sync worker processes. Must be ≤ connection pool sizes | `4` | | `ZERO_UPSTREAM_MAX_CONNS` | Max connections to upstream PostgreSQL for mutations | `20` | | `ZERO_CVR_MAX_CONNS` | Max connections to the CVR database | `30` | ### Database Defaults work out of the box. Change for security in production. | Variable | Description | Default | |----------|-------------|---------| | `DB_USER` | PostgreSQL username | `surfsense` | | `DB_PASSWORD` | PostgreSQL password | `surfsense` | | `DB_NAME` | PostgreSQL database name | `surfsense` | | `DB_HOST` | PostgreSQL host | `db` | | `DB_PORT` | PostgreSQL port | `5432` | | `DB_SSLMODE` | SSL mode: `disable`, `require`, `verify-ca`, `verify-full` | `disable` | | `DATABASE_URL` | Full connection URL override. Use for managed databases (RDS, Supabase, etc.) | *(built from above)* | ### Authentication | Variable | Description | |----------|-------------| | `GOOGLE_OAUTH_CLIENT_ID` | Google OAuth client ID (required if `AUTH_TYPE=GOOGLE`) | | `GOOGLE_OAUTH_CLIENT_SECRET` | Google OAuth client secret (required if `AUTH_TYPE=GOOGLE`) | Create credentials at the [Google Cloud Console](https://console.cloud.google.com/apis/credentials). ### External API Keys | Variable | Description | |----------|-------------| | `UNSTRUCTURED_API_KEY` | [Unstructured.io](https://unstructured.io/) API key (required if `ETL_SERVICE=UNSTRUCTURED`) | | `LLAMA_CLOUD_API_KEY` | [LlamaCloud](https://cloud.llamaindex.ai/) API key (required if `ETL_SERVICE=LLAMACLOUD`) | ### Connector OAuth Keys Uncomment the connectors you want to use. Redirect URIs follow the single-origin pattern `${SURFSENSE_PUBLIC_URL}/api/v1/auth//connector/callback`. For local Docker defaults, that means `http://localhost:3929/api/v1/auth//connector/callback`. | Connector | Variables | |-----------|-----------| | Google Drive / Gmail / Calendar | `GOOGLE_DRIVE_REDIRECT_URI`, `GOOGLE_GMAIL_REDIRECT_URI`, `GOOGLE_CALENDAR_REDIRECT_URI` | | Notion | `NOTION_CLIENT_ID`, `NOTION_CLIENT_SECRET`, `NOTION_REDIRECT_URI` | | Slack | `SLACK_CLIENT_ID`, `SLACK_CLIENT_SECRET`, `SLACK_REDIRECT_URI` | | Discord | `DISCORD_CLIENT_ID`, `DISCORD_CLIENT_SECRET`, `DISCORD_BOT_TOKEN`, `DISCORD_REDIRECT_URI` | | Atlassian (Jira & Confluence) | `ATLASSIAN_CLIENT_ID`, `ATLASSIAN_CLIENT_SECRET`, `JIRA_REDIRECT_URI`, `CONFLUENCE_REDIRECT_URI` | | Linear | `LINEAR_CLIENT_ID`, `LINEAR_CLIENT_SECRET`, `LINEAR_REDIRECT_URI` | | ClickUp | `CLICKUP_CLIENT_ID`, `CLICKUP_CLIENT_SECRET`, `CLICKUP_REDIRECT_URI` | | Airtable | `AIRTABLE_CLIENT_ID`, `AIRTABLE_CLIENT_SECRET`, `AIRTABLE_REDIRECT_URI` | | Microsoft (Teams & OneDrive) | `MICROSOFT_CLIENT_ID`, `MICROSOFT_CLIENT_SECRET`, `TEAMS_REDIRECT_URI`, `ONEDRIVE_REDIRECT_URI` | | Dropbox | `DROPBOX_APP_KEY`, `DROPBOX_APP_SECRET`, `DROPBOX_REDIRECT_URI` | ### Messaging Channels Configure these in the same `docker/.env` file when you want users to chat with SurfSense from external apps. See [Messaging Channels](/docs/messaging-channels) for full setup. | Channel | Variables | |---------|-----------| | Telegram | `TELEGRAM_SHARED_BOT_TOKEN`, `TELEGRAM_SHARED_BOT_USERNAME`, `TELEGRAM_WEBHOOK_SECRET`, `GATEWAY_BASE_URL`, `GATEWAY_TELEGRAM_INTAKE_MODE` | | WhatsApp | `GATEWAY_WHATSAPP_INTAKE_MODE`, `WHATSAPP_SHARED_BUSINESS_TOKEN`, `WHATSAPP_SHARED_PHONE_NUMBER_ID`, `WHATSAPP_SHARED_DISPLAY_PHONE_NUMBER`, `WHATSAPP_SHARED_WABA_ID`, `WHATSAPP_WEBHOOK_VERIFY_TOKEN`, `WHATSAPP_WEBHOOK_APP_SECRET` | | Slack | `SLACK_CLIENT_ID`, `SLACK_CLIENT_SECRET`, `GATEWAY_SLACK_ENABLED`, `GATEWAY_SLACK_SIGNING_SECRET`, `GATEWAY_SLACK_REDIRECT_URI` | | Discord | `DISCORD_CLIENT_ID`, `DISCORD_CLIENT_SECRET`, `DISCORD_BOT_TOKEN`, `GATEWAY_DISCORD_ENABLED`, `GATEWAY_DISCORD_REDIRECT_URI` | ### Observability (optional) | Variable | Description | |----------|-------------| | `LANGSMITH_TRACING` | Enable LangSmith tracing (`true` / `false`) | | `LANGSMITH_ENDPOINT` | LangSmith API endpoint | | `LANGSMITH_API_KEY` | LangSmith API key | | `LANGSMITH_PROJECT` | LangSmith project name | ### Advanced (optional) | Variable | Description | Default | |----------|-------------|---------| | `SCHEDULE_CHECKER_INTERVAL` | How often to check for scheduled connector tasks (e.g. `5m`, `1h`) | `5m` | | `RERANKERS_ENABLED` | Enable document reranking for improved search | `FALSE` | | `RERANKERS_MODEL_NAME` | Reranker model name (e.g. `ms-marco-MiniLM-L-12-v2`) | | | `RERANKERS_MODEL_TYPE` | Reranker model type (e.g. `flashrank`) | | | `PAGES_LIMIT` | Max pages per user for ETL services | unlimited | --- ## Docker Services | Service | Description | |---------|-------------| | `proxy` | Caddy reverse proxy; the only public ingress in production Docker | | `db` | PostgreSQL with pgvector extension | | `migrations` | Short-lived: runs `alembic upgrade head` and verifies `zero_publication`, then exits | | `redis` | Message broker for Celery | | `searxng` | Local privacy-respecting search backend | | `backend` | FastAPI application server | | `celery_worker` | Background task processing (document indexing, etc.) | | `celery_beat` | Periodic task scheduler (connector sync) | | `zero-cache` | Rocicorp Zero real-time sync (replicates Postgres to clients) | | `frontend` | Next.js web application, internal behind Caddy | All services start automatically with `docker compose up -d`. ### How startup ordering works Schema migrations run as a dedicated `migrations` service that exits 0 on success and non-zero on failure. Every other backend-image service gates on it via `condition: service_completed_successfully`: ```text db (healthy) ──▶ migrations (alembic upgrade head + verify zero_publication) │ ├── exit 0 ─▶ backend ──▶ frontend │ celery_worker │ celery_beat │ zero-cache ──▶ frontend │ └── exit ≠ 0 ─▶ compose halts the rest of the stack ``` This guarantees `zero-cache` only starts after `zero_publication` exists in Postgres. Before this design, a silent migration failure would leave `zero-cache` crash-looping with `Unknown or invalid publications. Specified: [zero_publication]. Found: []`. ### Readiness vs liveness The backend exposes two endpoints: - `GET /health`: lightweight liveness probe (always returns 200 if the process is up). - `GET /ready`: readiness probe that confirms `zero_publication` exists. Returns 503 if not. The compose `backend.healthcheck` uses `/ready` so the container only reports `healthy` once the schema is actually usable by zero-cache. You can also monitor startup progress with `docker compose ps` (look for `(health: starting)` → `(healthy)`). The install script polls these states automatically and times out after 5 minutes if the stack does not converge. --- ## Useful Commands ```bash # View logs (all services) docker compose logs -f # View logs for a specific service docker compose logs -f backend # Stop all services docker compose down # Restart a specific service docker compose restart backend # Stop and remove all containers + volumes (destructive!) docker compose down -v ``` --- ## Troubleshooting - **Port already in use**: Change `LISTEN_HTTP_PORT` in `.env` and restart. In domain mode, use ports `80` and `443` so Caddy can complete certificate issuance. - **Permission errors on Linux**: You may need to prefix `docker` commands with `sudo`. - **Real-time updates not working**: Open DevTools → Console and check for WebSocket errors. In production Docker the expected URL is `${SURFSENSE_PUBLIC_URL}/zero`. - **Line ending issues on Windows**: Run `git config --global core.autocrlf true` before cloning. ### Migration service exited non-zero The `migrations` service exits non-zero in two cases: 1. `alembic upgrade head` failed (timeout or SQL error). 2. `alembic` succeeded but `zero_publication` is still missing from `pg_publication`. Inspect the logs and the alembic state: ```bash docker compose logs migrations docker compose exec db psql -U surfsense -d surfsense \ -c 'SELECT * FROM alembic_version;' docker compose exec db psql -U surfsense -d surfsense \ -c 'SELECT pubname FROM pg_publication;' ``` The default migration timeout is 900 seconds. Slow disks (Windows / WSL2) may need more. Set `MIGRATION_TIMEOUT` in `.env` to increase it. ### Zero-cache stuck on `Unknown or invalid publications` Symptom (in `docker compose logs zero-cache`): ```text Error: Unknown or invalid publications. Specified: [zero_publication]. Found: [] ``` This means `zero-cache` started before `zero_publication` was created or the publication does not match SurfSense's canonical Zero shape. With the current compose files this should be impossible: the `migrations` service blocks `zero-cache` from starting and verifies the publication before exiting successfully. If you see it, your stack predates the fix or you brought up `zero-cache` manually with `docker compose up zero-cache` before the migrations service ran. Recovery: ```bash docker compose down docker volume rm surfsense-zero-cache # wipe half-built SQLite replica docker compose up -d # migrations runs first, then zero-cache ``` ### Zero-cache crashes with `_zero.tableMetadata` errors This indicates a half-initialized SQLite replica left behind by a previous crash. Zero's own event triggers and `ZERO_AUTO_RESET` handle schema and replication halts automatically. If the local SQLite replica is wedged, run the recovery one-liner above to wipe `surfsense-zero-cache`; zero-cache will re-sync from Postgres on the next start. ### Ensuring `wal_level = logical` Logical replication is required by zero-cache. The bundled `docker/postgresql.conf` sets `wal_level = logical` automatically. If you swap in your own config or use a managed Postgres, confirm with: ```bash docker compose exec db psql -U surfsense -d surfsense \ -c "SHOW wal_level;" ``` ### Using `docker-compose.deps-only.yml` `docker-compose.deps-only.yml` runs only the dependencies (Postgres, Redis, SearXNG, zero-cache) on Docker while the backend and frontend run on the host. Because there is no backend container in this stack, there is no `migrations` service either, and you must run alembic on the host **before** bringing the stack up: ```bash cd surfsense_backend uv run alembic upgrade head cd ../docker docker compose -f docker-compose.deps-only.yml up -d ``` If you skip the alembic step, `zero-cache` will crash-loop with `Unknown or invalid publications. Specified: [zero_publication]`.