Merge pull request #1377 from AnishSarkar22/feat/e2e-testing-ci

feat: add E2E CI and harden Docker build migrations
This commit is contained in:
Rohan Verma 2026-05-15 04:47:26 -07:00 committed by GitHub
commit 4db3cf7fd5
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
45 changed files with 1733 additions and 495 deletions

View file

@ -1,48 +1,48 @@
# Backend E2E Test Harness
# Backend E2E Harness
Strict fakes + alternative entrypoints used **only** by Playwright E2E.
Excluded from the production Docker image via `.dockerignore`.
This directory contains the test-only backend entrypoints and fakes used by
Playwright. They are not part of the production image: `.dockerignore` excludes
`tests/`, and the E2E Docker stage copies this directory through a separate
build context.
## Files
| Path | Role |
| -------------------------------- | ------------------------------------------------------------------------------- |
| `run_backend.py` | FastAPI entrypoint that hijacks `sys.modules` before importing `app.app:app` |
| `run_celery.py` | Celery worker entrypoint with the same hijack + patch logic |
| `middleware/scenario.py` | `X-E2E-Scenario` header → ContextVar (read by fakes) |
| `fakes/composio_module.py` | Strict drop-in for the `composio` package; raises on unknown surface |
| `fakes/llm.py` | `fake_get_user_long_context_llm` returning a `FakeListChatModel` |
| `fakes/embeddings.py` | Deterministic 0.1-vector `embed_text` / `embed_texts` |
| `fakes/fixtures/drive_files.json`| Canned Drive listings + file contents (incl. canary tokens) |
| Path | Purpose |
| --- | --- |
| `run_backend.py` | Starts FastAPI after installing the test fakes into `sys.modules`. |
| `run_celery.py` | Starts the Celery worker with the same fake setup. |
| `middleware/scenario.py` | Reads `X-E2E-Scenario` into a request-scoped context var. |
| `fakes/composio_module.py` | Fake `composio` package used by connector flows. |
| `fakes/llm.py` | Fake chat model factory. |
| `fakes/embeddings.py` | Deterministic embedding helpers. |
| `fakes/fixtures/drive_files.json` | Drive fixture data and canary file contents. |
## Why a sys.modules hijack?
## Why the import hook exists
Production code does `from composio import Composio` at module load
time. By the time the FastAPI app object exists, that binding has
already been resolved. The hijack runs **before** any `app.*` import,
so the binding resolves to our strict fake. No production source
changes; fakes are physically excluded from production images.
Some production modules import SDK clients at module load time, for example
`from composio import Composio`. By the time `app.app` has been imported, those
bindings are already fixed.
Belt + suspenders + no internet: the strict `__getattr__` in every
fake raises `NotImplementedError` if a future production code path
introduces a new SDK call. CI also sets `HTTPS_PROXY=http://127.0.0.1:1`
plus sentinel API keys so any leaked outbound HTTP fails immediately.
The E2E entrypoints install fake modules in `sys.modules` before importing any
`app.*` module. That lets the normal production code run while SDK calls resolve
to local fakes.
## Adding a new fake
The fakes should fail loudly. If production starts using a new SDK method that
the fake does not implement, add that method to the fake instead of letting the
test call the real service.
1. Create `fakes/<sdk>_module.py` modelled on `composio_module.py`.
2. In `run_backend.py` and `run_celery.py`, register
`sys.modules["<sdk>"] = _fake_<sdk>` before the `from app.app import app`
line.
3. If the new fake needs scenario branching, read from
## Adding a fake
1. Add `fakes/<sdk>_module.py`.
2. Register it in both `run_backend.py` and `run_celery.py` before importing
`app.app` or `app.celery_app`.
3. If the fake needs per-test behavior, read the current scenario from
`tests.e2e.middleware.scenario.current_scenario()`.
## Reused by backend integration tests
## Shared with backend integration tests
The strict fakes are not only for Playwright. Backend route integration
tests can import the same fake before importing `app.app`, so Composio
route tests exercise production route code without touching the real
SDK:
Backend integration tests can use the same fakes when they need production route
code without the real SDK:
```python
from tests.e2e.fakes import composio_module as _fake_composio
@ -50,20 +50,93 @@ sys.modules["composio"] = _fake_composio
from app.app import app
```
See `surfsense_backend/tests/integration/composio/conftest.py` for the
current pattern.
See `surfsense_backend/tests/integration/composio/conftest.py` for the current
pattern.
## Running locally
The recommended local flow runs only Postgres and Redis in Docker, and the
backend + Celery worker on the host. No `.env` file is required: both
entrypoints `setdefault` every variable they need (DB URL, Redis URL,
sentinel API keys, etc.) to values that match `docker-compose.deps-only.yml`.
### One-time setup
From `surfsense_web/`:
```bash
cd surfsense_backend
pnpm install
pnpm exec playwright install --with-deps chromium
```
### Each run
**1. Bring up Postgres + Redis** from the repo root (the other deps-only
services (SearXNG, Zero, pgAdmin) are not needed for E2E):
```bash
docker compose -f docker/docker-compose.deps-only.yml up -d db redis
```
**2. Start the backend** in `surfsense_backend/`, terminal A:
```bash
uv sync
uv run alembic upgrade head
uv run python tests/e2e/run_backend.py
# in a second shell:
```
**3. Start the Celery worker** in `surfsense_backend/`, terminal B:
```bash
uv run python tests/e2e/run_celery.py
```
Then in `surfsense_web`:
**4. Register the Playwright user**:
```bash
pnpm test:e2e
curl -X POST http://localhost:8000/auth/register \
-H "Content-Type: application/json" \
-d '{"email":"e2e-test@surfsense.net","password":"E2eTestPassword123!"}'
```
**5. Run Playwright** from `surfsense_web/`, terminal C:
```bash
pnpm test:e2e # dev server (fast iteration)
pnpm test:e2e:headed # show the browser
pnpm test:e2e:ui # Playwright UI mode
pnpm test:e2e:prod # build + start (matches CI exactly)
```
`playwright.config.ts` and the run scripts share defaults, so this works on a
fresh checkout. Set `PLAYWRIGHT_TEST_EMAIL`, `PLAYWRIGHT_TEST_PASSWORD`,
`NEXT_PUBLIC_FASTAPI_BACKEND_URL`, or any backend env (e.g. `DATABASE_URL`)
only when pointing tests at a different stack.
### Cleanup
```bash
docker compose -f docker/docker-compose.deps-only.yml down
```
Add `-v` to also wipe the Postgres volume.
### Hermetic alternative (matches CI)
To reproduce the CI environment exactly — backend and Celery in containers,
network egress denied at L3 — replace steps 13 with:
```bash
docker compose -f docker/docker-compose.e2e.yml up -d --build --wait
```
Then run steps 4 (curl register) and 5 (`pnpm test:e2e:prod`) as above. Tear
down with:
```bash
docker compose -f docker/docker-compose.e2e.yml down -v --remove-orphans
```
This builds the ~9 GB `surfsense-e2e-backend:local` image, so the deps-only
flow above is faster for day-to-day development.

View file

@ -0,0 +1,66 @@
"""Test-only token mint endpoint for the E2E backend entrypoint.
Mounted by ``tests/e2e/run_backend.py`` so Playwright can authenticate
the seeded e2e user without hitting ``/auth/jwt/login`` (rate-limited
to 5/min/IP in production). NEVER ships to production: this whole
``tests/`` tree is excluded from the production Docker image by
``surfsense_backend/.dockerignore``.
Authn: shared secret in ``X-E2E-Mint-Secret``. Same value is set on the
backend container env (``docker/docker-compose.e2e.yml``) and exported
to the Playwright runner (``.github/workflows/e2e-tests.yml``).
"""
from __future__ import annotations
import logging
import os
from fastapi import APIRouter, FastAPI, Header, HTTPException
from pydantic import BaseModel
from sqlalchemy import select
from app.db import User, async_session_maker
from app.users import get_jwt_strategy
_logger = logging.getLogger("surfsense.e2e.auth_mint")
class MintRequest(BaseModel):
email: str = "e2e-test@surfsense.net"
class MintResponse(BaseModel):
access_token: str
token_type: str = "bearer"
def _expected_secret() -> str:
return os.environ.get("E2E_MINT_SECRET", "local-e2e-mint-secret-not-for-production")
router = APIRouter(prefix="/__e2e__", tags=["__e2e__"])
@router.post("/auth/token", response_model=MintResponse)
async def mint_test_token(
body: MintRequest,
x_e2e_mint_secret: str = Header(..., alias="X-E2E-Mint-Secret"),
) -> MintResponse:
if x_e2e_mint_secret != _expected_secret():
raise HTTPException(status_code=403, detail="invalid e2e mint secret")
async with async_session_maker() as session:
result = await session.execute(select(User).where(User.email == body.email))
user = result.scalar_one_or_none()
if user is None:
raise HTTPException(
status_code=404, detail=f"e2e user {body.email!r} not seeded"
)
token = await get_jwt_strategy().write_token(user)
return MintResponse(access_token=token)
def install(app: FastAPI) -> None:
"""Mount the test-only mint router onto the given FastAPI app."""
app.include_router(router)
_logger.warning("[e2e] mounted POST /__e2e__/auth/token (test-only token mint)")

View file

@ -0,0 +1,141 @@
"""Stub DoclingService.process_document for E2E.
The real ``DoclingService.process_document`` calls
``DocumentConverter.convert(file_path)`` which lazily downloads the
``docling-project/docling-layout-heron`` model from Hugging Face Hub.
The hermetic E2E container sets ``HF_HUB_OFFLINE=1`` (see
``docker/docker-compose.e2e.yml``), so that download fails with
``LocalEntryNotFoundError`` and the indexing Celery task retries until
the Playwright test hits its ~4-minute step timeout. In CI that is the
difference between the suite finishing and the 30-minute job timeout
killing the run before any report can upload.
Stubbing ``process_document`` bypasses ``DocumentConverter.convert()``
entirely. ``DoclingService.__init__`` is intentionally left untouched
because constructing ``DocumentConverter(...)`` is cheap and offline
it is only ``.convert()`` that triggers the offline-model download.
Every canary PDF under ``tests/e2e/fakes/fixtures/binary/`` is produced
by ``generate_canary_pdfs.py`` and embeds its canary token as plain
``(text) Tj`` PDF text operators. Extracting those operators gives us
the canary string back, which is what the Playwright assertions look
for in the resulting Document row.
"""
from __future__ import annotations
import logging
import re
from pathlib import Path
from typing import Any
logger = logging.getLogger(__name__)
# Matches the `(escaped text) Tj` text-show operator emitted by
# generate_canary_pdfs.py. Inside the parens, the escape rules are:
# \\ -> backslash
# \( -> literal (
# \) -> literal )
# The character class [^\\()] consumes any non-escape byte; \\. consumes
# an escape sequence. Sufficient for our synthetic fixtures.
_TJ_PATTERN = re.compile(rb"\(((?:[^\\()]|\\.)*)\)\s*Tj")
def _extract_text_from_synthetic_pdf(file_path: str) -> str:
"""Pull every ``(text) Tj`` payload out of a fixture PDF in order.
Returns an empty string if the file cannot be read. We do not try to
handle arbitrary PDFs because the fake is only ever invoked against
fixtures we generate ourselves.
"""
try:
data = Path(file_path).read_bytes()
except OSError as exc:
logger.warning("[fake-docling] could not read %s: %s", file_path, exc)
return ""
lines: list[str] = []
for match in _TJ_PATTERN.finditer(data):
raw = match.group(1)
# Order-sensitive unescape via sentinel: protect `\\` first so
# the subsequent `\(` / `\)` passes do not corrupt it.
text = (
raw.replace(rb"\\", b"\x00")
.replace(rb"\(", b"(")
.replace(rb"\)", b")")
.replace(b"\x00", b"\\")
)
try:
lines.append(text.decode("utf-8"))
except UnicodeDecodeError:
lines.append(text.decode("latin-1"))
return "\n".join(lines)
async def fake_process_document(
self,
file_path: str,
filename: str | None = None,
) -> dict[str, Any]:
"""Drop-in replacement for ``DoclingService.process_document``.
Returns the same dict shape as the production method so callers
(``app/etl_pipeline/parsers/docling.py``) can keep reading
``result["content"]`` without changes.
"""
extracted = _extract_text_from_synthetic_pdf(file_path)
display_name = filename or Path(file_path).name
if extracted:
content = f"# {display_name}\n\n{extracted}\n"
else:
# Empty fallback so the indexing pipeline does not error out on
# an unexpected payload. A failing canary assertion is a much
# clearer failure mode than a hard parser exception.
content = (
f"# {display_name}\n\n(empty docling fake — no text-show operators found)\n"
)
logger.info(
"[fake-docling] returning %d chars for %s",
len(content),
display_name,
)
return {
"content": content,
"full_text": content,
"service_used": "docling-fake",
"status": "success",
"processing_notes": "e2e fake DoclingService — no real PDF parsing",
}
def install(patches: list[Any]) -> None:
"""Patch ``DoclingService.process_document`` at the class level.
Patching the class method (rather than each call site) is correct
here because every consumer goes through
``create_docling_service()`` ``DoclingService()`` instance method
dispatch, so the descriptor protocol picks up our replacement. There
is exactly one such consumer today
(``app/etl_pipeline/parsers/docling.py``), but patching the class is
future-proof.
Fails loud rather than warning, because a silent passthrough means
real Docling + ``HF_HUB_OFFLINE=1`` = 4 minutes of CI hang per test.
"""
from unittest.mock import patch as _patch
target = "app.services.docling_service.DoclingService.process_document"
try:
p = _patch(target, fake_process_document)
p.start()
patches.append(p)
logger.info("[fake-docling] patched %s", target)
except (ModuleNotFoundError, AttributeError) as exc:
raise RuntimeError(
f"Could not patch Docling binding {target!r}: {exc!s}. "
f"Update surfsense_backend/tests/e2e/fakes/docling_service.py "
f"to point at the new binding site."
) from exc

View file

@ -0,0 +1,71 @@
# Synthetic Global LLM configuration for E2E ONLY.
#
# Why this file exists:
# surfsense_backend/app/config/global_llm_config.yaml is gitignored
# (operators ship real API keys there). In CI that file does not exist,
# so app.config.load_global_llm_configs() returns [], every chat-stream
# test fails fast with "No usable global LLM configs are available for
# Auto mode" raised by auto_model_pin_service._global_candidates().
#
# What this file does:
# tests/e2e/run_backend.py and tests/e2e/run_celery.py copy this file
# to app/config/global_llm_config.yaml at startup, BEFORE app.config
# is imported. The copy lives only inside the E2E Docker container.
#
# Why a fake api_key is safe:
# tests.e2e.fakes.chat_llm patches
# app.tasks.chat.stream_new_chat.create_chat_litellm_from_agent_config
# app.tasks.chat.stream_new_chat.create_chat_litellm_from_config
# so the resolved auto-pin id is never sent to a real LLM provider.
# The values below only need to pass
# auto_model_pin_service._is_usable_global_config()
# which requires id / model_name / provider / api_key all truthy.
#
# Why TWO entries (premium + free):
# auto_model_pin_service.resolve_or_get_pinned_llm_config_id() splits
# candidates by billing_tier based on _is_premium_eligible(user):
# premium_eligible == True -> keeps only tier=="premium" configs
# premium_eligible == False -> keeps only tier!="premium" configs
# A single-tier fixture would fail one of the two branches with
# "Auto mode could not find an eligible LLM config for this user and
# quota state". Shipping one of each guarantees every quota state
# resolves to a viable pin in E2E.
router_settings:
routing_strategy: "simple-shuffle"
num_retries: 0
allowed_fails: 1
cooldown_time: 1
global_llm_configs:
- id: -9001
name: "E2E Fake Auto Model (premium)"
billing_tier: "premium"
anonymous_enabled: false
seo_enabled: false
quality_score: 1.0
provider: "OPENAI"
model_name: "fake-e2e-model-premium"
api_key: "fake-e2e-api-key-not-for-production"
supports_image_input: false
quota_reserve_tokens: 1024
rpm: 1000
tpm: 100000
litellm_params:
model: "openai/fake-e2e-model-premium"
- id: -9002
name: "E2E Fake Auto Model (free)"
billing_tier: "free"
anonymous_enabled: false
seo_enabled: false
quality_score: 1.0
provider: "OPENAI"
model_name: "fake-e2e-model-free"
api_key: "fake-e2e-api-key-not-for-production"
supports_image_input: false
quota_reserve_tokens: 1024
rpm: 1000
tpm: 100000
litellm_params:
model: "openai/fake-e2e-model-free"

View file

@ -23,15 +23,12 @@ Usage:
from __future__ import annotations
import asyncio
import logging
import os
import sys
# ---------------------------------------------------------------------------
# 1) Hijack sys.modules BEFORE any production import.
# Production: composio_service.py:11 does `from composio import Composio`.
# With this hijack in place, that import resolves to our strict fake.
# ---------------------------------------------------------------------------
import uvicorn
# Make the surfsense_backend root importable as a top-level package so
# `import tests.e2e.fakes...` works regardless of how the entrypoint is
@ -42,97 +39,175 @@ _BACKEND_ROOT = os.path.abspath(os.path.join(_THIS_DIR, "..", ".."))
if _BACKEND_ROOT not in sys.path:
sys.path.insert(0, _BACKEND_ROOT)
import tests.e2e.fakes.composio_module as _fake_composio # noqa: E402
import tests.e2e.fakes.notion_module as _fake_notion # noqa: E402
sys.modules["composio"] = _fake_composio
sys.modules["notion_client"] = _fake_notion
sys.modules["notion_client.errors"] = _fake_notion.errors
# ---------------------------------------------------------------------------
# 2) Standard logging + dotenv so the rest of the app behaves like main.py.
# ---------------------------------------------------------------------------
from dotenv import load_dotenv # noqa: E402
load_dotenv()
os.environ.setdefault("ATLASSIAN_CLIENT_ID", "fake-atlassian-client-id")
os.environ.setdefault("ATLASSIAN_CLIENT_SECRET", "fake-atlassian-client-secret")
os.environ.setdefault(
"CONFLUENCE_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/confluence/connector/callback",
)
os.environ.setdefault("NOTION_CLIENT_ID", "fake-notion-client-id")
os.environ.setdefault("NOTION_CLIENT_SECRET", "fake-notion-client-secret")
os.environ.setdefault(
"NOTION_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/notion/connector/callback",
)
os.environ.setdefault("MICROSOFT_CLIENT_ID", "fake-microsoft-client-id")
os.environ.setdefault("MICROSOFT_CLIENT_SECRET", "fake-microsoft-client-secret")
os.environ.setdefault(
"ONEDRIVE_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/onedrive/connector/callback",
)
os.environ.setdefault("DROPBOX_APP_KEY", "fake-dropbox-app-key")
os.environ.setdefault("DROPBOX_APP_SECRET", "fake-dropbox-app-secret")
os.environ.setdefault(
"DROPBOX_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/dropbox/connector/callback",
)
os.environ["SLACK_CLIENT_ID"] = "fake-slack-mcp-client-id"
os.environ["SLACK_CLIENT_SECRET"] = "fake-slack-mcp-client-secret"
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
logger = logging.getLogger("surfsense.e2e.backend")
logger.warning(
"*** SURFSENSE E2E BACKEND ENTRYPOINT — fake Composio + LLM + embeddings ***"
)
# ---------------------------------------------------------------------------
# 3) Now import the production app. Every module in app.* loads here,
# creating their bindings (some of which we will patch in step 4).
# ---------------------------------------------------------------------------
# ---------------------------------------------------------------------------
# 4) Patch LLM + embedding bindings at every consumer site.
# Composio is already covered by the sys.modules hijack in step 1.
# ---------------------------------------------------------------------------
from unittest.mock import patch # noqa: E402
from app.app import app # noqa: E402
from tests.e2e.fakes import ( # noqa: E402
clickup_module as _fake_clickup_module,
confluence_indexer as _fake_confluence_indexer,
confluence_oauth as _fake_confluence_oauth,
dropbox_api as _fake_dropbox_api,
embeddings as _fake_embeddings,
jira_module as _fake_jira_module,
linear_module as _fake_linear_module,
mcp_oauth_runtime as _fake_mcp_oauth_runtime,
mcp_runtime as _fake_mcp_runtime,
native_google as _fake_native_google,
notion_module as _fake_notion_module,
onedrive_graph as _fake_onedrive_graph,
slack_module as _fake_slack_module,
)
from tests.e2e.fakes.chat_llm import ( # noqa: E402
fake_create_chat_litellm_from_agent_config,
fake_create_chat_litellm_from_config,
)
from tests.e2e.fakes.llm import fake_get_user_long_context_llm # noqa: E402
# Patches started during bootstrap are kept alive for the lifetime of the
# process. We never call .stop() on them.
_active_patches: list = []
def _hijack_external_sdks() -> None:
"""Replace composio + notion_client in sys.modules.
Production does ``from composio import Composio`` and
``import notion_client`` at import time. With this hijack in place,
those imports resolve to our strict fakes.
MUST run before _import_production_app().
"""
import tests.e2e.fakes.composio_module as _fake_composio
import tests.e2e.fakes.notion_module as _fake_notion
sys.modules["composio"] = _fake_composio
sys.modules["notion_client"] = _fake_notion
sys.modules["notion_client.errors"] = _fake_notion.errors
def _load_dotenv_and_set_env_defaults() -> None:
"""Load .env and set every env var the production config reads on import.
MUST run before _import_production_app(), since app.config consumes
these values at import time.
"""
from dotenv import load_dotenv
load_dotenv()
os.environ.setdefault(
"DATABASE_URL",
"postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense",
)
os.environ.setdefault("CELERY_BROKER_URL", "redis://localhost:6379/0")
os.environ.setdefault("CELERY_RESULT_BACKEND", "redis://localhost:6379/0")
os.environ.setdefault("REDIS_APP_URL", "redis://localhost:6379/0")
os.environ.setdefault("CELERY_TASK_DEFAULT_QUEUE", "surfsense")
os.environ.setdefault("SECRET_KEY", "local-e2e-secret-not-for-production")
os.environ.setdefault("AUTH_TYPE", "LOCAL")
os.environ.setdefault("REGISTRATION_ENABLED", "TRUE")
os.environ.setdefault("ETL_SERVICE", "DOCLING")
os.environ.setdefault("EMBEDDING_MODEL", "sentence-transformers/all-MiniLM-L6-v2")
os.environ.setdefault("NEXT_FRONTEND_URL", "http://localhost:3000")
# Sentinel keys — fakes never read them; turns leaked real calls into 401s.
os.environ.setdefault("COMPOSIO_API_KEY", "local-deny-real-call-sentinel")
os.environ.setdefault("COMPOSIO_ENABLED", "TRUE")
os.environ.setdefault("OPENAI_API_KEY", "local-deny-real-call-sentinel")
os.environ.setdefault("ANTHROPIC_API_KEY", "local-deny-real-call-sentinel")
os.environ.setdefault("LITELLM_API_KEY", "local-deny-real-call-sentinel")
os.environ.setdefault("ATLASSIAN_CLIENT_ID", "fake-atlassian-client-id")
os.environ.setdefault("ATLASSIAN_CLIENT_SECRET", "fake-atlassian-client-secret")
os.environ.setdefault(
"CONFLUENCE_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/confluence/connector/callback",
)
os.environ.setdefault("NOTION_CLIENT_ID", "fake-notion-client-id")
os.environ.setdefault("NOTION_CLIENT_SECRET", "fake-notion-client-secret")
os.environ.setdefault(
"NOTION_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/notion/connector/callback",
)
os.environ.setdefault("MICROSOFT_CLIENT_ID", "fake-microsoft-client-id")
os.environ.setdefault("MICROSOFT_CLIENT_SECRET", "fake-microsoft-client-secret")
os.environ.setdefault(
"ONEDRIVE_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/onedrive/connector/callback",
)
os.environ.setdefault("DROPBOX_APP_KEY", "fake-dropbox-app-key")
os.environ.setdefault("DROPBOX_APP_SECRET", "fake-dropbox-app-secret")
os.environ.setdefault(
"DROPBOX_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/dropbox/connector/callback",
)
# Native Google OAuth — fake Flow in tests.e2e.fakes.native_google
# raises "Fake Google Flow requires redirect_uri." if these are empty,
# so connector/add routes return 500 in CI where no .env supplies them.
os.environ.setdefault(
"GOOGLE_DRIVE_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/google/drive/connector/callback",
)
os.environ.setdefault(
"GOOGLE_GMAIL_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/google/gmail/connector/callback",
)
os.environ.setdefault(
"GOOGLE_CALENDAR_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/google/calendar/connector/callback",
)
os.environ["SLACK_CLIENT_ID"] = "fake-slack-mcp-client-id"
os.environ["SLACK_CLIENT_SECRET"] = "fake-slack-mcp-client-secret"
def _install_synthetic_global_llm_config() -> None:
"""Materialise a fake ``app/config/global_llm_config.yaml`` for E2E.
The real file is gitignored (production operators ship their own with
real API keys), so a fresh CI checkout has no YAML at the path
``app.config.load_global_llm_configs()`` reads. With an empty
``GLOBAL_LLM_CONFIGS`` list, ``auto_model_pin_service`` raises
``"No usable global LLM configs are available for Auto mode"`` on
every chat-stream request.
We copy the synthetic fixture from ``tests/e2e/fixtures/`` into the
production-expected location BEFORE ``_import_production_app()`` so
``app.config`` picks it up on import. Production code is untouched
this is purely a test-time scaffold.
Only installs when the destination is missing. A developer running
the E2E entrypoint locally keeps their real ``global_llm_config.yaml``
intact (the patched ``create_chat_litellm_from_*`` factories make the
actual model values irrelevant either way).
MUST run before _import_production_app().
"""
import shutil
src = os.path.join(_THIS_DIR, "fixtures", "global_llm_config.yaml")
dst = os.path.join(_BACKEND_ROOT, "app", "config", "global_llm_config.yaml")
if not os.path.exists(src):
raise RuntimeError(
f"E2E synthetic global LLM config fixture missing at {src!r}. "
f"This file is checked into tests/e2e/fixtures/ — if it has gone "
f"missing, restore it from VCS before running the E2E entrypoint."
)
if os.path.exists(dst):
logger.info(
"[e2e-global-llm-config] %s already exists; leaving it alone "
"(local dev config preserved)",
dst,
)
return
os.makedirs(os.path.dirname(dst), exist_ok=True)
shutil.copyfile(src, dst)
logger.info("[e2e-global-llm-config] installed %s -> %s", src, dst)
def _import_production_app():
"""Import and return the production FastAPI app.
Every module under ``app.*`` loads here, creating their bindings.
The LLM/embedding factories captured at this point will be replaced
by patches in _patch_llm_bindings() below.
"""
from app.app import app as production_app
return production_app
def _patch_llm_bindings() -> None:
"""Replace LLM factories at every known binding site."""
from unittest.mock import patch
from tests.e2e.fakes.chat_llm import (
fake_create_chat_litellm_from_agent_config,
fake_create_chat_litellm_from_config,
)
from tests.e2e.fakes.llm import fake_get_user_long_context_llm
targets = [
"app.services.llm_service.get_user_long_context_llm",
"app.tasks.connector_indexers.confluence_indexer.get_user_long_context_llm",
@ -190,38 +265,90 @@ def _patch_llm_bindings() -> None:
logger.warning("[fake-chat-llm] could not patch %s: %s.", target, exc)
_patch_llm_bindings()
_fake_embeddings.install(_active_patches)
_fake_confluence_oauth.install(_active_patches)
_fake_confluence_indexer.install(_active_patches)
_fake_native_google.install(_active_patches)
_fake_onedrive_graph.install(_active_patches)
_fake_dropbox_api.install(_active_patches)
_fake_notion_module.install(_active_patches)
_fake_linear_module.install(_active_patches)
_fake_jira_module.install(_active_patches)
_fake_clickup_module.install(_active_patches)
_fake_mcp_runtime.install(_active_patches)
_fake_mcp_oauth_runtime.install(_active_patches)
_fake_slack_module.install(_active_patches)
def _install_runtime_fakes() -> None:
"""Run each fake's install() against the active patch stack."""
from tests.e2e.fakes import (
clickup_module as _fake_clickup_module,
confluence_indexer as _fake_confluence_indexer,
confluence_oauth as _fake_confluence_oauth,
docling_service as _fake_docling_service,
dropbox_api as _fake_dropbox_api,
embeddings as _fake_embeddings,
jira_module as _fake_jira_module,
linear_module as _fake_linear_module,
mcp_oauth_runtime as _fake_mcp_oauth_runtime,
mcp_runtime as _fake_mcp_runtime,
native_google as _fake_native_google,
notion_module as _fake_notion_module,
onedrive_graph as _fake_onedrive_graph,
slack_module as _fake_slack_module,
)
_fake_embeddings.install(_active_patches)
_fake_docling_service.install(_active_patches)
_fake_confluence_oauth.install(_active_patches)
_fake_confluence_indexer.install(_active_patches)
_fake_native_google.install(_active_patches)
_fake_onedrive_graph.install(_active_patches)
_fake_dropbox_api.install(_active_patches)
_fake_notion_module.install(_active_patches)
_fake_linear_module.install(_active_patches)
_fake_jira_module.install(_active_patches)
_fake_clickup_module.install(_active_patches)
_fake_mcp_runtime.install(_active_patches)
_fake_mcp_oauth_runtime.install(_active_patches)
_fake_slack_module.install(_active_patches)
# ---------------------------------------------------------------------------
# 5) Mount test-only middleware. Production never reaches this code.
# ---------------------------------------------------------------------------
def _install_test_only_app_extensions(app) -> None:
"""Mount test-only middleware + the /__e2e__ token mint router.
from tests.e2e.middleware.scenario import ScenarioMiddleware # noqa: E402
POST /__e2e__/auth/token bypasses /auth/jwt/login's 5/min/IP rate
limit so Playwright workers can authenticate without thrashing the
production auth surface. See tests/e2e/auth_mint.py.
"""
from tests.e2e.auth_mint import install as install_e2e_mint
from tests.e2e.middleware.scenario import ScenarioMiddleware
app.add_middleware(ScenarioMiddleware)
app.add_middleware(ScenarioMiddleware)
install_e2e_mint(app)
# ---------------------------------------------------------------------------
# 6) Start uvicorn, mirroring main.py's behaviour.
# ---------------------------------------------------------------------------
def _bootstrap():
"""Run the full E2E bootstrap and return the production FastAPI app.
import asyncio # noqa: E402
Ordering is load-bearing:
1) Hijack composio + notion_client in sys.modules.
2) Load .env + set env defaults (app.config reads env on import).
3) Configure logging.
4) Materialise the synthetic global_llm_config.yaml so Auto-mode
pin resolution finds at least one usable candidate.
5) Import production app (which transitively imports the now-faked
external SDKs and reads the env defaults + YAML).
6) Patch LLM / embedding bindings at every consumer site.
7) Mount test-only middleware + /__e2e__ routes onto the app.
"""
_hijack_external_sdks()
_load_dotenv_and_set_env_defaults()
import uvicorn # noqa: E402
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
logger.warning(
"*** SURFSENSE E2E BACKEND ENTRYPOINT — fake Composio + LLM + embeddings ***"
)
_install_synthetic_global_llm_config()
production_app = _import_production_app()
_patch_llm_bindings()
_install_runtime_fakes()
_install_test_only_app_extensions(production_app)
return production_app
app = _bootstrap()
def _main() -> None:

View file

@ -25,96 +25,166 @@ if _BACKEND_ROOT not in sys.path:
sys.path.insert(0, _BACKEND_ROOT)
# ---------------------------------------------------------------------------
# 1) Hijack sys.modules BEFORE production celery imports anything.
# ---------------------------------------------------------------------------
import tests.e2e.fakes.composio_module as _fake_composio # noqa: E402
import tests.e2e.fakes.notion_module as _fake_notion # noqa: E402
sys.modules["composio"] = _fake_composio
sys.modules["notion_client"] = _fake_notion
sys.modules["notion_client.errors"] = _fake_notion.errors
# ---------------------------------------------------------------------------
# 2) Logging + dotenv.
# ---------------------------------------------------------------------------
from dotenv import load_dotenv # noqa: E402
load_dotenv()
os.environ.setdefault("ATLASSIAN_CLIENT_ID", "fake-atlassian-client-id")
os.environ.setdefault("ATLASSIAN_CLIENT_SECRET", "fake-atlassian-client-secret")
os.environ.setdefault(
"CONFLUENCE_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/confluence/connector/callback",
)
os.environ.setdefault("NOTION_CLIENT_ID", "fake-notion-client-id")
os.environ.setdefault("NOTION_CLIENT_SECRET", "fake-notion-client-secret")
os.environ.setdefault(
"NOTION_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/notion/connector/callback",
)
os.environ.setdefault("MICROSOFT_CLIENT_ID", "fake-microsoft-client-id")
os.environ.setdefault("MICROSOFT_CLIENT_SECRET", "fake-microsoft-client-secret")
os.environ.setdefault(
"ONEDRIVE_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/onedrive/connector/callback",
)
os.environ.setdefault("DROPBOX_APP_KEY", "fake-dropbox-app-key")
os.environ.setdefault("DROPBOX_APP_SECRET", "fake-dropbox-app-secret")
os.environ.setdefault(
"DROPBOX_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/dropbox/connector/callback",
)
os.environ["SLACK_CLIENT_ID"] = "fake-slack-mcp-client-id"
os.environ["SLACK_CLIENT_SECRET"] = "fake-slack-mcp-client-secret"
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
logger = logging.getLogger("surfsense.e2e.celery")
logger.warning("*** SURFSENSE E2E CELERY WORKER — fake Composio + LLM + embeddings ***")
# ---------------------------------------------------------------------------
# 3) Import the production celery_app. All task modules load here.
# ---------------------------------------------------------------------------
# ---------------------------------------------------------------------------
# 4) Patch LLM + embedding bindings inside the worker process.
# ---------------------------------------------------------------------------
from unittest.mock import patch # noqa: E402
from app.celery_app import celery_app # noqa: E402
from tests.e2e.fakes import ( # noqa: E402
clickup_module as _fake_clickup_module,
confluence_indexer as _fake_confluence_indexer,
confluence_oauth as _fake_confluence_oauth,
dropbox_api as _fake_dropbox_api,
embeddings as _fake_embeddings,
jira_module as _fake_jira_module,
linear_module as _fake_linear_module,
mcp_oauth_runtime as _fake_mcp_oauth_runtime,
mcp_runtime as _fake_mcp_runtime,
native_google as _fake_native_google,
notion_module as _fake_notion_module,
onedrive_graph as _fake_onedrive_graph,
slack_module as _fake_slack_module,
)
from tests.e2e.fakes.chat_llm import ( # noqa: E402
fake_create_chat_litellm_from_agent_config,
fake_create_chat_litellm_from_config,
)
from tests.e2e.fakes.llm import fake_get_user_long_context_llm # noqa: E402
# Patches started during bootstrap are kept alive for the lifetime of the
# process. We never call .stop() on them.
_active_patches: list = []
def _hijack_external_sdks() -> None:
"""Replace composio + notion_client in sys.modules.
Production does ``from composio import Composio`` and
``import notion_client`` at import time. With this hijack in place,
those imports resolve to our strict fakes.
MUST run before _import_celery_app().
"""
import tests.e2e.fakes.composio_module as _fake_composio
import tests.e2e.fakes.notion_module as _fake_notion
sys.modules["composio"] = _fake_composio
sys.modules["notion_client"] = _fake_notion
sys.modules["notion_client.errors"] = _fake_notion.errors
def _load_dotenv_and_set_env_defaults() -> None:
"""Load .env and set every env var the production config reads on import.
MUST run before _import_celery_app(), since app.config consumes
these values at import time.
"""
from dotenv import load_dotenv
load_dotenv()
os.environ.setdefault(
"DATABASE_URL",
"postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense",
)
os.environ.setdefault("CELERY_BROKER_URL", "redis://localhost:6379/0")
os.environ.setdefault("CELERY_RESULT_BACKEND", "redis://localhost:6379/0")
os.environ.setdefault("REDIS_APP_URL", "redis://localhost:6379/0")
os.environ.setdefault("CELERY_TASK_DEFAULT_QUEUE", "surfsense")
os.environ.setdefault("SECRET_KEY", "local-e2e-secret-not-for-production")
os.environ.setdefault("AUTH_TYPE", "LOCAL")
os.environ.setdefault("REGISTRATION_ENABLED", "TRUE")
os.environ.setdefault("ETL_SERVICE", "DOCLING")
os.environ.setdefault("EMBEDDING_MODEL", "sentence-transformers/all-MiniLM-L6-v2")
os.environ.setdefault("NEXT_FRONTEND_URL", "http://localhost:3000")
# Sentinel keys — fakes never read them; turns leaked real calls into 401s.
os.environ.setdefault("COMPOSIO_API_KEY", "local-deny-real-call-sentinel")
os.environ.setdefault("COMPOSIO_ENABLED", "TRUE")
os.environ.setdefault("OPENAI_API_KEY", "local-deny-real-call-sentinel")
os.environ.setdefault("ANTHROPIC_API_KEY", "local-deny-real-call-sentinel")
os.environ.setdefault("LITELLM_API_KEY", "local-deny-real-call-sentinel")
os.environ.setdefault("ATLASSIAN_CLIENT_ID", "fake-atlassian-client-id")
os.environ.setdefault("ATLASSIAN_CLIENT_SECRET", "fake-atlassian-client-secret")
os.environ.setdefault(
"CONFLUENCE_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/confluence/connector/callback",
)
os.environ.setdefault("NOTION_CLIENT_ID", "fake-notion-client-id")
os.environ.setdefault("NOTION_CLIENT_SECRET", "fake-notion-client-secret")
os.environ.setdefault(
"NOTION_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/notion/connector/callback",
)
os.environ.setdefault("MICROSOFT_CLIENT_ID", "fake-microsoft-client-id")
os.environ.setdefault("MICROSOFT_CLIENT_SECRET", "fake-microsoft-client-secret")
os.environ.setdefault(
"ONEDRIVE_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/onedrive/connector/callback",
)
os.environ.setdefault("DROPBOX_APP_KEY", "fake-dropbox-app-key")
os.environ.setdefault("DROPBOX_APP_SECRET", "fake-dropbox-app-secret")
os.environ.setdefault(
"DROPBOX_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/dropbox/connector/callback",
)
# Native Google OAuth — fake Flow in tests.e2e.fakes.native_google raises
# "Fake Google Flow requires redirect_uri." when these are empty.
os.environ.setdefault(
"GOOGLE_DRIVE_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/google/drive/connector/callback",
)
os.environ.setdefault(
"GOOGLE_GMAIL_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/google/gmail/connector/callback",
)
os.environ.setdefault(
"GOOGLE_CALENDAR_REDIRECT_URI",
"http://localhost:8000/api/v1/auth/google/calendar/connector/callback",
)
os.environ["SLACK_CLIENT_ID"] = "fake-slack-mcp-client-id"
os.environ["SLACK_CLIENT_SECRET"] = "fake-slack-mcp-client-secret"
def _install_synthetic_global_llm_config() -> None:
"""Materialise a fake ``app/config/global_llm_config.yaml`` for E2E.
The real file is gitignored (production operators ship their own with
real API keys), so a fresh CI checkout has no YAML at the path
``app.config.load_global_llm_configs()`` reads. With an empty
``GLOBAL_LLM_CONFIGS`` list, the worker's view of the config diverges
from the API container.
We copy the synthetic fixture from ``tests/e2e/fixtures/`` into the
production-expected location BEFORE _import_celery_app() so
``app.config`` picks it up on import. Install-only-if-missing so a
developer's local config (with real API keys) is preserved.
MUST run before _import_celery_app().
"""
import shutil
src = os.path.join(_THIS_DIR, "fixtures", "global_llm_config.yaml")
dst = os.path.join(_BACKEND_ROOT, "app", "config", "global_llm_config.yaml")
if not os.path.exists(src):
raise RuntimeError(
f"E2E synthetic global LLM config fixture missing at {src!r}. "
f"Restore tests/e2e/fixtures/global_llm_config.yaml from VCS."
)
if os.path.exists(dst):
logger.info(
"[e2e-global-llm-config] %s already exists; leaving it alone "
"(local dev config preserved)",
dst,
)
return
os.makedirs(os.path.dirname(dst), exist_ok=True)
shutil.copyfile(src, dst)
logger.info("[e2e-global-llm-config] installed %s -> %s", src, dst)
def _import_celery_app():
"""Import and return the production Celery app.
Every module under ``app.*`` (including all task modules) loads here,
creating their bindings. The LLM/embedding factories captured at this
point will be replaced by patches in _patch_llm_bindings() below.
"""
from app.celery_app import celery_app
return celery_app
def _patch_llm_bindings() -> None:
"""Replace LLM factories at every known binding site in worker tasks."""
from unittest.mock import patch
from tests.e2e.fakes.chat_llm import (
fake_create_chat_litellm_from_agent_config,
fake_create_chat_litellm_from_config,
)
from tests.e2e.fakes.llm import fake_get_user_long_context_llm
targets = [
"app.services.llm_service.get_user_long_context_llm",
"app.tasks.connector_indexers.confluence_indexer.get_user_long_context_llm",
@ -172,38 +242,93 @@ def _patch_llm_bindings() -> None:
)
_patch_llm_bindings()
_fake_embeddings.install(_active_patches)
_fake_confluence_oauth.install(_active_patches)
_fake_confluence_indexer.install(_active_patches)
_fake_native_google.install(_active_patches)
_fake_onedrive_graph.install(_active_patches)
_fake_dropbox_api.install(_active_patches)
_fake_notion_module.install(_active_patches)
_fake_linear_module.install(_active_patches)
_fake_jira_module.install(_active_patches)
_fake_clickup_module.install(_active_patches)
_fake_mcp_runtime.install(_active_patches)
_fake_mcp_oauth_runtime.install(_active_patches)
_fake_slack_module.install(_active_patches)
def _install_runtime_fakes() -> None:
"""Run each fake's install() against the active patch stack."""
from tests.e2e.fakes import (
clickup_module as _fake_clickup_module,
confluence_indexer as _fake_confluence_indexer,
confluence_oauth as _fake_confluence_oauth,
docling_service as _fake_docling_service,
dropbox_api as _fake_dropbox_api,
embeddings as _fake_embeddings,
jira_module as _fake_jira_module,
linear_module as _fake_linear_module,
mcp_oauth_runtime as _fake_mcp_oauth_runtime,
mcp_runtime as _fake_mcp_runtime,
native_google as _fake_native_google,
notion_module as _fake_notion_module,
onedrive_graph as _fake_onedrive_graph,
slack_module as _fake_slack_module,
)
_fake_embeddings.install(_active_patches)
_fake_docling_service.install(_active_patches)
_fake_confluence_oauth.install(_active_patches)
_fake_confluence_indexer.install(_active_patches)
_fake_native_google.install(_active_patches)
_fake_onedrive_graph.install(_active_patches)
_fake_dropbox_api.install(_active_patches)
_fake_notion_module.install(_active_patches)
_fake_linear_module.install(_active_patches)
_fake_jira_module.install(_active_patches)
_fake_clickup_module.install(_active_patches)
_fake_mcp_runtime.install(_active_patches)
_fake_mcp_oauth_runtime.install(_active_patches)
_fake_slack_module.install(_active_patches)
# ---------------------------------------------------------------------------
# 5) Start the worker.
# ---------------------------------------------------------------------------
def _bootstrap():
"""Run the full E2E bootstrap and return the production Celery app.
Ordering is load-bearing:
1) Hijack composio + notion_client in sys.modules.
2) Load .env + set env defaults (app.config reads env on import).
3) Configure logging.
4) Materialise the synthetic global_llm_config.yaml so the worker's
view of GLOBAL_LLM_CONFIGS matches the API container.
5) Import production celery_app (which transitively imports the
now-faked external SDKs and reads the env defaults + YAML).
6) Patch LLM / embedding bindings at every consumer site.
7) Install runtime fakes for connectors and chat backends.
"""
_hijack_external_sdks()
_load_dotenv_and_set_env_defaults()
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
logger.warning(
"*** SURFSENSE E2E CELERY WORKER — fake Composio + LLM + embeddings ***"
)
_install_synthetic_global_llm_config()
celery_app = _import_celery_app()
_patch_llm_bindings()
_install_runtime_fakes()
return celery_app
celery_app = _bootstrap()
def _main() -> None:
# Default queues mirror production (default queue + connectors queue
# so Drive indexing tasks are picked up).
queue_name = os.getenv("CELERY_TASK_DEFAULT_QUEUE", "surfsense")
queues = f"{queue_name},{queue_name}.connectors"
# macOS forks-after-MPS-init crash prefork workers; threads avoid it.
default_pool = "threads" if sys.platform == "darwin" else "prefork"
pool = os.getenv("CELERY_POOL", default_pool)
concurrency = os.getenv("CELERY_CONCURRENCY", "2")
celery_app.worker_main(
argv=[
"worker",
"--loglevel=info",
f"--queues={queues}",
"--concurrency=2",
f"--pool={pool}",
f"--concurrency={concurrency}",
"--without-gossip",
"--without-mingle",
]