feat: add Helm chart for Kubernetes deployment (#365)

* feat: add Helm chart for Kubernetes deployment

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Replace bundled Bitnami subcharts with in-chart manifests on official images

The Bitnami catalog removed all versioned image tags from docker.io/bitnami in
Aug 2025 (old images frozen in bitnamilegacy, maintained catalog now behind a
Broadcom subscription), so the bundled postgresql/redis/minio subcharts no
longer pull. Replace them with plain in-chart manifests built on official
upstream images, keeping the internal/all-in-one path fully self-contained and
free of third-party chart packaging that can disappear:

- internal-postgres.yaml: pgvector/pgvector:pg17 — upstream Postgres plus the
  `vector` extension the migrations require. POSTGRES_USER=dograh is the initdb
  superuser, so CREATE EXTENSION vector succeeds.
- internal-redis.yaml: redis:7.4-alpine, password-protected, AOF persistence.
- internal-minio.yaml: minio/minio, root creds shared with the app via a single
  secret (can't drift); the app auto-creates its bucket.

Service/secret names are unchanged (<rel>-postgresql, <rel>-redisinternal-master,
<rel>-minio) so the app wiring is untouched. Dep passwords are generated once and
persisted across upgrades via lookup. Drop the Chart.yaml dependencies,
Chart.lock, and the `helm dependency` step; the internal manifests gate on the
mode toggles (database.mode=internal, etc.).

Also fixes surfaced by smoke-testing on a live EKS cluster:
- Dockerfile: ship the per-service run_*.sh entrypoints the chart invokes.
- migrate-job: run as a post-install/pre-upgrade hook (the bundled Postgres does
  not exist during pre-install) with a wait-for-postgres init container.
- backend env: declare POSTGRES_PASSWORD/REDIS_PASSWORD before the DATABASE_URL/
  REDIS_URL that interpolate them (Kubernetes only expands back-references).
- worker liveness probes: pgrep isn't in the slim runtime image; check
  /proc/1/cmdline instead (each worker execs its process as PID 1).
- UI: set HOSTNAME=0.0.0.0 so Next.js standalone doesn't bind to the k8s-injected
  pod name (which maps to the pod IP only, breaking port-forward/loopback).

Verified end-to-end on EKS 1.36: all pods Ready, migrations applied (pgvector
extension + 27 tables), UI login page and web API served via port-forward.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Abhishek 2026-07-03 12:39:39 +05:30 committed by GitHub
parent fd0d144b08
commit 88f4477edb
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
42 changed files with 2845 additions and 1 deletions

View file

@ -141,7 +141,22 @@ ENV PYTHONUNBUFFERED=1
# Copy application code (chown at copy-time avoids a duplicate /app layer
# from a later `RUN chown -R`, which would double the on-disk size of /app).
COPY --chown=dograh:dograh ./api ./api
COPY --chown=dograh:dograh ./scripts/start_services_docker.sh ./scripts/start_services_docker.sh
# Entrypoint scripts.
# start_services_docker.sh — single-container (docker-compose) entrypoint
# that runs every service in one process tree.
# run_*.sh — per-service entrypoints used by the Helm chart,
# which runs each workload (web, arq-worker, ari-manager,
# campaign-orchestrator, migrate) as its own pod. Keep this list in sync
# with the command:[] entries in deploy/helm/dograh/templates/*.yaml.
COPY --chown=dograh:dograh \
./scripts/start_services_docker.sh \
./scripts/run_migrate.sh \
./scripts/run_web.sh \
./scripts/run_arq_worker.sh \
./scripts/run_ari_manager.sh \
./scripts/run_campaign_orchestrator.sh \
./scripts/
# ts_validator Node deps (built in ts-deps stage with full node:22-slim image).
# The validator runs as a short-lived subprocess from api/mcp_server/ts_bridge.py.