diff --git a/docs/adr/0002-phase-2-execution.md b/docs/adr/0002-phase-2-execution.md index 6b6949f..5d590b4 100644 --- a/docs/adr/0002-phase-2-execution.md +++ b/docs/adr/0002-phase-2-execution.md @@ -504,7 +504,7 @@ own migrations. - Validate local Postgres dev cluster before PR C work begins. Recipe at `docs/plans/local-dev-postgres-setup.md` is correct but needs to be applied on this machine (delandtj-home): cluster is not initdb'd, pgvector is not - installed. Containerized `pgvector/pgvector:pg16` is a viable alternative + installed. Containerized `pgvector/pgvector:pg18` is a viable alternative if pgvector packaging is friction. See open discussion thread. ### Phase 4 sketch: `sharing_rules` and the precedence chain diff --git a/docs/plans/0002c-migrations.md b/docs/plans/0002c-migrations.md index ef8e35c..78b6ac6 100644 --- a/docs/plans/0002c-migrations.md +++ b/docs/plans/0002c-migrations.md @@ -843,7 +843,7 @@ podman run --rm -d --name vestige-pg \ -e POSTGRES_USER=vestige \ -e POSTGRES_DB=vestige \ -p 5432:5432 \ - docker.io/pgvector/pgvector:pg16 + docker.io/pgvector/pgvector:pg18 export DATABASE_URL="postgresql://vestige:devpw@127.0.0.1:5432/vestige" ``` diff --git a/docs/plans/0002d-store-impl-bodies.md b/docs/plans/0002d-store-impl-bodies.md index ad1d9b7..adfd8aa 100644 --- a/docs/plans/0002d-store-impl-bodies.md +++ b/docs/plans/0002d-store-impl-bodies.md @@ -1612,7 +1612,7 @@ use vestige_core::storage::postgres::PgMemoryStore; #[tokio::test] async fn round_trip_crud_search_scheduling_edges() { let docker = clients::Cli::default(); - let image = GenericImage::new("pgvector/pgvector", "pg16") + let image = GenericImage::new("pgvector/pgvector", "pg18") .with_env_var("POSTGRES_PASSWORD", "test") .with_env_var("POSTGRES_DB", "vestige_test") .with_exposed_port(5432); @@ -1759,7 +1759,7 @@ This sub-plan is complete when ALL of the following hold: and the `Visibility` enum is exported alongside it. The SQLite backend reads and writes the same four fields. 8. The `tests/postgres_round_trip.rs` integration test passes against - a `pgvector/pgvector:pg16` container (insert / get / update / delete + a `pgvector/pgvector:pg18` container (insert / get / update / delete / fts_search / vector_search / get_scheduling / update_scheduling / add_edge / get_edges / remove_edge / get_neighbors / cascade delete). diff --git a/docs/plans/0002h-testing-and-benches.md b/docs/plans/0002h-testing-and-benches.md index d6bcebc..3fc2e1e 100644 --- a/docs/plans/0002h-testing-and-benches.md +++ b/docs/plans/0002h-testing-and-benches.md @@ -166,12 +166,12 @@ use vestige_core::storage::postgres::PgMemoryStore; pub async fn fresh_pg_store( embedder: Arc, ) -> Result<(PgMemoryStore, ContainerAsync)> { - // pgvector/pgvector:pg16 is the official pgvector image built on the - // postgres:16 base. testcontainers-modules::postgres::Postgres targets + // pgvector/pgvector:pg18 is the official pgvector image built on the + // postgres:18 base. testcontainers-modules::postgres::Postgres targets // the upstream postgres image by default; we override name + tag. let container = Postgres::default() .with_name("pgvector/pgvector") - .with_tag("pg16") + .with_tag("pg18") .start() .await?; @@ -867,7 +867,7 @@ Requirements: the `docker_available()` check in `common/mod.rs`. The test output includes a `docker unavailable; skip` line per test so the developer knows the tests were not silently dropped. -- The pgvector image (`pgvector/pgvector:pg16`) is pulled on first run; +- The pgvector image (`pgvector/pgvector:pg18`) is pulled on first run; ~200 MB. A pre-pulled image keeps the per-run overhead at the cold-start container boot (~2-5 seconds). @@ -920,7 +920,7 @@ async fn build_bench(rows: usize) -> Bench { let embedder = TestEmbedder::new_768(); let container = Postgres::default() .with_name("pgvector/pgvector") - .with_tag("pg16") + .with_tag("pg18") .start() .await .unwrap(); @@ -1092,7 +1092,7 @@ Notes: - The Postgres feature tests should run in a separate CI matrix entry to isolate failures and skip them entirely on platforms (Windows runners if any) where the pgvector image is not available. -- Cache the `pgvector/pgvector:pg16` image between runs. The +- Cache the `pgvector/pgvector:pg18` image between runs. The `docker/setup-buildx-action` cache or a simple `docker pull` step before the test step keeps cold-start under the existing CI time budget. - Skip CI: contributors without Docker can still merge changes that do @@ -1113,7 +1113,7 @@ jobs: # no `postgres` service block needed; testcontainers manages its own steps: - uses: actions/checkout@v4 - - run: docker pull pgvector/pgvector:pg16 + - run: docker pull pgvector/pgvector:pg18 - uses: dtolnay/rust-toolchain@stable - run: cargo test -p vestige-core --features postgres-backend --test '*' ``` diff --git a/docs/plans/local-dev-postgres-setup.md b/docs/plans/local-dev-postgres-setup.md index 6250a55..f863d48 100644 --- a/docs/plans/local-dev-postgres-setup.md +++ b/docs/plans/local-dev-postgres-setup.md @@ -1,27 +1,55 @@ -# Local Dev Postgres Setup (Arch / CachyOS) +# Local Dev Postgres Setup (container, hybrid approach) -**Status**: Applied on this machine on 2026-04-21 -**Related**: docs/plans/0002-phase-2-postgres-backend.md, docs/adr/0001-pluggable-storage-and-network-access.md +**Status**: Applied on this machine on 2026-05-27 (rootless podman, Postgres 18.4 + pgvector 0.8.2). +**Related**: docs/plans/0002-phase-2-postgres-backend.md, docs/adr/0002-phase-2-execution.md, docs/adr/0001-pluggable-storage-and-network-access.md -Purpose: capture the minimum, repeatable steps to stand up a Postgres 18 instance on a local Arch/CachyOS box for Phase 2 (`PgMemoryStore`) development, `sqlx prepare`, and manual migration testing. This is a single-operator dev recipe, not a production runbook. +Purpose: capture the minimum, repeatable steps to stand up a long-lived +Postgres 18 + pgvector instance on a local Linux dev box for Phase 2 +(`PgMemoryStore`) development, `sqlx prepare`, and manual migration +testing. This is a single-operator dev recipe, not a production runbook. + +ADR 0002 picked the **hybrid container** approach over a native install: +the `pgvector/pgvector:pg18` image ships pgvector pre-installed, matches +the image testcontainers will use in the Phase 2 test harness, and avoids +the AUR/build-from-source friction of native pgvector packaging on Arch. --- ## Current state on this machine -- Package: `postgresql` 18.3-2 (pacman). Pulls `postgresql-libs`, `libxslt`. -- Service: `postgresql.service`, enabled + active. -- Listens on: `127.0.0.1:5432` and `[::1]:5432` only (default `listen_addresses = 'localhost'`). -- Data dir: `/var/lib/postgres/data`, owner `postgres:postgres`. -- Auth (`pg_hba.conf`, Arch defaults): `peer` for local socket, `scram-sha-256` for host 127.0.0.1/::1. +- Runtime: rootless `podman` 5.8.2 (Arch). `docker` 29.5.1 also installed but unused. +- Image: `docker.io/pgvector/pgvector:pg18` (PostgreSQL 18.4, pgvector 0.8.2). +- Container: `vestige-pg`, `--restart=always`, port `127.0.0.1:5432:5432`. +- Volume: named podman volume `vestige-pgdata`, mounted at + `/var/lib/postgresql/data` inside the container; `PGDATA` points at + `/var/lib/postgresql/data/pgdata` so the volume mount is non-empty at + init time (Postgres refuses to initdb into a non-empty directory). +- Listens on: `127.0.0.1:5432` only (port mapping is bound to loopback). +- Auth: `scram-sha-256` (image default for both local socket and host). ### Database + role -- Database: `vestige`, UTF8, owner `vestige`. -- Role: `vestige` with `LOGIN CREATEDB` (no superuser, no replication, no cross-db). -- Schema `public` re-owned to `vestige`, plus default privileges so any future tables / sequences / functions in `public` are fully owned and granted to `vestige`. +- Database: `vestige`, UTF8, owner `vestige`, `LC_COLLATE=C.UTF-8`, `LC_CTYPE=C.UTF-8`. +- Role: `vestige` with `LOGIN CREATEDB` (no superuser, no replication). +- Schema `public` re-owned to `vestige` with full default privileges on + future tables / sequences / functions. +- Extension: `vector` (pgvector 0.8.2) installed in the `vestige` + database by the superuser at setup time. -Net effect: the `vestige` role can create, alter, drop, and grant freely inside the `vestige` database -- enough for `sqlx::migrate!`, ad-hoc schema work, and the full Phase 2 `MemoryStore` surface. It cannot create extensions (see Phase 2 followups below) and cannot touch other databases. +Net effect: the `vestige` role can create, alter, drop, and grant freely +inside the `vestige` database -- enough for `sqlx::migrate!`, ad-hoc +schema work, and the full Phase 2 `MemoryStore` surface. It cannot create +extensions; the superuser handled `CREATE EXTENSION vector` already. + +### Passwords + +Two passwords live in the dev user's home, mode 600: + +- `~/.vestige_pg_superpw` -- the `postgres` superuser password inside the + container. Used for one-shot admin tasks (creating roles, installing + extensions, password rotation). Day-to-day app traffic does NOT use it. +- `~/.vestige_pg_pw` -- the `vestige` role password. This is the one the + Phase 2 backend, `sqlx prepare`, and ad-hoc `psql` invocations use. ### Connection @@ -29,13 +57,8 @@ Net effect: the `vestige` role can create, alter, drop, and grant freely inside postgresql://vestige:@127.0.0.1:5432/vestige ``` -Password lives at `~/.vestige_pg_pw`, mode 600, owned by the dev user (no sudo needed to read it). Read with: - -```sh -cat ~/.vestige_pg_pw -``` - -Recommended dev shell export (keep this OUT of the repo; use `.env` + gitignore or a shell rc): +Recommended dev shell export (keep this OUT of the repo; use `.env` + +gitignore or a shell rc): ```sh export DATABASE_URL="postgresql://vestige:$(cat ~/.vestige_pg_pw)@127.0.0.1:5432/vestige" @@ -45,109 +68,212 @@ export DATABASE_URL="postgresql://vestige:$(cat ~/.vestige_pg_pw)@127.0.0.1:5432 ## Reproduce from scratch -On a fresh Arch / CachyOS box with passwordless sudo: +On a fresh Linux box with `podman` installed and `python3` available: ```sh -# 1. Install -sudo pacman -S --noconfirm postgresql +# 1. Pull the image +podman pull docker.io/pgvector/pgvector:pg18 -# 2. Initialize the cluster (UTF8, scram-sha-256 for host, peer for local) -sudo -iu postgres initdb \ - --locale=C.UTF-8 --encoding=UTF8 \ - -D /var/lib/postgres/data \ - --auth-host=scram-sha-256 --auth-local=peer +# 2. Create a persistent named volume +podman volume create vestige-pgdata -# 3. Start + enable -sudo systemctl enable --now postgresql +# 3. Generate the superuser password and stash it (mode 600) +SUPER_PW=$(python3 -c 'import secrets,string; a=string.ascii_letters+string.digits; print("".join(secrets.choice(a) for _ in range(32)))') +umask 077 +printf '%s' "$SUPER_PW" > ~/.vestige_pg_superpw +chmod 600 ~/.vestige_pg_superpw -# 4. Generate a password and stash it in the dev user's home (mode 600) +# 4. Start the container +podman run -d \ + --name vestige-pg \ + --restart=always \ + -p 127.0.0.1:5432:5432 \ + -e POSTGRES_PASSWORD="$SUPER_PW" \ + -e PGDATA=/var/lib/postgresql/data/pgdata \ + -v vestige-pgdata:/var/lib/postgresql/data \ + docker.io/pgvector/pgvector:pg18 + +unset SUPER_PW + +# 5. Wait for ready +until podman exec vestige-pg pg_isready -U postgres -h 127.0.0.1 >/dev/null 2>&1; do + sleep 1 +done + +# 6. Generate the vestige role password and stash it (mode 600) VESTIGE_PW=$(python3 -c 'import secrets,string; a=string.ascii_letters+string.digits; print("".join(secrets.choice(a) for _ in range(32)))') umask 077 printf '%s' "$VESTIGE_PW" > ~/.vestige_pg_pw chmod 600 ~/.vestige_pg_pw -# 5. Create role + database + grants -sudo -u postgres psql -v ON_ERROR_STOP=1 < '[3,2,1]'::vector AS l2_distance;" ``` --- -## Phase 2 followups (before PgMemoryStore works) +## Boot persistence (rootless podman) -The cluster above is bare Postgres. Phase 2 needs `pgvector`: +`--restart=always` keeps the container alive across podman daemon +restarts, but rootless podman containers do NOT auto-start on system +boot unless the dev user has lingering enabled: ```sh -# Install the extension package -sudo pacman -S --noconfirm pgvector - -# Enable it in the vestige database (must run as postgres; vestige is not superuser) -sudo -u postgres psql -d vestige -c 'CREATE EXTENSION IF NOT EXISTS vector;' +sudo loginctl enable-linger "$USER" ``` -Verify: +After that, the `podman-restart.service` user unit handles restart of +`--restart=always` containers when the user session starts at boot: ```sh -PGPASSWORD="$(cat ~/.vestige_pg_pw)" psql -h 127.0.0.1 -U vestige -d vestige \ - -c "SELECT extname, extversion FROM pg_extension WHERE extname = 'vector';" +systemctl --user enable --now podman-restart.service ``` -Notes: +Skip both if you prefer to start the cluster manually each session with +`podman start vestige-pg`. -- `pgvector` must be available on the server before `sqlx::migrate!` runs, or the Phase 2 migration that declares typed `Vector` columns will fail. -- Testcontainer-based Phase 2 integration tests use `pgvector/pgvector:pg16` and are independent of this local cluster. This local cluster is for `sqlx prepare`, `cargo run -- migrate --to postgres`, and manual poking. -- `sqlx prepare` needs `DATABASE_URL` pointed at this cluster with `vestige` migrations already applied. Run from `crates/vestige-core/`. +--- + +## Day-to-day operation + +```sh +# Status +podman ps --filter name=vestige-pg + +# Logs (follow) +podman logs -f vestige-pg + +# psql as the app role +PGPASSWORD="$(cat ~/.vestige_pg_pw)" psql -h 127.0.0.1 -U vestige -d vestige + +# psql as the superuser (for grants, extensions, role admin) +podman exec -it vestige-pg psql -U postgres + +# Stop / start +podman stop vestige-pg +podman start vestige-pg + +# Restart in place +podman restart vestige-pg +``` --- ## Password rotation ```sh +# Rotate the vestige role password NEW_PW=$(python3 -c 'import secrets,string; a=string.ascii_letters+string.digits; print("".join(secrets.choice(a) for _ in range(32)))') umask 077 printf '%s' "$NEW_PW" > ~/.vestige_pg_pw chmod 600 ~/.vestige_pg_pw -sudo -u postgres psql -v ON_ERROR_STOP=1 \ +podman exec -i vestige-pg psql -U postgres -v ON_ERROR_STOP=1 \ -c "ALTER ROLE vestige WITH PASSWORD '${NEW_PW}';" unset NEW_PW + +# Rotate the superuser password (less common) +NEW_SUPER=$(python3 -c 'import secrets,string; a=string.ascii_letters+string.digits; print("".join(secrets.choice(a) for _ in range(32)))') +umask 077 +printf '%s' "$NEW_SUPER" > ~/.vestige_pg_superpw +chmod 600 ~/.vestige_pg_superpw +podman exec -i vestige-pg psql -U postgres -v ON_ERROR_STOP=1 \ + -c "ALTER ROLE postgres WITH PASSWORD '${NEW_SUPER}';" +unset NEW_SUPER ``` Then re-export `DATABASE_URL` in any live shells. --- +## Backup and restore (dev-grade) + +`pg_dump` writes a plain-text SQL dump to host disk. For dev data this is +enough; production runbook lives in `0002i-runbook.md`. + +```sh +# Dump +PGPASSWORD="$(cat ~/.vestige_pg_pw)" pg_dump -h 127.0.0.1 -U vestige -d vestige \ + --format=plain --no-owner > vestige-$(date +%Y%m%d-%H%M%S).sql + +# Restore (drops + recreates) +podman exec -i vestige-pg psql -U postgres -v ON_ERROR_STOP=1 \ + -c 'DROP DATABASE IF EXISTS vestige;' \ + -c 'CREATE DATABASE vestige OWNER vestige ENCODING UTF8 TEMPLATE template0;' +PGPASSWORD="$(cat ~/.vestige_pg_pw)" psql -h 127.0.0.1 -U vestige -d vestige < vestige-DUMP.sql +``` + +The named volume `vestige-pgdata` persists outside the container; the +container can be `podman rm`'d and recreated without losing data, as +long as the volume stays in place. + +--- + ## Teardown Destroys the cluster and all data in it: ```sh -sudo systemctl disable --now postgresql -sudo pacman -Rns postgresql postgresql-libs -sudo rm -rf /var/lib/postgres -rm -f ~/.vestige_pg_pw +podman stop vestige-pg +podman rm vestige-pg +podman volume rm vestige-pgdata +podman rmi docker.io/pgvector/pgvector:pg18 +rm -f ~/.vestige_pg_pw ~/.vestige_pg_superpw ``` +`enable-linger` and the user systemd unit can be undone with +`sudo loginctl disable-linger "$USER"` and +`systemctl --user disable podman-restart.service` if you turned them on. + +--- + +## Notes for Phase 2 + +- `pgvector` is preinstalled in the image; the `CREATE EXTENSION vector` + in step 7 above makes it available inside the `vestige` DB. The + extension must be loaded BEFORE `sqlx::migrate!` runs the Phase 2 + migration that declares typed `Vector` columns, otherwise the + migration fails. +- Testcontainer-based Phase 2 integration tests use the same + `pgvector/pgvector:pg18` image and spin up fresh containers per run; + they are independent of this long-lived cluster. This cluster exists + for `sqlx prepare`, `cargo run -- migrate --to postgres`, and manual + poking. +- `sqlx prepare` needs `DATABASE_URL` pointed at this cluster with + `vestige` migrations already applied. Run from `crates/vestige-core/`. + --- ## Out of scope for this doc -- TLS, client-cert auth, non-localhost access. Phase 3 exposes the Vestige HTTP API over the network, not Postgres directly. -- Backups, PITR, WAL archiving. For dev data: `pg_dump -h 127.0.0.1 -U vestige vestige > vestige.sql`. -- Replication, PgBouncer, tuned `postgresql.conf`. Defaults are fine for Phase 2 development. -- Making this the canonical Vestige backend. By default Vestige still uses SQLite; this cluster exists so the `postgres-backend` feature can be built and tested locally. +- TLS, client-cert auth, non-localhost access. Phase 3 exposes the + Vestige HTTP API over the network, not Postgres directly. +- PITR, WAL archiving, replication, PgBouncer, tuned `postgresql.conf`. + Defaults are fine for Phase 2 development. +- Native (non-container) Postgres install. The prior version of this + doc covered native Arch packaging; superseded by ADR 0002's hybrid + decision. +- Making this the canonical Vestige backend. By default Vestige still + uses SQLite; this cluster exists so the `postgres-backend` feature + can be built and tested locally.