Commit graph

2830 commits

Author SHA1 Message Date
CREDO23
f997b6464e test(podcasts): update renderer test for second-based duration 2026-06-16 23:38:28 +02:00
CREDO23
cb70b64a70 test(podcasts): update unit fixtures for second-based duration 2026-06-16 23:38:28 +02:00
CREDO23
38991c7db8 test(podcasts): update integration fixtures for second-based duration 2026-06-16 23:38:28 +02:00
CREDO23
16d226e5ce refactor(podcasts): plan transcript length from midpoint seconds 2026-06-16 23:38:28 +02:00
CREDO23
116c38feac refactor(podcasts): build DurationTarget from brief seconds config 2026-06-16 23:38:28 +02:00
CREDO23
af08e2f033 refactor(podcasts): propose brief with min_seconds and max_seconds 2026-06-16 23:38:28 +02:00
CREDO23
d0ed5b94d9 refactor(podcasts): use shared second-based brief duration defaults 2026-06-16 23:38:28 +02:00
CREDO23
845653cbac feat(podcasts): pass min_seconds and max_seconds when proposing brief 2026-06-16 23:38:27 +02:00
CREDO23
085442ed9a feat(podcasts): use seconds defaults on create podcast request 2026-06-16 23:38:27 +02:00
CREDO23
32e0d21604 feat(podcasts): store brief duration in seconds with legacy load 2026-06-16 23:38:27 +02:00
CREDO23
9583e8f250 feat(podcasts): add shared duration limit constants 2026-06-16 23:38:27 +02:00
Anish Sarkar
9b7e278114 refactor(config): update GATEWAY_ENABLED variable to FALSE and adjust related configurations for improved messaging gateway handling 2026-06-16 23:49:26 +05:30
CREDO23
1048d0afc3 test(podcasts): cover public stream missing-object 404 2026-06-16 20:09:08 +02:00
CREDO23
810ded2dde test(podcasts): cover in-flight 409 and missing-object 404 2026-06-16 20:09:08 +02:00
CREDO23
86a8833fb4 test(podcasts): add exists to fake storage backend 2026-06-16 20:09:08 +02:00
CREDO23
1d70af4684 fix(podcasts): guard public stream against missing audio 2026-06-16 20:09:08 +02:00
CREDO23
0c2808640a fix(podcasts): guard stream against missing audio 2026-06-16 20:09:08 +02:00
CREDO23
d2558e546e feat(podcasts): add audio_exists storage helper 2026-06-16 20:09:08 +02:00
Anish Sarkar
2a840fcc10 refactor(backend): derive frontend and backend urls from SURFSENSE_PUBLIC_URL 2026-06-16 02:10:50 +05:30
Anish Sarkar
6b31997599 Merge remote-tracking branch 'upstream/dev' into experiment/lean-url-port-architecture 2026-06-15 20:52:15 +05:30
Rohan Verma
69bdcf5946
Merge pull request #1491 from AnishSarkar22/feat/unified-model-connections
feat: Fix model attribution for prefix-stripped token usage callbacks
2026-06-14 17:50:48 -07:00
Anish Sarkar
0c15a37618 chore: update dependencies in pyproject.toml and uv.lock, removing flower 2026-06-14 20:29:52 +05:30
CREDO23
32a6e54ce6 Merge remote-tracking branch 'upstream/dev' into features/documents-injestion-layered-cached 2026-06-14 11:30:33 +02:00
Anish Sarkar
d9a4f14f99 feat(token-tracking): enhance model metadata reconciliation by adding bare model name handling 2026-06-14 12:18:22 +05:30
Anish Sarkar
7926814070 refactor(model-connections): remove unused fields and update verification logic 2026-06-14 02:46:19 +05:30
Anish Sarkar
c7409c8995 chore: ran linting 2026-06-13 21:59:35 +05:30
Anish Sarkar
ceace003aa feat(local-models): add documentation for connecting local model providers 2026-06-13 21:52:45 +05:30
Anish Sarkar
ab5423d2d2 Merge remote-tracking branch 'upstream/dev' into feat/unified-model-connections 2026-06-13 19:04:49 +05:30
Anish Sarkar
76843f42f1 refactor(anonymous-models): remove description field from anonymous model responses and update related UI components 2026-06-13 16:30:26 +05:30
Anish Sarkar
576c56628a chore(config): update global LLM configuration example with improved setup instructions, parameter naming, and enhanced comments for clarity 2026-06-13 14:57:14 +05:30
Anish Sarkar
e104193ddf refactor(provider-configuration): standardize provider parameter naming across various modules and improve quota error handling in tests 2026-06-13 14:23:32 +05:30
Anish Sarkar
4a6a282a46 feat(runtime-cooldown): implement Redis-based shared cooldown management for model selection 2026-06-13 13:53:01 +05:30
Anish Sarkar
bd4a04f2e7 feat(database-migrations): add migration to remove legacy model config tables and remove stale model connection code 2026-06-13 12:45:43 +05:30
Anish Sarkar
8fe9c21e76 feat(token-tracking): add model metadata registration and enhance token usage tracking 2026-06-13 03:08:35 +05:30
Anish Sarkar
5e86885a03 feat(model-connections): integrate model provider connections panel and connection card components 2026-06-13 02:40:22 +05:30
Anish Sarkar
15d9983669 feat(model-connections): enhance model selection facts and auto pinning logic 2026-06-13 02:19:27 +05:30
Anish Sarkar
45d27ba879 feat(model-connections): enhance auto mode with auto pinning 2026-06-13 01:39:26 +05:30
Anish Sarkar
9f6210ad08 feat(model-connections): add test preview functionality for model connections 2026-06-13 00:12:04 +05:30
CREDO23
dcebfc4756 Merge remote-tracking branch 'upstream/dev' into features/documents-injestion-layered-cached 2026-06-12 19:35:34 +02:00
Anish Sarkar
55f004e1da feat(model-connections): improve model discovery error handling and enhance UI components 2026-06-12 22:50:50 +05:30
Anish Sarkar
407f2a9612 feat(model-connections): enhance model connection functionality with preview and selection features 2026-06-12 22:41:21 +05:30
CREDO23
311570b4f0 test(indexing): cover the edit path and make integration caches hermetic
Real-DB tests assert unchanged chunk rows survive edits, only new text is
embedded, removed rows are deleted with positions compacted, and the kill
switch restores full-replace. An autouse fixture disables the ETL/embedding
caches so a developer's .env can't leak cache hits into unrelated tests.
2026-06-12 18:53:21 +02:00
CREDO23
052e9ef4d1 refactor(chunks): order chunk reads by (document_id, position)
Presentation and citation ordering moves off Chunk.id/created_at to the
explicit position column (id kept as tiebreaker). Vector and ts_rank
ranking order_by clauses are untouched.
2026-06-12 18:53:21 +02:00
CREDO23
5a71769dba fix(chunks): set position on remaining chunk insert paths
document_converters, the github size-fallback chunker, revert_service
restores, and the kb-persistence middleware now write explicit positions
(the middleware read path also orders by position).
2026-06-12 18:53:08 +02:00
CREDO23
7d55aaf2c1 feat(indexing): reconcile chunks incrementally on re-index
index() now loads existing rows and applies a content diff instead of
delete-all/reinsert-all: unchanged chunks keep their rows and embeddings
(zero HNSW/GIN churn), moved chunks get a position-only UPDATE, and only
new texts are embedded, batched with the summary embedding. First index
keeps the cache-aware build_chunk_embeddings path.
2026-06-12 18:53:08 +02:00
CREDO23
fd495e1b2f feat(observability): add chunk reconcile metric and kill-switch flag
surfsense.indexing.reconcile.chunks counts reused/embedded/deleted chunks per
re-index. CHUNK_RECONCILE_ENABLED (default on) falls back to delete-all +
full re-embed if the diff path ever misbehaves.
2026-06-12 18:52:57 +02:00
CREDO23
8d413ea5c2 refactor(indexing): expose chunk_markdown and embed_batch helpers
Split _compute so the incremental edit path can reuse the exact same chunker
selection and embedding entry points (and their test patch targets) without
going through the doc-level cache.
2026-06-12 18:52:57 +02:00
CREDO23
f82dedf712 feat(indexing): add pure chunk reconciler for content-addressed diffs
Greedy multiset match on chunk text decides which rows keep their embeddings,
which texts need embedding, and which rows are deleted. No DB, no embeddings;
fully unit-tested (reuse, head insert, middle edit, deletion, duplicates,
reorder, full rewrite).
2026-06-12 18:52:46 +02:00
CREDO23
c6e71c851c feat(chunks): add explicit position column with backfill migration
Chunk ids stop reflecting document order once incremental re-indexing keeps
unchanged rows across edits. Backfill preserves the historical id ordering
so behavior is identical on day one.
2026-06-12 18:52:45 +02:00
CREDO23
412493ae08 test(embedding-cache): add integration tests for service, repository, and store
Covers the public cache surface against real Postgres and a real local file
backend (no mocks): recall miss, remember->recall vector/text/order round-trip,
the dimension-mismatch refusal, the repository SQL behind eviction and dedup
(size sum, coldest ordering, TTL cutoff, duplicate-key no-op, reuse counter),
and the blob store save/load round-trip and delete.
2026-06-12 17:33:21 +02:00