SurfSense

mirror of https://github.com/MODSetter/SurfSense.git synced 2026-06-12 20:45:20 +02:00

Author	SHA1	Message	Date
CREDO23	c6e71c851c	feat(chunks): add explicit position column with backfill migration Chunk ids stop reflecting document order once incremental re-indexing keeps unchanged rows across edits. Backfill preserves the historical id ordering so behavior is identical on day one.	2026-06-12 18:52:45 +02:00
CREDO23	412493ae08	test(embedding-cache): add integration tests for service, repository, and store Covers the public cache surface against real Postgres and a real local file backend (no mocks): recall miss, remember->recall vector/text/order round-trip, the dimension-mismatch refusal, the repository SQL behind eviction and dedup (size sum, coldest ordering, TTL cutoff, duplicate-key no-op, reuse counter), and the blob store save/load round-trip and delete.	2026-06-12 17:33:21 +02:00
CREDO23	91d947ff79	refactor(embedding-cache): rename index cache to embedding cache The cached payload is the indexing pipeline's embeddings (markdown is chunked then embedded), so "embedding cache" names the expensive output directly and removes the "index" ambiguity (DB index vs vector index vs indexing phase). Renames the service, settings, eligibility, eviction task, metrics, config flags (INDEX_CACHE_* -> EMBEDDING_CACHE_*), object prefix, and the table (index_cache_embedding_sets -> embedding_cache_sets) with its constraint and indexes. Migration 161 renamed accordingly.	2026-06-12 17:00:01 +02:00
CREDO23	8cf578d965	test(index-cache): add unit tests and repoint embed/chunk patch targets	2026-06-12 16:48:18 +02:00
CREDO23	4e4f7f34fa	feat(index-cache): add TTL/size eviction task and daily schedule	2026-06-12 16:48:18 +02:00
CREDO23	019aa7bf76	feat(index-cache): serve chunk embeddings from cache during indexing	2026-06-12 16:48:18 +02:00
CREDO23	e8938c119b	feat(index-cache): add recall/remember service	2026-06-12 16:48:10 +02:00
CREDO23	4d6378e031	feat(observability): add index cache hit/miss and eviction metrics	2026-06-12 16:48:10 +02:00
CREDO23	daccd304ee	feat(index-cache): add settings, eligibility, and config flags	2026-06-12 16:48:10 +02:00
CREDO23	ad6da7c6af	feat(index-cache): add embedding blob store sharing the cache backend	2026-06-12 16:48:01 +02:00
CREDO23	f541114544	feat(index-cache): add cached embedding set table and repository	2026-06-12 16:48:01 +02:00
CREDO23	59fa4c38c3	feat(index-cache): add pickle-free blob serialization	2026-06-12 16:48:01 +02:00
CREDO23	cf208365b4	feat(index-cache): add embedding set value objects	2026-06-12 16:48:01 +02:00
CREDO23	0fb1d3d37b	feat(etl-cache): route all file-based sources through the parse cache Every file ingestion path (Dropbox, Google Drive / Composio Drive, OneDrive, local folder, Obsidian, and the legacy upload handlers) now parses via the extract_with_cache facade instead of calling EtlPipelineService.extract directly, so identical bytes are deduplicated globally regardless of source. vision_llm is passed through, keeping the existing cacheability gate intact.	2026-06-12 14:47:25 +02:00
CREDO23	99cf212c31	test: fix auth-mode mismatch and stale QuotaInsufficientError kwargs Pin AUTH_TYPE=LOCAL (and REGISTRATION_ENABLED=TRUE) in the test bootstrap so the email/password auth routers mount during integration tests regardless of a developer's .env=GOOGLE; without this the upload tests 404 on registration. Also update three tests to the current QuotaInsufficientError signature (balance_micros) after used_micros/limit_micros were removed.	2026-06-12 12:19:49 +02:00
CREDO23	0808fbcdee	feat(etl-cache): emit hit/miss and eviction metrics	2026-06-12 11:57:03 +02:00
CREDO23	9efe24879d	feat(observability): add etl cache lookup and eviction metrics	2026-06-12 11:57:03 +02:00
CREDO23	d5e0280097	test(etl-cache): cover two-phase eviction task on real infra	2026-06-12 11:54:36 +02:00
CREDO23	1460173dad	test(etl-cache): cover extract_with_cache end-to-end	2026-06-12 11:50:57 +02:00
CREDO23	c49a0f1233	test(etl-cache): cover store, service, and repository on real infra	2026-06-12 11:50:57 +02:00
CREDO23	3dec3231d0	test(etl-cache): cover over-budget eviction selection	2026-06-12 11:50:52 +02:00
CREDO23	a3e7047c35	test(etl-cache): cover cacheability gate rules	2026-06-12 11:50:52 +02:00
CREDO23	dddacbe762	test(etl-cache): cover content-addressing dedup and key shape	2026-06-12 11:50:52 +02:00
CREDO23	ce1e90386f	refactor(etl-cache): extract pure cacheability gate	2026-06-12 11:50:51 +02:00
CREDO23	5af594c405	docs(env): document ETL_CACHE_* settings	2026-06-12 11:23:50 +02:00
CREDO23	d898716cf4	feat(migration): add etl_cache_parses table	2026-06-12 11:23:50 +02:00
CREDO23	0dc2ccc003	feat(tasks): route extraction through etl cache	2026-06-12 11:23:50 +02:00
CREDO23	1c05980ffb	feat(celery): schedule etl cache eviction	2026-06-12 11:23:50 +02:00
CREDO23	9f29a885b1	feat(db): register CachedParse model	2026-06-12 11:23:50 +02:00
CREDO23	5c4eec26cc	feat(config): add ETL_CACHE_* settings	2026-06-12 11:23:50 +02:00
CREDO23	324ba141a6	feat(etl-cache): add eviction task and public API	2026-06-12 11:23:40 +02:00
CREDO23	7ad39fd995	feat(etl-cache): add eviction policy	2026-06-12 11:23:40 +02:00
CREDO23	758da06c4f	feat(etl-cache): add extract_with_cache	2026-06-12 11:23:40 +02:00
CREDO23	41dea96af4	feat(etl-cache): add EtlCacheService	2026-06-12 11:23:40 +02:00
CREDO23	87fdb37fa3	feat(etl-cache): expose storage layer	2026-06-12 11:23:40 +02:00
CREDO23	a6f2457c7c	feat(etl-cache): add MarkdownCacheStore for cache blobs	2026-06-12 11:22:57 +02:00
CREDO23	217d040e9e	feat(etl-cache): resolve cache blob storage backend	2026-06-12 11:22:57 +02:00
CREDO23	d9b1b491e9	feat(etl-cache): add cache blob object-key builder	2026-06-12 11:22:57 +02:00
CREDO23	8d3238bcd1	feat(etl-cache): expose cache persistence layer	2026-06-12 11:22:57 +02:00
CREDO23	ea10127979	feat(etl-cache): add CachedParseRepository data access	2026-06-12 11:22:57 +02:00
CREDO23	c624235780	feat(etl-cache): add CachedParse table model	2026-06-12 11:22:48 +02:00
CREDO23	205a63b9bc	feat(etl-cache): add EtlCacheSettings resolved from config	2026-06-12 11:22:48 +02:00
CREDO23	b84debd999	feat(etl-cache): expose cache schema value objects	2026-06-12 11:22:48 +02:00
CREDO23	3c9ea0011d	feat(etl-cache): add EvictionCandidate value object	2026-06-12 11:22:48 +02:00
CREDO23	24f824b597	feat(etl-cache): add ParseKey cache identity value object	2026-06-12 11:22:48 +02:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	c855be8ccd	fix(auto_reload): update task to use a lambda for user_id in async call	2026-06-11 16:51:18 -07:00
Rohan Verma	cb7cb90732	Merge pull request #1485 from MODSetter/dev feat(migration): evolve podcast lifecycle by detaching from zero_publication	2026-06-11 16:18:54 -07:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	fed83269d0	Merge commit '`6c8c559254`' into dev	2026-06-11 16:18:17 -07:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	cff721aa42	feat(migration): evolve podcast lifecycle by detaching from zero_publication and updating column handling	2026-06-11 16:17:14 -07:00
Rohan Verma	6c8c559254	Merge pull request #1484 from MODSetter/dev feat(podcasts): rebuild podcast pipeline with lifecycle architecture, zero sync, and unified credit wallet	2026-06-11 16:07:15 -07:00

1 2 3 4 5 ...

6644 commits