SurfSense

mirror of https://github.com/MODSetter/SurfSense.git synced 2026-06-12 20:45:20 +02:00

Author	SHA1	Message	Date
CREDO23	8d413ea5c2	refactor(indexing): expose chunk_markdown and embed_batch helpers Split _compute so the incremental edit path can reuse the exact same chunker selection and embedding entry points (and their test patch targets) without going through the doc-level cache.	2026-06-12 18:52:57 +02:00
CREDO23	f82dedf712	feat(indexing): add pure chunk reconciler for content-addressed diffs Greedy multiset match on chunk text decides which rows keep their embeddings, which texts need embedding, and which rows are deleted. No DB, no embeddings; fully unit-tested (reuse, head insert, middle edit, deletion, duplicates, reorder, full rewrite).	2026-06-12 18:52:46 +02:00
CREDO23	c6e71c851c	feat(chunks): add explicit position column with backfill migration Chunk ids stop reflecting document order once incremental re-indexing keeps unchanged rows across edits. Backfill preserves the historical id ordering so behavior is identical on day one.	2026-06-12 18:52:45 +02:00
CREDO23	91d947ff79	refactor(embedding-cache): rename index cache to embedding cache The cached payload is the indexing pipeline's embeddings (markdown is chunked then embedded), so "embedding cache" names the expensive output directly and removes the "index" ambiguity (DB index vs vector index vs indexing phase). Renames the service, settings, eligibility, eviction task, metrics, config flags (INDEX_CACHE_* -> EMBEDDING_CACHE_*), object prefix, and the table (index_cache_embedding_sets -> embedding_cache_sets) with its constraint and indexes. Migration 161 renamed accordingly.	2026-06-12 17:00:01 +02:00
CREDO23	4e4f7f34fa	feat(index-cache): add TTL/size eviction task and daily schedule	2026-06-12 16:48:18 +02:00
CREDO23	019aa7bf76	feat(index-cache): serve chunk embeddings from cache during indexing	2026-06-12 16:48:18 +02:00
CREDO23	e8938c119b	feat(index-cache): add recall/remember service	2026-06-12 16:48:10 +02:00
CREDO23	4d6378e031	feat(observability): add index cache hit/miss and eviction metrics	2026-06-12 16:48:10 +02:00
CREDO23	daccd304ee	feat(index-cache): add settings, eligibility, and config flags	2026-06-12 16:48:10 +02:00
CREDO23	ad6da7c6af	feat(index-cache): add embedding blob store sharing the cache backend	2026-06-12 16:48:01 +02:00
CREDO23	f541114544	feat(index-cache): add cached embedding set table and repository	2026-06-12 16:48:01 +02:00
CREDO23	59fa4c38c3	feat(index-cache): add pickle-free blob serialization	2026-06-12 16:48:01 +02:00
CREDO23	cf208365b4	feat(index-cache): add embedding set value objects	2026-06-12 16:48:01 +02:00
CREDO23	0fb1d3d37b	feat(etl-cache): route all file-based sources through the parse cache Every file ingestion path (Dropbox, Google Drive / Composio Drive, OneDrive, local folder, Obsidian, and the legacy upload handlers) now parses via the extract_with_cache facade instead of calling EtlPipelineService.extract directly, so identical bytes are deduplicated globally regardless of source. vision_llm is passed through, keeping the existing cacheability gate intact.	2026-06-12 14:47:25 +02:00
CREDO23	0808fbcdee	feat(etl-cache): emit hit/miss and eviction metrics	2026-06-12 11:57:03 +02:00
CREDO23	9efe24879d	feat(observability): add etl cache lookup and eviction metrics	2026-06-12 11:57:03 +02:00
CREDO23	ce1e90386f	refactor(etl-cache): extract pure cacheability gate	2026-06-12 11:50:51 +02:00
CREDO23	0dc2ccc003	feat(tasks): route extraction through etl cache	2026-06-12 11:23:50 +02:00
CREDO23	1c05980ffb	feat(celery): schedule etl cache eviction	2026-06-12 11:23:50 +02:00
CREDO23	9f29a885b1	feat(db): register CachedParse model	2026-06-12 11:23:50 +02:00
CREDO23	5c4eec26cc	feat(config): add ETL_CACHE_* settings	2026-06-12 11:23:50 +02:00
CREDO23	324ba141a6	feat(etl-cache): add eviction task and public API	2026-06-12 11:23:40 +02:00
CREDO23	7ad39fd995	feat(etl-cache): add eviction policy	2026-06-12 11:23:40 +02:00
CREDO23	758da06c4f	feat(etl-cache): add extract_with_cache	2026-06-12 11:23:40 +02:00
CREDO23	41dea96af4	feat(etl-cache): add EtlCacheService	2026-06-12 11:23:40 +02:00
CREDO23	87fdb37fa3	feat(etl-cache): expose storage layer	2026-06-12 11:23:40 +02:00
CREDO23	a6f2457c7c	feat(etl-cache): add MarkdownCacheStore for cache blobs	2026-06-12 11:22:57 +02:00
CREDO23	217d040e9e	feat(etl-cache): resolve cache blob storage backend	2026-06-12 11:22:57 +02:00
CREDO23	d9b1b491e9	feat(etl-cache): add cache blob object-key builder	2026-06-12 11:22:57 +02:00
CREDO23	8d3238bcd1	feat(etl-cache): expose cache persistence layer	2026-06-12 11:22:57 +02:00
CREDO23	ea10127979	feat(etl-cache): add CachedParseRepository data access	2026-06-12 11:22:57 +02:00
CREDO23	c624235780	feat(etl-cache): add CachedParse table model	2026-06-12 11:22:48 +02:00
CREDO23	205a63b9bc	feat(etl-cache): add EtlCacheSettings resolved from config	2026-06-12 11:22:48 +02:00
CREDO23	b84debd999	feat(etl-cache): expose cache schema value objects	2026-06-12 11:22:48 +02:00
CREDO23	3c9ea0011d	feat(etl-cache): add EvictionCandidate value object	2026-06-12 11:22:48 +02:00
CREDO23	24f824b597	feat(etl-cache): add ParseKey cache identity value object	2026-06-12 11:22:48 +02:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	c855be8ccd	fix(auto_reload): update task to use a lambda for user_id in async call	2026-06-11 16:51:18 -07:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	cff721aa42	feat(migration): evolve podcast lifecycle by detaching from zero_publication and updating column handling	2026-06-11 16:17:14 -07:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	05190da0a9	chore: linting	2026-06-11 15:31:43 -07:00
CREDO23	7b30a76856	fix(gitignore): anchor data/ rule; track podcast voice catalogs	2026-06-12 00:06:37 +02:00
CREDO23	41f4a58663	Merge remote-tracking branch 'upstream/dev' into improvement-podcast-graph # Conflicts: # surfsense_backend/app/tasks/celery_tasks/podcast_tasks.py	2026-06-11 23:14:49 +02:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	c3695e7837	feat: update auto-reload settings and enhance payment session creation - Added currency parameter to the Stripe checkout session for auto-reload setup. - Integrated AutoReloadSettings component into the BuyMorePage for improved user experience. - Removed deprecated AutoReloadSettings component from user settings directory. - Updated import paths for AutoReloadSettings in purchases page to reflect new structure.	2026-06-11 13:29:40 -07:00
CREDO23	ca9b157676	fix(podcasts): keep legacy episodes readable and guard regenerate	2026-06-11 12:43:07 +02:00
CREDO23	aa7f14d94f	feat(podcasts): add revert-regeneration and surface cancel on the live card	2026-06-11 12:31:42 +02:00
CREDO23	f0fc660d70	feat(podcasts): constrain monologue briefs to a single speaker	2026-06-11 11:56:57 +02:00
CREDO23	eb56acc407	refactor(podcasts): regenerate via brief gate, render brief inline in chat	2026-06-11 11:45:17 +02:00
CREDO23	11a6b178a0	refactor(podcasts): drop transcript gate, add regenerate-from-ready and voice previews	2026-06-11 10:42:13 +02:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	65e511f77b	feat: enhance credit management and user experience - Updated database queries to check for column existence with schema context. - Modified credit purchase quantity limits to allow up to 10,000 credits. - Improved user interface for credit purchases, enabling custom amounts and clamping input values. - Adjusted FAQ content to clarify credit purchasing process.	2026-06-10 22:52:27 -07:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	a7407502d3	feat(refactor): refactor payment system to implement unified credit wallet. - Updated environment variables and - configurations for credit purchases via Stripe, replacing legacy page pack system. - Introduced auto-reload feature for credit top-ups and modified database models to track credit transactions. - Updated notification system to handle insufficient credits and auto-reload failures. - Adjusted API routes and schemas to reflect changes in credit management.	2026-06-10 16:49:03 -07:00
CREDO23	97ab7a88fd	refactor(podcasts): remove legacy podcaster agent, task, and schema	2026-06-10 21:45:04 +02:00

1 2 3 4 5 ...

2252 commits