SurfSense/surfsense_backend/app
DhruvTilva e16e4e2c5c fix: guard missing text_as_html in Table element markdown conversion
When the Unstructured API returns a Table element without text_as_html
in its metadata (e.g. local install or free-tier API), the lambda was
raising KeyError: 'text_as_html', crashing the entire document
indexing pipeline for any file containing tables.

Guard the key access with .get() and fall back to the plain extracted
text content (x) so the pipeline continues and the table content is
still indexed, just without HTML formatting.
2026-06-25 23:52:15 +05:30
..
agents fix(image-gen): resolve relative URLs returned by Xinference and compatible backends 2026-06-17 10:57:39 +05:30
automations feat(database-migrations): add migration to remove legacy model config tables and remove stale model connection code 2026-06-13 12:45:43 +05:30
config feat: fix onboarding trigger 2026-06-17 23:30:56 -07:00
connectors feat(etl-cache): route all file-based sources through the parse cache 2026-06-12 14:47:25 +02:00
etl_pipeline chore: linting 2026-06-17 22:31:36 -07:00
event_bus refactor(event_bus): wire catalog and events into package, rename builtin to events 2026-05-29 22:15:18 +02:00
file_storage chore: linting 2026-06-09 00:42:26 -07:00
gateway refactor(config): update GATEWAY_ENABLED variable to FALSE and adjust related configurations for improved messaging gateway handling 2026-06-16 23:49:26 +05:30
indexing_pipeline chore: linting 2026-06-17 22:31:36 -07:00
notifications use started_title in document processing handler 2026-06-17 15:06:05 +02:00
observability feat(observability): add chunk reconcile metric and kill-switch flag 2026-06-12 18:52:57 +02:00
podcasts Merge commit '7ce409c580' into dev 2026-06-16 22:48:14 -07:00
prompts feat(database-migrations): add migration to remove legacy model config tables and remove stale model connection code 2026-06-13 12:45:43 +05:30
retriever refactor(chunks): order chunk reads by (document_id, position) 2026-06-12 18:53:21 +02:00
routes feat: fix onboarding trigger 2026-06-17 23:30:56 -07:00
schemas refactor(model-connections): remove unused fields and update verification logic 2026-06-14 02:46:19 +05:30
services Merge pull request #1491 from AnishSarkar22/feat/unified-model-connections 2026-06-14 17:50:48 -07:00
tasks feat: enable streaming in LLM bundle construction 2026-06-18 23:39:55 +05:30
templates feat: update report generation and export capabilities to support multiple formats (PDF, DOCX, HTML, LaTeX, EPUB, ODT, plain text) across documentation and backend 2026-03-09 18:41:21 -07:00
utils fix: guard missing text_as_html in Table element markdown conversion 2026-06-25 23:52:15 +05:30
__init__.py feat: SurfSense v0.0.6 init 2025-03-14 18:53:14 -07:00
app.py refactor(model-connections): move backend model connections to provider capabilities 2026-06-12 02:17:22 +05:30
celery_app.py Merge remote-tracking branch 'upstream/dev' into features/documents-injestion-layered-cached 2026-06-14 11:30:33 +02:00
db.py feat: add position column to chunks for explicit document order 2026-06-18 08:55:47 -07:00
exceptions.py feat: add processing mode support for document uploads and ETL pipeline, improded error handling ux 2026-04-14 21:26:00 -07:00
rate_limiter.py try: ip fix for cludflare 2026-04-16 02:13:52 -07:00
session_events.py refactor: anonymous/free chat experience 2026-05-31 15:58:21 -07:00
users.py Seed default prompts on registration and for existing users 2026-03-31 18:12:09 +02:00
zero_publication.py feat(migration): evolve podcast lifecycle by detaching from zero_publication and updating column handling 2026-06-11 16:17:14 -07:00