SurfSense/surfsense_backend/app/utils
DhruvTilva e16e4e2c5c fix: guard missing text_as_html in Table element markdown conversion
When the Unstructured API returns a Table element without text_as_html
in its metadata (e.g. local install or free-tier API), the lambda was
raising KeyError: 'text_as_html', crashing the entire document
indexing pipeline for any file containing tables.

Guard the key access with .get() and fall back to the plain extracted
text content (x) so the pipeline continues and the table content is
still indexed, just without HTML formatting.
2026-06-25 23:52:15 +05:30
..
proxy chore: linting 2026-06-09 00:42:26 -07:00
async_retry.py feat: updated agent harness 2026-04-28 09:22:19 -07:00
blocknote_to_markdown.py refactor: update _render_block function to use a parameter for numbered list counter, improving state management 2026-02-17 12:59:47 +05:30
chat_comments.py fix: use delimited format for mention highlighting 2026-01-16 20:10:09 +02:00
connector_naming.py chore: linting 2026-04-27 14:04:50 -07:00
content_utils.py fix(chat): normalize provider-safe message history 2026-06-12 02:17:37 +05:30
document_converters.py fix: guard missing text_as_html in Table element markdown conversion 2026-06-25 23:52:15 +05:30
document_versioning.py chore: ran linting 2026-04-03 13:14:40 +05:30
file_extensions.py Route uploaded images to vision LLM with document-parser fallback 2026-04-09 14:33:33 +02:00
google_credentials.py chore: ran linting 2026-03-21 13:21:19 +05:30
indexing_locks.py feat: fixed and improved search and background task management. 2026-02-09 14:03:56 -08:00
notion_utils.py Implement KB sync after Notion page updates with block ID verification 2026-02-17 20:30:12 +02:00
oauth_security.py refactor: move PKCE pair generatio for airtable 2026-04-04 03:36:54 +05:30
perf.py feat(observability): add SurfSense metric helpers 2026-05-21 23:02:20 +05:30
periodic_scheduler.py Merge remote-tracking branch 'upstream/dev' into feat/obsidian-plugin 2026-04-24 21:34:55 +05:30
proxy_config.py feat(proxy): integrate Scrapling for enhanced web scraping capabilities 2026-06-09 00:15:10 -07:00
rbac.py feat: Implement Role-Based Access Control (RBAC) for search space resources. 2025-11-27 22:45:04 -08:00
refresh_tokens.py Add refresh token auth routes and utilities 2026-02-05 17:29:50 +02:00
signed_image_urls.py feat: added image gen support 2026-02-05 16:43:48 -08:00
user_message_multimodal.py chore: linting 2026-04-28 21:37:51 -07:00
validators.py refactor: Update GitHub connector to use gitingest CLI 2026-01-20 23:24:33 +05:30
webcrawler_utils.py style(backend): run ruff format on 10 files 2026-01-28 22:20:02 +02:00