SurfSense/surfsense_backend
DhruvTilva e16e4e2c5c fix: guard missing text_as_html in Table element markdown conversion
When the Unstructured API returns a Table element without text_as_html
in its metadata (e.g. local install or free-tier API), the lambda was
raising KeyError: 'text_as_html', crashing the entire document
indexing pipeline for any file containing tables.

Guard the key access with .get() and fall back to the plain extracted
text content (x) so the pipeline continues and the table content is
still indexed, just without HTML formatting.
2026-06-25 23:52:15 +05:30
..
alembic feat: add position column to chunks for explicit document order 2026-06-18 08:55:47 -07:00
app fix: guard missing text_as_html in Table element markdown conversion 2026-06-25 23:52:15 +05:30
scripts refactor(provider-configuration): standardize provider parameter naming across various modules and improve quota error handling in tests 2026-06-13 14:23:32 +05:30
tests feat: add unit tests for LLM bundle streaming functionality 2026-06-18 23:32:36 +05:30
.dockerignore chore(backend): exclude tests/ from production Docker image 2026-05-06 17:16:22 +05:30
.env.example Merge commit '7ce409c580' into dev 2026-06-16 22:48:14 -07:00
.gitignore fix(gitignore): anchor data/ rule; track podcast voice catalogs 2026-06-12 00:06:37 +02:00
.python-version feat: SurfSense v0.0.6 init 2025-03-14 18:53:14 -07:00
alembic.ini add github connector, add alembic for db migrations, fix bug updating connectors 2025-04-13 13:56:22 -07:00
celery_worker.py fix: celery_app path and gmail indexing 2025-10-21 21:11:41 -07:00
Dockerfile feat(proxy): integrate Scrapling for enhanced web scraping capabilities 2026-06-09 00:15:10 -07:00
main.py feat(observability): add OpenTelemetry process bootstrap 2026-05-21 23:01:54 +05:30
pyproject.toml feat: bumped version to 0.0.29 2026-06-17 22:29:50 -07:00
uv.lock feat: bumped version to 0.0.29 2026-06-17 22:29:50 -07:00