- Introduced a new utility for parsing .xlsx files into markdown format, enhancing the ability to process Excel documents natively.
- Updated the Google Drive content extractor to utilize the new Excel parsing functionality, allowing for better handling of spreadsheet files.
- Enhanced file type detection and export logic to support various document formats, improving overall content extraction accuracy.
- Added unit tests to ensure the correctness of the new Excel parsing feature and its integration with existing content extraction workflows.
- Added authlib version 1.6.9 and tornado version 6.5.5 to the project dependencies.
- Updated authlib version in uv.lock to 1.6.9 with corresponding source and wheel URLs.
- Included tornado in the dependencies section of uv.lock for consistency.
- Added notion-markdown dependency to pyproject.toml.
- Refactored _markdown_to_blocks method to utilize notion-markdown for converting markdown content to Notion blocks.
- Updated create-notion-page component to replace Loader2Icon with Spinner for improved loading indication.
- Removed outdated dependencies: unstructured-client and langchain-unstructured.
- Added new versions for unstructured-client (0.42.3), unstructured[all-docs] (0.18.31), and langchain-unstructured (1.0.1).
- Added gitingest as a dependency to streamline the ingestion of GitHub repositories.
- Refactored GitHubConnector to utilize gitingest for efficient repository digest generation, reducing API calls.
- Updated GitHub indexer to process entire repository digests, enhancing performance and simplifying the indexing process.
- Modified GitHub connect form to indicate that the Personal Access Token is optional for public repositories.
- Introduced a new notifications table with relevant fields and indexes.
- Configured Electric SQL replication for the notifications, search_source_connectors, and documents tables.
- Centralized Electric SQL user credentials in the migration script for better management.
- Ensured idempotency in the migration process for creating users and publications.