SurfSense

mirror of https://github.com/MODSetter/SurfSense.git synced 2026-05-08 15:22:39 +02:00

Author	SHA1	Message	Date
CREDO23	40304c6795	feat(connectors): add Google Drive content extraction using existing ETL - Download files from Google Drive to temporary location - Export Google Workspace files as PDF - Delegate content extraction to existing process_file_in_background - Reuse Surfsense's ETL services (Unstructured, LlamaCloud, Docling)	2025-12-28 15:54:50 +02:00
CREDO23	701c3409b3	feat(connectors): add Google Drive file type detection and mapping - Detect Google Workspace files (Docs, Sheets, Slides) - Map to PDF export format to preserve rich content (images, formatting) - Identify files to skip (shortcuts, unsupported types)	2025-12-28 15:54:42 +02:00
CREDO23	74386affdc	feat(connectors): add Google Drive API client wrapper - Build and manage Google Drive service with credentials - List files with query support and pagination - Download binary files and export Google Workspace files as PDF - Handle HTTP errors gracefully	2025-12-28 15:54:32 +02:00
CREDO23	2c8717b14b	feat(connectors): add Google Drive credentials module for OAuth management - Handle Google OAuth credential initialization and validation - Automatic token refresh with database persistence - Reuse existing tokens when valid	2025-12-28 15:54:26 +02:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	9d0721de43	feat: Replace AsyncChromiumLoader with Playwright for web crawling and content extraction in link preview and web crawler connector modules.	2025-12-27 13:58:00 -08:00
Anish Sarkar	ebc04f590e	refactor: improve write_todos tool and UI components - Refactored the write_todos tool to enhance argument and result schemas using Zod for better validation and type safety. - Updated the WriteTodosToolUI to streamline the rendering logic and improve loading states, ensuring a smoother user experience. - Enhanced the Plan and TodoItem components to better handle streaming states and display progress, providing clearer feedback during task management. - Cleaned up code formatting and structure for improved readability and maintainability.	2025-12-26 17:49:56 +05:30
Anish Sarkar	d9df63f57e	refactor: enhance web crawling functionality with Firecrawl integration - Updated WebCrawlerConnector to prioritize Firecrawl API for crawling if an API key is provided, falling back to Chromium if Firecrawl fails. - Improved error handling to log failures from both Firecrawl and Chromium. - Enhanced link preview tool to use a random User-Agent for better compatibility with web servers. - Passed Firecrawl API key to the stream_new_chat function for improved configuration management.	2025-12-26 02:37:20 +05:30
CREDO23	64cd65bc1f	use trafilatura to extrack page content from the chromium result	2025-12-19 10:05:51 +02:00
CREDO23	1f60d1c22f	add user agent to AsyncChromiumLoader	2025-12-17 19:43:54 +02:00
CREDO23	4cfeffb38a	refactor: update the webcrawler connector formater	2025-12-17 18:42:37 +02:00
Differ	e238fab638	Merge remote-tracking branch 'upstream/main' into feat/bookstack-connector	2025-12-06 09:15:02 +08:00
Differ	500bc60d02	fix: add input validation, retry limit, code formatting, and exclude i18n from secret detection	2025-12-05 09:58:49 +08:00
CREDO23	803f792a9d	clean up	2025-12-04 12:55:19 +02:00
CREDO23	521cea3ef0	update query parmas for get issues by date range method	2025-12-04 12:53:18 +02:00
Differ	6b1b8d0f2e	feat: add BookStack connector for wiki documentation indexing	2025-12-04 14:08:44 +08:00
CREDO23	107f013ff9	jira-connector: update get_issues_by_date_range method	2025-12-04 01:21:46 +02:00
CREDO23	abf017eabb	jira-connector: update get_issues_by_date_range method	2025-12-04 00:48:54 +02:00
CREDO23	4df6b09db9	jira-connector: update get all issues method	2025-12-04 00:42:10 +02:00
CREDO23	875924e5fd	jira-connector: update make_api_request to accespt POST with payload	2025-12-04 00:38:13 +02:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	0b1ca97acf	refactor(webcrawler): update scraping logic to use v2 API and improve error handling	2025-11-26 14:30:08 -08:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	8f30cfd69a	chore(lint): ruff checks	2025-11-26 13:22:31 -08:00
samkul-swe	6d19e0fad8	Fixing search logic	2025-11-22 13:33:16 -08:00
samkul-swe	896e410e2a	Webcrawler connector draft	2025-11-21 23:27:21 -08:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	a3a5b13f48	chore: linting	2025-11-03 16:00:58 -08:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	e65d74f2e2	refactor: added batch commits and Increased task time limits in celery_app.py - Increased task time limits in celery_app.py for longer processing times. - Enhanced pagination logic in NotionHistoryConnector to handle large result sets. - Implemented batch commits every 10 documents across various indexers (Airtable, ClickUp, Confluence, Discord, GitHub, Google Calendar, Gmail, JIRA, Linear, Luma, Notion, Slack) to improve performance and reduce database load. - Updated final commit logging for clarity on total documents processed.	2025-11-03 15:57:19 -08:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	0e6669ac4e	fix: celery_app path and gmail indexing	2025-10-21 21:11:41 -07:00
Anish Sarkar	bbb2abfc02	fix: ran formatter as per coderrabbitai	2025-10-17 02:44:44 +05:30
Anish Sarkar	0ff1b586a2	feat: update Elasticsearch integration and logging - revised Elasticsearch connector enum revision IDs - added `TaskLoggingService` to elasticsearch_indexer - integrated Elasticsearch into prompts.py as requested	2025-10-17 02:21:56 +05:30
Anish Sarkar	929035f802	Merge remote-tracking branch 'upstream/main' into feature/elasticsearch-connector	2025-10-16 16:24:37 +05:30
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	31982cea9a	chore: removed content trunking for better UI	2025-10-14 14:19:48 -07:00
Anish Sarkar	55d752e3c8	feat: added elasticsearch connector	2025-10-12 09:39:04 +05:30
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	aea09a5dad	feat: Moved searchconnectors association from user to searchspace - Need to move llm configs to searchspace	2025-10-08 21:13:01 -07:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	94367e4226	chore: linting and formatting	2025-09-28 22:26:26 -07:00
samkul-swe	9d2b808e66	Added Luma connector	2025-09-28 14:59:10 -07:00
Rohan Verma	662212d4e2	Merge pull request #295 from CREDO23/feature/airtable-connector [Feature] Add Airtable connector	2025-09-03 12:49:14 -07:00
Rohan Verma	c2030cec48	Merge pull request #275 from CREDO23/improvement/persist-refreshed-token-in-google-related-connector [Improvement] Google connectors \| Update the connector config after refreshing the token	2025-08-26 18:47:36 -07:00
CREDO23	45d2c18c16	update airtable indexer	2025-08-26 19:17:46 +02:00
CREDO23	c4b7c45d6d	Add sirtable connector	2025-08-26 15:41:24 +02:00
CREDO23	ecbb1f27e0	clean up	2025-08-26 11:53:27 +02:00
CREDO23	85664f2ff8	update the connector config after refreshing google calendar access token	2025-08-26 11:49:31 +02:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	3b87ecc3c5	fix: made notion indexing async	2025-08-21 14:43:04 -07:00
CREDO23	9711af2b72	refresh the token when expired	2025-08-21 01:09:13 +02:00
CREDO23	b0b6df0971	updated the connector config after refreshing the token	2025-08-20 20:32:08 +02:00
CREDO23	d840113bff	add relelvant bot suggestions	2025-08-15 09:12:40 +02:00
CREDO23	b7e941bcb2	update google gmail connector	2025-08-15 09:11:14 +02:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	5aa52375c3	refactor: refactored background_tasks & indexing_tasks	2025-08-12 15:28:13 -07:00
CREDO23	edf46e4de1	update seach source connector schema	2025-08-03 12:16:40 +02:00
CREDO23	44d2338663	get all primary calendar event by default	2025-08-03 12:16:40 +02:00
CREDO23	4cb00735ac	add coderabbit suggestions	2025-07-30 22:25:47 +02:00
CREDO23	ede3dce9af	update clikup connector	2025-07-30 21:28:31 +02:00

1 2 3 4 5

232 commits