mirror of
https://github.com/MODSetter/SurfSense.git
synced 2026-04-29 19:06:24 +02:00
- Modified Dockerfile to use placeholder values for frontend environment variables, allowing for runtime substitution. - Enhanced entrypoint script to apply runtime environment variable configuration, replacing placeholders in JavaScript files with actual values. - Updated documentation paths in MDX files for Google OAuth images and added detailed setup guides for Discord, Linear, Notion, and Slack OAuth integrations.
87 lines
No EOL
3.6 KiB
Text
87 lines
No EOL
3.6 KiB
Text
---
|
|
title: Prerequisites
|
|
description: Required setup's before setting up SurfSense
|
|
---
|
|
|
|
|
|
## Auth Setup
|
|
|
|
SurfSense supports both Google OAuth and local email/password authentication. Google OAuth is optional - if you prefer local authentication, you can skip this section.
|
|
|
|
**Note**: Google OAuth setup is **required** in your `.env` files if you want to use the Gmail and Google Calendar connectors in SurfSense.
|
|
|
|
To set up Google OAuth:
|
|
|
|
1. Login to your [Google Developer Console](https://console.cloud.google.com/)
|
|
2. Enable the required APIs:
|
|
- **People API** (required for basic Google OAuth)
|
|
- **Gmail API** (required if you want to use the Gmail connector)
|
|
- **Google Calendar API** (required if you want to use the Google Calendar connector)
|
|

|
|
3. Set up OAuth consent screen.
|
|

|
|
4. Create OAuth client ID and secret.
|
|

|
|
5. It should look like this.
|
|

|
|
|
|
---
|
|
|
|
## File Upload's
|
|
|
|
SurfSense supports three ETL (Extract, Transform, Load) services for converting files to LLM-friendly formats:
|
|
|
|
### Option 1: Unstructured
|
|
|
|
Files are converted using [Unstructured](https://github.com/Unstructured-IO/unstructured)
|
|
|
|
1. Get an Unstructured.io API key from [Unstructured Platform](https://platform.unstructured.io/)
|
|
2. You should be able to generate API keys once registered
|
|

|
|
|
|
### Option 2: LlamaIndex (LlamaCloud)
|
|
|
|
Files are converted using [LlamaIndex](https://www.llamaindex.ai/) which offers 50+ file format support.
|
|
|
|
1. Get a LlamaIndex API key from [LlamaCloud](https://cloud.llamaindex.ai/)
|
|
2. Sign up for a LlamaCloud account to access their parsing services
|
|
3. LlamaCloud provides enhanced parsing capabilities for complex documents
|
|
|
|
### Option 3: Docling (Recommended for Privacy)
|
|
|
|
Files are processed locally using [Docling](https://github.com/DS4SD/docling) - IBM's open-source document parsing library.
|
|
|
|
1. **No API key required** - all processing happens locally
|
|
2. **Privacy-focused** - documents never leave your system
|
|
3. **Supported formats**: PDF, Office documents (Word, Excel, PowerPoint), images (PNG, JPEG, TIFF, BMP, WebP), HTML, CSV, AsciiDoc
|
|
4. **Enhanced features**: Advanced table detection, image extraction, and structured document parsing
|
|
5. **GPU acceleration** support for faster processing (when available)
|
|
|
|
**Note**: You only need to set up one of these services.
|
|
|
|
---
|
|
|
|
## LLM Observability (Optional)
|
|
|
|
This is not required for SurfSense to work. But it is always a good idea to monitor LLM interactions. So we do not have those WTH moments.
|
|
|
|
1. Get a LangSmith API key from [smith.langchain.com](https://smith.langchain.com/)
|
|
2. This helps in observing SurfSense Researcher Agent.
|
|

|
|
|
|
---
|
|
|
|
## Crawler
|
|
|
|
SurfSense have 2 options for saving webpages:
|
|
- [SurfSense Extension](https://github.com/MODSetter/SurfSense/tree/main/surfsense_browser_extension) (Overall better experience & ability to save private webpages, recommended)
|
|
- Crawler (If you want to save public webpages)
|
|
|
|
**NOTE:** SurfSense currently uses [Firecrawl.py](https://www.firecrawl.dev/) for web crawling. If you plan on using the crawler, you will need to create a Firecrawl account and get an API key.
|
|
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
Once you have all prerequisites in place, proceed to the [installation guide](/docs/installation) to set up SurfSense. |