From dd231a455cdc3481142f7829ace796d960fae612 Mon Sep 17 00:00:00 2001 From: Anish Sarkar <104695310+AnishSarkar22@users.noreply.github.com> Date: Tue, 10 Mar 2026 03:28:49 +0530 Subject: [PATCH] chore: restructure documentation to enhance user experience with a new index page and dedicated prerequisites section --- surfsense_web/content/docs/index.mdx | 131 +++++++------------ surfsense_web/content/docs/meta.json | 1 + surfsense_web/content/docs/prerequisites.mdx | 86 ++++++++++++ 3 files changed, 137 insertions(+), 81 deletions(-) create mode 100644 surfsense_web/content/docs/prerequisites.mdx diff --git a/surfsense_web/content/docs/index.mdx b/surfsense_web/content/docs/index.mdx index 6c0450297..42f25465d 100644 --- a/surfsense_web/content/docs/index.mdx +++ b/surfsense_web/content/docs/index.mdx @@ -1,86 +1,55 @@ --- -title: Prerequisites -description: Required setup's before setting up SurfSense -icon: ClipboardCheck +title: Documentation +description: Welcome to SurfSense's documentation +icon: BookOpen --- +import { Card, Cards } from 'fumadocs-ui/components/card'; +import { ClipboardCheck, Download, Container, Wrench, Cable, BookOpen, FlaskConical } from 'lucide-react'; -## Auth Setup +Welcome to **SurfSense's Documentation!** Here, you'll find everything you need to get the most out of SurfSense. Dive in to explore how SurfSense can be your AI-powered research companion. -SurfSense supports both Google OAuth and local email/password authentication. Google OAuth is optional - if you prefer local authentication, you can skip this section. - -**Note**: Google OAuth setup is **required** in your `.env` files if you want to use the Gmail and Google Calendar connectors in SurfSense. - -To set up Google OAuth: - -1. Login to your [Google Developer Console](https://console.cloud.google.com/) -2. Enable the required APIs: - - **People API** (required for basic Google OAuth) -![Google Developer Console People API](/docs/connectors/google/google_oauth_people_api.png) -3. Set up OAuth consent screen. -![Google Developer Console OAuth consent screen](/docs/connectors/google/google_oauth_screen.png) -4. Create OAuth client ID and secret. -![Google Developer Console OAuth client ID](/docs/connectors/google/google_oauth_client.png) -5. It should look like this. -![Google Developer Console Config](/docs/connectors/google/google_oauth_config.png) - ---- - -## File Upload's - -SurfSense supports three ETL (Extract, Transform, Load) services for converting files to LLM-friendly formats: - -### Option 1: Unstructured - -Files are converted using [Unstructured](https://github.com/Unstructured-IO/unstructured) - -1. Get an Unstructured.io API key from [Unstructured Platform](https://platform.unstructured.io/) -2. You should be able to generate API keys once registered -![Unstructured Dashboard](/docs/unstructured.png) - -### Option 2: LlamaIndex (LlamaCloud) - -Files are converted using [LlamaIndex](https://www.llamaindex.ai/) which offers 50+ file format support. - -1. Get a LlamaIndex API key from [LlamaCloud](https://cloud.llamaindex.ai/) -2. Sign up for a LlamaCloud account to access their parsing services -3. LlamaCloud provides enhanced parsing capabilities for complex documents - -### Option 3: Docling (Recommended for Privacy) - -Files are processed locally using [Docling](https://github.com/DS4SD/docling) - IBM's open-source document parsing library. - -1. **No API key required** - all processing happens locally -2. **Privacy-focused** - documents never leave your system -3. **Supported formats**: PDF, Office documents (Word, Excel, PowerPoint), images (PNG, JPEG, TIFF, BMP, WebP), HTML, CSV, AsciiDoc -4. **Enhanced features**: Advanced table detection, image extraction, and structured document parsing -5. **GPU acceleration** support for faster processing (when available) - -**Note**: You only need to set up one of these services. - ---- - -## LLM Observability (Optional) - -This is not required for SurfSense to work. But it is always a good idea to monitor LLM interactions. So we do not have those WTH moments. - -1. Get a LangSmith API key from [smith.langchain.com](https://smith.langchain.com/) -2. This helps in observing SurfSense Researcher Agent. -![LangSmith](/docs/langsmith.png) - ---- - -## Crawler - -SurfSense have 2 options for saving webpages: -- [SurfSense Extension](https://github.com/MODSetter/SurfSense/tree/main/surfsense_browser_extension) (Overall better experience & ability to save private webpages, recommended) -- Crawler (If you want to save public webpages) - -**NOTE:** SurfSense currently uses [Firecrawl.py](https://www.firecrawl.dev/) for web crawling. If you plan on using the crawler, you will need to create a Firecrawl account and get an API key. - - ---- - -## Next Steps - -Once you have all prerequisites in place, proceed to the [installation guide](/docs/installation) to set up SurfSense. \ No newline at end of file + + } + title="Prerequisites" + description="Required setup before installing SurfSense" + href="/docs/prerequisites" + /> + } + title="Installation" + description="Choose your installation method" + href="/docs/installation" + /> + } + title="Docker Installation" + description="Deploy SurfSense with Docker Compose" + href="/docs/docker-installation" + /> + } + title="Manual Installation" + description="Set up SurfSense manually from source" + href="/docs/manual-installation" + /> + } + title="Connectors" + description="Integrate with third-party services" + href="/docs/connectors" + /> + } + title="How-To Guides" + description="Step-by-step guides for common tasks" + href="/docs/how-to" + /> + } + title="Testing" + description="Running and writing tests for SurfSense" + href="/docs/testing" + /> + diff --git a/surfsense_web/content/docs/meta.json b/surfsense_web/content/docs/meta.json index 8401417cf..dee0cf6cb 100644 --- a/surfsense_web/content/docs/meta.json +++ b/surfsense_web/content/docs/meta.json @@ -5,6 +5,7 @@ "pages": [ "---Guides---", "index", + "prerequisites", "installation", "docker-installation", "manual-installation", diff --git a/surfsense_web/content/docs/prerequisites.mdx b/surfsense_web/content/docs/prerequisites.mdx new file mode 100644 index 000000000..6c0450297 --- /dev/null +++ b/surfsense_web/content/docs/prerequisites.mdx @@ -0,0 +1,86 @@ +--- +title: Prerequisites +description: Required setup's before setting up SurfSense +icon: ClipboardCheck +--- + + +## Auth Setup + +SurfSense supports both Google OAuth and local email/password authentication. Google OAuth is optional - if you prefer local authentication, you can skip this section. + +**Note**: Google OAuth setup is **required** in your `.env` files if you want to use the Gmail and Google Calendar connectors in SurfSense. + +To set up Google OAuth: + +1. Login to your [Google Developer Console](https://console.cloud.google.com/) +2. Enable the required APIs: + - **People API** (required for basic Google OAuth) +![Google Developer Console People API](/docs/connectors/google/google_oauth_people_api.png) +3. Set up OAuth consent screen. +![Google Developer Console OAuth consent screen](/docs/connectors/google/google_oauth_screen.png) +4. Create OAuth client ID and secret. +![Google Developer Console OAuth client ID](/docs/connectors/google/google_oauth_client.png) +5. It should look like this. +![Google Developer Console Config](/docs/connectors/google/google_oauth_config.png) + +--- + +## File Upload's + +SurfSense supports three ETL (Extract, Transform, Load) services for converting files to LLM-friendly formats: + +### Option 1: Unstructured + +Files are converted using [Unstructured](https://github.com/Unstructured-IO/unstructured) + +1. Get an Unstructured.io API key from [Unstructured Platform](https://platform.unstructured.io/) +2. You should be able to generate API keys once registered +![Unstructured Dashboard](/docs/unstructured.png) + +### Option 2: LlamaIndex (LlamaCloud) + +Files are converted using [LlamaIndex](https://www.llamaindex.ai/) which offers 50+ file format support. + +1. Get a LlamaIndex API key from [LlamaCloud](https://cloud.llamaindex.ai/) +2. Sign up for a LlamaCloud account to access their parsing services +3. LlamaCloud provides enhanced parsing capabilities for complex documents + +### Option 3: Docling (Recommended for Privacy) + +Files are processed locally using [Docling](https://github.com/DS4SD/docling) - IBM's open-source document parsing library. + +1. **No API key required** - all processing happens locally +2. **Privacy-focused** - documents never leave your system +3. **Supported formats**: PDF, Office documents (Word, Excel, PowerPoint), images (PNG, JPEG, TIFF, BMP, WebP), HTML, CSV, AsciiDoc +4. **Enhanced features**: Advanced table detection, image extraction, and structured document parsing +5. **GPU acceleration** support for faster processing (when available) + +**Note**: You only need to set up one of these services. + +--- + +## LLM Observability (Optional) + +This is not required for SurfSense to work. But it is always a good idea to monitor LLM interactions. So we do not have those WTH moments. + +1. Get a LangSmith API key from [smith.langchain.com](https://smith.langchain.com/) +2. This helps in observing SurfSense Researcher Agent. +![LangSmith](/docs/langsmith.png) + +--- + +## Crawler + +SurfSense have 2 options for saving webpages: +- [SurfSense Extension](https://github.com/MODSetter/SurfSense/tree/main/surfsense_browser_extension) (Overall better experience & ability to save private webpages, recommended) +- Crawler (If you want to save public webpages) + +**NOTE:** SurfSense currently uses [Firecrawl.py](https://www.firecrawl.dev/) for web crawling. If you plan on using the crawler, you will need to create a Firecrawl account and get an API key. + + +--- + +## Next Steps + +Once you have all prerequisites in place, proceed to the [installation guide](/docs/installation) to set up SurfSense. \ No newline at end of file