diff --git a/README.md b/README.md index 4cb51fa42..7ac8299fb 100644 --- a/README.md +++ b/README.md @@ -72,97 +72,23 @@ Join the [SurfSense Discord](https://discord.gg/ejRNvftDp9) and help shape the f ## How to get started? -### PRE-START CHECKS +### Installation Options -#### PGVector -Make sure pgvector extension is installed on your machine. Setup Guide https://github.com/pgvector/pgvector?tab=readme-ov-file#installation +SurfSense provides two installation methods: -#### File Uploading Support -For File uploading you need Unstructured.io API key. You can get it at http://platform.unstructured.io/ +1. **[Docker Installation (Recommended)](https://www.surfsense.net/docs/docker-installation)** - The easiest way to get SurfSense up and running with all dependencies containerized. -#### Auth -SurfSense now only works with Google OAuth. Make sure to set your OAuth Client at https://developers.google.com/identity/protocols/oauth2 . We need client id and client secret for backend. Make sure to enable people api and add the required scopes under data access (openid, userinfo.email, userinfo.profile) +2. **[Manual Installation](https://www.surfsense.net/docs/manual-installation)** - For users who prefer more control over their setup or need to customize their deployment. -![gauth](https://github.com/user-attachments/assets/80d60fe5-889b-48a6-b947-200fdaf544c1) +Both installation guides include detailed OS-specific instructions for Windows, macOS, and Linux. -#### LLM Observability -One easy way to observe SurfSense Researcher Agent is to use LangSmith. Get its API KEY from https://smith.langchain.com/ +Before installation, make sure to complete the [prerequisite setup steps](https://www.surfsense.net/docs/) including: +- PGVector setup +- Google OAuth configuration +- Unstructured.io API key +- Other required API keys -**Open AI LLMS** -![openai_langraph](https://github.com/user-attachments/assets/b1f4c7a1-0a66-4d21-9053-2e09a5634f95) - - -**Ollama LLMS** -![ollama_langgraph](https://github.com/user-attachments/assets/5b6c870e-095c-4368-86e6-f7488e0fca28) - - -#### Crawler Support -SurfSense currently uses [Firecrawl.py](https://www.firecrawl.dev/) right now. Playwright crawler support will be added soon. - - -## Quick Start - -### Preferred Method: Docker Setup -The recommended way to run SurfSense is using Docker, which ensures consistent environment across different systems. - -1. Make sure you have Docker and Docker Compose installed -2. Follow the detailed instructions in our [Docker Setup Guide](DOCKER_SETUP.md) - -```bash -# Start all services with one command -docker-compose up --build -``` - ---- -### Alternative: Manual Setup - -### Backend (./surfsense_backend) -This is the core of SurfSense. Before we begin let's look at `.env` variables' that we need to successfully setup SurfSense. - -|ENV VARIABLE|DESCRIPTION| -|--|--| -| DATABASE_URL| Your PostgreSQL database connection string. Eg. `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`| -| SECRET_KEY| JWT Secret key used for authentication. Should be a secure random string. Eg. `SURFSENSE_SECRET_KEY_123456789`| -| GOOGLE_OAUTH_CLIENT_ID| Google OAuth client ID obtained from Google Cloud Console when setting up OAuth authentication| -| GOOGLE_OAUTH_CLIENT_SECRET| Google OAuth client secret obtained from Google Cloud Console when setting up OAuth authentication| -| NEXT_FRONTEND_URL| URL where your frontend application is hosted. Eg. `http://localhost:3000`| -| EMBEDDING_MODEL| Name of the embedding model to use for vector embeddings. Currently works with Sentence Transformers only. Expect other embeddings soon. Eg. `mixedbread-ai/mxbai-embed-large-v1`| -| RERANKERS_MODEL_NAME| Name of the reranker model for search result reranking. Eg. `ms-marco-MiniLM-L-12-v2`| -| RERANKERS_MODEL_TYPE| Type of reranker model being used. Eg. `flashrank`| -| FAST_LLM| LiteLLM routed Smaller, faster LLM for quick responses. Eg. `openai/gpt-4o-mini`, `ollama/deepseek-r1:8b`| -| STRATEGIC_LLM| LiteLLM routed Advanced LLM for complex reasoning tasks. Eg. `openai/gpt-4o`, `ollama/gemma3:12b`| -| LONG_CONTEXT_LLM| LiteLLM routed LLM capable of handling longer context windows. Eg. `gemini/gemini-2.0-flash`, `ollama/deepseek-r1:8b`| -| UNSTRUCTURED_API_KEY| API key for Unstructured.io service for document parsing| -| FIRECRAWL_API_KEY| API key for Firecrawl service for web crawling and data extraction| - -IMPORTANT: Since LLM calls are routed through LiteLLM make sure to include API keys of LLM models you are using. For example if you used `openai/gpt-4o` make sure to include OpenAI API Key `OPENAI_API_KEY` or if you use `gemini/gemini-2.0-flash` then you include `GEMINI_API_KEY`. - -You can also integrate any LLM just follow this https://docs.litellm.ai/docs/providers - -Now once you have everything let's proceed to run SurfSense. -1. Install `uv` : https://docs.astral.sh/uv/getting-started/installation/ -2. Now just run this command to install dependencies i.e `uv sync` -3. That's it. Now just run the `main.py` file using `uv run main.py`. You can also optionally pass `--reload` as an argument to enable hot reloading. -4. If everything worked fine you should see screen like this. - -![backend](https://i.ibb.co/542Vhqw/backendrunning.png) - ---- - -### FrontEnd (./surfsense_web) - -For local frontend setup just fill out the `.env` file of frontend. - -|ENV VARIABLE|DESCRIPTION| -|--|--| -| NEXT_PUBLIC_FASTAPI_BACKEND_URL | Give hosted backend url here. Eg. `http://localhost:8000`| - -1. Now install dependencies using `pnpm install` -2. Run it using `pnpm run dev` - -You should see your Next.js frontend running at `localhost:3000` - -#### Some FrontEnd Screens +## Screenshots **Search Spaces** @@ -180,43 +106,13 @@ You should see your Next.js frontend running at `localhost:3000` ![chat](https://github.com/user-attachments/assets/bb352d52-1c6d-4020-926b-722d0b98b491) ---- - -### Extension (./surfsense_browser_extension) - -Extension is in plasmo framework which is a cross browser extension framework. Extension main usecase is to save any webpages protected beyond authentication. - -For building extension just fill out the `.env` file of frontend. - -|ENV VARIABLE|DESCRIPTION| -|--|--| -| PLASMO_PUBLIC_BACKEND_URL| SurfSense Backend URL eg. "http://127.0.0.1:8000" | - -Build the extension for your favorite browser using this guide: https://docs.plasmo.com/framework/workflows/build#with-a-specific-target - -When you load and start the extension you should see a Apu page like this +**Browser Extension** ![ext1](https://github.com/user-attachments/assets/1f042b7a-6349-422b-94fb-d40d0df16c40) - - -After filling in your SurfSense API key you should be able to use extension now. - - ![ext2](https://github.com/user-attachments/assets/a9b9f1aa-2677-404d-b0a0-c1b2dddf24a7) - -|Options|Explanations| -|--|--| -| Search Space | Search Space to save your dynamic bookmarks. | -| Clear Inactive History Sessions | It clears the saved content for Inactive Tab Sessions. | -| Save Current Webpage Snapshot | Stores the current webpage session info into SurfSense history store| -| Save to SurfSense | Processes the SurfSense History Store & Initiates a Save Job | - - - - -## Tech Stack +## Tech Stack ### **BackEnd** diff --git a/surfsense_web/components/Navbar.tsx b/surfsense_web/components/Navbar.tsx index 52d187931..eb3555c05 100644 --- a/surfsense_web/components/Navbar.tsx +++ b/surfsense_web/components/Navbar.tsx @@ -24,8 +24,8 @@ interface NavbarProps { export const Navbar = () => { const navItems = [ { - name: "", - link: "/", + name: "Docs", + link: "/docs", }, // { // name: "Product", @@ -118,53 +118,52 @@ const DesktopNav = ({ navItems, visible }: NavbarProps) => { SurfSense - - {navItems.map((navItem, idx) => ( - setHoveredIndex(idx)} - className="relative" - > - + + {navItems.map((navItem, idx) => ( + setHoveredIndex(idx)} + className="relative" > - {navItem.name} - {hoveredIndex === idx && ( - - )} - - - ))} - -
+ + {navItem.name} + {hoveredIndex === idx && ( + + )} + + + ))} + {!visible && ( diff --git a/surfsense_web/content/docs/docker-installation.mdx b/surfsense_web/content/docs/docker-installation.mdx new file mode 100644 index 000000000..66bc36d70 --- /dev/null +++ b/surfsense_web/content/docs/docker-installation.mdx @@ -0,0 +1,158 @@ +--- +title: Docker Installation +description: Setting up SurfSense using Docker (Recommended) +full: true +--- + +# Docker Installation (Recommended) + +This guide explains how to run SurfSense using Docker Compose, which is the preferred and recommended method for deployment. + +## Prerequisites + +Before you begin, ensure you have: + +- [Docker](https://docs.docker.com/get-docker/) and [Docker Compose](https://docs.docker.com/compose/install/) installed on your machine +- [Git](https://git-scm.com/downloads) (to clone the repository) +- Completed all the [prerequisite setup steps](/docs) including: + - PGVector setup + - Google OAuth configuration + - Unstructured.io API key + - Other required API keys + +## Installation Steps + +1. **Configure Environment Variables** + + Set up the necessary environment variables: + + **Linux/macOS:** + ```bash + # Copy example environment files + cp surfsense_backend/.env.example surfsense_backend/.env + cp surfsense_web/.env.example surfsense_web/.env + ``` + + **Windows (Command Prompt):** + ```cmd + copy surfsense_backend\.env.example surfsense_backend\.env + copy surfsense_web\.env.example surfsense_web\.env + ``` + + **Windows (PowerShell):** + ```powershell + Copy-Item -Path surfsense_backend\.env.example -Destination surfsense_backend\.env + Copy-Item -Path surfsense_web\.env.example -Destination surfsense_web\.env + ``` + + Edit both `.env` files and fill in the required values: + + **Backend Environment Variables:** + + | ENV VARIABLE | DESCRIPTION | + |--------------|-------------| + | DATABASE_URL | PostgreSQL connection string (e.g., `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`) | + | SECRET_KEY | JWT Secret key for authentication (should be a secure random string) | + | GOOGLE_OAUTH_CLIENT_ID | Google OAuth client ID obtained from Google Cloud Console | + | GOOGLE_OAUTH_CLIENT_SECRET | Google OAuth client secret obtained from Google Cloud Console | + | NEXT_FRONTEND_URL | URL where your frontend application is hosted (e.g., `http://localhost:3000`) | + | EMBEDDING_MODEL | Name of the embedding model (e.g., `mixedbread-ai/mxbai-embed-large-v1`) | + | RERANKERS_MODEL_NAME | Name of the reranker model (e.g., `ms-marco-MiniLM-L-12-v2`) | + | RERANKERS_MODEL_TYPE | Type of reranker model (e.g., `flashrank`) | + | FAST_LLM | LiteLLM routed smaller, faster LLM (e.g., `openai/gpt-4o-mini`, `ollama/deepseek-r1:8b`) | + | STRATEGIC_LLM | LiteLLM routed advanced LLM for complex tasks (e.g., `openai/gpt-4o`, `ollama/gemma3:12b`) | + | LONG_CONTEXT_LLM | LiteLLM routed LLM for longer context windows (e.g., `gemini/gemini-2.0-flash`, `ollama/deepseek-r1:8b`) | + | UNSTRUCTURED_API_KEY | API key for Unstructured.io service for document parsing | + | FIRECRAWL_API_KEY | API key for Firecrawl service for web crawling | + + Include API keys for the LLM providers you're using. For example: + - `OPENAI_API_KEY`: If using OpenAI models + - `GEMINI_API_KEY`: If using Google Gemini models + + For other LLM providers, refer to the [LiteLLM documentation](https://docs.litellm.ai/docs/providers). + + **Frontend Environment Variables:** + + | ENV VARIABLE | DESCRIPTION | + |--------------|-------------| + | NEXT_PUBLIC_FASTAPI_BACKEND_URL | URL of the backend service (e.g., `http://localhost:8000`) | + +2. **Build and Start Containers** + + Start the Docker containers: + + **Linux/macOS/Windows:** + ```bash + docker-compose up --build + ``` + + To run in detached mode (in the background): + + **Linux/macOS/Windows:** + ```bash + docker-compose up -d + ``` + + **Note for Windows users:** If you're using older Docker Desktop versions, you might need to use `docker compose` (with a space) instead of `docker-compose`. + +3. **Access the Applications** + + Once the containers are running, you can access: + - Frontend: [http://localhost:3000](http://localhost:3000) + - Backend API: [http://localhost:8000](http://localhost:8000) + - API Documentation: [http://localhost:8000/docs](http://localhost:8000/docs) + +## Useful Docker Commands + +### Container Management + +- **Stop containers:** + + **Linux/macOS/Windows:** + ```bash + docker-compose down + ``` + +- **View logs:** + + **Linux/macOS/Windows:** + ```bash + # All services + docker-compose logs -f + + # Specific service + docker-compose logs -f backend + docker-compose logs -f frontend + docker-compose logs -f db + ``` + +- **Restart a specific service:** + + **Linux/macOS/Windows:** + ```bash + docker-compose restart backend + ``` + +- **Execute commands in a running container:** + + **Linux/macOS/Windows:** + ```bash + # Backend + docker-compose exec backend python -m pytest + + # Frontend + docker-compose exec frontend pnpm lint + ``` + +## Troubleshooting + +- **Linux/macOS:** If you encounter permission errors, you may need to run the docker commands with `sudo`. +- **Windows:** If you see access denied errors, make sure you're running Command Prompt or PowerShell as Administrator. +- If ports are already in use, modify the port mappings in the `docker-compose.yml` file. +- For backend dependency issues, check the `Dockerfile` in the backend directory. +- For frontend dependency issues, check the `Dockerfile` in the frontend directory. +- **Windows-specific:** If you encounter line ending issues (CRLF vs LF), configure Git to handle line endings properly with `git config --global core.autocrlf true` before cloning the repository. + +## Next Steps + +Once your installation is complete, you can start using SurfSense! Navigate to the frontend URL and log in using your Google account. \ No newline at end of file diff --git a/surfsense_web/content/docs/installation.mdx b/surfsense_web/content/docs/installation.mdx new file mode 100644 index 000000000..684d6b7a0 --- /dev/null +++ b/surfsense_web/content/docs/installation.mdx @@ -0,0 +1,21 @@ +--- +title: Installation +description: Current ways to use SurfSense +full: true +--- + +# Installing SurfSense + +There are two ways to install and use SurfSense: + +## Docker Installation (Preferred) + +The recommended way to install SurfSense is using Docker. This method provides a containerized environment with all dependencies pre-configured. + +[Learn more about Docker installation](/docs/docker-installation) + +## Manual Installation + +For users who prefer more control over the installation process or need to customize their setup, we also provide manual installation instructions. + +[Learn more about Manual installation](/docs/manual-installation) \ No newline at end of file diff --git a/surfsense_web/content/docs/manual-installation.mdx b/surfsense_web/content/docs/manual-installation.mdx new file mode 100644 index 000000000..561e8d7b8 --- /dev/null +++ b/surfsense_web/content/docs/manual-installation.mdx @@ -0,0 +1,258 @@ +--- +title: Manual Installation +description: Setting up SurfSense manually for customized deployments +full: true +--- + +# Manual Installation + +This guide provides step-by-step instructions for setting up SurfSense without Docker. This approach gives you more control over the installation process and allows for customization of the environment. + +## Prerequisites + +Before beginning the manual installation, ensure you have completed all the [prerequisite setup steps](/docs), including: + +- PGVector installation +- Google OAuth setup +- Unstructured.io API key +- LLM observability (optional) +- Crawler setup (if needed) + +## Backend Setup + +The backend is the core of SurfSense. Follow these steps to set it up: + +### 1. Environment Configuration + +First, create and configure your environment variables by copying the example file: + +**Linux/macOS:** +```bash +cd surfsense_backend +cp .env.example .env +``` + +**Windows (Command Prompt):** +```cmd +cd surfsense_backend +copy .env.example .env +``` + +**Windows (PowerShell):** +```powershell +cd surfsense_backend +Copy-Item -Path .env.example -Destination .env +``` + +Edit the `.env` file and set the following variables: + +| ENV VARIABLE | DESCRIPTION | +|--------------|-------------| +| DATABASE_URL | PostgreSQL connection string (e.g., `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`) | +| SECRET_KEY | JWT Secret key for authentication (should be a secure random string) | +| GOOGLE_OAUTH_CLIENT_ID | Google OAuth client ID | +| GOOGLE_OAUTH_CLIENT_SECRET | Google OAuth client secret | +| NEXT_FRONTEND_URL | Frontend application URL (e.g., `http://localhost:3000`) | +| EMBEDDING_MODEL | Name of the embedding model (e.g., `mixedbread-ai/mxbai-embed-large-v1`) | +| RERANKERS_MODEL_NAME | Name of the reranker model (e.g., `ms-marco-MiniLM-L-12-v2`) | +| RERANKERS_MODEL_TYPE | Type of reranker model (e.g., `flashrank`) | +| FAST_LLM | LiteLLM routed faster LLM (e.g., `openai/gpt-4o-mini`, `ollama/deepseek-r1:8b`) | +| STRATEGIC_LLM | LiteLLM routed advanced LLM (e.g., `openai/gpt-4o`, `ollama/gemma3:12b`) | +| LONG_CONTEXT_LLM | LiteLLM routed long-context LLM (e.g., `gemini/gemini-2.0-flash`, `ollama/deepseek-r1:8b`) | +| UNSTRUCTURED_API_KEY | API key for Unstructured.io service | +| FIRECRAWL_API_KEY | API key for Firecrawl service (if using crawler) | + +**Important**: Since LLM calls are routed through LiteLLM, include API keys for the LLM providers you're using: +- For OpenAI models: `OPENAI_API_KEY` +- For Google Gemini models: `GEMINI_API_KEY` +- For other providers, refer to the [LiteLLM documentation](https://docs.litellm.ai/docs/providers) + +### 2. Install Dependencies + +Install the backend dependencies using `uv`: + +**Linux/macOS:** +```bash +# Install uv if you don't have it +curl -fsSL https://astral.sh/uv/install.sh | bash + +# Install dependencies +uv sync +``` + +**Windows (PowerShell):** +```powershell +# Install uv if you don't have it +iwr -useb https://astral.sh/uv/install.ps1 | iex + +# Install dependencies +uv sync +``` + +**Windows (Command Prompt):** +```cmd +# Install dependencies with uv (after installing uv) +uv sync +``` + +### 3. Run the Backend + +Start the backend server: + +**Linux/macOS/Windows:** +```bash +# Run without hot reloading +uv run main.py + +# Or with hot reloading for development +uv run main.py --reload +``` + +If everything is set up correctly, you should see output indicating the server is running on `http://localhost:8000`. + +## Frontend Setup + +### 1. Environment Configuration + +Set up the frontend environment: + +**Linux/macOS:** +```bash +cd surfsense_web +cp .env.example .env +``` + +**Windows (Command Prompt):** +```cmd +cd surfsense_web +copy .env.example .env +``` + +**Windows (PowerShell):** +```powershell +cd surfsense_web +Copy-Item -Path .env.example -Destination .env +``` + +Edit the `.env` file and set: + +| ENV VARIABLE | DESCRIPTION | +|--------------|-------------| +| NEXT_PUBLIC_FASTAPI_BACKEND_URL | Backend URL (e.g., `http://localhost:8000`) | + +### 2. Install Dependencies + +Install the frontend dependencies: + +**Linux/macOS:** +```bash +# Install pnpm if you don't have it +npm install -g pnpm + +# Install dependencies +pnpm install +``` + +**Windows:** +```powershell +# Install pnpm if you don't have it +npm install -g pnpm + +# Install dependencies +pnpm install +``` + +### 3. Run the Frontend + +Start the Next.js development server: + +**Linux/macOS/Windows:** +```bash +pnpm run dev +``` + +The frontend should now be running at `http://localhost:3000`. + +## Browser Extension Setup (Optional) + +The SurfSense browser extension allows you to save any webpage, including those protected behind authentication. + +### 1. Environment Configuration + +**Linux/macOS:** +```bash +cd surfsense_browser_extension +cp .env.example .env +``` + +**Windows (Command Prompt):** +```cmd +cd surfsense_browser_extension +copy .env.example .env +``` + +**Windows (PowerShell):** +```powershell +cd surfsense_browser_extension +Copy-Item -Path .env.example -Destination .env +``` + +Edit the `.env` file: + +| ENV VARIABLE | DESCRIPTION | +|--------------|-------------| +| PLASMO_PUBLIC_BACKEND_URL | SurfSense Backend URL (e.g., `http://127.0.0.1:8000`) | + +### 2. Build the Extension + +Build the extension for your browser using the [Plasmo framework](https://docs.plasmo.com/framework/workflows/build#with-a-specific-target). + +**Linux/macOS/Windows:** +```bash +# Install dependencies +pnpm install + +# Build for Chrome (default) +pnpm build + +# Or for other browsers +pnpm build --target=firefox +pnpm build --target=edge +``` + +### 3. Load the Extension + +Load the extension in your browser's developer mode and configure it with your SurfSense API key. + +## Verification + +To verify your installation: + +1. Open your browser and navigate to `http://localhost:3000` +2. Sign in with your Google account +3. Create a search space and try uploading a document +4. Test the chat functionality with your uploaded content + +## Troubleshooting + +- **Database Connection Issues**: Verify your PostgreSQL server is running and pgvector is properly installed +- **Authentication Problems**: Check your Google OAuth configuration and ensure redirect URIs are set correctly +- **LLM Errors**: Confirm your LLM API keys are valid and the selected models are accessible +- **File Upload Failures**: Validate your Unstructured.io API key +- **Windows-specific**: If you encounter path issues, ensure you're using the correct path separator (`\` instead of `/`) +- **macOS-specific**: If you encounter permission issues, you may need to use `sudo` for some installation commands + +## Next Steps + +Now that you have SurfSense running locally, you can explore its features: + +- Create search spaces for organizing your content +- Upload documents or use the browser extension to save webpages +- Ask questions about your saved content +- Explore the advanced RAG capabilities + +For production deployments, consider setting up: +- A reverse proxy like Nginx +- SSL certificates for secure connections +- Proper database backups +- User access controls \ No newline at end of file diff --git a/surfsense_web/content/docs/meta.json b/surfsense_web/content/docs/meta.json index e5bb280a4..27ce8b62f 100644 --- a/surfsense_web/content/docs/meta.json +++ b/surfsense_web/content/docs/meta.json @@ -4,6 +4,9 @@ "root": true, "pages": [ "---Setup---", - "index" + "index", + "installation", + "docker-installation", + "manual-installation" ] } \ No newline at end of file