rowboat/apps/docs/docs/using_rag.md

# Using RAG in Rowboat

Rowboat provides multiple ways to enhance your agents' context with Retrieval-Augmented Generation (RAG). This guide will help you set up and use each RAG features.

## Quick Start

Text RAG and local file uploads are enabled by default - no configuration needed! Just start using them right away.

## RAG Features

### 1. Text RAG
✅ Enabled by default:

- Process and reason over text content directly
- No configuration required

### 2. Local File Uploads
✅ Enabled by default:

- Upload PDF files directly from your device
- Files are stored locally
- No configuration required
- Files are parsed using OpenAI by default
- For larger files, we recommend using Gemini models - see section below.

#### 2.1 Using Gemini for File Parsing
To use Google's Gemini model for parsing uploaded PDFs, set the following variable:

```bash
# Enable Gemini for file parsing
export USE_GEMINI_FILE_PARSING=true
export GOOGLE_API_KEY=your_google_api_key
```

### 3. URL Scraping
Rowboat uses Firecrawl for URL scraping. To enable URL scraping, set the following variables:

```bash
export USE_RAG_SCRAPING=true
export FIRECRAWL_API_KEY=your_firecrawl_api_key
```

## Advanced RAG features

### 1. File Uploads Backed by S3
To enable S3 file uploads, set the following variables:

```bash
# Enable S3 uploads
export USE_RAG_S3_UPLOADS=true

# S3 Configuration
export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export RAG_UPLOADS_S3_BUCKET=your_bucket_name
export RAG_UPLOADS_S3_REGION=your_region
```

### 2. Changing Default Parsing Model

By default, uploaded PDF files are parsed using `gpt-4o`. You can customize this by setting the following:

```bash
# Override the default parsing model
export FILE_PARSING_MODEL=your-preferred-model
```

You can also change the model provider like so:
```bash
# Optional: Override the parsing provider settings
export FILE_PARSING_PROVIDER_BASE_URL=your-provider-base-url
export FILE_PARSING_PROVIDER_API_KEY=your-provider-api-key
```

### 3. Embedding Model Options

By default, Rowboat uses OpenAI's `text-embedding-3-small` model for generating embeddings. You can customize this by setting the following:

```bash
# Override the default embedding model
export EMBEDDING_MODEL=your-preferred-model
export EMBEDDING_VECTOR_SIZE=1536
```

**Important NOTE**

The default size for the vectors index is 1536. If you change this value, then you must delete the index and set it up again:
```bash
docker-compose --profile delete_qdrant --profile qdrant up --build delete_qdrant qdrant
```
followed by:
```bash
./start # this will recreate the index
```

You can also change the model provider like so:
```bash
# Optional: Override the embedding provider settings
export EMBEDDING_PROVIDER_BASE_URL=your-provider-base-url
export EMBEDDING_PROVIDER_API_KEY=your-provider-api-key
```

If you don't specify the provider settings, Rowboat will use OpenAI as the default provider.
Add support for RAG 2025-04-30 23:36:49 +05:30			`# Using RAG in Rowboat`

updated RAG docs 2025-05-20 16:11:44 +05:30			`Rowboat provides multiple ways to enhance your agents' context with Retrieval-Augmented Generation (RAG). This guide will help you set up and use each RAG features.`
Add support for RAG 2025-04-30 23:36:49 +05:30
			`## Quick Start`

			`Text RAG and local file uploads are enabled by default - no configuration needed! Just start using them right away.`

updated RAG docs 2025-05-20 16:11:44 +05:30			`## RAG Features`
Add support for RAG 2025-04-30 23:36:49 +05:30
			`### 1. Text RAG`
			`✅ Enabled by default:`

			`- Process and reason over text content directly`
			`- No configuration required`

			`### 2. Local File Uploads`
			`✅ Enabled by default:`

			`- Upload PDF files directly from your device`
			`- Files are stored locally`
			`- No configuration required`
			`- Files are parsed using OpenAI by default`
updated RAG docs 2025-05-20 16:11:44 +05:30			`- For larger files, we recommend using Gemini models - see section below.`
Add support for RAG 2025-04-30 23:36:49 +05:30
updated RAG docs 2025-05-20 16:11:44 +05:30			`#### 2.1 Using Gemini for File Parsing`
			`To use Google's Gemini model for parsing uploaded PDFs, set the following variable:`

			```bash
			`# Enable Gemini for file parsing`
			`export USE_GEMINI_FILE_PARSING=true`
			`export GOOGLE_API_KEY=your_google_api_key`
			```

			`### 3. URL Scraping`
			`Rowboat uses Firecrawl for URL scraping. To enable URL scraping, set the following variables:`

			```bash
			`export USE_RAG_SCRAPING=true`
			`export FIRECRAWL_API_KEY=your_firecrawl_api_key`
			```

			`## Advanced RAG features`

			`### 1. File Uploads Backed by S3`
Add support for RAG 2025-04-30 23:36:49 +05:30			`To enable S3 file uploads, set the following variables:`

			```bash
			`# Enable S3 uploads`
			`export USE_RAG_S3_UPLOADS=true`

			`# S3 Configuration`
			`export AWS_ACCESS_KEY_ID=your_access_key`
			`export AWS_SECRET_ACCESS_KEY=your_secret_key`
			`export RAG_UPLOADS_S3_BUCKET=your_bucket_name`
			`export RAG_UPLOADS_S3_REGION=your_region`
			```

updated RAG docs 2025-05-20 16:11:44 +05:30			`### 2. Changing Default Parsing Model`
Add support for RAG 2025-04-30 23:36:49 +05:30
			By default, uploaded PDF files are parsed using `gpt-4o`. You can customize this by setting the following:

			```bash
			`# Override the default parsing model`
			`export FILE_PARSING_MODEL=your-preferred-model`
			```

			`You can also change the model provider like so:`
			```bash
			`# Optional: Override the parsing provider settings`
			`export FILE_PARSING_PROVIDER_BASE_URL=your-provider-base-url`
			`export FILE_PARSING_PROVIDER_API_KEY=your-provider-api-key`
			```

updated RAG docs 2025-05-20 16:11:44 +05:30			`### 3. Embedding Model Options`
Add support for RAG 2025-04-30 23:36:49 +05:30
			By default, Rowboat uses OpenAI's `text-embedding-3-small` model for generating embeddings. You can customize this by setting the following:

			```bash
			`# Override the default embedding model`
			`export EMBEDDING_MODEL=your-preferred-model`
improve embedding index docs and setup 2025-05-06 16:30:20 +05:30			`export EMBEDDING_VECTOR_SIZE=1536`
			```

			`Important NOTE`

			`The default size for the vectors index is 1536. If you change this value, then you must delete the index and set it up again:`
			```bash
			`docker-compose --profile delete_qdrant --profile qdrant up --build delete_qdrant qdrant`
			```
			`followed by:`
			```bash
			`./start # this will recreate the index`
Add support for RAG 2025-04-30 23:36:49 +05:30			```

			`You can also change the model provider like so:`
			```bash
			`# Optional: Override the embedding provider settings`
			`export EMBEDDING_PROVIDER_BASE_URL=your-provider-base-url`
			`export EMBEDDING_PROVIDER_API_KEY=your-provider-api-key`
			```

Update using_rag.md 2025-05-20 12:56:37 +05:30			`If you don't specify the provider settings, Rowboat will use OpenAI as the default provider.`