mirror of
https://github.com/MODSetter/SurfSense.git
synced 2026-04-27 17:56:25 +02:00
Merge branch 'MODSetter:main' into main
This commit is contained in:
commit
2ae8d227bf
106 changed files with 8506 additions and 2268 deletions
17
.env.example
Normal file
17
.env.example
Normal file
|
|
@ -0,0 +1,17 @@
|
||||||
|
# Frontend Configuration
|
||||||
|
FRONTEND_PORT=3000
|
||||||
|
NEXT_PUBLIC_API_URL=http://backend:8000
|
||||||
|
|
||||||
|
# Backend Configuration
|
||||||
|
BACKEND_PORT=8000
|
||||||
|
|
||||||
|
# Database Configuration
|
||||||
|
POSTGRES_USER=postgres
|
||||||
|
POSTGRES_PASSWORD=postgres
|
||||||
|
POSTGRES_DB=surfsense
|
||||||
|
POSTGRES_PORT=5432
|
||||||
|
|
||||||
|
# pgAdmin Configuration
|
||||||
|
PGADMIN_PORT=5050
|
||||||
|
PGADMIN_DEFAULT_EMAIL=admin@surfsense.com
|
||||||
|
PGADMIN_DEFAULT_PASSWORD=surfsense
|
||||||
45
.github/PULL_REQUEST_TEMPLATE.md
vendored
Normal file
45
.github/PULL_REQUEST_TEMPLATE.md
vendored
Normal file
|
|
@ -0,0 +1,45 @@
|
||||||
|
<!--- Provide a general summary of your changes in the Title above -->
|
||||||
|
|
||||||
|
## Description
|
||||||
|
<!--- Describe your changes in detail -->
|
||||||
|
|
||||||
|
## Motivation and Context
|
||||||
|
<!--- Why is this change required? What problem does it solve? -->
|
||||||
|
<!--- If this PR relates to an open issue, please link to the issue here: FIX #123 -->
|
||||||
|
FIX #
|
||||||
|
|
||||||
|
## Changes Overview
|
||||||
|
<!-- List the primary changes/improvements made in this PR -->
|
||||||
|
-
|
||||||
|
|
||||||
|
## Screenshots
|
||||||
|
<!-- If applicable, add screenshots or images to demonstrate the changes visually -->
|
||||||
|
|
||||||
|
## API Changes
|
||||||
|
<!-- Document any API changes if applicable -->
|
||||||
|
- [ ] This PR includes API changes
|
||||||
|
|
||||||
|
## Types of changes
|
||||||
|
<!--- What types of changes does your code introduce? Put an `x` in all the boxes that apply: -->
|
||||||
|
- [ ] Bug fix (non-breaking change which fixes an issue)
|
||||||
|
- [ ] New feature (non-breaking change which adds functionality)
|
||||||
|
- [ ] Performance improvement (non-breaking change which enhances performance)
|
||||||
|
- [ ] Documentation update
|
||||||
|
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
|
||||||
|
|
||||||
|
## Testing
|
||||||
|
<!-- Describe the tests that have been run to verify your changes -->
|
||||||
|
- [ ] I have tested these changes locally
|
||||||
|
- [ ] I have added/updated unit tests
|
||||||
|
- [ ] I have added/updated integration tests
|
||||||
|
|
||||||
|
## Checklist:
|
||||||
|
<!--- Go over all the following points, and put an `x` in all the boxes that apply. -->
|
||||||
|
<!--- If you're unsure about any of these, don't hesitate to ask. We're here to help! -->
|
||||||
|
- [ ] My code follows the code style of this project
|
||||||
|
- [ ] My change requires documentation updates
|
||||||
|
- [ ] I have updated the documentation accordingly
|
||||||
|
- [ ] My change requires dependency updates
|
||||||
|
- [ ] I have updated the dependencies accordingly
|
||||||
|
- [ ] My code builds clean without any errors or warnings
|
||||||
|
- [ ] All new and existing tests passed
|
||||||
76
.github/workflows/docker-publish.yml
vendored
Normal file
76
.github/workflows/docker-publish.yml
vendored
Normal file
|
|
@ -0,0 +1,76 @@
|
||||||
|
name: Docker Publish
|
||||||
|
|
||||||
|
on:
|
||||||
|
push:
|
||||||
|
branches: [ "main" ]
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
build_and_push_backend:
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
permissions:
|
||||||
|
contents: read
|
||||||
|
packages: write
|
||||||
|
steps:
|
||||||
|
- name: Checkout repository
|
||||||
|
uses: actions/checkout@v4
|
||||||
|
|
||||||
|
- name: Set up QEMU
|
||||||
|
uses: docker/setup-qemu-action@v3
|
||||||
|
|
||||||
|
- name: Set up Docker Buildx
|
||||||
|
uses: docker/setup-buildx-action@v3
|
||||||
|
|
||||||
|
- name: Log in to GitHub Container Registry
|
||||||
|
uses: docker/login-action@v3
|
||||||
|
with:
|
||||||
|
registry: ghcr.io
|
||||||
|
username: ${{ github.actor }}
|
||||||
|
password: ${{ secrets.GITHUB_TOKEN }}
|
||||||
|
|
||||||
|
- name: Build and push backend image
|
||||||
|
uses: docker/build-push-action@v5
|
||||||
|
with:
|
||||||
|
context: ./surfsense_backend
|
||||||
|
file: ./surfsense_backend/Dockerfile
|
||||||
|
push: true
|
||||||
|
tags: ghcr.io/${{ github.repository_owner }}/surfsense_backend:${{ github.sha }}
|
||||||
|
platforms: linux/amd64,linux/arm64
|
||||||
|
labels: |
|
||||||
|
org.opencontainers.image.source=${{ github.repositoryUrl }}
|
||||||
|
org.opencontainers.image.created=${{ fromJSON(steps.meta.outputs.json).labels['org.opencontainers.image.created'] }}
|
||||||
|
org.opencontainers.image.revision=${{ github.sha }}
|
||||||
|
|
||||||
|
build_and_push_frontend:
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
permissions:
|
||||||
|
contents: read
|
||||||
|
packages: write
|
||||||
|
steps:
|
||||||
|
- name: Checkout repository
|
||||||
|
uses: actions/checkout@v4
|
||||||
|
|
||||||
|
- name: Set up QEMU
|
||||||
|
uses: docker/setup-qemu-action@v3
|
||||||
|
|
||||||
|
- name: Set up Docker Buildx
|
||||||
|
uses: docker/setup-buildx-action@v3
|
||||||
|
|
||||||
|
- name: Log in to GitHub Container Registry
|
||||||
|
uses: docker/login-action@v3
|
||||||
|
with:
|
||||||
|
registry: ghcr.io
|
||||||
|
username: ${{ github.actor }}
|
||||||
|
password: ${{ secrets.GITHUB_TOKEN }}
|
||||||
|
|
||||||
|
- name: Build and push frontend image
|
||||||
|
uses: docker/build-push-action@v5
|
||||||
|
with:
|
||||||
|
context: ./surfsense_web
|
||||||
|
file: ./surfsense_web/Dockerfile
|
||||||
|
push: true
|
||||||
|
tags: ghcr.io/${{ github.repository_owner }}/surfsense_web:${{ github.sha }}
|
||||||
|
platforms: linux/amd64,linux/arm64
|
||||||
|
labels: |
|
||||||
|
org.opencontainers.image.source=${{ github.repositoryUrl }}
|
||||||
|
org.opencontainers.image.created=${{ fromJSON(steps.meta.outputs.json).labels['org.opencontainers.image.created'] }}
|
||||||
|
org.opencontainers.image.revision=${{ github.sha }}
|
||||||
3
.gitignore
vendored
3
.gitignore
vendored
|
|
@ -1 +1,2 @@
|
||||||
.flashrank_cache*
|
.flashrank_cache*
|
||||||
|
podcasts/
|
||||||
|
|
|
||||||
124
DEPLOYMENT_GUIDE.md
Normal file
124
DEPLOYMENT_GUIDE.md
Normal file
|
|
@ -0,0 +1,124 @@
|
||||||
|
# SurfSense Deployment Guide
|
||||||
|
|
||||||
|
This guide explains the different deployment options available for SurfSense using Docker Compose.
|
||||||
|
|
||||||
|
## Deployment Options
|
||||||
|
|
||||||
|
SurfSense uses a flexible Docker Compose configuration that allows you to easily switch between deployment modes without manually editing files. Our approach uses Docker's built-in override functionality with two configuration files:
|
||||||
|
|
||||||
|
1. **docker-compose.yml**: Contains essential core services (database and pgAdmin)
|
||||||
|
2. **docker-compose.override.yml**: Contains application services (frontend and backend)
|
||||||
|
|
||||||
|
This structure provides several advantages:
|
||||||
|
- No need to comment/uncomment services manually
|
||||||
|
- Clear separation between core infrastructure and application services
|
||||||
|
- Easy switching between development and production environments
|
||||||
|
|
||||||
|
## Deployment Modes
|
||||||
|
|
||||||
|
### Full Stack Mode (Development)
|
||||||
|
|
||||||
|
This mode runs everything: frontend, backend, database, and pgAdmin. It's ideal for development environments where you need the complete application stack.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Both files are automatically used (docker-compose.yml + docker-compose.override.yml)
|
||||||
|
docker compose up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
### Core Services Mode (Production)
|
||||||
|
|
||||||
|
This mode runs only the database and pgAdmin services. It's suitable for production environments where you might want to deploy the frontend and backend separately or need to run database migrations.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Explicitly use only the main file
|
||||||
|
docker compose -f docker-compose.yml up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
## Custom Deployment Options
|
||||||
|
|
||||||
|
### Running Specific Services
|
||||||
|
|
||||||
|
You can specify which services to start by naming them:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start only database
|
||||||
|
docker compose up -d db
|
||||||
|
|
||||||
|
# Start database and pgAdmin
|
||||||
|
docker compose up -d db pgadmin
|
||||||
|
|
||||||
|
# Start only backend (requires db to be running)
|
||||||
|
docker compose up -d backend
|
||||||
|
```
|
||||||
|
|
||||||
|
### Using Custom Override Files
|
||||||
|
|
||||||
|
You can create and use custom override files for different environments:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Create a staging configuration
|
||||||
|
docker compose -f docker-compose.yml -f docker-compose.staging.yml up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
## Environment Variables
|
||||||
|
|
||||||
|
The deployment can be customized using environment variables:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Change default ports
|
||||||
|
FRONTEND_PORT=4000 BACKEND_PORT=9000 docker compose up -d
|
||||||
|
|
||||||
|
# Or use a .env file
|
||||||
|
# Create or modify .env file with your desired values
|
||||||
|
docker compose up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
## Common Deployment Workflows
|
||||||
|
|
||||||
|
### Initial Setup
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Clone the repository
|
||||||
|
git clone https://github.com/MODSetter/SurfSense.git
|
||||||
|
cd SurfSense
|
||||||
|
|
||||||
|
# Copy example env files
|
||||||
|
cp .env.example .env
|
||||||
|
cp surfsense_backend/.env.example surfsense_backend/.env
|
||||||
|
cp surfsense_web/.env.example surfsense_web/.env
|
||||||
|
|
||||||
|
# Edit the .env files with your configuration
|
||||||
|
|
||||||
|
# Start full stack for development
|
||||||
|
docker compose up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
### Database-Only Mode (for migrations or maintenance)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start just the database
|
||||||
|
docker compose -f docker-compose.yml up -d db
|
||||||
|
|
||||||
|
# Run migrations or maintenance tasks
|
||||||
|
docker compose exec db psql -U postgres -d surfsense
|
||||||
|
```
|
||||||
|
|
||||||
|
### Scaling in Production
|
||||||
|
|
||||||
|
For production deployments, you might want to:
|
||||||
|
|
||||||
|
1. Run core services with Docker Compose
|
||||||
|
2. Deploy frontend/backend with specialized services like Vercel, Netlify, or dedicated application servers
|
||||||
|
|
||||||
|
This separation allows for better scaling and resource utilization in production environments.
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
If you encounter issues with the deployment:
|
||||||
|
|
||||||
|
- Check container logs: `docker compose logs -f [service_name]`
|
||||||
|
- Ensure all required environment variables are set
|
||||||
|
- Verify network connectivity between containers
|
||||||
|
- Check that required ports are available and not blocked by firewalls
|
||||||
|
|
||||||
|
For more detailed setup instructions, refer to [DOCKER_SETUP.md](DOCKER_SETUP.md).
|
||||||
155
DOCKER_SETUP.md
155
DOCKER_SETUP.md
|
|
@ -7,73 +7,186 @@ This document explains how to run the SurfSense project using Docker Compose.
|
||||||
- Docker and Docker Compose installed on your machine
|
- Docker and Docker Compose installed on your machine
|
||||||
- Git (to clone the repository)
|
- Git (to clone the repository)
|
||||||
|
|
||||||
|
## Environment Variables Configuration
|
||||||
|
|
||||||
|
SurfSense Docker setup supports configuration through environment variables. You can set these variables in two ways:
|
||||||
|
|
||||||
|
1. Create a `.env` file in the project root directory (copy from `.env.example`)
|
||||||
|
2. Set environment variables directly in your shell before running Docker Compose
|
||||||
|
|
||||||
|
The following environment variables are available:
|
||||||
|
|
||||||
|
```
|
||||||
|
# Frontend Configuration
|
||||||
|
FRONTEND_PORT=3000
|
||||||
|
NEXT_PUBLIC_API_URL=http://backend:8000
|
||||||
|
|
||||||
|
# Backend Configuration
|
||||||
|
BACKEND_PORT=8000
|
||||||
|
|
||||||
|
# Database Configuration
|
||||||
|
POSTGRES_USER=postgres
|
||||||
|
POSTGRES_PASSWORD=postgres
|
||||||
|
POSTGRES_DB=surfsense
|
||||||
|
POSTGRES_PORT=5432
|
||||||
|
|
||||||
|
# pgAdmin Configuration
|
||||||
|
PGADMIN_PORT=5050
|
||||||
|
PGADMIN_DEFAULT_EMAIL=admin@surfsense.com
|
||||||
|
PGADMIN_DEFAULT_PASSWORD=surfsense
|
||||||
|
```
|
||||||
|
|
||||||
|
## Deployment Options
|
||||||
|
|
||||||
|
SurfSense uses a flexible Docker Compose setup that allows you to choose between different deployment modes:
|
||||||
|
|
||||||
|
### Option 1: Full-Stack Deployment (Development Mode)
|
||||||
|
Includes frontend, backend, database, and pgAdmin. This is the default when running `docker compose up`.
|
||||||
|
|
||||||
|
### Option 2: Core Services Only (Production Mode)
|
||||||
|
Includes only database and pgAdmin, suitable for production environments where you might deploy frontend/backend separately.
|
||||||
|
|
||||||
|
Our setup uses two files:
|
||||||
|
- `docker-compose.yml`: Contains core services (database and pgAdmin)
|
||||||
|
- `docker-compose.override.yml`: Contains application services (frontend and backend)
|
||||||
|
|
||||||
## Setup
|
## Setup
|
||||||
|
|
||||||
1. Make sure you have all the necessary environment variables set up:
|
1. Make sure you have all the necessary environment variables set up:
|
||||||
- Copy `surfsense_backend/.env.example` to `surfsense_backend/.env` and fill in the required values
|
- Copy `surfsense_backend/.env.example` to `surfsense_backend/.env` and fill in the required values
|
||||||
- Copy `surfsense_web/.env.example` to `surfsense_web/.env` and fill in the required values
|
- Copy `surfsense_web/.env.example` to `surfsense_web/.env` and fill in the required values
|
||||||
|
- Optionally: Copy `.env.example` to `.env` in the project root to customize Docker settings
|
||||||
|
|
||||||
2. Build and start the containers:
|
2. Deploy based on your needs:
|
||||||
|
|
||||||
|
**Full Stack (Development Mode)**:
|
||||||
```bash
|
```bash
|
||||||
docker-compose up --build
|
# Both files are automatically used
|
||||||
|
docker compose up --build
|
||||||
|
```
|
||||||
|
|
||||||
|
**Core Services Only (Production Mode)**:
|
||||||
|
```bash
|
||||||
|
# Explicitly use only the main file
|
||||||
|
docker compose -f docker-compose.yml up --build
|
||||||
```
|
```
|
||||||
|
|
||||||
3. To run in detached mode (in the background):
|
3. To run in detached mode (in the background):
|
||||||
```bash
|
```bash
|
||||||
docker-compose up -d
|
# Full stack
|
||||||
|
docker compose up -d
|
||||||
|
|
||||||
|
# Core services only
|
||||||
|
docker compose -f docker-compose.yml up -d
|
||||||
```
|
```
|
||||||
|
|
||||||
4. Access the applications:
|
4. Access the applications:
|
||||||
- Frontend: http://localhost:3000
|
- Frontend: http://localhost:3000 (when using full stack)
|
||||||
- Backend API: http://localhost:8000
|
- Backend API: http://localhost:8000 (when using full stack)
|
||||||
- API Documentation: http://localhost:8000/docs
|
- API Documentation: http://localhost:8000/docs (when using full stack)
|
||||||
|
- pgAdmin: http://localhost:5050
|
||||||
|
|
||||||
|
## Customizing the Deployment
|
||||||
|
|
||||||
|
If you need to make temporary changes to either full stack or core services deployment, you can:
|
||||||
|
|
||||||
|
1. **Temporarily disable override file**:
|
||||||
|
```bash
|
||||||
|
docker compose -f docker-compose.yml up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Use a custom override file**:
|
||||||
|
```bash
|
||||||
|
docker compose -f docker-compose.yml -f custom-override.yml up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Temporarily modify which services start**:
|
||||||
|
```bash
|
||||||
|
docker compose up -d db pgadmin
|
||||||
|
```
|
||||||
|
|
||||||
## Useful Commands
|
## Useful Commands
|
||||||
|
|
||||||
- Stop the containers:
|
- Stop the containers:
|
||||||
```bash
|
```bash
|
||||||
docker-compose down
|
docker compose down
|
||||||
```
|
```
|
||||||
|
|
||||||
- View logs:
|
- View logs:
|
||||||
```bash
|
```bash
|
||||||
# All services
|
# All services
|
||||||
docker-compose logs -f
|
docker compose logs -f
|
||||||
|
|
||||||
# Specific service
|
# Specific service
|
||||||
docker-compose logs -f backend
|
docker compose logs -f backend
|
||||||
docker-compose logs -f frontend
|
docker compose logs -f frontend
|
||||||
docker-compose logs -f db
|
docker compose logs -f db
|
||||||
|
docker compose logs -f pgadmin
|
||||||
```
|
```
|
||||||
|
|
||||||
- Restart a specific service:
|
- Restart a specific service:
|
||||||
```bash
|
```bash
|
||||||
docker-compose restart backend
|
docker compose restart backend
|
||||||
```
|
```
|
||||||
|
|
||||||
- Execute commands in a running container:
|
- Execute commands in a running container:
|
||||||
```bash
|
```bash
|
||||||
# Backend
|
# Backend
|
||||||
docker-compose exec backend python -m pytest
|
docker compose exec backend python -m pytest
|
||||||
|
|
||||||
# Frontend
|
# Frontend
|
||||||
docker-compose exec frontend pnpm lint
|
docker compose exec frontend pnpm lint
|
||||||
```
|
```
|
||||||
|
|
||||||
## Database
|
## Database
|
||||||
|
|
||||||
The PostgreSQL database with pgvector extensions is available at:
|
The PostgreSQL database with pgvector extensions is available at:
|
||||||
- Host: localhost
|
- Host: localhost
|
||||||
- Port: 5432
|
- Port: 5432 (configurable via POSTGRES_PORT)
|
||||||
- Username: postgres
|
- Username: postgres (configurable via POSTGRES_USER)
|
||||||
- Password: postgres
|
- Password: postgres (configurable via POSTGRES_PASSWORD)
|
||||||
- Database: surfsense
|
- Database: surfsense (configurable via POSTGRES_DB)
|
||||||
|
|
||||||
You can connect to it using any PostgreSQL client.
|
You can connect to it using any PostgreSQL client or the included pgAdmin.
|
||||||
|
|
||||||
|
## pgAdmin
|
||||||
|
|
||||||
|
pgAdmin is a web-based administration tool for PostgreSQL. It is included in the Docker setup for easier database management.
|
||||||
|
|
||||||
|
- URL: http://localhost:5050 (configurable via PGADMIN_PORT)
|
||||||
|
- Default Email: admin@surfsense.com (configurable via PGADMIN_DEFAULT_EMAIL)
|
||||||
|
- Default Password: surfsense (configurable via PGADMIN_DEFAULT_PASSWORD)
|
||||||
|
|
||||||
|
### Connecting to the Database in pgAdmin
|
||||||
|
|
||||||
|
1. Log in to pgAdmin using the credentials above
|
||||||
|
2. Right-click on "Servers" in the left sidebar and select "Create" > "Server"
|
||||||
|
3. In the "General" tab, give your connection a name (e.g., "SurfSense DB")
|
||||||
|
4. In the "Connection" tab, enter the following:
|
||||||
|
- Host: db
|
||||||
|
- Port: 5432
|
||||||
|
- Maintenance database: surfsense
|
||||||
|
- Username: postgres
|
||||||
|
- Password: postgres
|
||||||
|
5. Click "Save" to establish the connection
|
||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
|
|
||||||
- If you encounter permission errors, you may need to run the docker commands with `sudo`.
|
- If you encounter permission errors, you may need to run the docker commands with `sudo`.
|
||||||
- If ports are already in use, modify the port mappings in the `docker-compose.yml` file.
|
- If ports are already in use, modify the port mappings in the `.env` file or directly in the `docker-compose.yml` file.
|
||||||
- For backend dependency issues, you may need to modify the `Dockerfile` in the backend directory.
|
- For backend dependency issues, you may need to modify the `Dockerfile` in the backend directory.
|
||||||
- For frontend dependency issues, you may need to modify the `Dockerfile` in the frontend directory.
|
- If you encounter frontend dependency errors, adjust the frontend's `Dockerfile` accordingly.
|
||||||
|
- If pgAdmin doesn't connect to the database, ensure you're using `db` as the hostname, not `localhost`, as that's the Docker network name.
|
||||||
|
- If you need only specific services, you can explicitly name them: `docker compose up db pgadmin`
|
||||||
|
|
||||||
|
## Understanding Docker Compose File Structure
|
||||||
|
|
||||||
|
The project uses Docker's default override mechanism:
|
||||||
|
|
||||||
|
1. **docker-compose.yml**: Contains essential services (database and pgAdmin)
|
||||||
|
2. **docker-compose.override.yml**: Contains development services (frontend and backend)
|
||||||
|
|
||||||
|
When you run `docker compose up` without additional flags, Docker automatically merges both files.
|
||||||
|
When you run `docker compose -f docker-compose.yml up`, only the specified file is used.
|
||||||
|
|
||||||
|
This approach lets you maintain a cleaner codebase without manually commenting/uncommenting services in your configuration files.
|
||||||
|
|
|
||||||
125
README.md
125
README.md
|
|
@ -1,39 +1,53 @@
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
# SurfSense
|
# SurfSense
|
||||||
While tools like NotebookLM and Perplexity are impressive and highly effective for conducting research on any topic/query, SurfSense elevates this capability by integrating with your personal knowledge base. It is a highly customizable AI research agent, connected to external sources such as search engines (Tavily), Slack, Linear, Notion, YouTube, GitHub and more to come.
|
While tools like NotebookLM and Perplexity are impressive and highly effective for conducting research on any topic/query, SurfSense elevates this capability by integrating with your personal knowledge base. It is a highly customizable AI research agent, connected to external sources such as search engines (Tavily, LinkUp), Slack, Linear, Notion, YouTube, GitHub and more to come.
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
<a href="https://trendshift.io/repositories/13606" target="_blank"><img src="https://trendshift.io/api/badge/repositories/13606" alt="MODSetter%2FSurfSense | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
# Video
|
# Video
|
||||||
|
|
||||||
https://github.com/user-attachments/assets/48142909-6391-4084-b7e8-81da388bb1fc
|
https://github.com/user-attachments/assets/48142909-6391-4084-b7e8-81da388bb1fc
|
||||||
|
|
||||||
|
# Podcast's
|
||||||
|
|
||||||
|
https://github.com/user-attachments/assets/d516982f-de00-4c41-9e4c-632a7d942f41
|
||||||
|
|
||||||
|
## Podcast Sample
|
||||||
|
|
||||||
|
https://github.com/user-attachments/assets/bf64a6ca-934b-47ac-9e1b-edac5fe972ec
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## Key Features
|
## Key Features
|
||||||
### 1. Latest
|
|
||||||
|
|
||||||
#### 💡 **Idea**:
|
### 💡 **Idea**:
|
||||||
Have your own highly customizable private NotebookLM and Perplexity integrated with external sources.
|
Have your own highly customizable private NotebookLM and Perplexity integrated with external sources.
|
||||||
#### 📁 **Multiple File Format Uploading Support**
|
### 📁 **Multiple File Format Uploading Support**
|
||||||
Save content from your own personal files *(Documents, images and supports **27 file extensions**)* to your own personal knowledge base .
|
Save content from your own personal files *(Documents, images, videos and supports **50+ file extensions**)* to your own personal knowledge base .
|
||||||
#### 🔍 **Powerful Search**
|
### 🔍 **Powerful Search**
|
||||||
Quickly research or find anything in your saved content .
|
Quickly research or find anything in your saved content .
|
||||||
#### 💬 **Chat with your Saved Content**
|
### 💬 **Chat with your Saved Content**
|
||||||
Interact in Natural Language and get cited answers.
|
Interact in Natural Language and get cited answers.
|
||||||
#### 📄 **Cited Answers**
|
### 📄 **Cited Answers**
|
||||||
Get Cited answers just like Perplexity.
|
Get Cited answers just like Perplexity.
|
||||||
#### 🔔 **Privacy & Local LLM Support**
|
### 🔔 **Privacy & Local LLM Support**
|
||||||
Works Flawlessly with Ollama local LLMs.
|
Works Flawlessly with Ollama local LLMs.
|
||||||
#### 🏠 **Self Hostable**
|
### 🏠 **Self Hostable**
|
||||||
Open source and easy to deploy locally.
|
Open source and easy to deploy locally.
|
||||||
#### 📊 **Advanced RAG Techniques**
|
### 🎙️ Podcasts
|
||||||
|
- Blazingly fast podcast generation agent. (Creates a 3-minute podcast in under 20 seconds.)
|
||||||
|
- Convert your chat conversations into engaging audio content
|
||||||
|
- Support for multiple TTS providers (OpenAI, Azure, Google Vertex AI)
|
||||||
|
|
||||||
|
### 📊 **Advanced RAG Techniques**
|
||||||
- Supports 150+ LLM's
|
- Supports 150+ LLM's
|
||||||
- Supports 6000+ Embedding Models.
|
- Supports 6000+ Embedding Models.
|
||||||
- Supports all major Rerankers (Pinecode, Cohere, Flashrank etc)
|
- Supports all major Rerankers (Pinecode, Cohere, Flashrank etc)
|
||||||
|
|
@ -41,8 +55,8 @@ Open source and easy to deploy locally.
|
||||||
- Utilizes Hybrid Search (Semantic + Full Text Search combined with Reciprocal Rank Fusion).
|
- Utilizes Hybrid Search (Semantic + Full Text Search combined with Reciprocal Rank Fusion).
|
||||||
- RAG as a Service API Backend.
|
- RAG as a Service API Backend.
|
||||||
|
|
||||||
#### ℹ️ **External Sources**
|
### ℹ️ **External Sources**
|
||||||
- Search Engines (Tavily)
|
- Search Engines (Tavily, LinkUp)
|
||||||
- Slack
|
- Slack
|
||||||
- Linear
|
- Linear
|
||||||
- Notion
|
- Notion
|
||||||
|
|
@ -50,17 +64,41 @@ Open source and easy to deploy locally.
|
||||||
- GitHub
|
- GitHub
|
||||||
- and more to come.....
|
- and more to come.....
|
||||||
|
|
||||||
#### 🔖 Cross Browser Extension
|
## 📄 **Supported File Extensions**
|
||||||
|
|
||||||
|
> **Note**: File format support depends on your ETL service configuration. LlamaCloud supports 50+ formats, while Unstructured supports 34+ core formats.
|
||||||
|
|
||||||
|
### Documents & Text
|
||||||
|
**LlamaCloud**: `.pdf`, `.doc`, `.docx`, `.docm`, `.dot`, `.dotm`, `.rtf`, `.txt`, `.xml`, `.epub`, `.odt`, `.wpd`, `.pages`, `.key`, `.numbers`, `.602`, `.abw`, `.cgm`, `.cwk`, `.hwp`, `.lwp`, `.mw`, `.mcw`, `.pbd`, `.sda`, `.sdd`, `.sdp`, `.sdw`, `.sgl`, `.sti`, `.sxi`, `.sxw`, `.stw`, `.sxg`, `.uof`, `.uop`, `.uot`, `.vor`, `.wps`, `.zabw`
|
||||||
|
|
||||||
|
**Unstructured**: `.doc`, `.docx`, `.odt`, `.rtf`, `.pdf`, `.xml`, `.txt`, `.md`, `.markdown`, `.rst`, `.html`, `.org`, `.epub`
|
||||||
|
|
||||||
|
### Presentations
|
||||||
|
**LlamaCloud**: `.ppt`, `.pptx`, `.pptm`, `.pot`, `.potm`, `.potx`, `.odp`, `.key`
|
||||||
|
|
||||||
|
**Unstructured**: `.ppt`, `.pptx`
|
||||||
|
|
||||||
|
### Spreadsheets & Data
|
||||||
|
**LlamaCloud**: `.xlsx`, `.xls`, `.xlsm`, `.xlsb`, `.xlw`, `.csv`, `.tsv`, `.ods`, `.fods`, `.numbers`, `.dbf`, `.123`, `.dif`, `.sylk`, `.slk`, `.prn`, `.et`, `.uos1`, `.uos2`, `.wk1`, `.wk2`, `.wk3`, `.wk4`, `.wks`, `.wq1`, `.wq2`, `.wb1`, `.wb2`, `.wb3`, `.qpw`, `.xlr`, `.eth`
|
||||||
|
|
||||||
|
**Unstructured**: `.xls`, `.xlsx`, `.csv`, `.tsv`
|
||||||
|
|
||||||
|
### Images
|
||||||
|
**LlamaCloud**: `.jpg`, `.jpeg`, `.png`, `.gif`, `.bmp`, `.svg`, `.tiff`, `.webp`, `.html`, `.htm`, `.web`
|
||||||
|
|
||||||
|
**Unstructured**: `.jpg`, `.jpeg`, `.png`, `.bmp`, `.tiff`, `.heic`
|
||||||
|
|
||||||
|
### Audio & Video *(Always Supported)*
|
||||||
|
`.mp3`, `.mpga`, `.m4a`, `.wav`, `.mp4`, `.mpeg`, `.webm`
|
||||||
|
|
||||||
|
### Email & Communication
|
||||||
|
**Unstructured**: `.eml`, `.msg`, `.p7s`
|
||||||
|
|
||||||
|
### 🔖 Cross Browser Extension
|
||||||
- The SurfSense extension can be used to save any webpage you like.
|
- The SurfSense extension can be used to save any webpage you like.
|
||||||
- Its main usecase is to save any webpages protected beyond authentication.
|
- Its main usecase is to save any webpages protected beyond authentication.
|
||||||
|
|
||||||
|
|
||||||
### 2. Temporarily Deprecated
|
|
||||||
|
|
||||||
#### Podcasts
|
|
||||||
- The SurfSense Podcast feature is currently being reworked for better UI and stability. Expect it soon.
|
|
||||||
|
|
||||||
|
|
||||||
## FEATURE REQUESTS AND FUTURE
|
## FEATURE REQUESTS AND FUTURE
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -76,7 +114,13 @@ Join the [SurfSense Discord](https://discord.gg/ejRNvftDp9) and help shape the f
|
||||||
|
|
||||||
SurfSense provides two installation methods:
|
SurfSense provides two installation methods:
|
||||||
|
|
||||||
1. **[Docker Installation](https://www.surfsense.net/docs/docker-installation)** - The easiest way to get SurfSense up and running with all dependencies containerized. Less Customization.
|
1. **[Docker Installation](https://www.surfsense.net/docs/docker-installation)** - The easiest way to get SurfSense up and running with all dependencies containerized.
|
||||||
|
- Includes pgAdmin for database management through a web UI
|
||||||
|
- Supports environment variable customization via `.env` file
|
||||||
|
- Flexible deployment options (full stack or core services only)
|
||||||
|
- No need to manually edit configuration files between environments
|
||||||
|
- See [Docker Setup Guide](DOCKER_SETUP.md) for detailed instructions
|
||||||
|
- For deployment scenarios and options, see [Deployment Guide](DEPLOYMENT_GUIDE.md)
|
||||||
|
|
||||||
2. **[Manual Installation (Recommended)](https://www.surfsense.net/docs/manual-installation)** - For users who prefer more control over their setup or need to customize their deployment.
|
2. **[Manual Installation (Recommended)](https://www.surfsense.net/docs/manual-installation)** - For users who prefer more control over their setup or need to customize their deployment.
|
||||||
|
|
||||||
|
|
@ -84,7 +128,6 @@ Both installation guides include detailed OS-specific instructions for Windows,
|
||||||
|
|
||||||
Before installation, make sure to complete the [prerequisite setup steps](https://www.surfsense.net/docs/) including:
|
Before installation, make sure to complete the [prerequisite setup steps](https://www.surfsense.net/docs/) including:
|
||||||
- PGVector setup
|
- PGVector setup
|
||||||
- Google OAuth configuration
|
|
||||||
- Unstructured.io API key
|
- Unstructured.io API key
|
||||||
- Other required API keys
|
- Other required API keys
|
||||||
|
|
||||||
|
|
@ -101,6 +144,9 @@ Before installation, make sure to complete the [prerequisite setup steps](https:
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
|
**Podcast Agent**
|
||||||
|

|
||||||
|
|
||||||
|
|
||||||
**Agent Chat**
|
**Agent Chat**
|
||||||
|
|
||||||
|
|
@ -112,6 +158,7 @@ Before installation, make sure to complete the [prerequisite setup steps](https:
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
|
|
||||||
## Tech Stack
|
## Tech Stack
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -178,6 +225,14 @@ Before installation, make sure to complete the [prerequisite setup steps](https:
|
||||||
- **@tanstack/react-table**: Headless UI for building powerful tables & datagrids.
|
- **@tanstack/react-table**: Headless UI for building powerful tables & datagrids.
|
||||||
|
|
||||||
|
|
||||||
|
### **DevOps**
|
||||||
|
|
||||||
|
- **Docker**: Container platform for consistent deployment across environments
|
||||||
|
|
||||||
|
- **Docker Compose**: Tool for defining and running multi-container Docker applications
|
||||||
|
|
||||||
|
- **pgAdmin**: Web-based PostgreSQL administration tool included in Docker setup
|
||||||
|
|
||||||
|
|
||||||
### **Extension**
|
### **Extension**
|
||||||
Manifest v3 on Plasmo
|
Manifest v3 on Plasmo
|
||||||
|
|
@ -185,16 +240,8 @@ Before installation, make sure to complete the [prerequisite setup steps](https:
|
||||||
## Future Work
|
## Future Work
|
||||||
- Add More Connectors.
|
- Add More Connectors.
|
||||||
- Patch minor bugs.
|
- Patch minor bugs.
|
||||||
- Implement Canvas.
|
- Document Chat **[REIMPLEMENT]**
|
||||||
- Complete Hybrid Search. **[Done]**
|
- Document Podcasts
|
||||||
- Add support for file uploads QA. **[Done]**
|
|
||||||
- Shift to WebSockets for Streaming responses. **[Deprecated in favor of AI SDK Stream Protocol]**
|
|
||||||
- Based on feedback, I will work on making it compatible with local models. **[Done]**
|
|
||||||
- Cross Browser Extension **[Done]**
|
|
||||||
- Critical Notifications **[Done | PAUSED]**
|
|
||||||
- Saving Chats **[Done]**
|
|
||||||
- Basic keyword search page for saved sessions **[Done]**
|
|
||||||
- Multi & Single Document Chat **[Done]**
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -203,3 +250,13 @@ Before installation, make sure to complete the [prerequisite setup steps](https:
|
||||||
Contributions are very welcome! A contribution can be as small as a ⭐ or even finding and creating issues.
|
Contributions are very welcome! A contribution can be as small as a ⭐ or even finding and creating issues.
|
||||||
Fine-tuning the Backend is always desired.
|
Fine-tuning the Backend is always desired.
|
||||||
|
|
||||||
|
## Star History
|
||||||
|
|
||||||
|
<a href="https://www.star-history.com/#MODSetter/SurfSense&Date">
|
||||||
|
<picture>
|
||||||
|
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=MODSetter/SurfSense&type=Date&theme=dark" />
|
||||||
|
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=MODSetter/SurfSense&type=Date" />
|
||||||
|
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=MODSetter/SurfSense&type=Date" />
|
||||||
|
</picture>
|
||||||
|
</a>
|
||||||
|
|
||||||
|
|
|
||||||
34
docker-compose.override.yml
Normal file
34
docker-compose.override.yml
Normal file
|
|
@ -0,0 +1,34 @@
|
||||||
|
version: '3.8'
|
||||||
|
|
||||||
|
services:
|
||||||
|
frontend:
|
||||||
|
build:
|
||||||
|
context: ./surfsense_web
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
ports:
|
||||||
|
- "${FRONTEND_PORT:-3000}:3000"
|
||||||
|
volumes:
|
||||||
|
- ./surfsense_web:/app
|
||||||
|
- /app/node_modules
|
||||||
|
depends_on:
|
||||||
|
- backend
|
||||||
|
environment:
|
||||||
|
- NEXT_PUBLIC_API_URL=${NEXT_PUBLIC_API_URL:-http://backend:8000}
|
||||||
|
|
||||||
|
backend:
|
||||||
|
build:
|
||||||
|
context: ./surfsense_backend
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
ports:
|
||||||
|
- "${BACKEND_PORT:-8000}:8000"
|
||||||
|
volumes:
|
||||||
|
- ./surfsense_backend:/app
|
||||||
|
depends_on:
|
||||||
|
- db
|
||||||
|
env_file:
|
||||||
|
- ./surfsense_backend/.env
|
||||||
|
environment:
|
||||||
|
- DATABASE_URL=postgresql+asyncpg://${POSTGRES_USER:-postgres}:${POSTGRES_PASSWORD:-postgres}@db:5432/${POSTGRES_DB:-surfsense}
|
||||||
|
- PYTHONPATH=/app
|
||||||
|
- UVICORN_LOOP=asyncio
|
||||||
|
- UNSTRUCTURED_HAS_PATCHED_LOOP=1
|
||||||
|
|
@ -1,48 +1,29 @@
|
||||||
version: '3.8'
|
version: '3.8'
|
||||||
|
|
||||||
services:
|
services:
|
||||||
frontend:
|
|
||||||
build:
|
|
||||||
context: ./surfsense_web
|
|
||||||
dockerfile: Dockerfile
|
|
||||||
ports:
|
|
||||||
- "3000:3000"
|
|
||||||
volumes:
|
|
||||||
- ./surfsense_web:/app
|
|
||||||
- /app/node_modules
|
|
||||||
depends_on:
|
|
||||||
- backend
|
|
||||||
environment:
|
|
||||||
- NEXT_PUBLIC_API_URL=http://backend:8000
|
|
||||||
|
|
||||||
backend:
|
|
||||||
build:
|
|
||||||
context: ./surfsense_backend
|
|
||||||
dockerfile: Dockerfile
|
|
||||||
ports:
|
|
||||||
- "8000:8000"
|
|
||||||
volumes:
|
|
||||||
- ./surfsense_backend:/app
|
|
||||||
depends_on:
|
|
||||||
- db
|
|
||||||
env_file:
|
|
||||||
- ./surfsense_backend/.env
|
|
||||||
environment:
|
|
||||||
- DATABASE_URL=postgresql+asyncpg://postgres:postgres@db:5432/surfsense
|
|
||||||
- PYTHONPATH=/app
|
|
||||||
- UVICORN_LOOP=asyncio
|
|
||||||
- UNSTRUCTURED_HAS_PATCHED_LOOP=1
|
|
||||||
|
|
||||||
db:
|
db:
|
||||||
image: ankane/pgvector:latest
|
image: ankane/pgvector:latest
|
||||||
ports:
|
ports:
|
||||||
- "5432:5432"
|
- "${POSTGRES_PORT:-5432}:5432"
|
||||||
volumes:
|
volumes:
|
||||||
- postgres_data:/var/lib/postgresql/data
|
- postgres_data:/var/lib/postgresql/data
|
||||||
environment:
|
environment:
|
||||||
- POSTGRES_USER=postgres
|
- POSTGRES_USER=${POSTGRES_USER:-postgres}
|
||||||
- POSTGRES_PASSWORD=postgres
|
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD:-postgres}
|
||||||
- POSTGRES_DB=surfsense
|
- POSTGRES_DB=${POSTGRES_DB:-surfsense}
|
||||||
|
|
||||||
|
pgadmin:
|
||||||
|
image: dpage/pgadmin4
|
||||||
|
ports:
|
||||||
|
- "${PGADMIN_PORT:-5050}:80"
|
||||||
|
environment:
|
||||||
|
- PGADMIN_DEFAULT_EMAIL=${PGADMIN_DEFAULT_EMAIL:-admin@surfsense.com}
|
||||||
|
- PGADMIN_DEFAULT_PASSWORD=${PGADMIN_DEFAULT_PASSWORD:-surfsense}
|
||||||
|
volumes:
|
||||||
|
- pgadmin_data:/var/lib/pgadmin
|
||||||
|
depends_on:
|
||||||
|
- db
|
||||||
|
|
||||||
volumes:
|
volumes:
|
||||||
postgres_data:
|
postgres_data:
|
||||||
|
pgadmin_data:
|
||||||
|
|
@ -1,10 +1,15 @@
|
||||||
DATABASE_URL="postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense"
|
DATABASE_URL="postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense"
|
||||||
|
|
||||||
SECRET_KEY="SECRET"
|
SECRET_KEY="SECRET"
|
||||||
GOOGLE_OAUTH_CLIENT_ID="924507538m"
|
|
||||||
GOOGLE_OAUTH_CLIENT_SECRET="GOCSV"
|
|
||||||
NEXT_FRONTEND_URL="http://localhost:3000"
|
NEXT_FRONTEND_URL="http://localhost:3000"
|
||||||
|
|
||||||
|
#Auth
|
||||||
|
AUTH_TYPE="GOOGLE" or "LOCAL"
|
||||||
|
# For Google Auth Only
|
||||||
|
GOOGLE_OAUTH_CLIENT_ID="924507538m"
|
||||||
|
GOOGLE_OAUTH_CLIENT_SECRET="GOCSV"
|
||||||
|
|
||||||
|
#Embedding Model
|
||||||
EMBEDDING_MODEL="mixedbread-ai/mxbai-embed-large-v1"
|
EMBEDDING_MODEL="mixedbread-ai/mxbai-embed-large-v1"
|
||||||
|
|
||||||
RERANKERS_MODEL_NAME="ms-marco-MiniLM-L-12-v2"
|
RERANKERS_MODEL_NAME="ms-marco-MiniLM-L-12-v2"
|
||||||
|
|
@ -15,15 +20,32 @@ FAST_LLM="openai/gpt-4o-mini"
|
||||||
STRATEGIC_LLM="openai/gpt-4o"
|
STRATEGIC_LLM="openai/gpt-4o"
|
||||||
LONG_CONTEXT_LLM="gemini/gemini-2.0-flash"
|
LONG_CONTEXT_LLM="gemini/gemini-2.0-flash"
|
||||||
|
|
||||||
|
#LiteLLM TTS Provider: https://docs.litellm.ai/docs/text_to_speech#supported-providers
|
||||||
|
TTS_SERVICE="openai/tts-1"
|
||||||
|
|
||||||
|
#LiteLLM STT Provider: https://docs.litellm.ai/docs/audio_transcription#supported-providers
|
||||||
|
STT_SERVICE="openai/whisper-1"
|
||||||
|
|
||||||
# Chosen LiteLLM Providers Keys
|
# Chosen LiteLLM Providers Keys
|
||||||
OPENAI_API_KEY="sk-proj-iA"
|
OPENAI_API_KEY="sk-proj-iA"
|
||||||
GEMINI_API_KEY="AIzaSyB6-1641124124124124124124124124124"
|
GEMINI_API_KEY="AIzaSyB6-1641124124124124124124124124124"
|
||||||
|
|
||||||
UNSTRUCTURED_API_KEY="Tpu3P0U8iy"
|
|
||||||
FIRECRAWL_API_KEY="fcr-01J0000000000000000000000"
|
FIRECRAWL_API_KEY="fcr-01J0000000000000000000000"
|
||||||
|
|
||||||
|
#File Parser Service
|
||||||
|
ETL_SERVICE="UNSTRUCTURED" or "LLAMACLOUD"
|
||||||
|
UNSTRUCTURED_API_KEY="Tpu3P0U8iy"
|
||||||
|
LLAMA_CLOUD_API_KEY="llx-nnn"
|
||||||
|
|
||||||
#OPTIONAL: Add these for LangSmith Observability
|
#OPTIONAL: Add these for LangSmith Observability
|
||||||
LANGSMITH_TRACING=true
|
LANGSMITH_TRACING=true
|
||||||
LANGSMITH_ENDPOINT="https://api.smith.langchain.com"
|
LANGSMITH_ENDPOINT="https://api.smith.langchain.com"
|
||||||
LANGSMITH_API_KEY="lsv2_pt_....."
|
LANGSMITH_API_KEY="lsv2_pt_....."
|
||||||
LANGSMITH_PROJECT="surfsense"
|
LANGSMITH_PROJECT="surfsense"
|
||||||
|
|
||||||
|
# OPTIONAL: LiteLLM API Base
|
||||||
|
FAST_LLM_API_BASE=""
|
||||||
|
STRATEGIC_LLM_API_BASE=""
|
||||||
|
LONG_CONTEXT_LLM_API_BASE=""
|
||||||
|
TTS_SERVICE_API_BASE=""
|
||||||
|
STT_SERVICE_API_BASE=""
|
||||||
|
|
|
||||||
1
surfsense_backend/.gitignore
vendored
1
surfsense_backend/.gitignore
vendored
|
|
@ -5,3 +5,4 @@ data/
|
||||||
__pycache__/
|
__pycache__/
|
||||||
.flashrank_cache
|
.flashrank_cache
|
||||||
surf_new_backend.egg-info/
|
surf_new_backend.egg-info/
|
||||||
|
podcasts/
|
||||||
|
|
|
||||||
|
|
@ -110,7 +110,6 @@ See pyproject.toml for detailed dependency information. Key dependencies include
|
||||||
- fastapi and related packages
|
- fastapi and related packages
|
||||||
- fastapi-users: Authentication and user management
|
- fastapi-users: Authentication and user management
|
||||||
- firecrawl-py: Web crawling capabilities
|
- firecrawl-py: Web crawling capabilities
|
||||||
- gpt-researcher: Advanced research capabilities
|
|
||||||
- langchain components for AI workflows
|
- langchain components for AI workflows
|
||||||
- litellm: LLM model integration
|
- litellm: LLM model integration
|
||||||
- pgvector: Vector similarity search in PostgreSQL
|
- pgvector: Vector similarity search in PostgreSQL
|
||||||
|
|
|
||||||
|
|
@ -2,7 +2,6 @@
|
||||||
|
|
||||||
Revision ID: 1
|
Revision ID: 1
|
||||||
Revises:
|
Revises:
|
||||||
Create Date: 2023-10-27 10:00:00.000000
|
|
||||||
|
|
||||||
"""
|
"""
|
||||||
from typing import Sequence, Union
|
from typing import Sequence, Union
|
||||||
|
|
|
||||||
|
|
@ -2,7 +2,6 @@
|
||||||
|
|
||||||
Revision ID: 2
|
Revision ID: 2
|
||||||
Revises: e55302644c51
|
Revises: e55302644c51
|
||||||
Create Date: 2025-04-16 10:00:00.000000
|
|
||||||
|
|
||||||
"""
|
"""
|
||||||
from typing import Sequence, Union
|
from typing import Sequence, Union
|
||||||
|
|
|
||||||
|
|
@ -2,7 +2,6 @@
|
||||||
|
|
||||||
Revision ID: 3
|
Revision ID: 3
|
||||||
Revises: 2
|
Revises: 2
|
||||||
Create Date: 2025-04-16 10:05:00.059921
|
|
||||||
|
|
||||||
"""
|
"""
|
||||||
from typing import Sequence, Union
|
from typing import Sequence, Union
|
||||||
|
|
|
||||||
44
surfsense_backend/alembic/versions/4_add_linkup_api_enum.py
Normal file
44
surfsense_backend/alembic/versions/4_add_linkup_api_enum.py
Normal file
|
|
@ -0,0 +1,44 @@
|
||||||
|
"""Add LINKUP_API to SearchSourceConnectorType enum
|
||||||
|
|
||||||
|
Revision ID: 4
|
||||||
|
Revises: 3
|
||||||
|
|
||||||
|
"""
|
||||||
|
from typing import Sequence, Union
|
||||||
|
|
||||||
|
from alembic import op
|
||||||
|
import sqlalchemy as sa
|
||||||
|
|
||||||
|
|
||||||
|
# revision identifiers, used by Alembic.
|
||||||
|
revision: str = '4'
|
||||||
|
down_revision: Union[str, None] = '3'
|
||||||
|
branch_labels: Union[str, Sequence[str], None] = None
|
||||||
|
depends_on: Union[str, Sequence[str], None] = None
|
||||||
|
|
||||||
|
|
||||||
|
def upgrade() -> None:
|
||||||
|
# ### commands auto generated by Alembic - please adjust! ###
|
||||||
|
|
||||||
|
# Manually add the command to add the enum value
|
||||||
|
op.execute("ALTER TYPE searchsourceconnectortype ADD VALUE 'LINKUP_API'")
|
||||||
|
|
||||||
|
# Pass for the rest, as autogenerate didn't run to add other schema details
|
||||||
|
pass
|
||||||
|
# ### end Alembic commands ###
|
||||||
|
|
||||||
|
|
||||||
|
def downgrade() -> None:
|
||||||
|
# ### commands auto generated by Alembic - please adjust! ###
|
||||||
|
|
||||||
|
# Downgrading removal of an enum value requires recreating the type
|
||||||
|
op.execute("ALTER TYPE searchsourceconnectortype RENAME TO searchsourceconnectortype_old")
|
||||||
|
op.execute("CREATE TYPE searchsourceconnectortype AS ENUM('SERPER_API', 'TAVILY_API', 'SLACK_CONNECTOR', 'NOTION_CONNECTOR', 'GITHUB_CONNECTOR', 'LINEAR_CONNECTOR')")
|
||||||
|
op.execute((
|
||||||
|
"ALTER TABLE search_source_connectors ALTER COLUMN connector_type TYPE searchsourceconnectortype USING "
|
||||||
|
"connector_type::text::searchsourceconnectortype"
|
||||||
|
))
|
||||||
|
op.execute("DROP TYPE searchsourceconnectortype_old")
|
||||||
|
|
||||||
|
pass
|
||||||
|
# ### end Alembic commands ###
|
||||||
|
|
@ -0,0 +1,57 @@
|
||||||
|
"""Remove char limit on title columns
|
||||||
|
|
||||||
|
Revision ID: 5
|
||||||
|
Revises: 4
|
||||||
|
|
||||||
|
"""
|
||||||
|
from typing import Sequence, Union
|
||||||
|
|
||||||
|
from alembic import op
|
||||||
|
import sqlalchemy as sa
|
||||||
|
|
||||||
|
|
||||||
|
# revision identifiers, used by Alembic.
|
||||||
|
revision: str = '5'
|
||||||
|
down_revision: Union[str, None] = '4'
|
||||||
|
branch_labels: Union[str, Sequence[str], None] = None
|
||||||
|
depends_on: Union[str, Sequence[str], None] = None
|
||||||
|
|
||||||
|
|
||||||
|
def upgrade() -> None:
|
||||||
|
# Alter Chat table
|
||||||
|
op.alter_column('chats', 'title',
|
||||||
|
existing_type=sa.String(200),
|
||||||
|
type_=sa.String(),
|
||||||
|
existing_nullable=False)
|
||||||
|
|
||||||
|
# Alter Document table
|
||||||
|
op.alter_column('documents', 'title',
|
||||||
|
existing_type=sa.String(200),
|
||||||
|
type_=sa.String(),
|
||||||
|
existing_nullable=False)
|
||||||
|
|
||||||
|
# Alter Podcast table
|
||||||
|
op.alter_column('podcasts', 'title',
|
||||||
|
existing_type=sa.String(200),
|
||||||
|
type_=sa.String(),
|
||||||
|
existing_nullable=False)
|
||||||
|
|
||||||
|
|
||||||
|
def downgrade() -> None:
|
||||||
|
# Revert Chat table
|
||||||
|
op.alter_column('chats', 'title',
|
||||||
|
existing_type=sa.String(),
|
||||||
|
type_=sa.String(200),
|
||||||
|
existing_nullable=False)
|
||||||
|
|
||||||
|
# Revert Document table
|
||||||
|
op.alter_column('documents', 'title',
|
||||||
|
existing_type=sa.String(),
|
||||||
|
type_=sa.String(200),
|
||||||
|
existing_nullable=False)
|
||||||
|
|
||||||
|
# Revert Podcast table
|
||||||
|
op.alter_column('podcasts', 'title',
|
||||||
|
existing_type=sa.String(),
|
||||||
|
type_=sa.String(200),
|
||||||
|
existing_nullable=False)
|
||||||
|
|
@ -0,0 +1,43 @@
|
||||||
|
"""Change podcast_content to podcast_transcript with JSON type
|
||||||
|
|
||||||
|
Revision ID: 6
|
||||||
|
Revises: 5
|
||||||
|
|
||||||
|
"""
|
||||||
|
from typing import Sequence, Union
|
||||||
|
|
||||||
|
from alembic import op
|
||||||
|
import sqlalchemy as sa
|
||||||
|
from sqlalchemy.dialects.postgresql import JSON
|
||||||
|
|
||||||
|
|
||||||
|
# revision identifiers, used by Alembic.
|
||||||
|
revision: str = '6'
|
||||||
|
down_revision: Union[str, None] = '5'
|
||||||
|
branch_labels: Union[str, Sequence[str], None] = None
|
||||||
|
depends_on: Union[str, Sequence[str], None] = None
|
||||||
|
|
||||||
|
|
||||||
|
def upgrade() -> None:
|
||||||
|
# Drop the old column and create a new one with the new name and type
|
||||||
|
# We need to do this because PostgreSQL doesn't support direct column renames with type changes
|
||||||
|
op.add_column('podcasts', sa.Column('podcast_transcript', JSON, nullable=False, server_default='{}'))
|
||||||
|
|
||||||
|
# Copy data from old column to new column
|
||||||
|
# Convert text to JSON by storing it as a JSON string value
|
||||||
|
op.execute("UPDATE podcasts SET podcast_transcript = jsonb_build_object('text', podcast_content) WHERE podcast_content != ''")
|
||||||
|
|
||||||
|
# Drop the old column
|
||||||
|
op.drop_column('podcasts', 'podcast_content')
|
||||||
|
|
||||||
|
|
||||||
|
def downgrade() -> None:
|
||||||
|
# Add back the original column
|
||||||
|
op.add_column('podcasts', sa.Column('podcast_content', sa.Text(), nullable=False, server_default=''))
|
||||||
|
|
||||||
|
# Copy data from JSON column back to text column
|
||||||
|
# Extract the 'text' field if it exists, otherwise use empty string
|
||||||
|
op.execute("UPDATE podcasts SET podcast_content = COALESCE((podcast_transcript->>'text'), '')")
|
||||||
|
|
||||||
|
# Drop the new column
|
||||||
|
op.drop_column('podcasts', 'podcast_transcript')
|
||||||
|
|
@ -0,0 +1,27 @@
|
||||||
|
"""Remove is_generated column from podcasts table
|
||||||
|
|
||||||
|
Revision ID: 7
|
||||||
|
Revises: 6
|
||||||
|
|
||||||
|
"""
|
||||||
|
from typing import Sequence, Union
|
||||||
|
|
||||||
|
from alembic import op
|
||||||
|
import sqlalchemy as sa
|
||||||
|
|
||||||
|
|
||||||
|
# revision identifiers, used by Alembic.
|
||||||
|
revision: str = '7'
|
||||||
|
down_revision: Union[str, None] = '6'
|
||||||
|
branch_labels: Union[str, Sequence[str], None] = None
|
||||||
|
depends_on: Union[str, Sequence[str], None] = None
|
||||||
|
|
||||||
|
|
||||||
|
def upgrade() -> None:
|
||||||
|
# Drop the is_generated column
|
||||||
|
op.drop_column('podcasts', 'is_generated')
|
||||||
|
|
||||||
|
|
||||||
|
def downgrade() -> None:
|
||||||
|
# Add back the is_generated column with its original constraints
|
||||||
|
op.add_column('podcasts', sa.Column('is_generated', sa.Boolean(), nullable=False, server_default='false'))
|
||||||
|
|
@ -0,0 +1,56 @@
|
||||||
|
"""Add content_hash column to documents table
|
||||||
|
|
||||||
|
Revision ID: 8
|
||||||
|
Revises: 7
|
||||||
|
"""
|
||||||
|
from typing import Sequence, Union
|
||||||
|
|
||||||
|
from alembic import op
|
||||||
|
import sqlalchemy as sa
|
||||||
|
|
||||||
|
|
||||||
|
# revision identifiers, used by Alembic.
|
||||||
|
revision: str = '8'
|
||||||
|
down_revision: Union[str, None] = '7'
|
||||||
|
branch_labels: Union[str, Sequence[str], None] = None
|
||||||
|
depends_on: Union[str, Sequence[str], None] = None
|
||||||
|
|
||||||
|
|
||||||
|
def upgrade() -> None:
|
||||||
|
# Add content_hash column as nullable first to handle existing data
|
||||||
|
op.add_column('documents', sa.Column('content_hash', sa.String(), nullable=True))
|
||||||
|
|
||||||
|
# Update existing documents to generate content hashes
|
||||||
|
# Using SHA-256 hash of the content column with proper UTF-8 encoding
|
||||||
|
op.execute("""
|
||||||
|
UPDATE documents
|
||||||
|
SET content_hash = encode(sha256(convert_to(content, 'UTF8')), 'hex')
|
||||||
|
WHERE content_hash IS NULL
|
||||||
|
""")
|
||||||
|
|
||||||
|
# Handle duplicate content hashes by keeping only the oldest document for each hash
|
||||||
|
# Delete newer documents with duplicate content hashes
|
||||||
|
op.execute("""
|
||||||
|
DELETE FROM documents
|
||||||
|
WHERE id NOT IN (
|
||||||
|
SELECT MIN(id)
|
||||||
|
FROM documents
|
||||||
|
GROUP BY content_hash
|
||||||
|
)
|
||||||
|
""")
|
||||||
|
|
||||||
|
# Now alter the column to match the model: nullable=False, index=True, unique=True
|
||||||
|
op.alter_column('documents', 'content_hash',
|
||||||
|
existing_type=sa.String(),
|
||||||
|
nullable=False)
|
||||||
|
op.create_index(op.f('ix_documents_content_hash'), 'documents', ['content_hash'], unique=False)
|
||||||
|
op.create_unique_constraint(op.f('uq_documents_content_hash'), 'documents', ['content_hash'])
|
||||||
|
|
||||||
|
|
||||||
|
def downgrade() -> None:
|
||||||
|
# Remove constraints and index first
|
||||||
|
op.drop_constraint(op.f('uq_documents_content_hash'), 'documents', type_='unique')
|
||||||
|
op.drop_index(op.f('ix_documents_content_hash'), table_name='documents')
|
||||||
|
|
||||||
|
# Remove content_hash column from documents table
|
||||||
|
op.drop_column('documents', 'content_hash')
|
||||||
|
|
@ -2,7 +2,6 @@
|
||||||
|
|
||||||
Revision ID: e55302644c51
|
Revision ID: e55302644c51
|
||||||
Revises: 1
|
Revises: 1
|
||||||
Create Date: 2025-04-13 19:56:00.059921
|
|
||||||
|
|
||||||
"""
|
"""
|
||||||
from typing import Sequence, Union
|
from typing import Sequence, Union
|
||||||
|
|
|
||||||
|
|
@ -1 +0,0 @@
|
||||||
"""This is upcoming research agent. Work in progress."""
|
|
||||||
8
surfsense_backend/app/agents/podcaster/__init__.py
Normal file
8
surfsense_backend/app/agents/podcaster/__init__.py
Normal file
|
|
@ -0,0 +1,8 @@
|
||||||
|
"""New LangGraph Agent.
|
||||||
|
|
||||||
|
This module defines a custom graph.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from .graph import graph
|
||||||
|
|
||||||
|
__all__ = ["graph"]
|
||||||
28
surfsense_backend/app/agents/podcaster/configuration.py
Normal file
28
surfsense_backend/app/agents/podcaster/configuration.py
Normal file
|
|
@ -0,0 +1,28 @@
|
||||||
|
"""Define the configurable parameters for the agent."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from dataclasses import dataclass, fields
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
from langchain_core.runnables import RunnableConfig
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(kw_only=True)
|
||||||
|
class Configuration:
|
||||||
|
"""The configuration for the agent."""
|
||||||
|
|
||||||
|
# Changeme: Add configurable values here!
|
||||||
|
# these values can be pre-set when you
|
||||||
|
# create assistants (https://langchain-ai.github.io/langgraph/cloud/how-tos/configuration_cloud/)
|
||||||
|
# and when you invoke the graph
|
||||||
|
podcast_title: str
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def from_runnable_config(
|
||||||
|
cls, config: Optional[RunnableConfig] = None
|
||||||
|
) -> Configuration:
|
||||||
|
"""Create a Configuration instance from a RunnableConfig object."""
|
||||||
|
configurable = (config.get("configurable") or {}) if config else {}
|
||||||
|
_fields = {f.name for f in fields(cls) if f.init}
|
||||||
|
return cls(**{k: v for k, v in configurable.items() if k in _fields})
|
||||||
31
surfsense_backend/app/agents/podcaster/graph.py
Normal file
31
surfsense_backend/app/agents/podcaster/graph.py
Normal file
|
|
@ -0,0 +1,31 @@
|
||||||
|
from langgraph.graph import StateGraph
|
||||||
|
|
||||||
|
from .configuration import Configuration
|
||||||
|
from .state import State
|
||||||
|
|
||||||
|
|
||||||
|
from .nodes import create_merged_podcast_audio, create_podcast_transcript
|
||||||
|
|
||||||
|
|
||||||
|
def build_graph():
|
||||||
|
|
||||||
|
# Define a new graph
|
||||||
|
workflow = StateGraph(State, config_schema=Configuration)
|
||||||
|
|
||||||
|
# Add the node to the graph
|
||||||
|
workflow.add_node("create_podcast_transcript", create_podcast_transcript)
|
||||||
|
workflow.add_node("create_merged_podcast_audio", create_merged_podcast_audio)
|
||||||
|
|
||||||
|
# Set the entrypoint as `call_model`
|
||||||
|
workflow.add_edge("__start__", "create_podcast_transcript")
|
||||||
|
workflow.add_edge("create_podcast_transcript", "create_merged_podcast_audio")
|
||||||
|
workflow.add_edge("create_merged_podcast_audio", "__end__")
|
||||||
|
|
||||||
|
# Compile the workflow into an executable graph
|
||||||
|
graph = workflow.compile()
|
||||||
|
graph.name = "Surfsense Podcaster" # This defines the custom name in LangSmith
|
||||||
|
|
||||||
|
return graph
|
||||||
|
|
||||||
|
# Compile the graph once when the module is loaded
|
||||||
|
graph = build_graph()
|
||||||
206
surfsense_backend/app/agents/podcaster/nodes.py
Normal file
206
surfsense_backend/app/agents/podcaster/nodes.py
Normal file
|
|
@ -0,0 +1,206 @@
|
||||||
|
from typing import Any, Dict
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import uuid
|
||||||
|
from pathlib import Path
|
||||||
|
import asyncio
|
||||||
|
|
||||||
|
from langchain_core.messages import HumanMessage, SystemMessage
|
||||||
|
from langchain_core.runnables import RunnableConfig
|
||||||
|
from litellm import aspeech
|
||||||
|
from ffmpeg.asyncio import FFmpeg
|
||||||
|
|
||||||
|
from .configuration import Configuration
|
||||||
|
from .state import PodcastTranscriptEntry, State, PodcastTranscripts
|
||||||
|
from .prompts import get_podcast_generation_prompt
|
||||||
|
from app.config import config as app_config
|
||||||
|
|
||||||
|
|
||||||
|
async def create_podcast_transcript(state: State, config: RunnableConfig) -> Dict[str, Any]:
|
||||||
|
"""Each node does work."""
|
||||||
|
|
||||||
|
# Initialize LLM
|
||||||
|
llm = app_config.long_context_llm_instance
|
||||||
|
|
||||||
|
# Get the prompt
|
||||||
|
prompt = get_podcast_generation_prompt()
|
||||||
|
|
||||||
|
# Create the messages
|
||||||
|
messages = [
|
||||||
|
SystemMessage(content=prompt),
|
||||||
|
HumanMessage(content=f"<source_content>{state.source_content}</source_content>")
|
||||||
|
]
|
||||||
|
|
||||||
|
# Generate the podcast transcript
|
||||||
|
llm_response = await llm.ainvoke(messages)
|
||||||
|
|
||||||
|
# First try the direct approach
|
||||||
|
try:
|
||||||
|
podcast_transcript = PodcastTranscripts.model_validate(json.loads(llm_response.content))
|
||||||
|
except (json.JSONDecodeError, ValueError) as e:
|
||||||
|
print(f"Direct JSON parsing failed, trying fallback approach: {str(e)}")
|
||||||
|
|
||||||
|
# Fallback: Parse the JSON response manually
|
||||||
|
try:
|
||||||
|
# Extract JSON content from the response
|
||||||
|
content = llm_response.content
|
||||||
|
|
||||||
|
# Find the JSON in the content (handle case where LLM might add additional text)
|
||||||
|
json_start = content.find('{')
|
||||||
|
json_end = content.rfind('}') + 1
|
||||||
|
if json_start >= 0 and json_end > json_start:
|
||||||
|
json_str = content[json_start:json_end]
|
||||||
|
|
||||||
|
# Parse the JSON string
|
||||||
|
parsed_data = json.loads(json_str)
|
||||||
|
|
||||||
|
# Convert to Pydantic model
|
||||||
|
podcast_transcript = PodcastTranscripts.model_validate(parsed_data)
|
||||||
|
|
||||||
|
print(f"Successfully parsed podcast transcript using fallback approach")
|
||||||
|
else:
|
||||||
|
# If JSON structure not found, raise a clear error
|
||||||
|
error_message = f"Could not find valid JSON in LLM response. Raw response: {content}"
|
||||||
|
print(error_message)
|
||||||
|
raise ValueError(error_message)
|
||||||
|
|
||||||
|
except (json.JSONDecodeError, ValueError) as e2:
|
||||||
|
# Log the error and re-raise it
|
||||||
|
error_message = f"Error parsing LLM response (fallback also failed): {str(e2)}"
|
||||||
|
print(f"Error parsing LLM response: {str(e2)}")
|
||||||
|
print(f"Raw response: {llm_response.content}")
|
||||||
|
raise
|
||||||
|
|
||||||
|
return {
|
||||||
|
"podcast_transcript": podcast_transcript.podcast_transcripts
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
async def create_merged_podcast_audio(state: State, config: RunnableConfig) -> Dict[str, Any]:
|
||||||
|
"""Generate audio for each transcript and merge them into a single podcast file."""
|
||||||
|
|
||||||
|
configuration = Configuration.from_runnable_config(config)
|
||||||
|
|
||||||
|
starting_transcript = PodcastTranscriptEntry(
|
||||||
|
speaker_id=1,
|
||||||
|
dialog=f"Welcome to {configuration.podcast_title} Podcast."
|
||||||
|
)
|
||||||
|
|
||||||
|
transcript = state.podcast_transcript
|
||||||
|
|
||||||
|
# Merge the starting transcript with the podcast transcript
|
||||||
|
# Check if transcript is a PodcastTranscripts object or already a list
|
||||||
|
if hasattr(transcript, 'podcast_transcripts'):
|
||||||
|
transcript_entries = transcript.podcast_transcripts
|
||||||
|
else:
|
||||||
|
transcript_entries = transcript
|
||||||
|
|
||||||
|
merged_transcript = [starting_transcript] + transcript_entries
|
||||||
|
|
||||||
|
# Create a temporary directory for audio files
|
||||||
|
temp_dir = Path("temp_audio")
|
||||||
|
temp_dir.mkdir(exist_ok=True)
|
||||||
|
|
||||||
|
# Generate a unique session ID for this podcast
|
||||||
|
session_id = str(uuid.uuid4())
|
||||||
|
output_path = f"podcasts/{session_id}_podcast.mp3"
|
||||||
|
os.makedirs("podcasts", exist_ok=True)
|
||||||
|
|
||||||
|
# Map of speaker_id to voice
|
||||||
|
voice_mapping = {
|
||||||
|
0: "alloy", # Default/intro voice
|
||||||
|
1: "echo", # First speaker
|
||||||
|
# 2: "fable", # Second speaker
|
||||||
|
# 3: "onyx", # Third speaker
|
||||||
|
# 4: "nova", # Fourth speaker
|
||||||
|
# 5: "shimmer" # Fifth speaker
|
||||||
|
}
|
||||||
|
|
||||||
|
# Generate audio for each transcript segment
|
||||||
|
audio_files = []
|
||||||
|
|
||||||
|
async def generate_speech_for_segment(segment, index):
|
||||||
|
# Handle both dictionary and PodcastTranscriptEntry objects
|
||||||
|
if hasattr(segment, 'speaker_id'):
|
||||||
|
speaker_id = segment.speaker_id
|
||||||
|
dialog = segment.dialog
|
||||||
|
else:
|
||||||
|
speaker_id = segment.get("speaker_id", 0)
|
||||||
|
dialog = segment.get("dialog", "")
|
||||||
|
|
||||||
|
# Select voice based on speaker_id
|
||||||
|
voice = voice_mapping.get(speaker_id, "alloy")
|
||||||
|
|
||||||
|
# Generate a unique filename for this segment
|
||||||
|
filename = f"{temp_dir}/{session_id}_{index}.mp3"
|
||||||
|
|
||||||
|
try:
|
||||||
|
if app_config.TTS_SERVICE_API_BASE:
|
||||||
|
response = await aspeech(
|
||||||
|
model=app_config.TTS_SERVICE,
|
||||||
|
api_base=app_config.TTS_SERVICE_API_BASE,
|
||||||
|
voice=voice,
|
||||||
|
input=dialog,
|
||||||
|
max_retries=2,
|
||||||
|
timeout=600,
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
response = await aspeech(
|
||||||
|
model=app_config.TTS_SERVICE,
|
||||||
|
voice=voice,
|
||||||
|
input=dialog,
|
||||||
|
max_retries=2,
|
||||||
|
timeout=600,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Save the audio to a file - use proper streaming method
|
||||||
|
with open(filename, 'wb') as f:
|
||||||
|
f.write(response.content)
|
||||||
|
|
||||||
|
return filename
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error generating speech for segment {index}: {str(e)}")
|
||||||
|
raise
|
||||||
|
|
||||||
|
# Generate all audio files concurrently
|
||||||
|
tasks = [generate_speech_for_segment(segment, i) for i, segment in enumerate(merged_transcript)]
|
||||||
|
audio_files = await asyncio.gather(*tasks)
|
||||||
|
|
||||||
|
# Merge audio files using ffmpeg
|
||||||
|
try:
|
||||||
|
# Create FFmpeg instance with the first input
|
||||||
|
ffmpeg = FFmpeg().option("y")
|
||||||
|
|
||||||
|
# Add each audio file as input
|
||||||
|
for audio_file in audio_files:
|
||||||
|
ffmpeg = ffmpeg.input(audio_file)
|
||||||
|
|
||||||
|
# Configure the concatenation and output
|
||||||
|
filter_complex = []
|
||||||
|
for i in range(len(audio_files)):
|
||||||
|
filter_complex.append(f"[{i}:0]")
|
||||||
|
|
||||||
|
filter_complex_str = "".join(filter_complex) + f"concat=n={len(audio_files)}:v=0:a=1[outa]"
|
||||||
|
ffmpeg = ffmpeg.option("filter_complex", filter_complex_str)
|
||||||
|
ffmpeg = ffmpeg.output(output_path, map="[outa]")
|
||||||
|
|
||||||
|
# Execute FFmpeg
|
||||||
|
await ffmpeg.execute()
|
||||||
|
|
||||||
|
print(f"Successfully created podcast audio: {output_path}")
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error merging audio files: {str(e)}")
|
||||||
|
raise
|
||||||
|
finally:
|
||||||
|
# Clean up temporary files
|
||||||
|
for audio_file in audio_files:
|
||||||
|
try:
|
||||||
|
os.remove(audio_file)
|
||||||
|
except:
|
||||||
|
pass
|
||||||
|
|
||||||
|
return {
|
||||||
|
"podcast_transcript": merged_transcript,
|
||||||
|
"final_podcast_file_path": output_path
|
||||||
|
}
|
||||||
111
surfsense_backend/app/agents/podcaster/prompts.py
Normal file
111
surfsense_backend/app/agents/podcaster/prompts.py
Normal file
|
|
@ -0,0 +1,111 @@
|
||||||
|
import datetime
|
||||||
|
|
||||||
|
|
||||||
|
def get_podcast_generation_prompt():
|
||||||
|
return f"""
|
||||||
|
Today's date: {datetime.datetime.now().strftime("%Y-%m-%d")}
|
||||||
|
<podcast_generation_system>
|
||||||
|
You are a master podcast scriptwriter, adept at transforming diverse input content into a lively, engaging, and natural-sounding conversation between two distinct podcast hosts. Your primary objective is to craft authentic, flowing dialogue that captures the spontaneity and chemistry of a real podcast discussion, completely avoiding any hint of robotic scripting or stiff formality. Think dynamic interplay, not just information delivery.
|
||||||
|
|
||||||
|
<input>
|
||||||
|
- '<source_content>': A block of text containing the information to be discussed in the podcast. This could be research findings, an article summary, a detailed outline, user chat history related to the topic, or any other relevant raw information. The content might be unstructured but serves as the factual basis for the podcast dialogue.
|
||||||
|
</input>
|
||||||
|
|
||||||
|
<output_format>
|
||||||
|
A JSON object containing the podcast transcript with alternating speakers:
|
||||||
|
{{
|
||||||
|
"podcast_transcripts": [
|
||||||
|
{{
|
||||||
|
"speaker_id": 0,
|
||||||
|
"dialog": "Speaker 0 dialog here"
|
||||||
|
}},
|
||||||
|
{{
|
||||||
|
"speaker_id": 1,
|
||||||
|
"dialog": "Speaker 1 dialog here"
|
||||||
|
}},
|
||||||
|
{{
|
||||||
|
"speaker_id": 0,
|
||||||
|
"dialog": "Speaker 0 dialog here"
|
||||||
|
}},
|
||||||
|
{{
|
||||||
|
"speaker_id": 1,
|
||||||
|
"dialog": "Speaker 1 dialog here"
|
||||||
|
}}
|
||||||
|
]
|
||||||
|
}}
|
||||||
|
</output_format>
|
||||||
|
|
||||||
|
<guidelines>
|
||||||
|
1. **Establish Distinct & Consistent Host Personas:**
|
||||||
|
* **Speaker 0 (Lead Host):** Drives the conversation forward, introduces segments, poses key questions derived from the source content, and often summarizes takeaways. Maintain a guiding, clear, and engaging tone.
|
||||||
|
* **Speaker 1 (Co-Host/Expert):** Offers deeper insights, provides alternative viewpoints or elaborations on the source content, asks clarifying or challenging questions, and shares relevant anecdotes or examples. Adopt a complementary tone (e.g., analytical, enthusiastic, reflective, slightly skeptical).
|
||||||
|
* **Consistency is Key:** Ensure each speaker maintains their distinct voice, vocabulary choice, sentence structure, and perspective throughout the entire script. Avoid having them sound interchangeable. Their interaction should feel like a genuine partnership.
|
||||||
|
|
||||||
|
2. **Craft Natural & Dynamic Dialogue:**
|
||||||
|
* **Emulate Real Conversation:** Use contractions (e.g., "don't", "it's"), interjections ("Oh!", "Wow!", "Hmm"), discourse markers ("you know", "right?", "well"), and occasional natural pauses or filler words. Avoid overly formal language or complex sentence structures typical of written text.
|
||||||
|
* **Foster Interaction & Chemistry:** Write dialogue where speakers genuinely react *to each other*. They should build on points ("Exactly, and that reminds me..."), ask follow-up questions ("Could you expand on that?"), express agreement/disagreement respectfully ("That's a fair point, but have you considered...?"), and show active listening.
|
||||||
|
* **Vary Rhythm & Pace:** Mix short, punchy lines with longer, more explanatory ones. Vary sentence beginnings. Use questions to break up exposition. The rhythm should feel spontaneous, not monotonous.
|
||||||
|
* **Inject Personality & Relatability:** Allow for appropriate humor, moments of surprise or curiosity, brief personal reflections ("I actually experienced something similar..."), or relatable asides that fit the hosts' personas and the topic. Lightly reference past discussions if it enhances context ("Remember last week when we touched on...?").
|
||||||
|
|
||||||
|
3. **Structure for Flow and Listener Engagement:**
|
||||||
|
* **Natural Beginning:** Start with dialogue that flows naturally after an introduction (which will be added manually). Avoid redundant greetings or podcast name mentions since these will be added separately.
|
||||||
|
* **Logical Progression & Signposting:** Guide the listener through the information smoothly. Use clear transitions to link different ideas or segments ("So, now that we've covered X, let's dive into Y...", "That actually brings me to another key finding..."). Ensure topics flow logically from one to the next.
|
||||||
|
* **Meaningful Conclusion:** Summarize the key takeaways or main points discussed, reinforcing the core message derived from the source content. End with a final thought, a lingering question for the audience, or a brief teaser for what's next, providing a sense of closure. Avoid abrupt endings.
|
||||||
|
|
||||||
|
4. **Integrate Source Content Seamlessly & Accurately:**
|
||||||
|
* **Translate, Don't Recite:** Rephrase information from the `<source_content>` into conversational language suitable for each host's persona. Avoid directly copying dense sentences or technical jargon without explanation. The goal is discussion, not narration.
|
||||||
|
* **Explain & Contextualize:** Use analogies, simple examples, storytelling, or have one host ask clarifying questions (acting as a listener surrogate) to break down complex ideas from the source.
|
||||||
|
* **Weave Information Naturally:** Integrate facts, data, or key points from the source *within* the dialogue, not as standalone, undigested blocks. Attribute information conversationally where appropriate ("The research mentioned...", "Apparently, the key factor is...").
|
||||||
|
* **Balance Depth & Accessibility:** Ensure the conversation is informative and factually accurate based on the source content, but prioritize clear communication and engaging delivery over exhaustive technical detail. Make it understandable and interesting for a general audience.
|
||||||
|
|
||||||
|
5. **Length & Pacing:**
|
||||||
|
* **Six-Minute Duration:** Create a transcript that, when read at a natural speaking pace, would result in approximately 6 minutes of audio. Typically, this means around 1000 words total (based on average speaking rate of 150 words per minute).
|
||||||
|
* **Concise Speaking Turns:** Keep most speaking turns relatively brief and focused. Aim for a natural back-and-forth rhythm rather than extended monologues.
|
||||||
|
* **Essential Content Only:** Prioritize the most important information from the source content. Focus on quality over quantity, ensuring every line contributes meaningfully to the topic.
|
||||||
|
</guidelines>
|
||||||
|
|
||||||
|
<examples>
|
||||||
|
Input: "Quantum computing uses quantum bits or qubits which can exist in multiple states simultaneously due to superposition."
|
||||||
|
|
||||||
|
Output:
|
||||||
|
{{
|
||||||
|
"podcast_transcripts": [
|
||||||
|
{{
|
||||||
|
"speaker_id": 0,
|
||||||
|
"dialog": "Today we're diving into the mind-bending world of quantum computing. You know, this is a topic I've been excited to cover for weeks."
|
||||||
|
}},
|
||||||
|
{{
|
||||||
|
"speaker_id": 1,
|
||||||
|
"dialog": "Same here! And I know our listeners have been asking for it. But I have to admit, the concept of quantum computing makes my head spin a little. Can we start with the basics?"
|
||||||
|
}},
|
||||||
|
{{
|
||||||
|
"speaker_id": 0,
|
||||||
|
"dialog": "Absolutely. So regular computers use bits, right? Little on-off switches that are either 1 or 0. But quantum computers use something called qubits, and this is where it gets fascinating."
|
||||||
|
}},
|
||||||
|
{{
|
||||||
|
"speaker_id": 1,
|
||||||
|
"dialog": "Wait, what makes qubits so special compared to regular bits?"
|
||||||
|
}},
|
||||||
|
{{
|
||||||
|
"speaker_id": 0,
|
||||||
|
"dialog": "The magic is in something called superposition. These qubits can exist in multiple states at the same time, not just 1 or 0."
|
||||||
|
}},
|
||||||
|
{{
|
||||||
|
"speaker_id": 1,
|
||||||
|
"dialog": "That sounds impossible! How would you even picture that?"
|
||||||
|
}},
|
||||||
|
{{
|
||||||
|
"speaker_id": 0,
|
||||||
|
"dialog": "Think of it like a coin spinning in the air. Before it lands, is it heads or tails?"
|
||||||
|
}},
|
||||||
|
{{
|
||||||
|
"speaker_id": 1,
|
||||||
|
"dialog": "Well, it's... neither? Or I guess both, until it lands? Oh, I think I see where you're going with this."
|
||||||
|
}}
|
||||||
|
]
|
||||||
|
}}
|
||||||
|
</examples>
|
||||||
|
|
||||||
|
Transform the source material into a lively and engaging podcast conversation. Craft dialogue that showcases authentic host chemistry and natural interaction (including occasional disagreement, building on points, or asking follow-up questions). Use varied speech patterns reflecting real human conversation, ensuring the final script effectively educates *and* entertains the listener while keeping within a 5-minute audio duration.
|
||||||
|
</podcast_generation_system>
|
||||||
|
"""
|
||||||
38
surfsense_backend/app/agents/podcaster/state.py
Normal file
38
surfsense_backend/app/agents/podcaster/state.py
Normal file
|
|
@ -0,0 +1,38 @@
|
||||||
|
"""Define the state structures for the agent."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from typing import List, Optional
|
||||||
|
from pydantic import BaseModel, Field
|
||||||
|
|
||||||
|
|
||||||
|
class PodcastTranscriptEntry(BaseModel):
|
||||||
|
"""
|
||||||
|
Represents a single entry in a podcast transcript.
|
||||||
|
"""
|
||||||
|
speaker_id: int = Field(..., description="The ID of the speaker (0 or 1)")
|
||||||
|
dialog: str = Field(..., description="The dialog text spoken by the speaker")
|
||||||
|
|
||||||
|
|
||||||
|
class PodcastTranscripts(BaseModel):
|
||||||
|
"""
|
||||||
|
Represents the full podcast transcript structure.
|
||||||
|
"""
|
||||||
|
podcast_transcripts: List[PodcastTranscriptEntry] = Field(
|
||||||
|
...,
|
||||||
|
description="List of transcript entries with alternating speakers"
|
||||||
|
)
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class State:
|
||||||
|
"""Defines the input state for the agent, representing a narrower interface to the outside world.
|
||||||
|
|
||||||
|
This class is used to define the initial state and structure of incoming data.
|
||||||
|
See: https://langchain-ai.github.io/langgraph/concepts/low_level/#state
|
||||||
|
for more information.
|
||||||
|
"""
|
||||||
|
|
||||||
|
source_content: str
|
||||||
|
podcast_transcript: Optional[List[PodcastTranscriptEntry]] = None
|
||||||
|
final_podcast_file_path: Optional[str] = None
|
||||||
|
|
@ -3,10 +3,16 @@
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
from dataclasses import dataclass, fields
|
from dataclasses import dataclass, fields
|
||||||
|
from enum import Enum
|
||||||
from typing import Optional, List, Any
|
from typing import Optional, List, Any
|
||||||
|
|
||||||
from langchain_core.runnables import RunnableConfig
|
from langchain_core.runnables import RunnableConfig
|
||||||
|
|
||||||
|
class SearchMode(Enum):
|
||||||
|
"""Enum defining the type of search mode."""
|
||||||
|
CHUNKS = "CHUNKS"
|
||||||
|
DOCUMENTS = "DOCUMENTS"
|
||||||
|
|
||||||
|
|
||||||
@dataclass(kw_only=True)
|
@dataclass(kw_only=True)
|
||||||
class Configuration:
|
class Configuration:
|
||||||
|
|
@ -18,6 +24,7 @@ class Configuration:
|
||||||
connectors_to_search: List[str]
|
connectors_to_search: List[str]
|
||||||
user_id: str
|
user_id: str
|
||||||
search_space_id: int
|
search_space_id: int
|
||||||
|
search_mode: SearchMode
|
||||||
|
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
from langgraph.graph import StateGraph
|
from langgraph.graph import StateGraph
|
||||||
from .state import State
|
from .state import State
|
||||||
from .nodes import write_answer_outline, process_sections
|
from .nodes import reformulate_user_query, write_answer_outline, process_sections
|
||||||
from .configuration import Configuration
|
from .configuration import Configuration
|
||||||
from typing import TypedDict, List, Dict, Any, Optional
|
from typing import TypedDict, List, Dict, Any, Optional
|
||||||
|
|
||||||
|
|
@ -25,11 +25,13 @@ def build_graph():
|
||||||
workflow = StateGraph(State, config_schema=Configuration)
|
workflow = StateGraph(State, config_schema=Configuration)
|
||||||
|
|
||||||
# Add nodes to the graph
|
# Add nodes to the graph
|
||||||
|
workflow.add_node("reformulate_user_query", reformulate_user_query)
|
||||||
workflow.add_node("write_answer_outline", write_answer_outline)
|
workflow.add_node("write_answer_outline", write_answer_outline)
|
||||||
workflow.add_node("process_sections", process_sections)
|
workflow.add_node("process_sections", process_sections)
|
||||||
|
|
||||||
# Define the edges - create a linear flow
|
# Define the edges - create a linear flow
|
||||||
workflow.add_edge("__start__", "write_answer_outline")
|
workflow.add_edge("__start__", "reformulate_user_query")
|
||||||
|
workflow.add_edge("reformulate_user_query", "write_answer_outline")
|
||||||
workflow.add_edge("write_answer_outline", "process_sections")
|
workflow.add_edge("write_answer_outline", "process_sections")
|
||||||
workflow.add_edge("process_sections", "__end__")
|
workflow.add_edge("process_sections", "__end__")
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -10,10 +10,14 @@ from langchain_core.runnables import RunnableConfig
|
||||||
from pydantic import BaseModel, Field
|
from pydantic import BaseModel, Field
|
||||||
from sqlalchemy.ext.asyncio import AsyncSession
|
from sqlalchemy.ext.asyncio import AsyncSession
|
||||||
|
|
||||||
from .configuration import Configuration
|
from .configuration import Configuration, SearchMode
|
||||||
from .prompts import get_answer_outline_system_prompt
|
from .prompts import get_answer_outline_system_prompt
|
||||||
from .state import State
|
from .state import State
|
||||||
from .sub_section_writer.graph import graph as sub_section_writer_graph
|
from .sub_section_writer.graph import graph as sub_section_writer_graph
|
||||||
|
from .sub_section_writer.configuration import SubSectionType
|
||||||
|
|
||||||
|
from app.utils.query_service import QueryService
|
||||||
|
|
||||||
|
|
||||||
from langgraph.types import StreamWriter
|
from langgraph.types import StreamWriter
|
||||||
|
|
||||||
|
|
@ -41,14 +45,15 @@ async def write_answer_outline(state: State, config: RunnableConfig, writer: Str
|
||||||
"""
|
"""
|
||||||
streaming_service = state.streaming_service
|
streaming_service = state.streaming_service
|
||||||
|
|
||||||
streaming_service.only_update_terminal("Generating answer outline...")
|
streaming_service.only_update_terminal("🔍 Generating answer outline...")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
# Get configuration from runnable config
|
# Get configuration from runnable config
|
||||||
configuration = Configuration.from_runnable_config(config)
|
configuration = Configuration.from_runnable_config(config)
|
||||||
|
reformulated_query = state.reformulated_query
|
||||||
user_query = configuration.user_query
|
user_query = configuration.user_query
|
||||||
num_sections = configuration.num_sections
|
num_sections = configuration.num_sections
|
||||||
|
|
||||||
streaming_service.only_update_terminal(f"Planning research approach for query: {user_query[:100]}...")
|
streaming_service.only_update_terminal(f"🤔 Planning research approach for: \"{user_query[:100]}...\"")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
# Initialize LLM
|
# Initialize LLM
|
||||||
|
|
@ -58,7 +63,7 @@ async def write_answer_outline(state: State, config: RunnableConfig, writer: Str
|
||||||
human_message_content = f"""
|
human_message_content = f"""
|
||||||
Now Please create an answer outline for the following query:
|
Now Please create an answer outline for the following query:
|
||||||
|
|
||||||
User Query: {user_query}
|
User Query: {reformulated_query}
|
||||||
Number of Sections: {num_sections}
|
Number of Sections: {num_sections}
|
||||||
|
|
||||||
Remember to format your response as valid JSON exactly matching this structure:
|
Remember to format your response as valid JSON exactly matching this structure:
|
||||||
|
|
@ -78,7 +83,7 @@ async def write_answer_outline(state: State, config: RunnableConfig, writer: Str
|
||||||
Your output MUST be valid JSON in exactly this format. Do not include any other text or explanation.
|
Your output MUST be valid JSON in exactly this format. Do not include any other text or explanation.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
streaming_service.only_update_terminal("Designing structured outline with AI...")
|
streaming_service.only_update_terminal("📝 Designing structured outline with AI...")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
# Create messages for the LLM
|
# Create messages for the LLM
|
||||||
|
|
@ -88,7 +93,7 @@ async def write_answer_outline(state: State, config: RunnableConfig, writer: Str
|
||||||
]
|
]
|
||||||
|
|
||||||
# Call the LLM directly without using structured output
|
# Call the LLM directly without using structured output
|
||||||
streaming_service.only_update_terminal("Processing answer structure...")
|
streaming_service.only_update_terminal("⚙️ Processing answer structure...")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
response = await llm.ainvoke(messages)
|
response = await llm.ainvoke(messages)
|
||||||
|
|
@ -111,7 +116,7 @@ async def write_answer_outline(state: State, config: RunnableConfig, writer: Str
|
||||||
answer_outline = AnswerOutline(**parsed_data)
|
answer_outline = AnswerOutline(**parsed_data)
|
||||||
|
|
||||||
total_questions = sum(len(section.questions) for section in answer_outline.answer_outline)
|
total_questions = sum(len(section.questions) for section in answer_outline.answer_outline)
|
||||||
streaming_service.only_update_terminal(f"Successfully generated outline with {len(answer_outline.answer_outline)} sections and {total_questions} research questions")
|
streaming_service.only_update_terminal(f"✅ Successfully generated outline with {len(answer_outline.answer_outline)} sections and {total_questions} research questions!")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
print(f"Successfully generated answer outline with {len(answer_outline.answer_outline)} sections")
|
print(f"Successfully generated answer outline with {len(answer_outline.answer_outline)} sections")
|
||||||
|
|
@ -121,14 +126,14 @@ async def write_answer_outline(state: State, config: RunnableConfig, writer: Str
|
||||||
else:
|
else:
|
||||||
# If JSON structure not found, raise a clear error
|
# If JSON structure not found, raise a clear error
|
||||||
error_message = f"Could not find valid JSON in LLM response. Raw response: {content}"
|
error_message = f"Could not find valid JSON in LLM response. Raw response: {content}"
|
||||||
streaming_service.only_update_terminal(error_message, "error")
|
streaming_service.only_update_terminal(f"❌ {error_message}", "error")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
raise ValueError(error_message)
|
raise ValueError(error_message)
|
||||||
|
|
||||||
except (json.JSONDecodeError, ValueError) as e:
|
except (json.JSONDecodeError, ValueError) as e:
|
||||||
# Log the error and re-raise it
|
# Log the error and re-raise it
|
||||||
error_message = f"Error parsing LLM response: {str(e)}"
|
error_message = f"Error parsing LLM response: {str(e)}"
|
||||||
streaming_service.only_update_terminal(error_message, "error")
|
streaming_service.only_update_terminal(f"❌ {error_message}", "error")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
print(f"Error parsing LLM response: {str(e)}")
|
print(f"Error parsing LLM response: {str(e)}")
|
||||||
|
|
@ -143,11 +148,18 @@ async def fetch_relevant_documents(
|
||||||
connectors_to_search: List[str],
|
connectors_to_search: List[str],
|
||||||
writer: StreamWriter = None,
|
writer: StreamWriter = None,
|
||||||
state: State = None,
|
state: State = None,
|
||||||
top_k: int = 20
|
top_k: int = 10,
|
||||||
|
connector_service: ConnectorService = None,
|
||||||
|
search_mode: SearchMode = SearchMode.CHUNKS
|
||||||
) -> List[Dict[str, Any]]:
|
) -> List[Dict[str, Any]]:
|
||||||
"""
|
"""
|
||||||
Fetch relevant documents for research questions using the provided connectors.
|
Fetch relevant documents for research questions using the provided connectors.
|
||||||
|
|
||||||
|
This function searches across multiple data sources for information related to the
|
||||||
|
research questions. It provides user-friendly feedback during the search process by
|
||||||
|
displaying connector names (like "Web Search" instead of "TAVILY_API") and adding
|
||||||
|
relevant emojis to indicate the type of source being searched.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
research_questions: List of research questions to find documents for
|
research_questions: List of research questions to find documents for
|
||||||
user_id: The user ID
|
user_id: The user ID
|
||||||
|
|
@ -157,19 +169,22 @@ async def fetch_relevant_documents(
|
||||||
writer: StreamWriter for sending progress updates
|
writer: StreamWriter for sending progress updates
|
||||||
state: The current state containing the streaming service
|
state: The current state containing the streaming service
|
||||||
top_k: Number of top results to retrieve per connector per question
|
top_k: Number of top results to retrieve per connector per question
|
||||||
|
connector_service: An initialized connector service to use for searching
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
List of relevant documents
|
List of relevant documents
|
||||||
"""
|
"""
|
||||||
# Initialize services
|
# Initialize services
|
||||||
connector_service = ConnectorService(db_session)
|
# connector_service = ConnectorService(db_session)
|
||||||
|
|
||||||
# Only use streaming if both writer and state are provided
|
# Only use streaming if both writer and state are provided
|
||||||
streaming_service = state.streaming_service if state is not None else None
|
streaming_service = state.streaming_service if state is not None else None
|
||||||
|
|
||||||
# Stream initial status update
|
# Stream initial status update
|
||||||
if streaming_service and writer:
|
if streaming_service and writer:
|
||||||
streaming_service.only_update_terminal(f"Starting research on {len(research_questions)} questions using {len(connectors_to_search)} connectors...")
|
connector_names = [get_connector_friendly_name(connector) for connector in connectors_to_search]
|
||||||
|
connector_names_str = ", ".join(connector_names)
|
||||||
|
streaming_service.only_update_terminal(f"🔎 Starting research on {len(research_questions)} questions using {connector_names_str} data sources")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
all_raw_documents = [] # Store all raw documents
|
all_raw_documents = [] # Store all raw documents
|
||||||
|
|
@ -178,7 +193,7 @@ async def fetch_relevant_documents(
|
||||||
for i, user_query in enumerate(research_questions):
|
for i, user_query in enumerate(research_questions):
|
||||||
# Stream question being researched
|
# Stream question being researched
|
||||||
if streaming_service and writer:
|
if streaming_service and writer:
|
||||||
streaming_service.only_update_terminal(f"Researching question {i+1}/{len(research_questions)}: {user_query[:100]}...")
|
streaming_service.only_update_terminal(f"🧠 Researching question {i+1}/{len(research_questions)}: \"{user_query[:100]}...\"")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
# Use original research question as the query
|
# Use original research question as the query
|
||||||
|
|
@ -188,7 +203,9 @@ async def fetch_relevant_documents(
|
||||||
for connector in connectors_to_search:
|
for connector in connectors_to_search:
|
||||||
# Stream connector being searched
|
# Stream connector being searched
|
||||||
if streaming_service and writer:
|
if streaming_service and writer:
|
||||||
streaming_service.only_update_terminal(f"Searching {connector} for relevant information...")
|
connector_emoji = get_connector_emoji(connector)
|
||||||
|
friendly_name = get_connector_friendly_name(connector)
|
||||||
|
streaming_service.only_update_terminal(f"{connector_emoji} Searching {friendly_name} for relevant information...")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
try:
|
try:
|
||||||
|
|
@ -197,7 +214,8 @@ async def fetch_relevant_documents(
|
||||||
user_query=reformulated_query,
|
user_query=reformulated_query,
|
||||||
user_id=user_id,
|
user_id=user_id,
|
||||||
search_space_id=search_space_id,
|
search_space_id=search_space_id,
|
||||||
top_k=top_k
|
top_k=top_k,
|
||||||
|
search_mode=search_mode
|
||||||
)
|
)
|
||||||
|
|
||||||
# Add to sources and raw documents
|
# Add to sources and raw documents
|
||||||
|
|
@ -207,7 +225,7 @@ async def fetch_relevant_documents(
|
||||||
|
|
||||||
# Stream found document count
|
# Stream found document count
|
||||||
if streaming_service and writer:
|
if streaming_service and writer:
|
||||||
streaming_service.only_update_terminal(f"Found {len(youtube_chunks)} YouTube chunks relevant to the query")
|
streaming_service.only_update_terminal(f"📹 Found {len(youtube_chunks)} YouTube chunks related to your query")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
elif connector == "EXTENSION":
|
elif connector == "EXTENSION":
|
||||||
|
|
@ -215,7 +233,8 @@ async def fetch_relevant_documents(
|
||||||
user_query=reformulated_query,
|
user_query=reformulated_query,
|
||||||
user_id=user_id,
|
user_id=user_id,
|
||||||
search_space_id=search_space_id,
|
search_space_id=search_space_id,
|
||||||
top_k=top_k
|
top_k=top_k,
|
||||||
|
search_mode=search_mode
|
||||||
)
|
)
|
||||||
|
|
||||||
# Add to sources and raw documents
|
# Add to sources and raw documents
|
||||||
|
|
@ -225,7 +244,7 @@ async def fetch_relevant_documents(
|
||||||
|
|
||||||
# Stream found document count
|
# Stream found document count
|
||||||
if streaming_service and writer:
|
if streaming_service and writer:
|
||||||
streaming_service.only_update_terminal(f"Found {len(extension_chunks)} extension chunks relevant to the query")
|
streaming_service.only_update_terminal(f"🧩 Found {len(extension_chunks)} Browser Extension chunks related to your query")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
elif connector == "CRAWLED_URL":
|
elif connector == "CRAWLED_URL":
|
||||||
|
|
@ -233,7 +252,8 @@ async def fetch_relevant_documents(
|
||||||
user_query=reformulated_query,
|
user_query=reformulated_query,
|
||||||
user_id=user_id,
|
user_id=user_id,
|
||||||
search_space_id=search_space_id,
|
search_space_id=search_space_id,
|
||||||
top_k=top_k
|
top_k=top_k,
|
||||||
|
search_mode=search_mode
|
||||||
)
|
)
|
||||||
|
|
||||||
# Add to sources and raw documents
|
# Add to sources and raw documents
|
||||||
|
|
@ -243,7 +263,7 @@ async def fetch_relevant_documents(
|
||||||
|
|
||||||
# Stream found document count
|
# Stream found document count
|
||||||
if streaming_service and writer:
|
if streaming_service and writer:
|
||||||
streaming_service.only_update_terminal(f"Found {len(crawled_urls_chunks)} crawled URL chunks relevant to the query")
|
streaming_service.only_update_terminal(f"🌐 Found {len(crawled_urls_chunks)} Web Pages chunks related to your query")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
elif connector == "FILE":
|
elif connector == "FILE":
|
||||||
|
|
@ -251,7 +271,8 @@ async def fetch_relevant_documents(
|
||||||
user_query=reformulated_query,
|
user_query=reformulated_query,
|
||||||
user_id=user_id,
|
user_id=user_id,
|
||||||
search_space_id=search_space_id,
|
search_space_id=search_space_id,
|
||||||
top_k=top_k
|
top_k=top_k,
|
||||||
|
search_mode=search_mode
|
||||||
)
|
)
|
||||||
|
|
||||||
# Add to sources and raw documents
|
# Add to sources and raw documents
|
||||||
|
|
@ -261,9 +282,86 @@ async def fetch_relevant_documents(
|
||||||
|
|
||||||
# Stream found document count
|
# Stream found document count
|
||||||
if streaming_service and writer:
|
if streaming_service and writer:
|
||||||
streaming_service.only_update_terminal(f"Found {len(files_chunks)} file chunks relevant to the query")
|
streaming_service.only_update_terminal(f"📄 Found {len(files_chunks)} Files chunks related to your query")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
|
|
||||||
|
elif connector == "SLACK_CONNECTOR":
|
||||||
|
source_object, slack_chunks = await connector_service.search_slack(
|
||||||
|
user_query=reformulated_query,
|
||||||
|
user_id=user_id,
|
||||||
|
search_space_id=search_space_id,
|
||||||
|
top_k=top_k,
|
||||||
|
search_mode=search_mode
|
||||||
|
)
|
||||||
|
|
||||||
|
# Add to sources and raw documents
|
||||||
|
if source_object:
|
||||||
|
all_sources.append(source_object)
|
||||||
|
all_raw_documents.extend(slack_chunks)
|
||||||
|
|
||||||
|
# Stream found document count
|
||||||
|
if streaming_service and writer:
|
||||||
|
streaming_service.only_update_terminal(f"💬 Found {len(slack_chunks)} Slack messages related to your query")
|
||||||
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
|
elif connector == "NOTION_CONNECTOR":
|
||||||
|
source_object, notion_chunks = await connector_service.search_notion(
|
||||||
|
user_query=reformulated_query,
|
||||||
|
user_id=user_id,
|
||||||
|
search_space_id=search_space_id,
|
||||||
|
top_k=top_k,
|
||||||
|
search_mode=search_mode
|
||||||
|
)
|
||||||
|
|
||||||
|
# Add to sources and raw documents
|
||||||
|
if source_object:
|
||||||
|
all_sources.append(source_object)
|
||||||
|
all_raw_documents.extend(notion_chunks)
|
||||||
|
|
||||||
|
# Stream found document count
|
||||||
|
if streaming_service and writer:
|
||||||
|
streaming_service.only_update_terminal(f"📘 Found {len(notion_chunks)} Notion pages/blocks related to your query")
|
||||||
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
|
elif connector == "GITHUB_CONNECTOR":
|
||||||
|
source_object, github_chunks = await connector_service.search_github(
|
||||||
|
user_query=reformulated_query,
|
||||||
|
user_id=user_id,
|
||||||
|
search_space_id=search_space_id,
|
||||||
|
top_k=top_k,
|
||||||
|
search_mode=search_mode
|
||||||
|
)
|
||||||
|
|
||||||
|
# Add to sources and raw documents
|
||||||
|
if source_object:
|
||||||
|
all_sources.append(source_object)
|
||||||
|
all_raw_documents.extend(github_chunks)
|
||||||
|
|
||||||
|
# Stream found document count
|
||||||
|
if streaming_service and writer:
|
||||||
|
streaming_service.only_update_terminal(f"🐙 Found {len(github_chunks)} GitHub files/issues related to your query")
|
||||||
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
|
elif connector == "LINEAR_CONNECTOR":
|
||||||
|
source_object, linear_chunks = await connector_service.search_linear(
|
||||||
|
user_query=reformulated_query,
|
||||||
|
user_id=user_id,
|
||||||
|
search_space_id=search_space_id,
|
||||||
|
top_k=top_k,
|
||||||
|
search_mode=search_mode
|
||||||
|
)
|
||||||
|
|
||||||
|
# Add to sources and raw documents
|
||||||
|
if source_object:
|
||||||
|
all_sources.append(source_object)
|
||||||
|
all_raw_documents.extend(linear_chunks)
|
||||||
|
|
||||||
|
# Stream found document count
|
||||||
|
if streaming_service and writer:
|
||||||
|
streaming_service.only_update_terminal(f"📊 Found {len(linear_chunks)} Linear issues related to your query")
|
||||||
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
elif connector == "TAVILY_API":
|
elif connector == "TAVILY_API":
|
||||||
source_object, tavily_chunks = await connector_service.search_tavily(
|
source_object, tavily_chunks = await connector_service.search_tavily(
|
||||||
user_query=reformulated_query,
|
user_query=reformulated_query,
|
||||||
|
|
@ -278,87 +376,40 @@ async def fetch_relevant_documents(
|
||||||
|
|
||||||
# Stream found document count
|
# Stream found document count
|
||||||
if streaming_service and writer:
|
if streaming_service and writer:
|
||||||
streaming_service.only_update_terminal(f"Found {len(tavily_chunks)} web search results relevant to the query")
|
streaming_service.only_update_terminal(f"🔍 Found {len(tavily_chunks)} Web Search results related to your query")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
elif connector == "SLACK_CONNECTOR":
|
elif connector == "LINKUP_API":
|
||||||
source_object, slack_chunks = await connector_service.search_slack(
|
if top_k > 10:
|
||||||
|
linkup_mode = "deep"
|
||||||
|
else:
|
||||||
|
linkup_mode = "standard"
|
||||||
|
|
||||||
|
source_object, linkup_chunks = await connector_service.search_linkup(
|
||||||
user_query=reformulated_query,
|
user_query=reformulated_query,
|
||||||
user_id=user_id,
|
user_id=user_id,
|
||||||
search_space_id=search_space_id,
|
mode=linkup_mode
|
||||||
top_k=top_k
|
)
|
||||||
)
|
|
||||||
|
|
||||||
# Add to sources and raw documents
|
# Add to sources and raw documents
|
||||||
if source_object:
|
if source_object:
|
||||||
all_sources.append(source_object)
|
all_sources.append(source_object)
|
||||||
all_raw_documents.extend(slack_chunks)
|
all_raw_documents.extend(linkup_chunks)
|
||||||
|
|
||||||
# Stream found document count
|
# Stream found document count
|
||||||
if streaming_service and writer:
|
if streaming_service and writer:
|
||||||
streaming_service.only_update_terminal(f"Found {len(slack_chunks)} Slack messages relevant to the query")
|
streaming_service.only_update_terminal(f"🔗 Found {len(linkup_chunks)} Linkup results related to your query")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
elif connector == "NOTION_CONNECTOR":
|
|
||||||
source_object, notion_chunks = await connector_service.search_notion(
|
|
||||||
user_query=reformulated_query,
|
|
||||||
user_id=user_id,
|
|
||||||
search_space_id=search_space_id,
|
|
||||||
top_k=top_k
|
|
||||||
)
|
|
||||||
|
|
||||||
# Add to sources and raw documents
|
|
||||||
if source_object:
|
|
||||||
all_sources.append(source_object)
|
|
||||||
all_raw_documents.extend(notion_chunks)
|
|
||||||
|
|
||||||
# Stream found document count
|
|
||||||
if streaming_service and writer:
|
|
||||||
streaming_service.only_update_terminal(f"Found {len(notion_chunks)} Notion pages/blocks relevant to the query")
|
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
|
||||||
|
|
||||||
elif connector == "GITHUB_CONNECTOR":
|
|
||||||
source_object, github_chunks = await connector_service.search_github(
|
|
||||||
user_query=reformulated_query,
|
|
||||||
user_id=user_id,
|
|
||||||
search_space_id=search_space_id,
|
|
||||||
top_k=top_k
|
|
||||||
)
|
|
||||||
|
|
||||||
# Add to sources and raw documents
|
|
||||||
if source_object:
|
|
||||||
all_sources.append(source_object)
|
|
||||||
all_raw_documents.extend(github_chunks)
|
|
||||||
|
|
||||||
# Stream found document count
|
|
||||||
if streaming_service and writer:
|
|
||||||
streaming_service.only_update_terminal(f"Found {len(github_chunks)} GitHub files/issues relevant to the query")
|
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
|
||||||
|
|
||||||
elif connector == "LINEAR_CONNECTOR":
|
|
||||||
source_object, linear_chunks = await connector_service.search_linear(
|
|
||||||
user_query=reformulated_query,
|
|
||||||
user_id=user_id,
|
|
||||||
search_space_id=search_space_id,
|
|
||||||
top_k=top_k
|
|
||||||
)
|
|
||||||
|
|
||||||
# Add to sources and raw documents
|
|
||||||
if source_object:
|
|
||||||
all_sources.append(source_object)
|
|
||||||
all_raw_documents.extend(linear_chunks)
|
|
||||||
|
|
||||||
# Stream found document count
|
|
||||||
if streaming_service and writer:
|
|
||||||
streaming_service.only_update_terminal(f"Found {len(linear_chunks)} Linear issues relevant to the query")
|
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
error_message = f"Error searching connector {connector}: {str(e)}"
|
error_message = f"Error searching connector {connector}: {str(e)}"
|
||||||
print(error_message)
|
print(error_message)
|
||||||
|
|
||||||
# Stream error message
|
# Stream error message
|
||||||
if streaming_service and writer:
|
if streaming_service and writer:
|
||||||
streaming_service.only_update_terminal(error_message, "error")
|
friendly_name = get_connector_friendly_name(connector)
|
||||||
|
streaming_service.only_update_terminal(f"⚠️ Error searching {friendly_name}: {str(e)}", "error")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
# Continue with other connectors on error
|
# Continue with other connectors on error
|
||||||
|
|
@ -385,7 +436,7 @@ async def fetch_relevant_documents(
|
||||||
|
|
||||||
# Stream info about deduplicated sources
|
# Stream info about deduplicated sources
|
||||||
if streaming_service and writer:
|
if streaming_service and writer:
|
||||||
streaming_service.only_update_terminal(f"Collected {len(deduplicated_sources)} unique sources across all connectors")
|
streaming_service.only_update_terminal(f"📚 Collected {len(deduplicated_sources)} unique sources across all connectors")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
# After all sources are collected and deduplicated, stream them
|
# After all sources are collected and deduplicated, stream them
|
||||||
|
|
@ -415,12 +466,44 @@ async def fetch_relevant_documents(
|
||||||
|
|
||||||
# Stream info about deduplicated documents
|
# Stream info about deduplicated documents
|
||||||
if streaming_service and writer:
|
if streaming_service and writer:
|
||||||
streaming_service.only_update_terminal(f"Found {len(deduplicated_docs)} unique document chunks after deduplication")
|
streaming_service.only_update_terminal(f"🧹 Found {len(deduplicated_docs)} unique document chunks after removing duplicates")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
# Return deduplicated documents
|
# Return deduplicated documents
|
||||||
return deduplicated_docs
|
return deduplicated_docs
|
||||||
|
|
||||||
|
def get_connector_emoji(connector_name: str) -> str:
|
||||||
|
"""Get an appropriate emoji for a connector type."""
|
||||||
|
connector_emojis = {
|
||||||
|
"YOUTUBE_VIDEO": "📹",
|
||||||
|
"EXTENSION": "🧩",
|
||||||
|
"CRAWLED_URL": "🌐",
|
||||||
|
"FILE": "📄",
|
||||||
|
"SLACK_CONNECTOR": "💬",
|
||||||
|
"NOTION_CONNECTOR": "📘",
|
||||||
|
"GITHUB_CONNECTOR": "🐙",
|
||||||
|
"LINEAR_CONNECTOR": "📊",
|
||||||
|
"TAVILY_API": "🔍",
|
||||||
|
"LINKUP_API": "🔗"
|
||||||
|
}
|
||||||
|
return connector_emojis.get(connector_name, "🔎")
|
||||||
|
|
||||||
|
def get_connector_friendly_name(connector_name: str) -> str:
|
||||||
|
"""Convert technical connector IDs to user-friendly names."""
|
||||||
|
connector_friendly_names = {
|
||||||
|
"YOUTUBE_VIDEO": "YouTube",
|
||||||
|
"EXTENSION": "Browser Extension",
|
||||||
|
"CRAWLED_URL": "Web Pages",
|
||||||
|
"FILE": "Files",
|
||||||
|
"SLACK_CONNECTOR": "Slack",
|
||||||
|
"NOTION_CONNECTOR": "Notion",
|
||||||
|
"GITHUB_CONNECTOR": "GitHub",
|
||||||
|
"LINEAR_CONNECTOR": "Linear",
|
||||||
|
"TAVILY_API": "Tavily Search",
|
||||||
|
"LINKUP_API": "Linkup Search"
|
||||||
|
}
|
||||||
|
return connector_friendly_names.get(connector_name, connector_name)
|
||||||
|
|
||||||
async def process_sections(state: State, config: RunnableConfig, writer: StreamWriter) -> Dict[str, Any]:
|
async def process_sections(state: State, config: RunnableConfig, writer: StreamWriter) -> Dict[str, Any]:
|
||||||
"""
|
"""
|
||||||
Process all sections in parallel and combine the results.
|
Process all sections in parallel and combine the results.
|
||||||
|
|
@ -437,13 +520,17 @@ async def process_sections(state: State, config: RunnableConfig, writer: StreamW
|
||||||
answer_outline = state.answer_outline
|
answer_outline = state.answer_outline
|
||||||
streaming_service = state.streaming_service
|
streaming_service = state.streaming_service
|
||||||
|
|
||||||
streaming_service.only_update_terminal(f"Starting to process research sections...")
|
# Initialize a dictionary to track content for all sections
|
||||||
|
# This is used to maintain section content while streaming multiple sections
|
||||||
|
section_contents = {}
|
||||||
|
|
||||||
|
streaming_service.only_update_terminal(f"🚀 Starting to process research sections...")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
print(f"Processing sections from outline: {answer_outline is not None}")
|
print(f"Processing sections from outline: {answer_outline is not None}")
|
||||||
|
|
||||||
if not answer_outline:
|
if not answer_outline:
|
||||||
streaming_service.only_update_terminal("Error: No answer outline was provided. Cannot generate report.", "error")
|
streaming_service.only_update_terminal("❌ Error: No answer outline was provided. Cannot generate report.", "error")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
return {
|
return {
|
||||||
"final_written_report": "No answer outline was provided. Cannot generate final report."
|
"final_written_report": "No answer outline was provided. Cannot generate final report."
|
||||||
|
|
@ -455,16 +542,26 @@ async def process_sections(state: State, config: RunnableConfig, writer: StreamW
|
||||||
all_questions.extend(section.questions)
|
all_questions.extend(section.questions)
|
||||||
|
|
||||||
print(f"Collected {len(all_questions)} questions from all sections")
|
print(f"Collected {len(all_questions)} questions from all sections")
|
||||||
streaming_service.only_update_terminal(f"Found {len(all_questions)} research questions across {len(answer_outline.answer_outline)} sections")
|
streaming_service.only_update_terminal(f"🧩 Found {len(all_questions)} research questions across {len(answer_outline.answer_outline)} sections")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
# Fetch relevant documents once for all questions
|
# Fetch relevant documents once for all questions
|
||||||
streaming_service.only_update_terminal("Searching for relevant information across all connectors...")
|
streaming_service.only_update_terminal("🔍 Searching for relevant information across all connectors...")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
|
if configuration.num_sections == 1:
|
||||||
|
TOP_K = 10
|
||||||
|
elif configuration.num_sections == 3:
|
||||||
|
TOP_K = 20
|
||||||
|
elif configuration.num_sections == 6:
|
||||||
|
TOP_K = 30
|
||||||
|
|
||||||
relevant_documents = []
|
relevant_documents = []
|
||||||
async with async_session_maker() as db_session:
|
async with async_session_maker() as db_session:
|
||||||
try:
|
try:
|
||||||
|
# Create connector service inside the db_session scope
|
||||||
|
connector_service = ConnectorService(db_session)
|
||||||
|
|
||||||
relevant_documents = await fetch_relevant_documents(
|
relevant_documents = await fetch_relevant_documents(
|
||||||
research_questions=all_questions,
|
research_questions=all_questions,
|
||||||
user_id=configuration.user_id,
|
user_id=configuration.user_id,
|
||||||
|
|
@ -472,30 +569,47 @@ async def process_sections(state: State, config: RunnableConfig, writer: StreamW
|
||||||
db_session=db_session,
|
db_session=db_session,
|
||||||
connectors_to_search=configuration.connectors_to_search,
|
connectors_to_search=configuration.connectors_to_search,
|
||||||
writer=writer,
|
writer=writer,
|
||||||
state=state
|
state=state,
|
||||||
|
top_k=TOP_K,
|
||||||
|
connector_service=connector_service,
|
||||||
|
search_mode=configuration.search_mode
|
||||||
)
|
)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
error_message = f"Error fetching relevant documents: {str(e)}"
|
error_message = f"Error fetching relevant documents: {str(e)}"
|
||||||
print(error_message)
|
print(error_message)
|
||||||
streaming_service.only_update_terminal(error_message, "error")
|
streaming_service.only_update_terminal(f"❌ {error_message}", "error")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
# Log the error and continue with an empty list of documents
|
# Log the error and continue with an empty list of documents
|
||||||
# This allows the process to continue, but the report might lack information
|
# This allows the process to continue, but the report might lack information
|
||||||
relevant_documents = []
|
relevant_documents = []
|
||||||
# Consider adding more robust error handling or reporting if needed
|
|
||||||
|
|
||||||
print(f"Fetched {len(relevant_documents)} relevant documents for all sections")
|
print(f"Fetched {len(relevant_documents)} relevant documents for all sections")
|
||||||
streaming_service.only_update_terminal(f"Starting to draft {len(answer_outline.answer_outline)} sections using {len(relevant_documents)} relevant document chunks")
|
streaming_service.only_update_terminal(f"✨ Starting to draft {len(answer_outline.answer_outline)} sections using {len(relevant_documents)} relevant document chunks")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
# Create tasks to process each section in parallel with the same document set
|
# Create tasks to process each section in parallel with the same document set
|
||||||
section_tasks = []
|
section_tasks = []
|
||||||
streaming_service.only_update_terminal("Creating processing tasks for each section...")
|
streaming_service.only_update_terminal("⚙️ Creating processing tasks for each section...")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
for section in answer_outline.answer_outline:
|
for i, section in enumerate(answer_outline.answer_outline):
|
||||||
|
if i == 0:
|
||||||
|
sub_section_type = SubSectionType.START
|
||||||
|
elif i == len(answer_outline.answer_outline) - 1:
|
||||||
|
sub_section_type = SubSectionType.END
|
||||||
|
else:
|
||||||
|
sub_section_type = SubSectionType.MIDDLE
|
||||||
|
|
||||||
|
# Initialize the section_contents entry for this section
|
||||||
|
section_contents[i] = {
|
||||||
|
"title": section.section_title,
|
||||||
|
"content": "",
|
||||||
|
"index": i
|
||||||
|
}
|
||||||
|
|
||||||
section_tasks.append(
|
section_tasks.append(
|
||||||
process_section_with_documents(
|
process_section_with_documents(
|
||||||
|
section_id=i,
|
||||||
section_title=section.section_title,
|
section_title=section.section_title,
|
||||||
section_questions=section.questions,
|
section_questions=section.questions,
|
||||||
user_query=configuration.user_query,
|
user_query=configuration.user_query,
|
||||||
|
|
@ -503,19 +617,21 @@ async def process_sections(state: State, config: RunnableConfig, writer: StreamW
|
||||||
search_space_id=configuration.search_space_id,
|
search_space_id=configuration.search_space_id,
|
||||||
relevant_documents=relevant_documents,
|
relevant_documents=relevant_documents,
|
||||||
state=state,
|
state=state,
|
||||||
writer=writer
|
writer=writer,
|
||||||
|
sub_section_type=sub_section_type,
|
||||||
|
section_contents=section_contents
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
|
||||||
# Run all section processing tasks in parallel
|
# Run all section processing tasks in parallel
|
||||||
print(f"Running {len(section_tasks)} section processing tasks in parallel")
|
print(f"Running {len(section_tasks)} section processing tasks in parallel")
|
||||||
streaming_service.only_update_terminal(f"Processing {len(section_tasks)} sections simultaneously...")
|
streaming_service.only_update_terminal(f"⏳ Processing {len(section_tasks)} sections simultaneously...")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
section_results = await asyncio.gather(*section_tasks, return_exceptions=True)
|
section_results = await asyncio.gather(*section_tasks, return_exceptions=True)
|
||||||
|
|
||||||
# Handle any exceptions in the results
|
# Handle any exceptions in the results
|
||||||
streaming_service.only_update_terminal("Combining section results into final report...")
|
streaming_service.only_update_terminal("🧵 Combining section results into final report...")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
processed_results = []
|
processed_results = []
|
||||||
|
|
@ -524,7 +640,7 @@ async def process_sections(state: State, config: RunnableConfig, writer: StreamW
|
||||||
section_title = answer_outline.answer_outline[i].section_title
|
section_title = answer_outline.answer_outline[i].section_title
|
||||||
error_message = f"Error processing section '{section_title}': {str(result)}"
|
error_message = f"Error processing section '{section_title}': {str(result)}"
|
||||||
print(error_message)
|
print(error_message)
|
||||||
streaming_service.only_update_terminal(error_message, "error")
|
streaming_service.only_update_terminal(f"⚠️ {error_message}", "error")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
processed_results.append(error_message)
|
processed_results.append(error_message)
|
||||||
else:
|
else:
|
||||||
|
|
@ -542,31 +658,18 @@ async def process_sections(state: State, config: RunnableConfig, writer: StreamW
|
||||||
final_written_report = "\n".join(final_report)
|
final_written_report = "\n".join(final_report)
|
||||||
print(f"Generated final report with {len(final_report)} parts")
|
print(f"Generated final report with {len(final_report)} parts")
|
||||||
|
|
||||||
streaming_service.only_update_terminal("Final research report generated successfully!")
|
streaming_service.only_update_terminal("🎉 Final research report generated successfully!")
|
||||||
writer({"yeild_value": streaming_service._format_annotations()})
|
writer({"yeild_value": streaming_service._format_annotations()})
|
||||||
|
|
||||||
if hasattr(state, 'streaming_service') and state.streaming_service:
|
# Skip the final update since we've been streaming incremental updates
|
||||||
# Convert the final report to the expected format for UI:
|
# The final answer from each section is already shown in the UI
|
||||||
# A list of strings where empty strings represent line breaks
|
|
||||||
formatted_report = []
|
|
||||||
for section in final_report:
|
|
||||||
if section == "\n":
|
|
||||||
# Add an empty string for line breaks
|
|
||||||
formatted_report.append("")
|
|
||||||
else:
|
|
||||||
# Split any multiline content by newlines and add each line
|
|
||||||
section_lines = section.split("\n")
|
|
||||||
formatted_report.extend(section_lines)
|
|
||||||
|
|
||||||
state.streaming_service.only_update_answer(formatted_report)
|
|
||||||
writer({"yeild_value": state.streaming_service._format_annotations()})
|
|
||||||
|
|
||||||
|
|
||||||
return {
|
return {
|
||||||
"final_written_report": final_written_report
|
"final_written_report": final_written_report
|
||||||
}
|
}
|
||||||
|
|
||||||
async def process_section_with_documents(
|
async def process_section_with_documents(
|
||||||
|
section_id: int,
|
||||||
section_title: str,
|
section_title: str,
|
||||||
section_questions: List[str],
|
section_questions: List[str],
|
||||||
user_id: str,
|
user_id: str,
|
||||||
|
|
@ -574,12 +677,15 @@ async def process_section_with_documents(
|
||||||
relevant_documents: List[Dict[str, Any]],
|
relevant_documents: List[Dict[str, Any]],
|
||||||
user_query: str,
|
user_query: str,
|
||||||
state: State = None,
|
state: State = None,
|
||||||
writer: StreamWriter = None
|
writer: StreamWriter = None,
|
||||||
|
sub_section_type: SubSectionType = SubSectionType.MIDDLE,
|
||||||
|
section_contents: Dict[int, Dict[str, Any]] = None
|
||||||
) -> str:
|
) -> str:
|
||||||
"""
|
"""
|
||||||
Process a single section using pre-fetched documents.
|
Process a single section using pre-fetched documents.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
|
section_id: The ID of the section
|
||||||
section_title: The title of the section
|
section_title: The title of the section
|
||||||
section_questions: List of research questions for this section
|
section_questions: List of research questions for this section
|
||||||
user_id: The user ID
|
user_id: The user ID
|
||||||
|
|
@ -587,6 +693,8 @@ async def process_section_with_documents(
|
||||||
relevant_documents: Pre-fetched documents to use for this section
|
relevant_documents: Pre-fetched documents to use for this section
|
||||||
state: The current state
|
state: The current state
|
||||||
writer: StreamWriter for sending progress updates
|
writer: StreamWriter for sending progress updates
|
||||||
|
sub_section_type: The type of section (start, middle, end)
|
||||||
|
section_contents: Dictionary to track content across multiple sections
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
The written section content
|
The written section content
|
||||||
|
|
@ -597,14 +705,14 @@ async def process_section_with_documents(
|
||||||
|
|
||||||
# Send status update via streaming if available
|
# Send status update via streaming if available
|
||||||
if state and state.streaming_service and writer:
|
if state and state.streaming_service and writer:
|
||||||
state.streaming_service.only_update_terminal(f"Writing section: {section_title} with {len(section_questions)} research questions")
|
state.streaming_service.only_update_terminal(f"📝 Writing section: \"{section_title}\" with {len(section_questions)} research questions")
|
||||||
writer({"yeild_value": state.streaming_service._format_annotations()})
|
writer({"yeild_value": state.streaming_service._format_annotations()})
|
||||||
|
|
||||||
# Fallback if no documents found
|
# Fallback if no documents found
|
||||||
if not documents_to_use:
|
if not documents_to_use:
|
||||||
print(f"No relevant documents found for section: {section_title}")
|
print(f"No relevant documents found for section: {section_title}")
|
||||||
if state and state.streaming_service and writer:
|
if state and state.streaming_service and writer:
|
||||||
state.streaming_service.only_update_terminal(f"Warning: No relevant documents found for section: {section_title}", "warning")
|
state.streaming_service.only_update_terminal(f"⚠️ Warning: No relevant documents found for section: \"{section_title}\"", "warning")
|
||||||
writer({"yeild_value": state.streaming_service._format_annotations()})
|
writer({"yeild_value": state.streaming_service._format_annotations()})
|
||||||
|
|
||||||
documents_to_use = [
|
documents_to_use = [
|
||||||
|
|
@ -619,6 +727,7 @@ async def process_section_with_documents(
|
||||||
"configurable": {
|
"configurable": {
|
||||||
"sub_section_title": section_title,
|
"sub_section_title": section_title,
|
||||||
"sub_section_questions": section_questions,
|
"sub_section_questions": section_questions,
|
||||||
|
"sub_section_type": sub_section_type,
|
||||||
"user_query": user_query,
|
"user_query": user_query,
|
||||||
"relevant_documents": documents_to_use,
|
"relevant_documents": documents_to_use,
|
||||||
"user_id": user_id,
|
"user_id": user_id,
|
||||||
|
|
@ -626,33 +735,94 @@ async def process_section_with_documents(
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
# Create the initial state with db_session
|
# Create the initial state with db_session and chat_history
|
||||||
sub_state = {"db_session": db_session}
|
sub_state = {
|
||||||
|
"db_session": db_session,
|
||||||
|
"chat_history": state.chat_history
|
||||||
|
}
|
||||||
|
|
||||||
# Invoke the sub-section writer graph
|
# Invoke the sub-section writer graph with streaming
|
||||||
print(f"Invoking sub_section_writer for: {section_title}")
|
print(f"Invoking sub_section_writer for: {section_title}")
|
||||||
if state and state.streaming_service and writer:
|
if state and state.streaming_service and writer:
|
||||||
state.streaming_service.only_update_terminal(f"Analyzing information and drafting content for section: {section_title}")
|
state.streaming_service.only_update_terminal(f"🧠 Analyzing information and drafting content for section: \"{section_title}\"")
|
||||||
writer({"yeild_value": state.streaming_service._format_annotations()})
|
writer({"yeild_value": state.streaming_service._format_annotations()})
|
||||||
|
|
||||||
result = await sub_section_writer_graph.ainvoke(sub_state, config)
|
|
||||||
|
|
||||||
# Return the final answer from the sub_section_writer
|
# Variables to track streaming state
|
||||||
final_answer = result.get("final_answer", "No content was generated for this section.")
|
complete_content = "" # Tracks the complete content received so far
|
||||||
|
|
||||||
# Send section content update via streaming if available
|
async for chunk_type, chunk in sub_section_writer_graph.astream(sub_state, config, stream_mode=["values"]):
|
||||||
|
if "final_answer" in chunk:
|
||||||
|
new_content = chunk["final_answer"]
|
||||||
|
if new_content and new_content != complete_content:
|
||||||
|
# Extract only the new content (delta)
|
||||||
|
delta = new_content[len(complete_content):]
|
||||||
|
|
||||||
|
# Update what we've processed so far
|
||||||
|
complete_content = new_content
|
||||||
|
|
||||||
|
# Only stream if there's actual new content
|
||||||
|
if delta and state and state.streaming_service and writer:
|
||||||
|
# Update terminal with real-time progress indicator
|
||||||
|
state.streaming_service.only_update_terminal(f"✍️ Writing section {section_id+1}... ({len(complete_content.split())} words)")
|
||||||
|
|
||||||
|
# Update section_contents with just the new delta
|
||||||
|
section_contents[section_id]["content"] += delta
|
||||||
|
|
||||||
|
# Build UI-friendly content for all sections
|
||||||
|
complete_answer = []
|
||||||
|
for i in range(len(section_contents)):
|
||||||
|
if i in section_contents and section_contents[i]["content"]:
|
||||||
|
# Add section header
|
||||||
|
complete_answer.append(f"# {section_contents[i]['title']}")
|
||||||
|
complete_answer.append("") # Empty line after title
|
||||||
|
|
||||||
|
# Add section content
|
||||||
|
content_lines = section_contents[i]["content"].split("\n")
|
||||||
|
complete_answer.extend(content_lines)
|
||||||
|
complete_answer.append("") # Empty line after content
|
||||||
|
|
||||||
|
# Update answer in UI in real-time
|
||||||
|
state.streaming_service.only_update_answer(complete_answer)
|
||||||
|
writer({"yeild_value": state.streaming_service._format_annotations()})
|
||||||
|
|
||||||
|
# Set default if no content was received
|
||||||
|
if not complete_content:
|
||||||
|
complete_content = "No content was generated for this section."
|
||||||
|
section_contents[section_id]["content"] = complete_content
|
||||||
|
|
||||||
|
# Final terminal update
|
||||||
if state and state.streaming_service and writer:
|
if state and state.streaming_service and writer:
|
||||||
state.streaming_service.only_update_terminal(f"Completed writing section: {section_title}")
|
state.streaming_service.only_update_terminal(f"✅ Completed section: \"{section_title}\"")
|
||||||
writer({"yeild_value": state.streaming_service._format_annotations()})
|
writer({"yeild_value": state.streaming_service._format_annotations()})
|
||||||
|
|
||||||
return final_answer
|
return complete_content
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
print(f"Error processing section '{section_title}': {str(e)}")
|
print(f"Error processing section '{section_title}': {str(e)}")
|
||||||
|
|
||||||
# Send error update via streaming if available
|
# Send error update via streaming if available
|
||||||
if state and state.streaming_service and writer:
|
if state and state.streaming_service and writer:
|
||||||
state.streaming_service.only_update_terminal(f"Error processing section '{section_title}': {str(e)}", "error")
|
state.streaming_service.only_update_terminal(f"❌ Error processing section \"{section_title}\": {str(e)}", "error")
|
||||||
writer({"yeild_value": state.streaming_service._format_annotations()})
|
writer({"yeild_value": state.streaming_service._format_annotations()})
|
||||||
|
|
||||||
return f"Error processing section: {section_title}. Details: {str(e)}"
|
return f"Error processing section: {section_title}. Details: {str(e)}"
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
async def reformulate_user_query(state: State, config: RunnableConfig, writer: StreamWriter) -> Dict[str, Any]:
|
||||||
|
"""
|
||||||
|
Reforms the user query based on the chat history.
|
||||||
|
"""
|
||||||
|
|
||||||
|
configuration = Configuration.from_runnable_config(config)
|
||||||
|
user_query = configuration.user_query
|
||||||
|
chat_history_str = await QueryService.langchain_chat_history_to_str(state.chat_history)
|
||||||
|
if len(state.chat_history) == 0:
|
||||||
|
reformulated_query = user_query
|
||||||
|
else:
|
||||||
|
reformulated_query = await QueryService.reformulate_query_with_chat_history(user_query, chat_history_str)
|
||||||
|
|
||||||
|
return {
|
||||||
|
"reformulated_query": reformulated_query
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -3,7 +3,7 @@
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
from dataclasses import dataclass, field
|
from dataclasses import dataclass, field
|
||||||
from typing import Optional, Any
|
from typing import List, Optional, Any
|
||||||
from sqlalchemy.ext.asyncio import AsyncSession
|
from sqlalchemy.ext.asyncio import AsyncSession
|
||||||
from app.utils.streaming_service import StreamingService
|
from app.utils.streaming_service import StreamingService
|
||||||
|
|
||||||
|
|
@ -21,7 +21,9 @@ class State:
|
||||||
# Streaming service
|
# Streaming service
|
||||||
streaming_service: StreamingService
|
streaming_service: StreamingService
|
||||||
|
|
||||||
# Intermediate state - populated during workflow
|
chat_history: Optional[List[Any]] = field(default_factory=list)
|
||||||
|
|
||||||
|
reformulated_query: Optional[str] = field(default=None)
|
||||||
# Using field to explicitly mark as part of state
|
# Using field to explicitly mark as part of state
|
||||||
answer_outline: Optional[Any] = field(default=None)
|
answer_outline: Optional[Any] = field(default=None)
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -3,11 +3,19 @@
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
from dataclasses import dataclass, fields
|
from dataclasses import dataclass, fields
|
||||||
|
from enum import Enum
|
||||||
from typing import Optional, List, Any
|
from typing import Optional, List, Any
|
||||||
|
|
||||||
from langchain_core.runnables import RunnableConfig
|
from langchain_core.runnables import RunnableConfig
|
||||||
|
|
||||||
|
|
||||||
|
class SubSectionType(Enum):
|
||||||
|
"""Enum defining the type of sub-section."""
|
||||||
|
START = "START"
|
||||||
|
MIDDLE = "MIDDLE"
|
||||||
|
END = "END"
|
||||||
|
|
||||||
|
|
||||||
@dataclass(kw_only=True)
|
@dataclass(kw_only=True)
|
||||||
class Configuration:
|
class Configuration:
|
||||||
"""The configuration for the agent."""
|
"""The configuration for the agent."""
|
||||||
|
|
@ -15,6 +23,7 @@ class Configuration:
|
||||||
# Input parameters provided at invocation
|
# Input parameters provided at invocation
|
||||||
sub_section_title: str
|
sub_section_title: str
|
||||||
sub_section_questions: List[str]
|
sub_section_questions: List[str]
|
||||||
|
sub_section_type: SubSectionType
|
||||||
user_query: str
|
user_query: str
|
||||||
relevant_documents: List[Any] # Documents provided directly to the agent
|
relevant_documents: List[Any] # Documents provided directly to the agent
|
||||||
user_id: str
|
user_id: str
|
||||||
|
|
|
||||||
|
|
@ -5,6 +5,7 @@ from typing import Any, Dict
|
||||||
from app.config import config as app_config
|
from app.config import config as app_config
|
||||||
from .prompts import get_citation_system_prompt
|
from .prompts import get_citation_system_prompt
|
||||||
from langchain_core.messages import HumanMessage, SystemMessage
|
from langchain_core.messages import HumanMessage, SystemMessage
|
||||||
|
from .configuration import SubSectionType
|
||||||
|
|
||||||
async def rerank_documents(state: State, config: RunnableConfig) -> Dict[str, Any]:
|
async def rerank_documents(state: State, config: RunnableConfig) -> Dict[str, Any]:
|
||||||
"""
|
"""
|
||||||
|
|
@ -38,7 +39,9 @@ async def rerank_documents(state: State, config: RunnableConfig) -> Dict[str, An
|
||||||
try:
|
try:
|
||||||
# Use the sub-section questions for reranking context
|
# Use the sub-section questions for reranking context
|
||||||
# rerank_query = "\n".join(sub_section_questions)
|
# rerank_query = "\n".join(sub_section_questions)
|
||||||
rerank_query = configuration.user_query
|
# rerank_query = configuration.user_query
|
||||||
|
|
||||||
|
rerank_query = configuration.user_query + "\n" + "\n".join(sub_section_questions)
|
||||||
|
|
||||||
# Convert documents to format expected by reranker if needed
|
# Convert documents to format expected by reranker if needed
|
||||||
reranker_input_docs = [
|
reranker_input_docs = [
|
||||||
|
|
@ -102,13 +105,14 @@ async def write_sub_section(state: State, config: RunnableConfig) -> Dict[str, A
|
||||||
# Extract content and metadata
|
# Extract content and metadata
|
||||||
content = doc.get("content", "")
|
content = doc.get("content", "")
|
||||||
doc_info = doc.get("document", {})
|
doc_info = doc.get("document", {})
|
||||||
document_id = doc_info.get("id", f"{i+1}") # Use document ID or index+1 as source_id
|
document_id = doc_info.get("id") # Use document ID
|
||||||
|
|
||||||
# Format document according to the citation system prompt's expected format
|
# Format document according to the citation system prompt's expected format
|
||||||
formatted_doc = f"""
|
formatted_doc = f"""
|
||||||
<document>
|
<document>
|
||||||
<metadata>
|
<metadata>
|
||||||
<source_id>{document_id}</source_id>
|
<source_id>{document_id}</source_id>
|
||||||
|
<source_type>{doc_info.get("document_type", "CRAWLED_URL")}</source_type>
|
||||||
</metadata>
|
</metadata>
|
||||||
<content>
|
<content>
|
||||||
{content}
|
{content}
|
||||||
|
|
@ -122,12 +126,27 @@ async def write_sub_section(state: State, config: RunnableConfig) -> Dict[str, A
|
||||||
sub_section_questions = configuration.sub_section_questions
|
sub_section_questions = configuration.sub_section_questions
|
||||||
user_query = configuration.user_query # Get the original user query
|
user_query = configuration.user_query # Get the original user query
|
||||||
documents_text = "\n".join(formatted_documents)
|
documents_text = "\n".join(formatted_documents)
|
||||||
|
sub_section_type = configuration.sub_section_type
|
||||||
|
|
||||||
# Format the questions as bullet points for clarity
|
# Format the questions as bullet points for clarity
|
||||||
questions_text = "\n".join([f"- {question}" for question in sub_section_questions])
|
questions_text = "\n".join([f"- {question}" for question in sub_section_questions])
|
||||||
|
|
||||||
|
# Provide more context based on the subsection type
|
||||||
|
section_position_context = ""
|
||||||
|
if sub_section_type == SubSectionType.START:
|
||||||
|
section_position_context = "This is the INTRODUCTION section. "
|
||||||
|
elif sub_section_type == SubSectionType.MIDDLE:
|
||||||
|
section_position_context = "This is a MIDDLE section. Ensure this content flows naturally from previous sections and into subsequent ones. This could be any middle section in the document, so maintain coherence with the overall structure while addressing the specific topic of this section. Do not provide any conclusions in this section, as conclusions should only appear in the final section."
|
||||||
|
elif sub_section_type == SubSectionType.END:
|
||||||
|
section_position_context = "This is the CONCLUSION section. Focus on summarizing key points, providing closure."
|
||||||
|
|
||||||
# Construct a clear, structured query for the LLM
|
# Construct a clear, structured query for the LLM
|
||||||
human_message_content = f"""
|
human_message_content = f"""
|
||||||
|
Source material:
|
||||||
|
<documents>
|
||||||
|
{documents_text}
|
||||||
|
</documents>
|
||||||
|
|
||||||
Now user's query is:
|
Now user's query is:
|
||||||
<user_query>
|
<user_query>
|
||||||
{user_query}
|
{user_query}
|
||||||
|
|
@ -137,21 +156,24 @@ async def write_sub_section(state: State, config: RunnableConfig) -> Dict[str, A
|
||||||
<sub_section_title>
|
<sub_section_title>
|
||||||
{section_title}
|
{section_title}
|
||||||
</sub_section_title>
|
</sub_section_title>
|
||||||
|
|
||||||
|
<section_position>
|
||||||
|
{section_position_context}
|
||||||
|
</section_position>
|
||||||
|
|
||||||
Use the provided documents as your source material and cite them properly using the IEEE citation format [X] where X is the source_id.
|
<guiding_questions>
|
||||||
<documents>
|
{questions_text}
|
||||||
{documents_text}
|
</guiding_questions>
|
||||||
</documents>
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
# Create messages for the LLM
|
# Create messages for the LLM
|
||||||
messages = [
|
messages_with_chat_history = state.chat_history + [
|
||||||
SystemMessage(content=get_citation_system_prompt()),
|
SystemMessage(content=get_citation_system_prompt()),
|
||||||
HumanMessage(content=human_message_content)
|
HumanMessage(content=human_message_content)
|
||||||
]
|
]
|
||||||
|
|
||||||
# Call the LLM and get the response
|
# Call the LLM and get the response
|
||||||
response = await llm.ainvoke(messages)
|
response = await llm.ainvoke(messages_with_chat_history)
|
||||||
final_answer = response.content
|
final_answer = response.content
|
||||||
|
|
||||||
return {
|
return {
|
||||||
|
|
|
||||||
|
|
@ -4,16 +4,28 @@ import datetime
|
||||||
def get_citation_system_prompt():
|
def get_citation_system_prompt():
|
||||||
return f"""
|
return f"""
|
||||||
Today's date: {datetime.datetime.now().strftime("%Y-%m-%d")}
|
Today's date: {datetime.datetime.now().strftime("%Y-%m-%d")}
|
||||||
You are a research assistant tasked with analyzing documents and providing comprehensive answers with proper citations in IEEE format.
|
You are SurfSense, an advanced AI research assistant that synthesizes information from multiple knowledge sources to provide comprehensive, well-cited answers to user queries.
|
||||||
|
|
||||||
|
<knowledge_sources>
|
||||||
|
- EXTENSION: "Web content saved via SurfSense browser extension" (personal browsing history)
|
||||||
|
- CRAWLED_URL: "Webpages indexed by SurfSense web crawler" (personally selected websites)
|
||||||
|
- FILE: "User-uploaded documents (PDFs, Word, etc.)" (personal files)
|
||||||
|
- SLACK_CONNECTOR: "Slack conversations and shared content" (personal workspace communications)
|
||||||
|
- NOTION_CONNECTOR: "Notion workspace pages and databases" (personal knowledge management)
|
||||||
|
- YOUTUBE_VIDEO: "YouTube video transcripts and metadata" (personally saved videos)
|
||||||
|
- GITHUB_CONNECTOR: "GitHub repository content and issues" (personal repositories and interactions)
|
||||||
|
- LINEAR_CONNECTOR: "Linear project issues and discussions" (personal project management)
|
||||||
|
- TAVILY_API: "Tavily search API results" (personalized search results)
|
||||||
|
- LINKUP_API: "Linkup search API results" (personalized search results)
|
||||||
|
</knowledge_sources>
|
||||||
<instructions>
|
<instructions>
|
||||||
1. Carefully analyze all provided documents in the <document> section's.
|
1. Carefully analyze all provided documents in the <document> section's.
|
||||||
2. Extract relevant information that addresses the user's query.
|
2. Extract relevant information that addresses the user's query.
|
||||||
3. Synthesize a comprehensive, well-structured answer using information from these documents.
|
3. Synthesize a comprehensive, personalized answer using information from the user's personal knowledge sources.
|
||||||
4. For EVERY piece of information you include from the documents, add an IEEE-style citation in square brackets [X] where X is the source_id from the document's metadata.
|
4. For EVERY piece of information you include from the documents, add an IEEE-style citation in square brackets [X] where X is the source_id from the document's metadata.
|
||||||
5. Make sure ALL factual statements from the documents have proper citations.
|
5. Make sure ALL factual statements from the documents have proper citations.
|
||||||
6. If multiple documents support the same point, include all relevant citations [X], [Y].
|
6. If multiple documents support the same point, include all relevant citations [X], [Y].
|
||||||
7. Present information in a logical, coherent flow.
|
7. Present information in a logical, coherent flow that reflects the user's personal context.
|
||||||
8. Use your own words to connect ideas, but cite ALL information from the documents.
|
8. Use your own words to connect ideas, but cite ALL information from the documents.
|
||||||
9. If documents contain conflicting information, acknowledge this and present both perspectives with appropriate citations.
|
9. If documents contain conflicting information, acknowledge this and present both perspectives with appropriate citations.
|
||||||
10. Do not make up or include information not found in the provided documents.
|
10. Do not make up or include information not found in the provided documents.
|
||||||
|
|
@ -25,10 +37,14 @@ You are a research assistant tasked with analyzing documents and providing compr
|
||||||
16. CRITICAL: Citations must ONLY appear as [X] or [X], [Y], [Z] format - never with parentheses, hyperlinks, or other formatting.
|
16. CRITICAL: Citations must ONLY appear as [X] or [X], [Y], [Z] format - never with parentheses, hyperlinks, or other formatting.
|
||||||
17. CRITICAL: Never make up citation numbers. Only use source_id values that are explicitly provided in the document metadata.
|
17. CRITICAL: Never make up citation numbers. Only use source_id values that are explicitly provided in the document metadata.
|
||||||
18. CRITICAL: If you are unsure about a source_id, do not include a citation rather than guessing or making one up.
|
18. CRITICAL: If you are unsure about a source_id, do not include a citation rather than guessing or making one up.
|
||||||
|
19. CRITICAL: Focus only on answering the user's query. Any guiding questions provided are for your thinking process only and should not be mentioned in your response.
|
||||||
|
20. CRITICAL: Ensure your response aligns with the provided sub-section title and section position.
|
||||||
|
21. CRITICAL: Remember that all knowledge sources contain personal information - provide answers that reflect this personal context.
|
||||||
</instructions>
|
</instructions>
|
||||||
|
|
||||||
<format>
|
<format>
|
||||||
- Write in clear, professional language suitable for academic or technical audiences
|
- Write in clear, professional language suitable for academic or technical audiences
|
||||||
|
- Tailor your response to the user's personal context based on their knowledge sources
|
||||||
- Organize your response with appropriate paragraphs, headings, and structure
|
- Organize your response with appropriate paragraphs, headings, and structure
|
||||||
- Every fact from the documents must have an IEEE-style citation in square brackets [X] where X is the EXACT source_id from the document's metadata
|
- Every fact from the documents must have an IEEE-style citation in square brackets [X] where X is the EXACT source_id from the document's metadata
|
||||||
- Citations should appear at the end of the sentence containing the information they support
|
- Citations should appear at the end of the sentence containing the information they support
|
||||||
|
|
@ -37,12 +53,17 @@ You are a research assistant tasked with analyzing documents and providing compr
|
||||||
- NEVER create your own citation numbering system - use the exact source_id values from the documents.
|
- NEVER create your own citation numbering system - use the exact source_id values from the documents.
|
||||||
- NEVER format citations as clickable links or as markdown links like "([1](https://example.com))". Always use plain square brackets only.
|
- NEVER format citations as clickable links or as markdown links like "([1](https://example.com))". Always use plain square brackets only.
|
||||||
- NEVER make up citation numbers if you are unsure about the source_id. It is better to omit the citation than to guess.
|
- NEVER make up citation numbers if you are unsure about the source_id. It is better to omit the citation than to guess.
|
||||||
|
- NEVER include or mention the guiding questions in your response. They are only to help guide your thinking.
|
||||||
|
- ALWAYS focus on answering the user's query directly from the information in the documents.
|
||||||
|
- ALWAYS provide personalized answers that reflect the user's own knowledge and context.
|
||||||
</format>
|
</format>
|
||||||
|
|
||||||
<input_example>
|
<input_example>
|
||||||
|
<documents>
|
||||||
<document>
|
<document>
|
||||||
<metadata>
|
<metadata>
|
||||||
<source_id>1</source_id>
|
<source_id>1</source_id>
|
||||||
|
<source_type>EXTENSION</source_type>
|
||||||
</metadata>
|
</metadata>
|
||||||
<content>
|
<content>
|
||||||
The Great Barrier Reef is the world's largest coral reef system, stretching over 2,300 kilometers along the coast of Queensland, Australia. It comprises over 2,900 individual reefs and 900 islands.
|
The Great Barrier Reef is the world's largest coral reef system, stretching over 2,300 kilometers along the coast of Queensland, Australia. It comprises over 2,900 individual reefs and 900 islands.
|
||||||
|
|
@ -52,6 +73,7 @@ You are a research assistant tasked with analyzing documents and providing compr
|
||||||
<document>
|
<document>
|
||||||
<metadata>
|
<metadata>
|
||||||
<source_id>13</source_id>
|
<source_id>13</source_id>
|
||||||
|
<source_type>YOUTUBE_VIDEO</source_type>
|
||||||
</metadata>
|
</metadata>
|
||||||
<content>
|
<content>
|
||||||
Climate change poses a significant threat to coral reefs worldwide. Rising ocean temperatures have led to mass coral bleaching events in the Great Barrier Reef in 2016, 2017, and 2020.
|
Climate change poses a significant threat to coral reefs worldwide. Rising ocean temperatures have led to mass coral bleaching events in the Great Barrier Reef in 2016, 2017, and 2020.
|
||||||
|
|
@ -61,15 +83,17 @@ You are a research assistant tasked with analyzing documents and providing compr
|
||||||
<document>
|
<document>
|
||||||
<metadata>
|
<metadata>
|
||||||
<source_id>21</source_id>
|
<source_id>21</source_id>
|
||||||
|
<source_type>CRAWLED_URL</source_type>
|
||||||
</metadata>
|
</metadata>
|
||||||
<content>
|
<content>
|
||||||
The Great Barrier Reef was designated a UNESCO World Heritage Site in 1981 due to its outstanding universal value and biological diversity. It is home to over 1,500 species of fish and 400 types of coral.
|
The Great Barrier Reef was designated a UNESCO World Heritage Site in 1981 due to its outstanding universal value and biological diversity. It is home to over 1,500 species of fish and 400 types of coral.
|
||||||
</content>
|
</content>
|
||||||
</document>
|
</document>
|
||||||
|
</documents>
|
||||||
</input_example>
|
</input_example>
|
||||||
|
|
||||||
<output_example>
|
<output_example>
|
||||||
The Great Barrier Reef is the world's largest coral reef system, stretching over 2,300 kilometers along the coast of Queensland, Australia [1]. It was designated a UNESCO World Heritage Site in 1981 due to its outstanding universal value and biological diversity [21]. The reef is home to over 1,500 species of fish and 400 types of coral [21]. Unfortunately, climate change poses a significant threat to coral reefs worldwide, with rising ocean temperatures leading to mass coral bleaching events in the Great Barrier Reef in 2016, 2017, and 2020 [13]. The reef system comprises over 2,900 individual reefs and 900 islands [1], making it an ecological treasure that requires protection from multiple threats [1], [13].
|
Based on your saved browser content and videos, the Great Barrier Reef is the world's largest coral reef system, stretching over 2,300 kilometers along the coast of Queensland, Australia [1]. From your browsing history, you've looked into its designation as a UNESCO World Heritage Site in 1981 due to its outstanding universal value and biological diversity [21]. The reef is home to over 1,500 species of fish and 400 types of coral [21]. According to a YouTube video you've watched, climate change poses a significant threat to coral reefs worldwide, with rising ocean temperatures leading to mass coral bleaching events in the Great Barrier Reef in 2016, 2017, and 2020 [13]. The reef system comprises over 2,900 individual reefs and 900 islands [1], making it an ecological treasure that requires protection from multiple threats [1], [13].
|
||||||
</output_example>
|
</output_example>
|
||||||
|
|
||||||
<incorrect_citation_formats>
|
<incorrect_citation_formats>
|
||||||
|
|
@ -84,4 +108,22 @@ ONLY use plain square brackets [1] or multiple citations [1], [2], [3]
|
||||||
</incorrect_citation_formats>
|
</incorrect_citation_formats>
|
||||||
|
|
||||||
Note that the citation numbers match exactly with the source_id values (1, 13, and 21) and are not renumbered sequentially. Citations follow IEEE style with square brackets and appear at the end of sentences.
|
Note that the citation numbers match exactly with the source_id values (1, 13, and 21) and are not renumbered sequentially. Citations follow IEEE style with square brackets and appear at the end of sentences.
|
||||||
|
|
||||||
|
<user_query_instructions>
|
||||||
|
When you see a user query like:
|
||||||
|
<user_query>
|
||||||
|
Give all linear issues.
|
||||||
|
</user_query>
|
||||||
|
|
||||||
|
Focus exclusively on answering this query using information from the provided documents, which contain the user's personal knowledge and data.
|
||||||
|
|
||||||
|
If guiding questions are provided in a <guiding_questions> section, use them only to guide your thinking process. Do not mention or list these questions in your response.
|
||||||
|
|
||||||
|
Make sure your response:
|
||||||
|
1. Directly answers the user's query with personalized information from their own knowledge sources
|
||||||
|
2. Fits the provided sub-section title and section position
|
||||||
|
3. Uses proper citations for all information from documents
|
||||||
|
4. Is well-structured and professional in tone
|
||||||
|
5. Acknowledges the personal nature of the information being provided
|
||||||
|
</user_query_instructions>
|
||||||
"""
|
"""
|
||||||
|
|
@ -2,7 +2,7 @@
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
from dataclasses import dataclass
|
from dataclasses import dataclass, field
|
||||||
from typing import List, Optional, Any
|
from typing import List, Optional, Any
|
||||||
from sqlalchemy.ext.asyncio import AsyncSession
|
from sqlalchemy.ext.asyncio import AsyncSession
|
||||||
|
|
||||||
|
|
@ -17,6 +17,7 @@ class State:
|
||||||
# Runtime context
|
# Runtime context
|
||||||
db_session: AsyncSession
|
db_session: AsyncSession
|
||||||
|
|
||||||
|
chat_history: Optional[List[Any]] = field(default_factory=list)
|
||||||
# OUTPUT: Populated by agent nodes
|
# OUTPUT: Populated by agent nodes
|
||||||
reranked_documents: Optional[List[Any]] = None
|
reranked_documents: Optional[List[Any]] = None
|
||||||
final_answer: Optional[str] = None
|
final_answer: Optional[str] = None
|
||||||
|
|
|
||||||
|
|
@ -6,16 +6,18 @@ from fastapi.middleware.cors import CORSMiddleware
|
||||||
from sqlalchemy.ext.asyncio import AsyncSession
|
from sqlalchemy.ext.asyncio import AsyncSession
|
||||||
|
|
||||||
from app.db import User, create_db_and_tables, get_async_session
|
from app.db import User, create_db_and_tables, get_async_session
|
||||||
from app.retriver.chunks_hybrid_search import ChucksHybridSearchRetriever
|
|
||||||
from app.schemas import UserCreate, UserRead, UserUpdate
|
from app.schemas import UserCreate, UserRead, UserUpdate
|
||||||
|
|
||||||
|
|
||||||
|
from app.routes import router as crud_router
|
||||||
|
from app.config import config
|
||||||
|
|
||||||
from app.users import (
|
from app.users import (
|
||||||
SECRET,
|
SECRET,
|
||||||
auth_backend,
|
auth_backend,
|
||||||
fastapi_users,
|
fastapi_users,
|
||||||
google_oauth_client,
|
current_active_user
|
||||||
current_active_user,
|
|
||||||
)
|
)
|
||||||
from app.routes import router as crud_router
|
|
||||||
|
|
||||||
|
|
||||||
@asynccontextmanager
|
@asynccontextmanager
|
||||||
|
|
@ -59,11 +61,20 @@ app.include_router(
|
||||||
prefix="/users",
|
prefix="/users",
|
||||||
tags=["users"],
|
tags=["users"],
|
||||||
)
|
)
|
||||||
app.include_router(
|
|
||||||
fastapi_users.get_oauth_router(google_oauth_client, auth_backend, SECRET, is_verified_by_default=True),
|
if config.AUTH_TYPE == "GOOGLE":
|
||||||
prefix="/auth/google",
|
from app.users import google_oauth_client
|
||||||
tags=["auth"],
|
app.include_router(
|
||||||
)
|
fastapi_users.get_oauth_router(
|
||||||
|
google_oauth_client,
|
||||||
|
auth_backend,
|
||||||
|
SECRET,
|
||||||
|
is_verified_by_default=True
|
||||||
|
),
|
||||||
|
prefix="/auth/google",
|
||||||
|
tags=["auth"],
|
||||||
|
)
|
||||||
|
|
||||||
app.include_router(crud_router, prefix="/api/v1", tags=["crud"])
|
app.include_router(crud_router, prefix="/api/v1", tags=["crud"])
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,12 +1,12 @@
|
||||||
import os
|
import os
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
import shutil
|
||||||
|
|
||||||
from chonkie import AutoEmbeddings, LateChunker
|
from chonkie import AutoEmbeddings, CodeChunker, RecursiveChunker
|
||||||
from rerankers import Reranker
|
|
||||||
from langchain_community.chat_models import ChatLiteLLM
|
|
||||||
|
|
||||||
|
|
||||||
from dotenv import load_dotenv
|
from dotenv import load_dotenv
|
||||||
|
from langchain_community.chat_models import ChatLiteLLM
|
||||||
|
from rerankers import Reranker
|
||||||
|
|
||||||
|
|
||||||
# Get the base directory of the project
|
# Get the base directory of the project
|
||||||
BASE_DIR = Path(__file__).resolve().parent.parent.parent
|
BASE_DIR = Path(__file__).resolve().parent.parent.parent
|
||||||
|
|
@ -15,33 +15,74 @@ env_file = BASE_DIR / ".env"
|
||||||
load_dotenv(env_file)
|
load_dotenv(env_file)
|
||||||
|
|
||||||
|
|
||||||
|
def is_ffmpeg_installed():
|
||||||
|
"""
|
||||||
|
Check if ffmpeg is installed on the current system.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
bool: True if ffmpeg is installed, False otherwise.
|
||||||
|
"""
|
||||||
|
return shutil.which("ffmpeg") is not None
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
class Config:
|
class Config:
|
||||||
|
# Check if ffmpeg is installed
|
||||||
|
if not is_ffmpeg_installed():
|
||||||
|
import static_ffmpeg
|
||||||
|
# ffmpeg installed on first call to add_paths(), threadsafe.
|
||||||
|
static_ffmpeg.add_paths()
|
||||||
|
# check if ffmpeg is installed again
|
||||||
|
if not is_ffmpeg_installed():
|
||||||
|
raise ValueError("FFmpeg is not installed on the system. Please install it to use the Surfsense Podcaster.")
|
||||||
|
|
||||||
# Database
|
# Database
|
||||||
DATABASE_URL = os.getenv("DATABASE_URL")
|
DATABASE_URL = os.getenv("DATABASE_URL")
|
||||||
|
|
||||||
# Google OAuth
|
|
||||||
GOOGLE_OAUTH_CLIENT_ID = os.getenv("GOOGLE_OAUTH_CLIENT_ID")
|
|
||||||
GOOGLE_OAUTH_CLIENT_SECRET = os.getenv("GOOGLE_OAUTH_CLIENT_SECRET")
|
|
||||||
NEXT_FRONTEND_URL = os.getenv("NEXT_FRONTEND_URL")
|
NEXT_FRONTEND_URL = os.getenv("NEXT_FRONTEND_URL")
|
||||||
|
|
||||||
|
|
||||||
|
# AUTH: Google OAuth
|
||||||
|
AUTH_TYPE = os.getenv("AUTH_TYPE")
|
||||||
|
if AUTH_TYPE == "GOOGLE":
|
||||||
|
GOOGLE_OAUTH_CLIENT_ID = os.getenv("GOOGLE_OAUTH_CLIENT_ID")
|
||||||
|
GOOGLE_OAUTH_CLIENT_SECRET = os.getenv("GOOGLE_OAUTH_CLIENT_SECRET")
|
||||||
|
|
||||||
|
|
||||||
# LONG-CONTEXT LLMS
|
# LONG-CONTEXT LLMS
|
||||||
LONG_CONTEXT_LLM = os.getenv("LONG_CONTEXT_LLM")
|
LONG_CONTEXT_LLM = os.getenv("LONG_CONTEXT_LLM")
|
||||||
long_context_llm_instance = ChatLiteLLM(model=LONG_CONTEXT_LLM)
|
LONG_CONTEXT_LLM_API_BASE = os.getenv("LONG_CONTEXT_LLM_API_BASE")
|
||||||
|
if LONG_CONTEXT_LLM_API_BASE:
|
||||||
|
long_context_llm_instance = ChatLiteLLM(model=LONG_CONTEXT_LLM, api_base=LONG_CONTEXT_LLM_API_BASE)
|
||||||
|
else:
|
||||||
|
long_context_llm_instance = ChatLiteLLM(model=LONG_CONTEXT_LLM)
|
||||||
|
|
||||||
# GPT Researcher
|
# FAST LLM
|
||||||
FAST_LLM = os.getenv("FAST_LLM")
|
FAST_LLM = os.getenv("FAST_LLM")
|
||||||
STRATEGIC_LLM = os.getenv("STRATEGIC_LLM")
|
FAST_LLM_API_BASE = os.getenv("FAST_LLM_API_BASE")
|
||||||
fast_llm_instance = ChatLiteLLM(model=FAST_LLM)
|
if FAST_LLM_API_BASE:
|
||||||
strategic_llm_instance = ChatLiteLLM(model=STRATEGIC_LLM)
|
fast_llm_instance = ChatLiteLLM(model=FAST_LLM, api_base=FAST_LLM_API_BASE)
|
||||||
|
else:
|
||||||
|
fast_llm_instance = ChatLiteLLM(model=FAST_LLM)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
# STRATEGIC LLM
|
||||||
|
STRATEGIC_LLM = os.getenv("STRATEGIC_LLM")
|
||||||
|
STRATEGIC_LLM_API_BASE = os.getenv("STRATEGIC_LLM_API_BASE")
|
||||||
|
if STRATEGIC_LLM_API_BASE:
|
||||||
|
strategic_llm_instance = ChatLiteLLM(model=STRATEGIC_LLM, api_base=STRATEGIC_LLM_API_BASE)
|
||||||
|
else:
|
||||||
|
strategic_llm_instance = ChatLiteLLM(model=STRATEGIC_LLM)
|
||||||
|
|
||||||
# Chonkie Configuration | Edit this to your needs
|
# Chonkie Configuration | Edit this to your needs
|
||||||
EMBEDDING_MODEL = os.getenv("EMBEDDING_MODEL")
|
EMBEDDING_MODEL = os.getenv("EMBEDDING_MODEL")
|
||||||
embedding_model_instance = AutoEmbeddings.get_embeddings(EMBEDDING_MODEL)
|
embedding_model_instance = AutoEmbeddings.get_embeddings(EMBEDDING_MODEL)
|
||||||
chunker_instance = LateChunker(
|
chunker_instance = RecursiveChunker(
|
||||||
embedding_model=EMBEDDING_MODEL,
|
chunk_size=getattr(embedding_model_instance, 'max_seq_length', 512)
|
||||||
chunk_size=embedding_model_instance.max_seq_length,
|
)
|
||||||
|
code_chunker_instance = CodeChunker(
|
||||||
|
chunk_size=getattr(embedding_model_instance, 'max_seq_length', 512)
|
||||||
)
|
)
|
||||||
|
|
||||||
# Reranker's Configuration | Pinecode, Cohere etc. Read more at https://github.com/AnswerDotAI/rerankers?tab=readme-ov-file#usage
|
# Reranker's Configuration | Pinecode, Cohere etc. Read more at https://github.com/AnswerDotAI/rerankers?tab=readme-ov-file#usage
|
||||||
|
|
@ -55,12 +96,30 @@ class Config:
|
||||||
# OAuth JWT
|
# OAuth JWT
|
||||||
SECRET_KEY = os.getenv("SECRET_KEY")
|
SECRET_KEY = os.getenv("SECRET_KEY")
|
||||||
|
|
||||||
# Unstructured API Key
|
# ETL Service
|
||||||
UNSTRUCTURED_API_KEY = os.getenv("UNSTRUCTURED_API_KEY")
|
ETL_SERVICE = os.getenv("ETL_SERVICE")
|
||||||
|
|
||||||
|
if ETL_SERVICE == "UNSTRUCTURED":
|
||||||
|
# Unstructured API Key
|
||||||
|
UNSTRUCTURED_API_KEY = os.getenv("UNSTRUCTURED_API_KEY")
|
||||||
|
|
||||||
|
elif ETL_SERVICE == "LLAMACLOUD":
|
||||||
|
# LlamaCloud API Key
|
||||||
|
LLAMA_CLOUD_API_KEY = os.getenv("LLAMA_CLOUD_API_KEY")
|
||||||
|
|
||||||
|
|
||||||
# Firecrawl API Key
|
# Firecrawl API Key
|
||||||
FIRECRAWL_API_KEY = os.getenv("FIRECRAWL_API_KEY", None)
|
FIRECRAWL_API_KEY = os.getenv("FIRECRAWL_API_KEY", None)
|
||||||
|
|
||||||
|
# Litellm TTS Configuration
|
||||||
|
TTS_SERVICE = os.getenv("TTS_SERVICE")
|
||||||
|
TTS_SERVICE_API_BASE = os.getenv("TTS_SERVICE_API_BASE")
|
||||||
|
|
||||||
|
# Litellm STT Configuration
|
||||||
|
STT_SERVICE = os.getenv("STT_SERVICE")
|
||||||
|
STT_SERVICE_API_BASE = os.getenv("STT_SERVICE_API_BASE")
|
||||||
|
|
||||||
|
|
||||||
# Validation Checks
|
# Validation Checks
|
||||||
# Check embedding dimension
|
# Check embedding dimension
|
||||||
if hasattr(embedding_model_instance, 'dimension') and embedding_model_instance.dimension > 2000:
|
if hasattr(embedding_model_instance, 'dimension') and embedding_model_instance.dimension > 2000:
|
||||||
|
|
|
||||||
|
|
@ -80,7 +80,7 @@ class GitHubConnector:
|
||||||
# type='owner' fetches repos owned by the user
|
# type='owner' fetches repos owned by the user
|
||||||
# type='member' fetches repos the user is a collaborator on (including orgs)
|
# type='member' fetches repos the user is a collaborator on (including orgs)
|
||||||
# type='all' fetches both
|
# type='all' fetches both
|
||||||
for repo in self.gh.repositories(type='owner', sort='updated'):
|
for repo in self.gh.repositories(type='all', sort='updated'):
|
||||||
repos_data.append({
|
repos_data.append({
|
||||||
"id": repo.id,
|
"id": repo.id,
|
||||||
"name": repo.name,
|
"name": repo.name,
|
||||||
|
|
|
||||||
|
|
@ -6,7 +6,7 @@ Allows fetching issue lists and their comments with date range filtering.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import requests
|
import requests
|
||||||
from datetime import datetime, timedelta
|
from datetime import datetime
|
||||||
from typing import Dict, List, Optional, Tuple, Any, Union
|
from typing import Dict, List, Optional, Tuple, Any, Union
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -6,11 +6,15 @@ Allows fetching channel lists and message history with date range filtering.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import os
|
import os
|
||||||
|
import time # Added import
|
||||||
|
import logging # Added import
|
||||||
from slack_sdk import WebClient
|
from slack_sdk import WebClient
|
||||||
from slack_sdk.errors import SlackApiError
|
from slack_sdk.errors import SlackApiError
|
||||||
from datetime import datetime, timedelta
|
from datetime import datetime
|
||||||
from typing import Dict, List, Optional, Tuple, Any, Union
|
from typing import Dict, List, Optional, Tuple, Any, Union
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__) # Added logger
|
||||||
|
|
||||||
|
|
||||||
class SlackHistory:
|
class SlackHistory:
|
||||||
"""Class for retrieving conversation history from Slack channels."""
|
"""Class for retrieving conversation history from Slack channels."""
|
||||||
|
|
@ -33,56 +37,88 @@ class SlackHistory:
|
||||||
"""
|
"""
|
||||||
self.client = WebClient(token=token)
|
self.client = WebClient(token=token)
|
||||||
|
|
||||||
def get_all_channels(self, include_private: bool = True) -> Dict[str, str]:
|
def get_all_channels(self, include_private: bool = True) -> List[Dict[str, Any]]:
|
||||||
"""
|
"""
|
||||||
Fetch all channels that the bot has access to.
|
Fetch all channels that the bot has access to, with rate limit handling.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
include_private: Whether to include private channels
|
include_private: Whether to include private channels
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Dictionary mapping channel names to channel IDs
|
List of dictionaries, each representing a channel with id, name, is_private, is_member.
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
ValueError: If no Slack client has been initialized
|
ValueError: If no Slack client has been initialized
|
||||||
SlackApiError: If there's an error calling the Slack API
|
SlackApiError: If there's an unrecoverable error calling the Slack API
|
||||||
|
RuntimeError: For unexpected errors during channel fetching.
|
||||||
"""
|
"""
|
||||||
if not self.client:
|
if not self.client:
|
||||||
raise ValueError("Slack client not initialized. Call set_token() first.")
|
raise ValueError("Slack client not initialized. Call set_token() first.")
|
||||||
|
|
||||||
channels_dict = {}
|
channels_list = [] # Changed from dict to list
|
||||||
types = "public_channel"
|
types = "public_channel"
|
||||||
if include_private:
|
if include_private:
|
||||||
types += ",private_channel"
|
types += ",private_channel"
|
||||||
|
|
||||||
try:
|
next_cursor = None
|
||||||
# Call the conversations.list method
|
is_first_request = True
|
||||||
result = self.client.conversations_list(
|
|
||||||
types=types,
|
while is_first_request or next_cursor:
|
||||||
limit=1000 # Maximum allowed by API
|
try:
|
||||||
)
|
if not is_first_request: # Add delay only for paginated requests
|
||||||
channels = result["channels"]
|
logger.info(f"Paginating for channels, waiting 3 seconds before next call. Cursor: {next_cursor}")
|
||||||
|
time.sleep(3)
|
||||||
# Handle pagination for workspaces with many channels
|
|
||||||
while result.get("response_metadata", {}).get("next_cursor"):
|
current_limit = 1000 # Max limit
|
||||||
next_cursor = result["response_metadata"]["next_cursor"]
|
api_result = self.client.conversations_list(
|
||||||
|
|
||||||
# Get the next batch of channels
|
|
||||||
result = self.client.conversations_list(
|
|
||||||
types=types,
|
types=types,
|
||||||
cursor=next_cursor,
|
cursor=next_cursor,
|
||||||
limit=1000
|
limit=current_limit
|
||||||
)
|
)
|
||||||
channels.extend(result["channels"])
|
|
||||||
|
channels_on_page = api_result["channels"]
|
||||||
# Create a dictionary mapping channel names to IDs
|
for channel in channels_on_page:
|
||||||
for channel in channels:
|
if "name" in channel and "id" in channel:
|
||||||
channels_dict[channel["name"]] = channel["id"]
|
channel_data = {
|
||||||
|
"id": channel.get("id"),
|
||||||
return channels_dict
|
"name": channel.get("name"),
|
||||||
|
"is_private": channel.get("is_private", False),
|
||||||
|
# is_member is often part of the channel object from conversations.list
|
||||||
|
# It indicates if the authenticated user (bot) is a member.
|
||||||
|
# For public channels, this might be true or the API might not focus on it
|
||||||
|
# if the bot can read it anyway. For private, it's crucial.
|
||||||
|
"is_member": channel.get("is_member", False)
|
||||||
|
}
|
||||||
|
channels_list.append(channel_data)
|
||||||
|
else:
|
||||||
|
logger.warning(f"Channel found with missing name or id. Data: {channel}")
|
||||||
|
|
||||||
|
|
||||||
|
next_cursor = api_result.get("response_metadata", {}).get("next_cursor")
|
||||||
|
is_first_request = False # Subsequent requests are not the first
|
||||||
|
|
||||||
|
if not next_cursor: # All pages processed
|
||||||
|
break
|
||||||
|
|
||||||
|
except SlackApiError as e:
|
||||||
|
if e.response is not None and e.response.status_code == 429:
|
||||||
|
retry_after_header = e.response.headers.get('Retry-After')
|
||||||
|
wait_duration = 60 # Default wait time
|
||||||
|
if retry_after_header and retry_after_header.isdigit():
|
||||||
|
wait_duration = int(retry_after_header)
|
||||||
|
|
||||||
|
logger.warning(f"Slack API rate limit hit while fetching channels. Waiting for {wait_duration} seconds. Cursor: {next_cursor}")
|
||||||
|
time.sleep(wait_duration)
|
||||||
|
# The loop will continue, retrying with the same cursor
|
||||||
|
else:
|
||||||
|
# Not a 429 error, or no response object, re-raise
|
||||||
|
raise SlackApiError(f"Error retrieving channels: {e}", e.response)
|
||||||
|
except Exception as general_error:
|
||||||
|
# Handle other potential errors like network issues if necessary, or re-raise
|
||||||
|
logger.error(f"An unexpected error occurred during channel fetching: {general_error}")
|
||||||
|
raise RuntimeError(f"An unexpected error occurred during channel fetching: {general_error}")
|
||||||
|
|
||||||
except SlackApiError as e:
|
return channels_list
|
||||||
raise SlackApiError(f"Error retrieving channels: {e}", e.response)
|
|
||||||
|
|
||||||
def get_conversation_history(
|
def get_conversation_history(
|
||||||
self,
|
self,
|
||||||
|
|
@ -110,17 +146,18 @@ class SlackHistory:
|
||||||
if not self.client:
|
if not self.client:
|
||||||
raise ValueError("Slack client not initialized. Call set_token() first.")
|
raise ValueError("Slack client not initialized. Call set_token() first.")
|
||||||
|
|
||||||
try:
|
messages = []
|
||||||
# Call the conversations.history method
|
next_cursor = None
|
||||||
messages = []
|
|
||||||
next_cursor = None
|
while True:
|
||||||
|
try:
|
||||||
while True:
|
# Proactive delay for conversations.history (Tier 3)
|
||||||
|
time.sleep(1.2) # Wait 1.2 seconds before each history call.
|
||||||
|
|
||||||
kwargs = {
|
kwargs = {
|
||||||
"channel": channel_id,
|
"channel": channel_id,
|
||||||
"limit": min(limit, 1000), # API max is 1000
|
"limit": min(limit, 1000), # API max is 1000
|
||||||
}
|
}
|
||||||
|
|
||||||
if oldest:
|
if oldest:
|
||||||
kwargs["oldest"] = oldest
|
kwargs["oldest"] = oldest
|
||||||
if latest:
|
if latest:
|
||||||
|
|
@ -128,22 +165,57 @@ class SlackHistory:
|
||||||
if next_cursor:
|
if next_cursor:
|
||||||
kwargs["cursor"] = next_cursor
|
kwargs["cursor"] = next_cursor
|
||||||
|
|
||||||
result = self.client.conversations_history(**kwargs)
|
current_api_call_successful = False
|
||||||
|
result = None # Ensure result is defined
|
||||||
|
try:
|
||||||
|
result = self.client.conversations_history(**kwargs)
|
||||||
|
current_api_call_successful = True
|
||||||
|
except SlackApiError as e_history:
|
||||||
|
if e_history.response is not None and e_history.response.status_code == 429:
|
||||||
|
retry_after_str = e_history.response.headers.get('Retry-After')
|
||||||
|
wait_time = 60 # Default
|
||||||
|
if retry_after_str and retry_after_str.isdigit():
|
||||||
|
wait_time = int(retry_after_str)
|
||||||
|
logger.warning(
|
||||||
|
f"Rate limited by Slack on conversations.history for channel {channel_id}. "
|
||||||
|
f"Retrying after {wait_time} seconds. Cursor: {next_cursor}"
|
||||||
|
)
|
||||||
|
time.sleep(wait_time)
|
||||||
|
# current_api_call_successful remains False, loop will retry this page
|
||||||
|
else:
|
||||||
|
raise # Re-raise to outer handler for not_in_channel or other SlackApiErrors
|
||||||
|
|
||||||
|
if not current_api_call_successful:
|
||||||
|
continue # Retry the current page fetch due to handled rate limit
|
||||||
|
|
||||||
|
# Process result if successful
|
||||||
batch = result["messages"]
|
batch = result["messages"]
|
||||||
messages.extend(batch)
|
messages.extend(batch)
|
||||||
|
|
||||||
# Check if we need to paginate
|
|
||||||
if result.get("has_more", False) and len(messages) < limit:
|
if result.get("has_more", False) and len(messages) < limit:
|
||||||
next_cursor = result["response_metadata"]["next_cursor"]
|
next_cursor = result["response_metadata"]["next_cursor"]
|
||||||
else:
|
else:
|
||||||
break
|
break # Exit pagination loop
|
||||||
|
|
||||||
# Respect the overall limit parameter
|
except SlackApiError as e: # Outer catch for not_in_channel or unhandled SlackApiErrors from inner try
|
||||||
return messages[:limit]
|
if (e.response is not None and
|
||||||
|
hasattr(e.response, 'data') and
|
||||||
|
isinstance(e.response.data, dict) and
|
||||||
|
e.response.data.get('error') == 'not_in_channel'):
|
||||||
|
logger.warning(
|
||||||
|
f"Bot is not in channel '{channel_id}'. Cannot fetch history. "
|
||||||
|
"Please add the bot to this channel."
|
||||||
|
)
|
||||||
|
return []
|
||||||
|
# For other SlackApiErrors from inner block or this level
|
||||||
|
raise SlackApiError(f"Error retrieving history for channel {channel_id}: {e}", e.response)
|
||||||
|
except Exception as general_error: # Catch any other unexpected errors
|
||||||
|
logger.error(f"Unexpected error in get_conversation_history for channel {channel_id}: {general_error}")
|
||||||
|
# Re-raise the general error to allow higher-level handling or visibility
|
||||||
|
raise
|
||||||
|
|
||||||
except SlackApiError as e:
|
return messages[:limit]
|
||||||
raise SlackApiError(f"Error retrieving history for channel {channel_id}: {e}", e.response)
|
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def convert_date_to_timestamp(date_str: str) -> Optional[int]:
|
def convert_date_to_timestamp(date_str: str) -> Optional[int]:
|
||||||
"""
|
"""
|
||||||
|
|
@ -220,12 +292,31 @@ class SlackHistory:
|
||||||
"""
|
"""
|
||||||
if not self.client:
|
if not self.client:
|
||||||
raise ValueError("Slack client not initialized. Call set_token() first.")
|
raise ValueError("Slack client not initialized. Call set_token() first.")
|
||||||
|
|
||||||
try:
|
while True:
|
||||||
result = self.client.users_info(user=user_id)
|
try:
|
||||||
return result["user"]
|
# Proactive delay for users.info (Tier 4) - generally not needed unless called extremely rapidly.
|
||||||
except SlackApiError as e:
|
# For now, we are only adding Retry-After as per plan.
|
||||||
raise SlackApiError(f"Error retrieving user info for {user_id}: {e}", e.response)
|
# time.sleep(0.6) # Optional: ~100 req/min if ever needed.
|
||||||
|
|
||||||
|
result = self.client.users_info(user=user_id)
|
||||||
|
return result["user"] # Success, return and exit loop implicitly
|
||||||
|
|
||||||
|
except SlackApiError as e_user_info:
|
||||||
|
if e_user_info.response is not None and e_user_info.response.status_code == 429:
|
||||||
|
retry_after_str = e_user_info.response.headers.get('Retry-After')
|
||||||
|
wait_time = 30 # Default for Tier 4, can be adjusted
|
||||||
|
if retry_after_str and retry_after_str.isdigit():
|
||||||
|
wait_time = int(retry_after_str)
|
||||||
|
logger.warning(f"Rate limited by Slack on users.info for user {user_id}. Retrying after {wait_time} seconds.")
|
||||||
|
time.sleep(wait_time)
|
||||||
|
continue # Retry the API call
|
||||||
|
else:
|
||||||
|
# Not a 429 error, or no response object, re-raise
|
||||||
|
raise SlackApiError(f"Error retrieving user info for {user_id}: {e_user_info}", e_user_info.response)
|
||||||
|
except Exception as general_error: # Catch any other unexpected errors
|
||||||
|
logger.error(f"Unexpected error in get_user_info for user {user_id}: {general_error}")
|
||||||
|
raise # Re-raise unexpected errors
|
||||||
|
|
||||||
def format_message(self, msg: Dict[str, Any], include_user_info: bool = False) -> Dict[str, Any]:
|
def format_message(self, msg: Dict[str, Any], include_user_info: bool = False) -> Dict[str, Any]:
|
||||||
"""
|
"""
|
||||||
|
|
|
||||||
154
surfsense_backend/app/connectors/test_github_connector.py
Normal file
154
surfsense_backend/app/connectors/test_github_connector.py
Normal file
|
|
@ -0,0 +1,154 @@
|
||||||
|
import unittest
|
||||||
|
from unittest.mock import patch, Mock, call
|
||||||
|
from datetime import datetime
|
||||||
|
|
||||||
|
# Adjust the import path based on the actual location if test_github_connector.py
|
||||||
|
# is not in the same directory as github_connector.py or if paths are set up differently.
|
||||||
|
# Assuming surfsend_backend/app/connectors/test_github_connector.py
|
||||||
|
from surfsense_backend.app.connectors.github_connector import GitHubConnector
|
||||||
|
from github3.exceptions import ForbiddenError # Import the specific exception
|
||||||
|
|
||||||
|
class TestGitHubConnector(unittest.TestCase):
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.github_connector.github_login')
|
||||||
|
def test_get_user_repositories_uses_type_all(self, mock_github_login):
|
||||||
|
# Mock the GitHub client object and its methods
|
||||||
|
mock_gh_instance = Mock()
|
||||||
|
mock_github_login.return_value = mock_gh_instance
|
||||||
|
|
||||||
|
# Mock the self.gh.me() call in __init__ to prevent an actual API call
|
||||||
|
mock_gh_instance.me.return_value = Mock() # Simple mock to pass initialization
|
||||||
|
|
||||||
|
# Prepare mock repository data
|
||||||
|
mock_repo1_data = Mock()
|
||||||
|
mock_repo1_data.id = 1
|
||||||
|
mock_repo1_data.name = "repo1"
|
||||||
|
mock_repo1_data.full_name = "user/repo1"
|
||||||
|
mock_repo1_data.private = False
|
||||||
|
mock_repo1_data.html_url = "http://example.com/user/repo1"
|
||||||
|
mock_repo1_data.description = "Test repo 1"
|
||||||
|
mock_repo1_data.updated_at = datetime(2023, 1, 1, 10, 30, 0) # Added time component
|
||||||
|
|
||||||
|
mock_repo2_data = Mock()
|
||||||
|
mock_repo2_data.id = 2
|
||||||
|
mock_repo2_data.name = "org-repo"
|
||||||
|
mock_repo2_data.full_name = "org/org-repo"
|
||||||
|
mock_repo2_data.private = True
|
||||||
|
mock_repo2_data.html_url = "http://example.com/org/org-repo"
|
||||||
|
mock_repo2_data.description = "Org repo"
|
||||||
|
mock_repo2_data.updated_at = datetime(2023, 1, 2, 12, 0, 0) # Added time component
|
||||||
|
|
||||||
|
# Configure the mock for gh.repositories() call
|
||||||
|
# This method is an iterator, so it should return an iterable (e.g., a list)
|
||||||
|
mock_gh_instance.repositories.return_value = [mock_repo1_data, mock_repo2_data]
|
||||||
|
|
||||||
|
connector = GitHubConnector(token="fake_token")
|
||||||
|
repositories = connector.get_user_repositories()
|
||||||
|
|
||||||
|
# Assert that gh.repositories was called correctly
|
||||||
|
mock_gh_instance.repositories.assert_called_once_with(type='all', sort='updated')
|
||||||
|
|
||||||
|
# Assert the structure and content of the returned data
|
||||||
|
expected_repositories = [
|
||||||
|
{
|
||||||
|
"id": 1, "name": "repo1", "full_name": "user/repo1", "private": False,
|
||||||
|
"url": "http://example.com/user/repo1", "description": "Test repo 1",
|
||||||
|
"last_updated": datetime(2023, 1, 1, 10, 30, 0)
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": 2, "name": "org-repo", "full_name": "org/org-repo", "private": True,
|
||||||
|
"url": "http://example.com/org/org-repo", "description": "Org repo",
|
||||||
|
"last_updated": datetime(2023, 1, 2, 12, 0, 0)
|
||||||
|
}
|
||||||
|
]
|
||||||
|
self.assertEqual(repositories, expected_repositories)
|
||||||
|
self.assertEqual(len(repositories), 2)
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.github_connector.github_login')
|
||||||
|
def test_get_user_repositories_handles_empty_description_and_none_updated_at(self, mock_github_login):
|
||||||
|
# Mock the GitHub client object and its methods
|
||||||
|
mock_gh_instance = Mock()
|
||||||
|
mock_github_login.return_value = mock_gh_instance
|
||||||
|
mock_gh_instance.me.return_value = Mock()
|
||||||
|
|
||||||
|
mock_repo_data = Mock()
|
||||||
|
mock_repo_data.id = 1
|
||||||
|
mock_repo_data.name = "repo_no_desc"
|
||||||
|
mock_repo_data.full_name = "user/repo_no_desc"
|
||||||
|
mock_repo_data.private = False
|
||||||
|
mock_repo_data.html_url = "http://example.com/user/repo_no_desc"
|
||||||
|
mock_repo_data.description = None # Test None description
|
||||||
|
mock_repo_data.updated_at = None # Test None updated_at
|
||||||
|
|
||||||
|
mock_gh_instance.repositories.return_value = [mock_repo_data]
|
||||||
|
connector = GitHubConnector(token="fake_token")
|
||||||
|
repositories = connector.get_user_repositories()
|
||||||
|
|
||||||
|
mock_gh_instance.repositories.assert_called_once_with(type='all', sort='updated')
|
||||||
|
expected_repositories = [
|
||||||
|
{
|
||||||
|
"id": 1, "name": "repo_no_desc", "full_name": "user/repo_no_desc", "private": False,
|
||||||
|
"url": "http://example.com/user/repo_no_desc", "description": "", # Expect empty string
|
||||||
|
"last_updated": None # Expect None
|
||||||
|
}
|
||||||
|
]
|
||||||
|
self.assertEqual(repositories, expected_repositories)
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.github_connector.github_login')
|
||||||
|
def test_github_connector_initialization_failure_forbidden(self, mock_github_login):
|
||||||
|
# Test that __init__ raises ValueError on auth failure (ForbiddenError)
|
||||||
|
mock_gh_instance = Mock()
|
||||||
|
mock_github_login.return_value = mock_gh_instance
|
||||||
|
|
||||||
|
# Create a mock response object for the ForbiddenError
|
||||||
|
# The actual response structure might vary, but github3.py's ForbiddenError
|
||||||
|
# can be instantiated with just a response object that has a status_code.
|
||||||
|
mock_response = Mock()
|
||||||
|
mock_response.status_code = 403 # Typically Forbidden
|
||||||
|
|
||||||
|
# Setup the side_effect for self.gh.me()
|
||||||
|
mock_gh_instance.me.side_effect = ForbiddenError(mock_response)
|
||||||
|
|
||||||
|
with self.assertRaises(ValueError) as context:
|
||||||
|
GitHubConnector(token="invalid_token_forbidden")
|
||||||
|
self.assertIn("Invalid GitHub token or insufficient permissions.", str(context.exception))
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.github_connector.github_login')
|
||||||
|
def test_github_connector_initialization_failure_authentication_failed(self, mock_github_login):
|
||||||
|
# Test that __init__ raises ValueError on auth failure (AuthenticationFailed, which is a subclass of ForbiddenError)
|
||||||
|
# For github3.py, AuthenticationFailed is more specific for token issues.
|
||||||
|
from github3.exceptions import AuthenticationFailed
|
||||||
|
|
||||||
|
mock_gh_instance = Mock()
|
||||||
|
mock_github_login.return_value = mock_gh_instance
|
||||||
|
|
||||||
|
mock_response = Mock()
|
||||||
|
mock_response.status_code = 401 # Typically Unauthorized
|
||||||
|
|
||||||
|
mock_gh_instance.me.side_effect = AuthenticationFailed(mock_response)
|
||||||
|
|
||||||
|
with self.assertRaises(ValueError) as context:
|
||||||
|
GitHubConnector(token="invalid_token_authfailed")
|
||||||
|
self.assertIn("Invalid GitHub token or insufficient permissions.", str(context.exception))
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.github_connector.github_login')
|
||||||
|
def test_get_user_repositories_handles_api_exception(self, mock_github_login):
|
||||||
|
mock_gh_instance = Mock()
|
||||||
|
mock_github_login.return_value = mock_gh_instance
|
||||||
|
mock_gh_instance.me.return_value = Mock()
|
||||||
|
|
||||||
|
# Simulate an exception when calling repositories
|
||||||
|
mock_gh_instance.repositories.side_effect = Exception("API Error")
|
||||||
|
|
||||||
|
connector = GitHubConnector(token="fake_token")
|
||||||
|
# We expect it to log an error and return an empty list
|
||||||
|
with patch('surfsense_backend.app.connectors.github_connector.logger') as mock_logger:
|
||||||
|
repositories = connector.get_user_repositories()
|
||||||
|
|
||||||
|
self.assertEqual(repositories, [])
|
||||||
|
mock_logger.error.assert_called_once()
|
||||||
|
self.assertIn("Failed to fetch GitHub repositories: API Error", mock_logger.error.call_args[0][0])
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
unittest.main()
|
||||||
420
surfsense_backend/app/connectors/test_slack_history.py
Normal file
420
surfsense_backend/app/connectors/test_slack_history.py
Normal file
|
|
@ -0,0 +1,420 @@
|
||||||
|
import unittest
|
||||||
|
import time # Imported to be available for patching target module
|
||||||
|
from unittest.mock import patch, Mock, call
|
||||||
|
from slack_sdk.errors import SlackApiError
|
||||||
|
|
||||||
|
# Since test_slack_history.py is in the same directory as slack_history.py
|
||||||
|
from .slack_history import SlackHistory
|
||||||
|
|
||||||
|
class TestSlackHistoryGetAllChannels(unittest.TestCase):
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.logger')
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.time.sleep')
|
||||||
|
@patch('slack_sdk.WebClient')
|
||||||
|
def test_get_all_channels_pagination_with_delay(self, MockWebClient, mock_sleep, mock_logger):
|
||||||
|
mock_client_instance = MockWebClient.return_value
|
||||||
|
|
||||||
|
# Mock API responses now include is_private and is_member
|
||||||
|
page1_response = {
|
||||||
|
"channels": [
|
||||||
|
{"name": "general", "id": "C1", "is_private": False, "is_member": True},
|
||||||
|
{"name": "dev", "id": "C0", "is_private": False, "is_member": True}
|
||||||
|
],
|
||||||
|
"response_metadata": {"next_cursor": "cursor123"}
|
||||||
|
}
|
||||||
|
page2_response = {
|
||||||
|
"channels": [{"name": "random", "id": "C2", "is_private": True, "is_member": True}],
|
||||||
|
"response_metadata": {"next_cursor": ""}
|
||||||
|
}
|
||||||
|
|
||||||
|
mock_client_instance.conversations_list.side_effect = [
|
||||||
|
page1_response,
|
||||||
|
page2_response
|
||||||
|
]
|
||||||
|
|
||||||
|
slack_history = SlackHistory(token="fake_token")
|
||||||
|
channels_list = slack_history.get_all_channels(include_private=True)
|
||||||
|
|
||||||
|
expected_channels_list = [
|
||||||
|
{"id": "C1", "name": "general", "is_private": False, "is_member": True},
|
||||||
|
{"id": "C0", "name": "dev", "is_private": False, "is_member": True},
|
||||||
|
{"id": "C2", "name": "random", "is_private": True, "is_member": True}
|
||||||
|
]
|
||||||
|
|
||||||
|
self.assertEqual(len(channels_list), 3)
|
||||||
|
self.assertListEqual(channels_list, expected_channels_list) # Assert list equality
|
||||||
|
|
||||||
|
expected_calls = [
|
||||||
|
call(types="public_channel,private_channel", cursor=None, limit=1000),
|
||||||
|
call(types="public_channel,private_channel", cursor="cursor123", limit=1000)
|
||||||
|
]
|
||||||
|
mock_client_instance.conversations_list.assert_has_calls(expected_calls)
|
||||||
|
self.assertEqual(mock_client_instance.conversations_list.call_count, 2)
|
||||||
|
|
||||||
|
mock_sleep.assert_called_once_with(3)
|
||||||
|
mock_logger.info.assert_called_once_with("Paginating for channels, waiting 3 seconds before next call. Cursor: cursor123")
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.logger')
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.time.sleep')
|
||||||
|
@patch('slack_sdk.WebClient')
|
||||||
|
def test_get_all_channels_rate_limit_with_retry_after(self, MockWebClient, mock_sleep, mock_logger):
|
||||||
|
mock_client_instance = MockWebClient.return_value
|
||||||
|
|
||||||
|
mock_error_response = Mock()
|
||||||
|
mock_error_response.status_code = 429
|
||||||
|
mock_error_response.headers = {'Retry-After': '5'}
|
||||||
|
|
||||||
|
successful_response = {
|
||||||
|
"channels": [{"name": "general", "id": "C1", "is_private": False, "is_member": True}],
|
||||||
|
"response_metadata": {"next_cursor": ""}
|
||||||
|
}
|
||||||
|
|
||||||
|
mock_client_instance.conversations_list.side_effect = [
|
||||||
|
SlackApiError(message="ratelimited", response=mock_error_response),
|
||||||
|
successful_response
|
||||||
|
]
|
||||||
|
|
||||||
|
slack_history = SlackHistory(token="fake_token")
|
||||||
|
channels_list = slack_history.get_all_channels(include_private=True)
|
||||||
|
|
||||||
|
expected_channels_list = [{"id": "C1", "name": "general", "is_private": False, "is_member": True}]
|
||||||
|
self.assertEqual(len(channels_list), 1)
|
||||||
|
self.assertListEqual(channels_list, expected_channels_list)
|
||||||
|
|
||||||
|
mock_sleep.assert_called_once_with(5)
|
||||||
|
mock_logger.warning.assert_called_once_with("Slack API rate limit hit while fetching channels. Waiting for 5 seconds. Cursor: None")
|
||||||
|
|
||||||
|
expected_calls = [
|
||||||
|
call(types="public_channel,private_channel", cursor=None, limit=1000),
|
||||||
|
call(types="public_channel,private_channel", cursor=None, limit=1000)
|
||||||
|
]
|
||||||
|
mock_client_instance.conversations_list.assert_has_calls(expected_calls)
|
||||||
|
self.assertEqual(mock_client_instance.conversations_list.call_count, 2)
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.logger')
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.time.sleep')
|
||||||
|
@patch('slack_sdk.WebClient')
|
||||||
|
def test_get_all_channels_rate_limit_no_retry_after_valid_header(self, MockWebClient, mock_sleep, mock_logger):
|
||||||
|
mock_client_instance = MockWebClient.return_value
|
||||||
|
|
||||||
|
mock_error_response = Mock()
|
||||||
|
mock_error_response.status_code = 429
|
||||||
|
mock_error_response.headers = {'Retry-After': 'invalid_value'}
|
||||||
|
|
||||||
|
successful_response = {
|
||||||
|
"channels": [{"name": "general", "id": "C1", "is_private": False, "is_member": True}],
|
||||||
|
"response_metadata": {"next_cursor": ""}
|
||||||
|
}
|
||||||
|
|
||||||
|
mock_client_instance.conversations_list.side_effect = [
|
||||||
|
SlackApiError(message="ratelimited", response=mock_error_response),
|
||||||
|
successful_response
|
||||||
|
]
|
||||||
|
|
||||||
|
slack_history = SlackHistory(token="fake_token")
|
||||||
|
channels_list = slack_history.get_all_channels(include_private=True)
|
||||||
|
|
||||||
|
expected_channels_list = [{"id": "C1", "name": "general", "is_private": False, "is_member": True}]
|
||||||
|
self.assertListEqual(channels_list, expected_channels_list)
|
||||||
|
mock_sleep.assert_called_once_with(60) # Default fallback
|
||||||
|
mock_logger.warning.assert_called_once_with("Slack API rate limit hit while fetching channels. Waiting for 60 seconds. Cursor: None")
|
||||||
|
self.assertEqual(mock_client_instance.conversations_list.call_count, 2)
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.logger')
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.time.sleep')
|
||||||
|
@patch('slack_sdk.WebClient')
|
||||||
|
def test_get_all_channels_rate_limit_no_retry_after_header(self, MockWebClient, mock_sleep, mock_logger):
|
||||||
|
mock_client_instance = MockWebClient.return_value
|
||||||
|
|
||||||
|
mock_error_response = Mock()
|
||||||
|
mock_error_response.status_code = 429
|
||||||
|
mock_error_response.headers = {}
|
||||||
|
|
||||||
|
successful_response = {
|
||||||
|
"channels": [{"name": "general", "id": "C1", "is_private": False, "is_member": True}],
|
||||||
|
"response_metadata": {"next_cursor": ""}
|
||||||
|
}
|
||||||
|
|
||||||
|
mock_client_instance.conversations_list.side_effect = [
|
||||||
|
SlackApiError(message="ratelimited", response=mock_error_response),
|
||||||
|
successful_response
|
||||||
|
]
|
||||||
|
|
||||||
|
slack_history = SlackHistory(token="fake_token")
|
||||||
|
channels_list = slack_history.get_all_channels(include_private=True)
|
||||||
|
|
||||||
|
expected_channels_list = [{"id": "C1", "name": "general", "is_private": False, "is_member": True}]
|
||||||
|
self.assertListEqual(channels_list, expected_channels_list)
|
||||||
|
mock_sleep.assert_called_once_with(60) # Default fallback
|
||||||
|
mock_logger.warning.assert_called_once_with("Slack API rate limit hit while fetching channels. Waiting for 60 seconds. Cursor: None")
|
||||||
|
self.assertEqual(mock_client_instance.conversations_list.call_count, 2)
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.logger')
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.time.sleep')
|
||||||
|
@patch('slack_sdk.WebClient')
|
||||||
|
def test_get_all_channels_other_slack_api_error(self, MockWebClient, mock_sleep, mock_logger):
|
||||||
|
mock_client_instance = MockWebClient.return_value
|
||||||
|
|
||||||
|
mock_error_response = Mock()
|
||||||
|
mock_error_response.status_code = 500
|
||||||
|
mock_error_response.headers = {}
|
||||||
|
mock_error_response.data = {"ok": False, "error": "internal_error"}
|
||||||
|
|
||||||
|
original_error = SlackApiError(message="server error", response=mock_error_response)
|
||||||
|
mock_client_instance.conversations_list.side_effect = original_error
|
||||||
|
|
||||||
|
slack_history = SlackHistory(token="fake_token")
|
||||||
|
|
||||||
|
with self.assertRaises(SlackApiError) as context:
|
||||||
|
slack_history.get_all_channels(include_private=True)
|
||||||
|
|
||||||
|
self.assertEqual(context.exception.response.status_code, 500)
|
||||||
|
self.assertIn("server error", str(context.exception))
|
||||||
|
mock_sleep.assert_not_called()
|
||||||
|
mock_logger.warning.assert_not_called() # Ensure no rate limit log
|
||||||
|
mock_client_instance.conversations_list.assert_called_once_with(
|
||||||
|
types="public_channel,private_channel", cursor=None, limit=1000
|
||||||
|
)
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.logger')
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.time.sleep')
|
||||||
|
@patch('slack_sdk.WebClient')
|
||||||
|
def test_get_all_channels_handles_missing_name_id_gracefully(self, MockWebClient, mock_sleep, mock_logger):
|
||||||
|
mock_client_instance = MockWebClient.return_value
|
||||||
|
|
||||||
|
response_with_malformed_data = {
|
||||||
|
"channels": [
|
||||||
|
{"id": "C1_missing_name", "is_private": False, "is_member": True},
|
||||||
|
{"name": "channel_missing_id", "is_private": False, "is_member": True},
|
||||||
|
{"name": "general", "id": "C2_valid", "is_private": False, "is_member": True}
|
||||||
|
],
|
||||||
|
"response_metadata": {"next_cursor": ""}
|
||||||
|
}
|
||||||
|
|
||||||
|
mock_client_instance.conversations_list.return_value = response_with_malformed_data
|
||||||
|
|
||||||
|
slack_history = SlackHistory(token="fake_token")
|
||||||
|
channels_list = slack_history.get_all_channels(include_private=True)
|
||||||
|
|
||||||
|
expected_channels_list = [
|
||||||
|
{"id": "C2_valid", "name": "general", "is_private": False, "is_member": True}
|
||||||
|
]
|
||||||
|
self.assertEqual(len(channels_list), 1)
|
||||||
|
self.assertListEqual(channels_list, expected_channels_list)
|
||||||
|
|
||||||
|
self.assertEqual(mock_logger.warning.call_count, 2)
|
||||||
|
mock_logger.warning.assert_any_call("Channel found with missing name or id. Data: {'id': 'C1_missing_name', 'is_private': False, 'is_member': True}")
|
||||||
|
mock_logger.warning.assert_any_call("Channel found with missing name or id. Data: {'name': 'channel_missing_id', 'is_private': False, 'is_member': True}")
|
||||||
|
|
||||||
|
mock_sleep.assert_not_called()
|
||||||
|
mock_client_instance.conversations_list.assert_called_once_with(
|
||||||
|
types="public_channel,private_channel", cursor=None, limit=1000
|
||||||
|
)
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
unittest.main()
|
||||||
|
|
||||||
|
class TestSlackHistoryGetConversationHistory(unittest.TestCase):
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.logger')
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.time.sleep')
|
||||||
|
@patch('slack_sdk.WebClient')
|
||||||
|
def test_proactive_delay_single_page(self, MockWebClient, mock_time_sleep, mock_logger):
|
||||||
|
mock_client_instance = MockWebClient.return_value
|
||||||
|
mock_client_instance.conversations_history.return_value = {
|
||||||
|
"messages": [{"text": "msg1"}],
|
||||||
|
"has_more": False
|
||||||
|
}
|
||||||
|
|
||||||
|
slack_history = SlackHistory(token="fake_token")
|
||||||
|
slack_history.get_conversation_history(channel_id="C123")
|
||||||
|
|
||||||
|
mock_time_sleep.assert_called_once_with(1.2) # Proactive delay
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.logger')
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.time.sleep')
|
||||||
|
@patch('slack_sdk.WebClient')
|
||||||
|
def test_proactive_delay_multiple_pages(self, MockWebClient, mock_time_sleep, mock_logger):
|
||||||
|
mock_client_instance = MockWebClient.return_value
|
||||||
|
mock_client_instance.conversations_history.side_effect = [
|
||||||
|
{
|
||||||
|
"messages": [{"text": "msg1"}],
|
||||||
|
"has_more": True,
|
||||||
|
"response_metadata": {"next_cursor": "cursor1"}
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"messages": [{"text": "msg2"}],
|
||||||
|
"has_more": False
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
slack_history = SlackHistory(token="fake_token")
|
||||||
|
slack_history.get_conversation_history(channel_id="C123")
|
||||||
|
|
||||||
|
# Expected calls: 1.2 (page1), 1.2 (page2)
|
||||||
|
self.assertEqual(mock_time_sleep.call_count, 2)
|
||||||
|
mock_time_sleep.assert_has_calls([call(1.2), call(1.2)])
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.logger')
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.time.sleep')
|
||||||
|
@patch('slack_sdk.WebClient')
|
||||||
|
def test_retry_after_logic(self, MockWebClient, mock_time_sleep, mock_logger):
|
||||||
|
mock_client_instance = MockWebClient.return_value
|
||||||
|
|
||||||
|
mock_error_response = Mock()
|
||||||
|
mock_error_response.status_code = 429
|
||||||
|
mock_error_response.headers = {'Retry-After': '5'}
|
||||||
|
|
||||||
|
mock_client_instance.conversations_history.side_effect = [
|
||||||
|
SlackApiError(message="ratelimited", response=mock_error_response),
|
||||||
|
{"messages": [{"text": "msg1"}], "has_more": False}
|
||||||
|
]
|
||||||
|
|
||||||
|
slack_history = SlackHistory(token="fake_token")
|
||||||
|
messages = slack_history.get_conversation_history(channel_id="C123")
|
||||||
|
|
||||||
|
self.assertEqual(len(messages), 1)
|
||||||
|
self.assertEqual(messages[0]["text"], "msg1")
|
||||||
|
|
||||||
|
# Expected sleep calls: 1.2 (proactive for 1st attempt), 5 (rate limit), 1.2 (proactive for 2nd attempt)
|
||||||
|
mock_time_sleep.assert_has_calls([call(1.2), call(5), call(1.2)], any_order=False)
|
||||||
|
mock_logger.warning.assert_called_once() # Check that a warning was logged for rate limiting
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.logger')
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.time.sleep')
|
||||||
|
@patch('slack_sdk.WebClient')
|
||||||
|
def test_not_in_channel_error(self, MockWebClient, mock_time_sleep, mock_logger):
|
||||||
|
mock_client_instance = MockWebClient.return_value
|
||||||
|
|
||||||
|
mock_error_response = Mock()
|
||||||
|
mock_error_response.status_code = 403 # Typical for not_in_channel, but data matters more
|
||||||
|
mock_error_response.data = {'ok': False, 'error': 'not_in_channel'}
|
||||||
|
|
||||||
|
# This error is now raised by the inner try-except, then caught by the outer one
|
||||||
|
mock_client_instance.conversations_history.side_effect = SlackApiError(
|
||||||
|
message="not_in_channel error",
|
||||||
|
response=mock_error_response
|
||||||
|
)
|
||||||
|
|
||||||
|
slack_history = SlackHistory(token="fake_token")
|
||||||
|
messages = slack_history.get_conversation_history(channel_id="C123")
|
||||||
|
|
||||||
|
self.assertEqual(messages, [])
|
||||||
|
mock_logger.warning.assert_called_with(
|
||||||
|
"Bot is not in channel 'C123'. Cannot fetch history. Please add the bot to this channel."
|
||||||
|
)
|
||||||
|
mock_time_sleep.assert_called_once_with(1.2) # Proactive delay before the API call
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.logger')
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.time.sleep')
|
||||||
|
@patch('slack_sdk.WebClient')
|
||||||
|
def test_other_slack_api_error_propagates(self, MockWebClient, mock_time_sleep, mock_logger):
|
||||||
|
mock_client_instance = MockWebClient.return_value
|
||||||
|
|
||||||
|
mock_error_response = Mock()
|
||||||
|
mock_error_response.status_code = 500
|
||||||
|
mock_error_response.data = {'ok': False, 'error': 'internal_error'}
|
||||||
|
original_error = SlackApiError(message="server error", response=mock_error_response)
|
||||||
|
|
||||||
|
mock_client_instance.conversations_history.side_effect = original_error
|
||||||
|
|
||||||
|
slack_history = SlackHistory(token="fake_token")
|
||||||
|
|
||||||
|
with self.assertRaises(SlackApiError) as context:
|
||||||
|
slack_history.get_conversation_history(channel_id="C123")
|
||||||
|
|
||||||
|
self.assertIn("Error retrieving history for channel C123", str(context.exception))
|
||||||
|
self.assertIs(context.exception.response, mock_error_response)
|
||||||
|
mock_time_sleep.assert_called_once_with(1.2) # Proactive delay
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.logger')
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.time.sleep')
|
||||||
|
@patch('slack_sdk.WebClient')
|
||||||
|
def test_general_exception_propagates(self, MockWebClient, mock_time_sleep, mock_logger):
|
||||||
|
mock_client_instance = MockWebClient.return_value
|
||||||
|
original_error = Exception("Something broke")
|
||||||
|
mock_client_instance.conversations_history.side_effect = original_error
|
||||||
|
|
||||||
|
slack_history = SlackHistory(token="fake_token")
|
||||||
|
|
||||||
|
with self.assertRaises(Exception) as context: # Check for generic Exception
|
||||||
|
slack_history.get_conversation_history(channel_id="C123")
|
||||||
|
|
||||||
|
self.assertIs(context.exception, original_error) # Should re-raise the original error
|
||||||
|
mock_logger.error.assert_called_once_with("Unexpected error in get_conversation_history for channel C123: Something broke")
|
||||||
|
mock_time_sleep.assert_called_once_with(1.2) # Proactive delay
|
||||||
|
|
||||||
|
class TestSlackHistoryGetUserInfo(unittest.TestCase):
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.logger')
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.time.sleep')
|
||||||
|
@patch('slack_sdk.WebClient')
|
||||||
|
def test_retry_after_logic(self, MockWebClient, mock_time_sleep, mock_logger):
|
||||||
|
mock_client_instance = MockWebClient.return_value
|
||||||
|
|
||||||
|
mock_error_response = Mock()
|
||||||
|
mock_error_response.status_code = 429
|
||||||
|
mock_error_response.headers = {'Retry-After': '3'} # Using 3 seconds for test
|
||||||
|
|
||||||
|
successful_user_data = {"id": "U123", "name": "testuser"}
|
||||||
|
|
||||||
|
mock_client_instance.users_info.side_effect = [
|
||||||
|
SlackApiError(message="ratelimited_userinfo", response=mock_error_response),
|
||||||
|
{"user": successful_user_data}
|
||||||
|
]
|
||||||
|
|
||||||
|
slack_history = SlackHistory(token="fake_token")
|
||||||
|
user_info = slack_history.get_user_info(user_id="U123")
|
||||||
|
|
||||||
|
self.assertEqual(user_info, successful_user_data)
|
||||||
|
|
||||||
|
# Assert that time.sleep was called for the rate limit
|
||||||
|
mock_time_sleep.assert_called_once_with(3)
|
||||||
|
mock_logger.warning.assert_called_once_with(
|
||||||
|
"Rate limited by Slack on users.info for user U123. Retrying after 3 seconds."
|
||||||
|
)
|
||||||
|
# Assert users_info was called twice (original + retry)
|
||||||
|
self.assertEqual(mock_client_instance.users_info.call_count, 2)
|
||||||
|
mock_client_instance.users_info.assert_has_calls([call(user="U123"), call(user="U123")])
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.logger')
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.time.sleep') # time.sleep might be called by other logic, but not expected here
|
||||||
|
@patch('slack_sdk.WebClient')
|
||||||
|
def test_other_slack_api_error_propagates(self, MockWebClient, mock_time_sleep, mock_logger):
|
||||||
|
mock_client_instance = MockWebClient.return_value
|
||||||
|
|
||||||
|
mock_error_response = Mock()
|
||||||
|
mock_error_response.status_code = 500 # Some other error
|
||||||
|
mock_error_response.data = {'ok': False, 'error': 'internal_server_error'}
|
||||||
|
original_error = SlackApiError(message="internal server error", response=mock_error_response)
|
||||||
|
|
||||||
|
mock_client_instance.users_info.side_effect = original_error
|
||||||
|
|
||||||
|
slack_history = SlackHistory(token="fake_token")
|
||||||
|
|
||||||
|
with self.assertRaises(SlackApiError) as context:
|
||||||
|
slack_history.get_user_info(user_id="U123")
|
||||||
|
|
||||||
|
# Check that the raised error is the one we expect
|
||||||
|
self.assertIn("Error retrieving user info for U123", str(context.exception))
|
||||||
|
self.assertIs(context.exception.response, mock_error_response)
|
||||||
|
mock_time_sleep.assert_not_called() # No rate limit sleep
|
||||||
|
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.logger')
|
||||||
|
@patch('surfsense_backend.app.connectors.slack_history.time.sleep')
|
||||||
|
@patch('slack_sdk.WebClient')
|
||||||
|
def test_general_exception_propagates(self, MockWebClient, mock_time_sleep, mock_logger):
|
||||||
|
mock_client_instance = MockWebClient.return_value
|
||||||
|
original_error = Exception("A very generic problem")
|
||||||
|
mock_client_instance.users_info.side_effect = original_error
|
||||||
|
|
||||||
|
slack_history = SlackHistory(token="fake_token")
|
||||||
|
|
||||||
|
with self.assertRaises(Exception) as context:
|
||||||
|
slack_history.get_user_info(user_id="U123")
|
||||||
|
|
||||||
|
self.assertIs(context.exception, original_error) # Check it's the exact same exception
|
||||||
|
mock_logger.error.assert_called_once_with(
|
||||||
|
"Unexpected error in get_user_info for user U123: A very generic problem"
|
||||||
|
)
|
||||||
|
mock_time_sleep.assert_not_called() # No rate limit sleep
|
||||||
|
|
@ -3,11 +3,7 @@ from datetime import datetime, timezone
|
||||||
from enum import Enum
|
from enum import Enum
|
||||||
|
|
||||||
from fastapi import Depends
|
from fastapi import Depends
|
||||||
from fastapi_users.db import (
|
|
||||||
SQLAlchemyBaseOAuthAccountTableUUID,
|
|
||||||
SQLAlchemyBaseUserTableUUID,
|
|
||||||
SQLAlchemyUserDatabase,
|
|
||||||
)
|
|
||||||
from pgvector.sqlalchemy import Vector
|
from pgvector.sqlalchemy import Vector
|
||||||
from sqlalchemy import (
|
from sqlalchemy import (
|
||||||
ARRAY,
|
ARRAY,
|
||||||
|
|
@ -30,6 +26,18 @@ from app.config import config
|
||||||
from app.retriver.chunks_hybrid_search import ChucksHybridSearchRetriever
|
from app.retriver.chunks_hybrid_search import ChucksHybridSearchRetriever
|
||||||
from app.retriver.documents_hybrid_search import DocumentHybridSearchRetriever
|
from app.retriver.documents_hybrid_search import DocumentHybridSearchRetriever
|
||||||
|
|
||||||
|
if config.AUTH_TYPE == "GOOGLE":
|
||||||
|
from fastapi_users.db import (
|
||||||
|
SQLAlchemyBaseOAuthAccountTableUUID,
|
||||||
|
SQLAlchemyBaseUserTableUUID,
|
||||||
|
SQLAlchemyUserDatabase,
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
from fastapi_users.db import (
|
||||||
|
SQLAlchemyBaseUserTableUUID,
|
||||||
|
SQLAlchemyUserDatabase,
|
||||||
|
)
|
||||||
|
|
||||||
DATABASE_URL = config.DATABASE_URL
|
DATABASE_URL = config.DATABASE_URL
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -44,8 +52,9 @@ class DocumentType(str, Enum):
|
||||||
LINEAR_CONNECTOR = "LINEAR_CONNECTOR"
|
LINEAR_CONNECTOR = "LINEAR_CONNECTOR"
|
||||||
|
|
||||||
class SearchSourceConnectorType(str, Enum):
|
class SearchSourceConnectorType(str, Enum):
|
||||||
SERPER_API = "SERPER_API"
|
SERPER_API = "SERPER_API" # NOT IMPLEMENTED YET : DON'T REMEMBER WHY : MOST PROBABLY BECAUSE WE NEED TO CRAWL THE RESULTS RETURNED BY IT
|
||||||
TAVILY_API = "TAVILY_API"
|
TAVILY_API = "TAVILY_API"
|
||||||
|
LINKUP_API = "LINKUP_API"
|
||||||
SLACK_CONNECTOR = "SLACK_CONNECTOR"
|
SLACK_CONNECTOR = "SLACK_CONNECTOR"
|
||||||
NOTION_CONNECTOR = "NOTION_CONNECTOR"
|
NOTION_CONNECTOR = "NOTION_CONNECTOR"
|
||||||
GITHUB_CONNECTOR = "GITHUB_CONNECTOR"
|
GITHUB_CONNECTOR = "GITHUB_CONNECTOR"
|
||||||
|
|
@ -75,7 +84,7 @@ class Chat(BaseModel, TimestampMixin):
|
||||||
__tablename__ = "chats"
|
__tablename__ = "chats"
|
||||||
|
|
||||||
type = Column(SQLAlchemyEnum(ChatType), nullable=False)
|
type = Column(SQLAlchemyEnum(ChatType), nullable=False)
|
||||||
title = Column(String(200), nullable=False, index=True)
|
title = Column(String, nullable=False, index=True)
|
||||||
initial_connectors = Column(ARRAY(String), nullable=True)
|
initial_connectors = Column(ARRAY(String), nullable=True)
|
||||||
messages = Column(JSON, nullable=False)
|
messages = Column(JSON, nullable=False)
|
||||||
|
|
||||||
|
|
@ -85,11 +94,12 @@ class Chat(BaseModel, TimestampMixin):
|
||||||
class Document(BaseModel, TimestampMixin):
|
class Document(BaseModel, TimestampMixin):
|
||||||
__tablename__ = "documents"
|
__tablename__ = "documents"
|
||||||
|
|
||||||
title = Column(String(200), nullable=False, index=True)
|
title = Column(String, nullable=False, index=True)
|
||||||
document_type = Column(SQLAlchemyEnum(DocumentType), nullable=False)
|
document_type = Column(SQLAlchemyEnum(DocumentType), nullable=False)
|
||||||
document_metadata = Column(JSON, nullable=True)
|
document_metadata = Column(JSON, nullable=True)
|
||||||
|
|
||||||
content = Column(Text, nullable=False)
|
content = Column(Text, nullable=False)
|
||||||
|
content_hash = Column(String, nullable=False, index=True, unique=True)
|
||||||
embedding = Column(Vector(config.embedding_model_instance.dimension))
|
embedding = Column(Vector(config.embedding_model_instance.dimension))
|
||||||
|
|
||||||
search_space_id = Column(Integer, ForeignKey("searchspaces.id", ondelete='CASCADE'), nullable=False)
|
search_space_id = Column(Integer, ForeignKey("searchspaces.id", ondelete='CASCADE'), nullable=False)
|
||||||
|
|
@ -108,9 +118,8 @@ class Chunk(BaseModel, TimestampMixin):
|
||||||
class Podcast(BaseModel, TimestampMixin):
|
class Podcast(BaseModel, TimestampMixin):
|
||||||
__tablename__ = "podcasts"
|
__tablename__ = "podcasts"
|
||||||
|
|
||||||
title = Column(String(200), nullable=False, index=True)
|
title = Column(String, nullable=False, index=True)
|
||||||
is_generated = Column(Boolean, nullable=False, default=False)
|
podcast_transcript = Column(JSON, nullable=False, default={})
|
||||||
podcast_content = Column(Text, nullable=False, default="")
|
|
||||||
file_location = Column(String(500), nullable=False, default="")
|
file_location = Column(String(500), nullable=False, default="")
|
||||||
|
|
||||||
search_space_id = Column(Integer, ForeignKey("searchspaces.id", ondelete='CASCADE'), nullable=False)
|
search_space_id = Column(Integer, ForeignKey("searchspaces.id", ondelete='CASCADE'), nullable=False)
|
||||||
|
|
@ -141,17 +150,22 @@ class SearchSourceConnector(BaseModel, TimestampMixin):
|
||||||
user_id = Column(UUID(as_uuid=True), ForeignKey("user.id", ondelete='CASCADE'), nullable=False)
|
user_id = Column(UUID(as_uuid=True), ForeignKey("user.id", ondelete='CASCADE'), nullable=False)
|
||||||
user = relationship("User", back_populates="search_source_connectors")
|
user = relationship("User", back_populates="search_source_connectors")
|
||||||
|
|
||||||
|
if config.AUTH_TYPE == "GOOGLE":
|
||||||
class OAuthAccount(SQLAlchemyBaseOAuthAccountTableUUID, Base):
|
class OAuthAccount(SQLAlchemyBaseOAuthAccountTableUUID, Base):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
|
|
||||||
class User(SQLAlchemyBaseUserTableUUID, Base):
|
class User(SQLAlchemyBaseUserTableUUID, Base):
|
||||||
oauth_accounts: Mapped[list[OAuthAccount]] = relationship(
|
oauth_accounts: Mapped[list[OAuthAccount]] = relationship(
|
||||||
"OAuthAccount", lazy="joined"
|
"OAuthAccount", lazy="joined"
|
||||||
)
|
)
|
||||||
search_spaces = relationship("SearchSpace", back_populates="user")
|
search_spaces = relationship("SearchSpace", back_populates="user")
|
||||||
search_source_connectors = relationship("SearchSourceConnector", back_populates="user")
|
search_source_connectors = relationship("SearchSourceConnector", back_populates="user")
|
||||||
|
else:
|
||||||
|
class User(SQLAlchemyBaseUserTableUUID, Base):
|
||||||
|
|
||||||
|
search_spaces = relationship("SearchSpace", back_populates="user")
|
||||||
|
search_source_connectors = relationship("SearchSourceConnector", back_populates="user")
|
||||||
|
|
||||||
|
|
||||||
engine = create_async_engine(DATABASE_URL)
|
engine = create_async_engine(DATABASE_URL)
|
||||||
|
|
@ -180,8 +194,12 @@ async def get_async_session() -> AsyncGenerator[AsyncSession, None]:
|
||||||
yield session
|
yield session
|
||||||
|
|
||||||
|
|
||||||
async def get_user_db(session: AsyncSession = Depends(get_async_session)):
|
if config.AUTH_TYPE == "GOOGLE":
|
||||||
yield SQLAlchemyUserDatabase(session, User, OAuthAccount)
|
async def get_user_db(session: AsyncSession = Depends(get_async_session)):
|
||||||
|
yield SQLAlchemyUserDatabase(session, User, OAuthAccount)
|
||||||
|
else:
|
||||||
|
async def get_user_db(session: AsyncSession = Depends(get_async_session)):
|
||||||
|
yield SQLAlchemyUserDatabase(session, User)
|
||||||
|
|
||||||
async def get_chucks_hybrid_search_retriever(session: AsyncSession = Depends(get_async_session)):
|
async def get_chucks_hybrid_search_retriever(session: AsyncSession = Depends(get_async_session)):
|
||||||
return ChucksHybridSearchRetriever(session)
|
return ChucksHybridSearchRetriever(session)
|
||||||
|
|
|
||||||
|
|
@ -113,8 +113,6 @@ class DocumentHybridSearchRetriever:
|
||||||
search_space_id: Optional search space ID to filter results
|
search_space_id: Optional search space ID to filter results
|
||||||
document_type: Optional document type to filter results (e.g., "FILE", "CRAWLED_URL")
|
document_type: Optional document type to filter results (e.g., "FILE", "CRAWLED_URL")
|
||||||
|
|
||||||
Returns:
|
|
||||||
List of dictionaries containing document data and relevance scores
|
|
||||||
"""
|
"""
|
||||||
from sqlalchemy import select, func, text
|
from sqlalchemy import select, func, text
|
||||||
from sqlalchemy.orm import joinedload
|
from sqlalchemy.orm import joinedload
|
||||||
|
|
@ -224,10 +222,22 @@ class DocumentHybridSearchRetriever:
|
||||||
# Convert to serializable dictionaries
|
# Convert to serializable dictionaries
|
||||||
serialized_results = []
|
serialized_results = []
|
||||||
for document, score in documents_with_scores:
|
for document, score in documents_with_scores:
|
||||||
|
# Fetch associated chunks for this document
|
||||||
|
from sqlalchemy import select
|
||||||
|
from app.db import Chunk
|
||||||
|
|
||||||
|
chunks_query = select(Chunk).where(Chunk.document_id == document.id).order_by(Chunk.id)
|
||||||
|
chunks_result = await self.db_session.execute(chunks_query)
|
||||||
|
chunks = chunks_result.scalars().all()
|
||||||
|
|
||||||
|
# Concatenate chunks content
|
||||||
|
concatenated_chunks_content = " ".join([chunk.content for chunk in chunks]) if chunks else document.content
|
||||||
|
|
||||||
serialized_results.append({
|
serialized_results.append({
|
||||||
"document_id": document.id,
|
"document_id": document.id,
|
||||||
"title": document.title,
|
"title": document.title,
|
||||||
"content": document.content,
|
"content": document.content,
|
||||||
|
"chunks_content": concatenated_chunks_content,
|
||||||
"document_type": document.document_type.value if hasattr(document, 'document_type') else None,
|
"document_type": document.document_type.value if hasattr(document, 'document_type') else None,
|
||||||
"metadata": document.document_metadata,
|
"metadata": document.document_metadata,
|
||||||
"score": float(score), # Ensure score is a Python float
|
"score": float(score), # Ensure score is a Python float
|
||||||
|
|
|
||||||
|
|
@ -10,6 +10,8 @@ from fastapi.responses import StreamingResponse
|
||||||
from sqlalchemy.exc import IntegrityError, OperationalError
|
from sqlalchemy.exc import IntegrityError, OperationalError
|
||||||
from sqlalchemy.ext.asyncio import AsyncSession
|
from sqlalchemy.ext.asyncio import AsyncSession
|
||||||
from sqlalchemy.future import select
|
from sqlalchemy.future import select
|
||||||
|
from langchain.schema import HumanMessage, AIMessage
|
||||||
|
|
||||||
|
|
||||||
router = APIRouter()
|
router = APIRouter()
|
||||||
|
|
||||||
|
|
@ -20,14 +22,16 @@ async def handle_chat_data(
|
||||||
user: User = Depends(current_active_user)
|
user: User = Depends(current_active_user)
|
||||||
):
|
):
|
||||||
messages = request.messages
|
messages = request.messages
|
||||||
if messages[-1].role != "user":
|
if messages[-1]['role'] != "user":
|
||||||
raise HTTPException(
|
raise HTTPException(
|
||||||
status_code=400, detail="Last message must be a user message")
|
status_code=400, detail="Last message must be a user message")
|
||||||
|
|
||||||
user_query = messages[-1].content
|
user_query = messages[-1]['content']
|
||||||
search_space_id = request.data.get('search_space_id')
|
search_space_id = request.data.get('search_space_id')
|
||||||
research_mode: str = request.data.get('research_mode')
|
research_mode: str = request.data.get('research_mode')
|
||||||
selected_connectors: List[str] = request.data.get('selected_connectors')
|
selected_connectors: List[str] = request.data.get('selected_connectors')
|
||||||
|
|
||||||
|
search_mode_str = request.data.get('search_mode', "CHUNKS")
|
||||||
|
|
||||||
# Convert search_space_id to integer if it's a string
|
# Convert search_space_id to integer if it's a string
|
||||||
if search_space_id and isinstance(search_space_id, str):
|
if search_space_id and isinstance(search_space_id, str):
|
||||||
|
|
@ -43,6 +47,21 @@ async def handle_chat_data(
|
||||||
except HTTPException:
|
except HTTPException:
|
||||||
raise HTTPException(
|
raise HTTPException(
|
||||||
status_code=403, detail="You don't have access to this search space")
|
status_code=403, detail="You don't have access to this search space")
|
||||||
|
|
||||||
|
langchain_chat_history = []
|
||||||
|
for message in messages[:-1]:
|
||||||
|
if message['role'] == "user":
|
||||||
|
langchain_chat_history.append(HumanMessage(content=message['content']))
|
||||||
|
elif message['role'] == "assistant":
|
||||||
|
# Last annotation type will always be "ANSWER" here
|
||||||
|
answer_annotation = message['annotations'][-1]
|
||||||
|
answer_text = ""
|
||||||
|
if answer_annotation['type'] == "ANSWER":
|
||||||
|
answer_text = answer_annotation['content']
|
||||||
|
# If content is a list, join it into a single string
|
||||||
|
if isinstance(answer_text, list):
|
||||||
|
answer_text = "\n".join(answer_text)
|
||||||
|
langchain_chat_history.append(AIMessage(content=answer_text))
|
||||||
|
|
||||||
response = StreamingResponse(stream_connector_search_results(
|
response = StreamingResponse(stream_connector_search_results(
|
||||||
user_query,
|
user_query,
|
||||||
|
|
@ -50,7 +69,9 @@ async def handle_chat_data(
|
||||||
search_space_id, # Already converted to int in lines 32-37
|
search_space_id, # Already converted to int in lines 32-37
|
||||||
session,
|
session,
|
||||||
research_mode,
|
research_mode,
|
||||||
selected_connectors
|
selected_connectors,
|
||||||
|
langchain_chat_history,
|
||||||
|
search_mode_str
|
||||||
))
|
))
|
||||||
response.headers['x-vercel-ai-data-stream'] = 'v1'
|
response.headers['x-vercel-ai-data-stream'] = 'v1'
|
||||||
return response
|
return response
|
||||||
|
|
|
||||||
|
|
@ -1,3 +1,4 @@
|
||||||
|
from litellm import atranscription
|
||||||
from fastapi import APIRouter, Depends, BackgroundTasks, UploadFile, Form, HTTPException
|
from fastapi import APIRouter, Depends, BackgroundTasks, UploadFile, Form, HTTPException
|
||||||
from sqlalchemy.ext.asyncio import AsyncSession
|
from sqlalchemy.ext.asyncio import AsyncSession
|
||||||
from sqlalchemy.future import select
|
from sqlalchemy.future import select
|
||||||
|
|
@ -6,7 +7,8 @@ from app.db import get_async_session, User, SearchSpace, Document, DocumentType
|
||||||
from app.schemas import DocumentsCreate, DocumentUpdate, DocumentRead
|
from app.schemas import DocumentsCreate, DocumentUpdate, DocumentRead
|
||||||
from app.users import current_active_user
|
from app.users import current_active_user
|
||||||
from app.utils.check_ownership import check_ownership
|
from app.utils.check_ownership import check_ownership
|
||||||
from app.tasks.background_tasks import add_extension_received_document, add_received_file_document, add_crawled_url_document, add_youtube_video_document
|
from app.tasks.background_tasks import add_received_markdown_file_document, add_extension_received_document, add_received_file_document_using_unstructured, add_crawled_url_document, add_youtube_video_document, add_received_file_document_using_llamacloud
|
||||||
|
from app.config import config as app_config
|
||||||
# Force asyncio to use standard event loop before unstructured imports
|
# Force asyncio to use standard event loop before unstructured imports
|
||||||
import asyncio
|
import asyncio
|
||||||
try:
|
try:
|
||||||
|
|
@ -15,12 +17,11 @@ except RuntimeError:
|
||||||
pass
|
pass
|
||||||
import os
|
import os
|
||||||
os.environ["UNSTRUCTURED_HAS_PATCHED_LOOP"] = "1"
|
os.environ["UNSTRUCTURED_HAS_PATCHED_LOOP"] = "1"
|
||||||
from langchain_unstructured import UnstructuredLoader
|
|
||||||
from app.config import config
|
|
||||||
import json
|
|
||||||
|
|
||||||
router = APIRouter()
|
router = APIRouter()
|
||||||
|
|
||||||
|
|
||||||
@router.post("/documents/")
|
@router.post("/documents/")
|
||||||
async def create_documents(
|
async def create_documents(
|
||||||
request: DocumentsCreate,
|
request: DocumentsCreate,
|
||||||
|
|
@ -31,19 +32,19 @@ async def create_documents(
|
||||||
try:
|
try:
|
||||||
# Check if the user owns the search space
|
# Check if the user owns the search space
|
||||||
await check_ownership(session, SearchSpace, request.search_space_id, user)
|
await check_ownership(session, SearchSpace, request.search_space_id, user)
|
||||||
|
|
||||||
if request.document_type == DocumentType.EXTENSION:
|
if request.document_type == DocumentType.EXTENSION:
|
||||||
for individual_document in request.content:
|
for individual_document in request.content:
|
||||||
fastapi_background_tasks.add_task(
|
fastapi_background_tasks.add_task(
|
||||||
process_extension_document_with_new_session,
|
process_extension_document_with_new_session,
|
||||||
individual_document,
|
individual_document,
|
||||||
request.search_space_id
|
request.search_space_id
|
||||||
)
|
)
|
||||||
elif request.document_type == DocumentType.CRAWLED_URL:
|
elif request.document_type == DocumentType.CRAWLED_URL:
|
||||||
for url in request.content:
|
for url in request.content:
|
||||||
fastapi_background_tasks.add_task(
|
fastapi_background_tasks.add_task(
|
||||||
process_crawled_url_with_new_session,
|
process_crawled_url_with_new_session,
|
||||||
url,
|
url,
|
||||||
request.search_space_id
|
request.search_space_id
|
||||||
)
|
)
|
||||||
elif request.document_type == DocumentType.YOUTUBE_VIDEO:
|
elif request.document_type == DocumentType.YOUTUBE_VIDEO:
|
||||||
|
|
@ -58,7 +59,7 @@ async def create_documents(
|
||||||
status_code=400,
|
status_code=400,
|
||||||
detail="Invalid document type"
|
detail="Invalid document type"
|
||||||
)
|
)
|
||||||
|
|
||||||
await session.commit()
|
await session.commit()
|
||||||
return {"message": "Documents processed successfully"}
|
return {"message": "Documents processed successfully"}
|
||||||
except HTTPException:
|
except HTTPException:
|
||||||
|
|
@ -70,6 +71,7 @@ async def create_documents(
|
||||||
detail=f"Failed to process documents: {str(e)}"
|
detail=f"Failed to process documents: {str(e)}"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@router.post("/documents/fileupload")
|
@router.post("/documents/fileupload")
|
||||||
async def create_documents(
|
async def create_documents(
|
||||||
files: list[UploadFile],
|
files: list[UploadFile],
|
||||||
|
|
@ -80,27 +82,26 @@ async def create_documents(
|
||||||
):
|
):
|
||||||
try:
|
try:
|
||||||
await check_ownership(session, SearchSpace, search_space_id, user)
|
await check_ownership(session, SearchSpace, search_space_id, user)
|
||||||
|
|
||||||
if not files:
|
if not files:
|
||||||
raise HTTPException(status_code=400, detail="No files provided")
|
raise HTTPException(status_code=400, detail="No files provided")
|
||||||
|
|
||||||
for file in files:
|
for file in files:
|
||||||
try:
|
try:
|
||||||
# Save file to a temporary location to avoid stream issues
|
# Save file to a temporary location to avoid stream issues
|
||||||
import tempfile
|
import tempfile
|
||||||
import aiofiles
|
import aiofiles
|
||||||
import os
|
import os
|
||||||
|
|
||||||
# Create temp file
|
# Create temp file
|
||||||
with tempfile.NamedTemporaryFile(delete=False, suffix=os.path.splitext(file.filename)[1]) as temp_file:
|
with tempfile.NamedTemporaryFile(delete=False, suffix=os.path.splitext(file.filename)[1]) as temp_file:
|
||||||
temp_path = temp_file.name
|
temp_path = temp_file.name
|
||||||
|
|
||||||
# Write uploaded file to temp file
|
# Write uploaded file to temp file
|
||||||
content = await file.read()
|
content = await file.read()
|
||||||
with open(temp_path, "wb") as f:
|
with open(temp_path, "wb") as f:
|
||||||
f.write(content)
|
f.write(content)
|
||||||
|
|
||||||
# Process in background to avoid uvloop conflicts
|
|
||||||
fastapi_background_tasks.add_task(
|
fastapi_background_tasks.add_task(
|
||||||
process_file_in_background_with_new_session,
|
process_file_in_background_with_new_session,
|
||||||
temp_path,
|
temp_path,
|
||||||
|
|
@ -112,7 +113,7 @@ async def create_documents(
|
||||||
status_code=422,
|
status_code=422,
|
||||||
detail=f"Failed to process file {file.filename}: {str(e)}"
|
detail=f"Failed to process file {file.filename}: {str(e)}"
|
||||||
)
|
)
|
||||||
|
|
||||||
await session.commit()
|
await session.commit()
|
||||||
return {"message": "Files uploaded for processing"}
|
return {"message": "Files uploaded for processing"}
|
||||||
except HTTPException:
|
except HTTPException:
|
||||||
|
|
@ -132,40 +133,136 @@ async def process_file_in_background(
|
||||||
session: AsyncSession
|
session: AsyncSession
|
||||||
):
|
):
|
||||||
try:
|
try:
|
||||||
# Use synchronous unstructured API to avoid event loop issues
|
# Check if the file is a markdown file
|
||||||
from langchain_community.document_loaders import UnstructuredFileLoader
|
if filename.lower().endswith(('.md', '.markdown')):
|
||||||
|
# For markdown files, read the content directly
|
||||||
# Process the file
|
with open(file_path, 'r', encoding='utf-8') as f:
|
||||||
loader = UnstructuredFileLoader(
|
markdown_content = f.read()
|
||||||
file_path,
|
|
||||||
mode="elements",
|
# Clean up the temp file
|
||||||
post_processors=[],
|
import os
|
||||||
languages=["eng"],
|
try:
|
||||||
include_orig_elements=False,
|
os.unlink(file_path)
|
||||||
include_metadata=False,
|
except:
|
||||||
strategy="auto",
|
pass
|
||||||
)
|
|
||||||
|
# Process markdown directly through specialized function
|
||||||
docs = loader.load()
|
await add_received_markdown_file_document(
|
||||||
|
session,
|
||||||
# Clean up the temp file
|
filename,
|
||||||
import os
|
markdown_content,
|
||||||
try:
|
search_space_id
|
||||||
os.unlink(file_path)
|
)
|
||||||
except:
|
# Check if the file is an audio file
|
||||||
pass
|
elif filename.lower().endswith(('.mp3', '.mp4', '.mpeg', '.mpga', '.m4a', '.wav', '.webm')):
|
||||||
|
# Open the audio file for transcription
|
||||||
# Pass the documents to the existing background task
|
with open(file_path, "rb") as audio_file:
|
||||||
await add_received_file_document(
|
# Use LiteLLM for audio transcription
|
||||||
session,
|
if app_config.STT_SERVICE_API_BASE:
|
||||||
filename,
|
transcription_response = await atranscription(
|
||||||
docs,
|
model=app_config.STT_SERVICE,
|
||||||
search_space_id
|
file=audio_file,
|
||||||
)
|
api_base=app_config.STT_SERVICE_API_BASE
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
transcription_response = await atranscription(
|
||||||
|
model=app_config.STT_SERVICE,
|
||||||
|
file=audio_file
|
||||||
|
)
|
||||||
|
|
||||||
|
# Extract the transcribed text
|
||||||
|
transcribed_text = transcription_response.get("text", "")
|
||||||
|
|
||||||
|
# Add metadata about the transcription
|
||||||
|
transcribed_text = f"# Transcription of {filename}\n\n{transcribed_text}"
|
||||||
|
|
||||||
|
# Clean up the temp file
|
||||||
|
try:
|
||||||
|
os.unlink(file_path)
|
||||||
|
except:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Process transcription as markdown document
|
||||||
|
await add_received_markdown_file_document(
|
||||||
|
session,
|
||||||
|
filename,
|
||||||
|
transcribed_text,
|
||||||
|
search_space_id
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
if app_config.ETL_SERVICE == "UNSTRUCTURED":
|
||||||
|
from langchain_unstructured import UnstructuredLoader
|
||||||
|
|
||||||
|
# Process the file
|
||||||
|
loader = UnstructuredLoader(
|
||||||
|
file_path,
|
||||||
|
mode="elements",
|
||||||
|
post_processors=[],
|
||||||
|
languages=["eng"],
|
||||||
|
include_orig_elements=False,
|
||||||
|
include_metadata=False,
|
||||||
|
strategy="auto",
|
||||||
|
)
|
||||||
|
|
||||||
|
docs = await loader.aload()
|
||||||
|
|
||||||
|
# Clean up the temp file
|
||||||
|
import os
|
||||||
|
try:
|
||||||
|
os.unlink(file_path)
|
||||||
|
except:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Pass the documents to the existing background task
|
||||||
|
await add_received_file_document_using_unstructured(
|
||||||
|
session,
|
||||||
|
filename,
|
||||||
|
docs,
|
||||||
|
search_space_id
|
||||||
|
)
|
||||||
|
elif app_config.ETL_SERVICE == "LLAMACLOUD":
|
||||||
|
from llama_cloud_services import LlamaParse
|
||||||
|
from llama_cloud_services.parse.utils import ResultType
|
||||||
|
|
||||||
|
|
||||||
|
# Create LlamaParse parser instance
|
||||||
|
parser = LlamaParse(
|
||||||
|
api_key=app_config.LLAMA_CLOUD_API_KEY,
|
||||||
|
num_workers=1, # Use single worker for file processing
|
||||||
|
verbose=True,
|
||||||
|
language="en",
|
||||||
|
result_type=ResultType.MD
|
||||||
|
)
|
||||||
|
|
||||||
|
# Parse the file asynchronously
|
||||||
|
result = await parser.aparse(file_path)
|
||||||
|
|
||||||
|
# Clean up the temp file
|
||||||
|
import os
|
||||||
|
try:
|
||||||
|
os.unlink(file_path)
|
||||||
|
except:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Get markdown documents from the result
|
||||||
|
markdown_documents = await result.aget_markdown_documents(split_by_page=False)
|
||||||
|
|
||||||
|
for doc in markdown_documents:
|
||||||
|
# Extract text content from the markdown documents
|
||||||
|
markdown_content = doc.text
|
||||||
|
|
||||||
|
# Process the documents using our LlamaCloud background task
|
||||||
|
await add_received_file_document_using_llamacloud(
|
||||||
|
session,
|
||||||
|
filename,
|
||||||
|
llamacloud_markdown_document=markdown_content,
|
||||||
|
search_space_id=search_space_id
|
||||||
|
)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
import logging
|
import logging
|
||||||
logging.error(f"Error processing file in background: {str(e)}")
|
logging.error(f"Error processing file in background: {str(e)}")
|
||||||
|
|
||||||
|
|
||||||
@router.get("/documents/", response_model=List[DocumentRead])
|
@router.get("/documents/", response_model=List[DocumentRead])
|
||||||
async def read_documents(
|
async def read_documents(
|
||||||
skip: int = 0,
|
skip: int = 0,
|
||||||
|
|
@ -175,17 +272,18 @@ async def read_documents(
|
||||||
user: User = Depends(current_active_user)
|
user: User = Depends(current_active_user)
|
||||||
):
|
):
|
||||||
try:
|
try:
|
||||||
query = select(Document).join(SearchSpace).filter(SearchSpace.user_id == user.id)
|
query = select(Document).join(SearchSpace).filter(
|
||||||
|
SearchSpace.user_id == user.id)
|
||||||
|
|
||||||
# Filter by search_space_id if provided
|
# Filter by search_space_id if provided
|
||||||
if search_space_id is not None:
|
if search_space_id is not None:
|
||||||
query = query.filter(Document.search_space_id == search_space_id)
|
query = query.filter(Document.search_space_id == search_space_id)
|
||||||
|
|
||||||
result = await session.execute(
|
result = await session.execute(
|
||||||
query.offset(skip).limit(limit)
|
query.offset(skip).limit(limit)
|
||||||
)
|
)
|
||||||
db_documents = result.scalars().all()
|
db_documents = result.scalars().all()
|
||||||
|
|
||||||
# Convert database objects to API-friendly format
|
# Convert database objects to API-friendly format
|
||||||
api_documents = []
|
api_documents = []
|
||||||
for doc in db_documents:
|
for doc in db_documents:
|
||||||
|
|
@ -198,7 +296,7 @@ async def read_documents(
|
||||||
created_at=doc.created_at,
|
created_at=doc.created_at,
|
||||||
search_space_id=doc.search_space_id
|
search_space_id=doc.search_space_id
|
||||||
))
|
))
|
||||||
|
|
||||||
return api_documents
|
return api_documents
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
raise HTTPException(
|
raise HTTPException(
|
||||||
|
|
@ -206,6 +304,7 @@ async def read_documents(
|
||||||
detail=f"Failed to fetch documents: {str(e)}"
|
detail=f"Failed to fetch documents: {str(e)}"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@router.get("/documents/{document_id}", response_model=DocumentRead)
|
@router.get("/documents/{document_id}", response_model=DocumentRead)
|
||||||
async def read_document(
|
async def read_document(
|
||||||
document_id: int,
|
document_id: int,
|
||||||
|
|
@ -219,13 +318,13 @@ async def read_document(
|
||||||
.filter(Document.id == document_id, SearchSpace.user_id == user.id)
|
.filter(Document.id == document_id, SearchSpace.user_id == user.id)
|
||||||
)
|
)
|
||||||
document = result.scalars().first()
|
document = result.scalars().first()
|
||||||
|
|
||||||
if not document:
|
if not document:
|
||||||
raise HTTPException(
|
raise HTTPException(
|
||||||
status_code=404,
|
status_code=404,
|
||||||
detail=f"Document with id {document_id} not found"
|
detail=f"Document with id {document_id} not found"
|
||||||
)
|
)
|
||||||
|
|
||||||
# Convert database object to API-friendly format
|
# Convert database object to API-friendly format
|
||||||
return DocumentRead(
|
return DocumentRead(
|
||||||
id=document.id,
|
id=document.id,
|
||||||
|
|
@ -242,6 +341,7 @@ async def read_document(
|
||||||
detail=f"Failed to fetch document: {str(e)}"
|
detail=f"Failed to fetch document: {str(e)}"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@router.put("/documents/{document_id}", response_model=DocumentRead)
|
@router.put("/documents/{document_id}", response_model=DocumentRead)
|
||||||
async def update_document(
|
async def update_document(
|
||||||
document_id: int,
|
document_id: int,
|
||||||
|
|
@ -257,19 +357,19 @@ async def update_document(
|
||||||
.filter(Document.id == document_id, SearchSpace.user_id == user.id)
|
.filter(Document.id == document_id, SearchSpace.user_id == user.id)
|
||||||
)
|
)
|
||||||
db_document = result.scalars().first()
|
db_document = result.scalars().first()
|
||||||
|
|
||||||
if not db_document:
|
if not db_document:
|
||||||
raise HTTPException(
|
raise HTTPException(
|
||||||
status_code=404,
|
status_code=404,
|
||||||
detail=f"Document with id {document_id} not found"
|
detail=f"Document with id {document_id} not found"
|
||||||
)
|
)
|
||||||
|
|
||||||
update_data = document_update.model_dump(exclude_unset=True)
|
update_data = document_update.model_dump(exclude_unset=True)
|
||||||
for key, value in update_data.items():
|
for key, value in update_data.items():
|
||||||
setattr(db_document, key, value)
|
setattr(db_document, key, value)
|
||||||
await session.commit()
|
await session.commit()
|
||||||
await session.refresh(db_document)
|
await session.refresh(db_document)
|
||||||
|
|
||||||
# Convert to DocumentRead for response
|
# Convert to DocumentRead for response
|
||||||
return DocumentRead(
|
return DocumentRead(
|
||||||
id=db_document.id,
|
id=db_document.id,
|
||||||
|
|
@ -289,6 +389,7 @@ async def update_document(
|
||||||
detail=f"Failed to update document: {str(e)}"
|
detail=f"Failed to update document: {str(e)}"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@router.delete("/documents/{document_id}", response_model=dict)
|
@router.delete("/documents/{document_id}", response_model=dict)
|
||||||
async def delete_document(
|
async def delete_document(
|
||||||
document_id: int,
|
document_id: int,
|
||||||
|
|
@ -303,13 +404,13 @@ async def delete_document(
|
||||||
.filter(Document.id == document_id, SearchSpace.user_id == user.id)
|
.filter(Document.id == document_id, SearchSpace.user_id == user.id)
|
||||||
)
|
)
|
||||||
document = result.scalars().first()
|
document = result.scalars().first()
|
||||||
|
|
||||||
if not document:
|
if not document:
|
||||||
raise HTTPException(
|
raise HTTPException(
|
||||||
status_code=404,
|
status_code=404,
|
||||||
detail=f"Document with id {document_id} not found"
|
detail=f"Document with id {document_id} not found"
|
||||||
)
|
)
|
||||||
|
|
||||||
await session.delete(document)
|
await session.delete(document)
|
||||||
await session.commit()
|
await session.commit()
|
||||||
return {"message": "Document deleted successfully"}
|
return {"message": "Document deleted successfully"}
|
||||||
|
|
@ -320,16 +421,16 @@ async def delete_document(
|
||||||
raise HTTPException(
|
raise HTTPException(
|
||||||
status_code=500,
|
status_code=500,
|
||||||
detail=f"Failed to delete document: {str(e)}"
|
detail=f"Failed to delete document: {str(e)}"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
async def process_extension_document_with_new_session(
|
async def process_extension_document_with_new_session(
|
||||||
individual_document,
|
individual_document,
|
||||||
search_space_id: int
|
search_space_id: int
|
||||||
):
|
):
|
||||||
"""Create a new session and process extension document."""
|
"""Create a new session and process extension document."""
|
||||||
from app.db import async_session_maker
|
from app.db import async_session_maker
|
||||||
|
|
||||||
async with async_session_maker() as session:
|
async with async_session_maker() as session:
|
||||||
try:
|
try:
|
||||||
await add_extension_received_document(session, individual_document, search_space_id)
|
await add_extension_received_document(session, individual_document, search_space_id)
|
||||||
|
|
@ -337,13 +438,14 @@ async def process_extension_document_with_new_session(
|
||||||
import logging
|
import logging
|
||||||
logging.error(f"Error processing extension document: {str(e)}")
|
logging.error(f"Error processing extension document: {str(e)}")
|
||||||
|
|
||||||
|
|
||||||
async def process_crawled_url_with_new_session(
|
async def process_crawled_url_with_new_session(
|
||||||
url: str,
|
url: str,
|
||||||
search_space_id: int
|
search_space_id: int
|
||||||
):
|
):
|
||||||
"""Create a new session and process crawled URL."""
|
"""Create a new session and process crawled URL."""
|
||||||
from app.db import async_session_maker
|
from app.db import async_session_maker
|
||||||
|
|
||||||
async with async_session_maker() as session:
|
async with async_session_maker() as session:
|
||||||
try:
|
try:
|
||||||
await add_crawled_url_document(session, url, search_space_id)
|
await add_crawled_url_document(session, url, search_space_id)
|
||||||
|
|
@ -351,6 +453,7 @@ async def process_crawled_url_with_new_session(
|
||||||
import logging
|
import logging
|
||||||
logging.error(f"Error processing crawled URL: {str(e)}")
|
logging.error(f"Error processing crawled URL: {str(e)}")
|
||||||
|
|
||||||
|
|
||||||
async def process_file_in_background_with_new_session(
|
async def process_file_in_background_with_new_session(
|
||||||
file_path: str,
|
file_path: str,
|
||||||
filename: str,
|
filename: str,
|
||||||
|
|
@ -358,17 +461,18 @@ async def process_file_in_background_with_new_session(
|
||||||
):
|
):
|
||||||
"""Create a new session and process file."""
|
"""Create a new session and process file."""
|
||||||
from app.db import async_session_maker
|
from app.db import async_session_maker
|
||||||
|
|
||||||
async with async_session_maker() as session:
|
async with async_session_maker() as session:
|
||||||
await process_file_in_background(file_path, filename, search_space_id, session)
|
await process_file_in_background(file_path, filename, search_space_id, session)
|
||||||
|
|
||||||
|
|
||||||
async def process_youtube_video_with_new_session(
|
async def process_youtube_video_with_new_session(
|
||||||
url: str,
|
url: str,
|
||||||
search_space_id: int
|
search_space_id: int
|
||||||
):
|
):
|
||||||
"""Create a new session and process YouTube video."""
|
"""Create a new session and process YouTube video."""
|
||||||
from app.db import async_session_maker
|
from app.db import async_session_maker
|
||||||
|
|
||||||
async with async_session_maker() as session:
|
async with async_session_maker() as session:
|
||||||
try:
|
try:
|
||||||
await add_youtube_video_document(session, url, search_space_id)
|
await add_youtube_video_document(session, url, search_space_id)
|
||||||
|
|
@ -376,3 +480,4 @@ async def process_youtube_video_with_new_session(
|
||||||
import logging
|
import logging
|
||||||
logging.error(f"Error processing YouTube video: {str(e)}")
|
logging.error(f"Error processing YouTube video: {str(e)}")
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,12 +1,16 @@
|
||||||
from fastapi import APIRouter, Depends, HTTPException
|
from fastapi import APIRouter, BackgroundTasks, Depends, HTTPException
|
||||||
from sqlalchemy.ext.asyncio import AsyncSession
|
from sqlalchemy.ext.asyncio import AsyncSession
|
||||||
from sqlalchemy.future import select
|
from sqlalchemy.future import select
|
||||||
from sqlalchemy.exc import IntegrityError, SQLAlchemyError
|
from sqlalchemy.exc import IntegrityError, SQLAlchemyError
|
||||||
from typing import List
|
from typing import List
|
||||||
from app.db import get_async_session, User, SearchSpace, Podcast
|
from app.db import get_async_session, User, SearchSpace, Podcast, Chat
|
||||||
from app.schemas import PodcastCreate, PodcastUpdate, PodcastRead
|
from app.schemas import PodcastCreate, PodcastUpdate, PodcastRead, PodcastGenerateRequest
|
||||||
from app.users import current_active_user
|
from app.users import current_active_user
|
||||||
from app.utils.check_ownership import check_ownership
|
from app.utils.check_ownership import check_ownership
|
||||||
|
from app.tasks.podcast_tasks import generate_chat_podcast
|
||||||
|
from fastapi.responses import StreamingResponse
|
||||||
|
import os
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
router = APIRouter()
|
router = APIRouter()
|
||||||
|
|
||||||
|
|
@ -119,4 +123,121 @@ async def delete_podcast(
|
||||||
raise he
|
raise he
|
||||||
except SQLAlchemyError:
|
except SQLAlchemyError:
|
||||||
await session.rollback()
|
await session.rollback()
|
||||||
raise HTTPException(status_code=500, detail="Database error occurred while deleting podcast")
|
raise HTTPException(status_code=500, detail="Database error occurred while deleting podcast")
|
||||||
|
|
||||||
|
async def generate_chat_podcast_with_new_session(
|
||||||
|
chat_id: int,
|
||||||
|
search_space_id: int,
|
||||||
|
podcast_title: str = "SurfSense Podcast"
|
||||||
|
):
|
||||||
|
"""Create a new session and process chat podcast generation."""
|
||||||
|
from app.db import async_session_maker
|
||||||
|
|
||||||
|
async with async_session_maker() as session:
|
||||||
|
try:
|
||||||
|
await generate_chat_podcast(session, chat_id, search_space_id, podcast_title)
|
||||||
|
except Exception as e:
|
||||||
|
import logging
|
||||||
|
logging.error(f"Error generating podcast from chat: {str(e)}")
|
||||||
|
|
||||||
|
@router.post("/podcasts/generate/")
|
||||||
|
async def generate_podcast(
|
||||||
|
request: PodcastGenerateRequest,
|
||||||
|
session: AsyncSession = Depends(get_async_session),
|
||||||
|
user: User = Depends(current_active_user),
|
||||||
|
fastapi_background_tasks: BackgroundTasks = BackgroundTasks()
|
||||||
|
):
|
||||||
|
try:
|
||||||
|
# Check if the user owns the search space
|
||||||
|
await check_ownership(session, SearchSpace, request.search_space_id, user)
|
||||||
|
|
||||||
|
if request.type == "CHAT":
|
||||||
|
# Verify that all chat IDs belong to this user and search space
|
||||||
|
query = select(Chat).filter(
|
||||||
|
Chat.id.in_(request.ids),
|
||||||
|
Chat.search_space_id == request.search_space_id
|
||||||
|
).join(SearchSpace).filter(SearchSpace.user_id == user.id)
|
||||||
|
|
||||||
|
result = await session.execute(query)
|
||||||
|
valid_chats = result.scalars().all()
|
||||||
|
valid_chat_ids = [chat.id for chat in valid_chats]
|
||||||
|
|
||||||
|
# If any requested ID is not in valid IDs, raise error immediately
|
||||||
|
if len(valid_chat_ids) != len(request.ids):
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=403,
|
||||||
|
detail="One or more chat IDs do not belong to this user or search space"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Only add a single task with the first chat ID
|
||||||
|
for chat_id in valid_chat_ids:
|
||||||
|
fastapi_background_tasks.add_task(
|
||||||
|
generate_chat_podcast_with_new_session,
|
||||||
|
chat_id,
|
||||||
|
request.search_space_id,
|
||||||
|
request.podcast_title
|
||||||
|
)
|
||||||
|
|
||||||
|
return {
|
||||||
|
"message": "Podcast generation started",
|
||||||
|
}
|
||||||
|
except HTTPException as he:
|
||||||
|
raise he
|
||||||
|
except IntegrityError as e:
|
||||||
|
await session.rollback()
|
||||||
|
raise HTTPException(status_code=400, detail="Podcast generation failed due to constraint violation")
|
||||||
|
except SQLAlchemyError as e:
|
||||||
|
await session.rollback()
|
||||||
|
raise HTTPException(status_code=500, detail="Database error occurred while generating podcast")
|
||||||
|
except Exception as e:
|
||||||
|
await session.rollback()
|
||||||
|
raise HTTPException(status_code=500, detail=f"An unexpected error occurred: {str(e)}")
|
||||||
|
|
||||||
|
@router.get("/podcasts/{podcast_id}/stream")
|
||||||
|
async def stream_podcast(
|
||||||
|
podcast_id: int,
|
||||||
|
session: AsyncSession = Depends(get_async_session),
|
||||||
|
user: User = Depends(current_active_user)
|
||||||
|
):
|
||||||
|
"""Stream a podcast audio file."""
|
||||||
|
try:
|
||||||
|
# Get the podcast and check if user has access
|
||||||
|
result = await session.execute(
|
||||||
|
select(Podcast)
|
||||||
|
.join(SearchSpace)
|
||||||
|
.filter(Podcast.id == podcast_id, SearchSpace.user_id == user.id)
|
||||||
|
)
|
||||||
|
podcast = result.scalars().first()
|
||||||
|
|
||||||
|
if not podcast:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=404,
|
||||||
|
detail="Podcast not found or you don't have permission to access it"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Get the file path
|
||||||
|
file_path = podcast.file_location
|
||||||
|
|
||||||
|
# Check if the file exists
|
||||||
|
if not os.path.isfile(file_path):
|
||||||
|
raise HTTPException(status_code=404, detail="Podcast audio file not found")
|
||||||
|
|
||||||
|
# Define a generator function to stream the file
|
||||||
|
def iterfile():
|
||||||
|
with open(file_path, mode="rb") as file_like:
|
||||||
|
yield from file_like
|
||||||
|
|
||||||
|
# Return a streaming response with appropriate headers
|
||||||
|
return StreamingResponse(
|
||||||
|
iterfile(),
|
||||||
|
media_type="audio/mpeg",
|
||||||
|
headers={
|
||||||
|
"Accept-Ranges": "bytes",
|
||||||
|
"Content-Disposition": f"inline; filename={Path(file_path).name}"
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
except HTTPException as he:
|
||||||
|
raise he
|
||||||
|
except Exception as e:
|
||||||
|
raise HTTPException(status_code=500, detail=f"Error streaming podcast: {str(e)}")
|
||||||
|
|
@ -21,7 +21,7 @@ from app.utils.check_ownership import check_ownership
|
||||||
from pydantic import BaseModel, Field, ValidationError
|
from pydantic import BaseModel, Field, ValidationError
|
||||||
from app.tasks.connectors_indexing_tasks import index_slack_messages, index_notion_pages, index_github_repos, index_linear_issues
|
from app.tasks.connectors_indexing_tasks import index_slack_messages, index_notion_pages, index_github_repos, index_linear_issues
|
||||||
from app.connectors.github_connector import GitHubConnector
|
from app.connectors.github_connector import GitHubConnector
|
||||||
from datetime import datetime, timezone, timedelta
|
from datetime import datetime, timedelta
|
||||||
import logging
|
import logging
|
||||||
|
|
||||||
# Set up logging
|
# Set up logging
|
||||||
|
|
|
||||||
|
|
@ -10,7 +10,7 @@ from .documents import (
|
||||||
DocumentRead,
|
DocumentRead,
|
||||||
)
|
)
|
||||||
from .chunks import ChunkBase, ChunkCreate, ChunkUpdate, ChunkRead
|
from .chunks import ChunkBase, ChunkCreate, ChunkUpdate, ChunkRead
|
||||||
from .podcasts import PodcastBase, PodcastCreate, PodcastUpdate, PodcastRead
|
from .podcasts import PodcastBase, PodcastCreate, PodcastUpdate, PodcastRead, PodcastGenerateRequest
|
||||||
from .chats import ChatBase, ChatCreate, ChatUpdate, ChatRead, AISDKChatRequest
|
from .chats import ChatBase, ChatCreate, ChatUpdate, ChatRead, AISDKChatRequest
|
||||||
from .search_source_connector import SearchSourceConnectorBase, SearchSourceConnectorCreate, SearchSourceConnectorUpdate, SearchSourceConnectorRead
|
from .search_source_connector import SearchSourceConnectorBase, SearchSourceConnectorCreate, SearchSourceConnectorUpdate, SearchSourceConnectorRead
|
||||||
|
|
||||||
|
|
@ -39,6 +39,7 @@ __all__ = [
|
||||||
"PodcastCreate",
|
"PodcastCreate",
|
||||||
"PodcastUpdate",
|
"PodcastUpdate",
|
||||||
"PodcastRead",
|
"PodcastRead",
|
||||||
|
"PodcastGenerateRequest",
|
||||||
"ChatBase",
|
"ChatBase",
|
||||||
"ChatCreate",
|
"ChatCreate",
|
||||||
"ChatUpdate",
|
"ChatUpdate",
|
||||||
|
|
|
||||||
|
|
@ -1,8 +1,10 @@
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
from pydantic import BaseModel
|
from pydantic import BaseModel, ConfigDict
|
||||||
|
|
||||||
class TimestampModel(BaseModel):
|
class TimestampModel(BaseModel):
|
||||||
created_at: datetime
|
created_at: datetime
|
||||||
|
model_config = ConfigDict(from_attributes=True)
|
||||||
|
|
||||||
class IDModel(BaseModel):
|
class IDModel(BaseModel):
|
||||||
id: int
|
id: int
|
||||||
|
model_config = ConfigDict(from_attributes=True)
|
||||||
|
|
@ -1,8 +1,10 @@
|
||||||
from typing import Any, Dict, List, Optional
|
from typing import Any, Dict, List, Optional
|
||||||
from pydantic import BaseModel
|
|
||||||
from sqlalchemy import JSON
|
|
||||||
from .base import IDModel, TimestampModel
|
|
||||||
from app.db import ChatType
|
from app.db import ChatType
|
||||||
|
from pydantic import BaseModel, ConfigDict
|
||||||
|
|
||||||
|
from .base import IDModel, TimestampModel
|
||||||
|
|
||||||
|
|
||||||
class ChatBase(BaseModel):
|
class ChatBase(BaseModel):
|
||||||
type: ChatType
|
type: ChatType
|
||||||
|
|
@ -25,14 +27,14 @@ class ToolInvocation(BaseModel):
|
||||||
result: dict
|
result: dict
|
||||||
|
|
||||||
|
|
||||||
class ClientMessage(BaseModel):
|
# class ClientMessage(BaseModel):
|
||||||
role: str
|
# role: str
|
||||||
content: str
|
# content: str
|
||||||
experimental_attachments: Optional[List[ClientAttachment]] = None
|
# experimental_attachments: Optional[List[ClientAttachment]] = None
|
||||||
toolInvocations: Optional[List[ToolInvocation]] = None
|
# toolInvocations: Optional[List[ToolInvocation]] = None
|
||||||
|
|
||||||
class AISDKChatRequest(BaseModel):
|
class AISDKChatRequest(BaseModel):
|
||||||
messages: List[ClientMessage]
|
messages: List[Any]
|
||||||
data: Optional[Dict[str, Any]] = None
|
data: Optional[Dict[str, Any]] = None
|
||||||
|
|
||||||
class ChatCreate(ChatBase):
|
class ChatCreate(ChatBase):
|
||||||
|
|
@ -42,5 +44,4 @@ class ChatUpdate(ChatBase):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
class ChatRead(ChatBase, IDModel, TimestampModel):
|
class ChatRead(ChatBase, IDModel, TimestampModel):
|
||||||
class Config:
|
model_config = ConfigDict(from_attributes=True)
|
||||||
from_attributes = True
|
|
||||||
|
|
@ -1,4 +1,4 @@
|
||||||
from pydantic import BaseModel
|
from pydantic import BaseModel, ConfigDict
|
||||||
from .base import IDModel, TimestampModel
|
from .base import IDModel, TimestampModel
|
||||||
|
|
||||||
class ChunkBase(BaseModel):
|
class ChunkBase(BaseModel):
|
||||||
|
|
@ -12,5 +12,4 @@ class ChunkUpdate(ChunkBase):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
class ChunkRead(ChunkBase, IDModel, TimestampModel):
|
class ChunkRead(ChunkBase, IDModel, TimestampModel):
|
||||||
class Config:
|
model_config = ConfigDict(from_attributes=True)
|
||||||
from_attributes = True
|
|
||||||
|
|
@ -1,7 +1,5 @@
|
||||||
from typing import List, Any
|
from typing import List, Any
|
||||||
from pydantic import BaseModel
|
from pydantic import BaseModel, ConfigDict
|
||||||
from sqlalchemy import JSON
|
|
||||||
from .base import IDModel, TimestampModel
|
|
||||||
from app.db import DocumentType
|
from app.db import DocumentType
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
|
|
||||||
|
|
@ -37,6 +35,5 @@ class DocumentRead(BaseModel):
|
||||||
created_at: datetime
|
created_at: datetime
|
||||||
search_space_id: int
|
search_space_id: int
|
||||||
|
|
||||||
class Config:
|
model_config = ConfigDict(from_attributes=True)
|
||||||
from_attributes = True
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,10 +1,10 @@
|
||||||
from pydantic import BaseModel
|
from pydantic import BaseModel, ConfigDict
|
||||||
|
from typing import Any, List, Literal
|
||||||
from .base import IDModel, TimestampModel
|
from .base import IDModel, TimestampModel
|
||||||
|
|
||||||
class PodcastBase(BaseModel):
|
class PodcastBase(BaseModel):
|
||||||
title: str
|
title: str
|
||||||
is_generated: bool = False
|
podcast_transcript: List[Any]
|
||||||
podcast_content: str = ""
|
|
||||||
file_location: str = ""
|
file_location: str = ""
|
||||||
search_space_id: int
|
search_space_id: int
|
||||||
|
|
||||||
|
|
@ -15,5 +15,10 @@ class PodcastUpdate(PodcastBase):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
class PodcastRead(PodcastBase, IDModel, TimestampModel):
|
class PodcastRead(PodcastBase, IDModel, TimestampModel):
|
||||||
class Config:
|
model_config = ConfigDict(from_attributes=True)
|
||||||
from_attributes = True
|
|
||||||
|
class PodcastGenerateRequest(BaseModel):
|
||||||
|
type: Literal["DOCUMENT", "CHAT"]
|
||||||
|
ids: List[int]
|
||||||
|
search_space_id: int
|
||||||
|
podcast_title: str = "SurfSense Podcast"
|
||||||
|
|
@ -1,7 +1,7 @@
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
import uuid
|
import uuid
|
||||||
from typing import Dict, Any, Optional
|
from typing import Dict, Any, Optional
|
||||||
from pydantic import BaseModel, field_validator
|
from pydantic import BaseModel, field_validator, ConfigDict
|
||||||
from .base import IDModel, TimestampModel
|
from .base import IDModel, TimestampModel
|
||||||
from app.db import SearchSourceConnectorType
|
from app.db import SearchSourceConnectorType
|
||||||
|
|
||||||
|
|
@ -36,6 +36,16 @@ class SearchSourceConnectorBase(BaseModel):
|
||||||
# Ensure the API key is not empty
|
# Ensure the API key is not empty
|
||||||
if not config.get("TAVILY_API_KEY"):
|
if not config.get("TAVILY_API_KEY"):
|
||||||
raise ValueError("TAVILY_API_KEY cannot be empty")
|
raise ValueError("TAVILY_API_KEY cannot be empty")
|
||||||
|
|
||||||
|
elif connector_type == SearchSourceConnectorType.LINKUP_API:
|
||||||
|
# For LINKUP_API, only allow LINKUP_API_KEY
|
||||||
|
allowed_keys = ["LINKUP_API_KEY"]
|
||||||
|
if set(config.keys()) != set(allowed_keys):
|
||||||
|
raise ValueError(f"For LINKUP_API connector type, config must only contain these keys: {allowed_keys}")
|
||||||
|
|
||||||
|
# Ensure the API key is not empty
|
||||||
|
if not config.get("LINKUP_API_KEY"):
|
||||||
|
raise ValueError("LINKUP_API_KEY cannot be empty")
|
||||||
|
|
||||||
elif connector_type == SearchSourceConnectorType.SLACK_CONNECTOR:
|
elif connector_type == SearchSourceConnectorType.SLACK_CONNECTOR:
|
||||||
# For SLACK_CONNECTOR, only allow SLACK_BOT_TOKEN
|
# For SLACK_CONNECTOR, only allow SLACK_BOT_TOKEN
|
||||||
|
|
@ -96,5 +106,4 @@ class SearchSourceConnectorUpdate(BaseModel):
|
||||||
class SearchSourceConnectorRead(SearchSourceConnectorBase, IDModel, TimestampModel):
|
class SearchSourceConnectorRead(SearchSourceConnectorBase, IDModel, TimestampModel):
|
||||||
user_id: uuid.UUID
|
user_id: uuid.UUID
|
||||||
|
|
||||||
class Config:
|
model_config = ConfigDict(from_attributes=True)
|
||||||
from_attributes = True
|
|
||||||
|
|
|
||||||
|
|
@ -1,7 +1,7 @@
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
import uuid
|
import uuid
|
||||||
from typing import Optional
|
from typing import Optional
|
||||||
from pydantic import BaseModel
|
from pydantic import BaseModel, ConfigDict
|
||||||
from .base import IDModel, TimestampModel
|
from .base import IDModel, TimestampModel
|
||||||
|
|
||||||
class SearchSpaceBase(BaseModel):
|
class SearchSpaceBase(BaseModel):
|
||||||
|
|
@ -19,5 +19,4 @@ class SearchSpaceRead(SearchSpaceBase, IDModel, TimestampModel):
|
||||||
created_at: datetime
|
created_at: datetime
|
||||||
user_id: uuid.UUID
|
user_id: uuid.UUID
|
||||||
|
|
||||||
class Config:
|
model_config = ConfigDict(from_attributes=True)
|
||||||
from_attributes = True
|
|
||||||
|
|
@ -1,27 +1,29 @@
|
||||||
from typing import Optional, List
|
from typing import Optional, List
|
||||||
from sqlalchemy.ext.asyncio import AsyncSession
|
from sqlalchemy.ext.asyncio import AsyncSession
|
||||||
from sqlalchemy.exc import SQLAlchemyError
|
from sqlalchemy.exc import SQLAlchemyError
|
||||||
|
from sqlalchemy.future import select
|
||||||
from app.db import Document, DocumentType, Chunk
|
from app.db import Document, DocumentType, Chunk
|
||||||
from app.schemas import ExtensionDocumentContent
|
from app.schemas import ExtensionDocumentContent
|
||||||
from app.config import config
|
from app.config import config
|
||||||
from app.prompts import SUMMARY_PROMPT_TEMPLATE
|
from app.prompts import SUMMARY_PROMPT_TEMPLATE
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
from app.utils.document_converters import convert_document_to_markdown
|
from app.utils.document_converters import convert_document_to_markdown, generate_content_hash
|
||||||
from langchain_core.documents import Document as LangChainDocument
|
from langchain_core.documents import Document as LangChainDocument
|
||||||
from langchain_community.document_loaders import FireCrawlLoader, AsyncChromiumLoader
|
from langchain_community.document_loaders import FireCrawlLoader, AsyncChromiumLoader
|
||||||
from langchain_community.document_transformers import MarkdownifyTransformer
|
from langchain_community.document_transformers import MarkdownifyTransformer
|
||||||
import validators
|
import validators
|
||||||
|
from youtube_transcript_api import YouTubeTranscriptApi
|
||||||
|
from urllib.parse import urlparse, parse_qs
|
||||||
|
import aiohttp
|
||||||
|
import logging
|
||||||
|
|
||||||
md = MarkdownifyTransformer()
|
md = MarkdownifyTransformer()
|
||||||
|
|
||||||
|
|
||||||
async def add_crawled_url_document(
|
async def add_crawled_url_document(
|
||||||
session: AsyncSession,
|
session: AsyncSession, url: str, search_space_id: int
|
||||||
url: str,
|
|
||||||
search_space_id: int
|
|
||||||
) -> Optional[Document]:
|
) -> Optional[Document]:
|
||||||
try:
|
try:
|
||||||
|
|
||||||
if not validators.url(url):
|
if not validators.url(url):
|
||||||
raise ValueError(f"Url {url} is not a valid URL address")
|
raise ValueError(f"Url {url} is not a valid URL address")
|
||||||
|
|
||||||
|
|
@ -33,7 +35,7 @@ async def add_crawled_url_document(
|
||||||
params={
|
params={
|
||||||
"formats": ["markdown"],
|
"formats": ["markdown"],
|
||||||
"excludeTags": ["a"],
|
"excludeTags": ["a"],
|
||||||
}
|
},
|
||||||
)
|
)
|
||||||
else:
|
else:
|
||||||
crawl_loader = AsyncChromiumLoader(urls=[url], headless=True)
|
crawl_loader = AsyncChromiumLoader(urls=[url], headless=True)
|
||||||
|
|
@ -43,20 +45,21 @@ async def add_crawled_url_document(
|
||||||
if type(crawl_loader) == FireCrawlLoader:
|
if type(crawl_loader) == FireCrawlLoader:
|
||||||
content_in_markdown = url_crawled[0].page_content
|
content_in_markdown = url_crawled[0].page_content
|
||||||
elif type(crawl_loader) == AsyncChromiumLoader:
|
elif type(crawl_loader) == AsyncChromiumLoader:
|
||||||
content_in_markdown = md.transform_documents(url_crawled)[
|
content_in_markdown = md.transform_documents(url_crawled)[0].page_content
|
||||||
0].page_content
|
|
||||||
|
|
||||||
# Format document metadata in a more maintainable way
|
# Format document metadata in a more maintainable way
|
||||||
metadata_sections = [
|
metadata_sections = [
|
||||||
("METADATA", [
|
(
|
||||||
f"{key.upper()}: {value}" for key, value in url_crawled[0].metadata.items()
|
"METADATA",
|
||||||
]),
|
[
|
||||||
("CONTENT", [
|
f"{key.upper()}: {value}"
|
||||||
"FORMAT: markdown",
|
for key, value in url_crawled[0].metadata.items()
|
||||||
"TEXT_START",
|
],
|
||||||
content_in_markdown,
|
),
|
||||||
"TEXT_END"
|
(
|
||||||
])
|
"CONTENT",
|
||||||
|
["FORMAT: markdown", "TEXT_START", content_in_markdown, "TEXT_END"],
|
||||||
|
),
|
||||||
]
|
]
|
||||||
|
|
||||||
# Build the document string more efficiently
|
# Build the document string more efficiently
|
||||||
|
|
@ -69,31 +72,48 @@ async def add_crawled_url_document(
|
||||||
document_parts.append(f"</{section_title}>")
|
document_parts.append(f"</{section_title}>")
|
||||||
|
|
||||||
document_parts.append("</DOCUMENT>")
|
document_parts.append("</DOCUMENT>")
|
||||||
combined_document_string = '\n'.join(document_parts)
|
combined_document_string = "\n".join(document_parts)
|
||||||
|
content_hash = generate_content_hash(combined_document_string)
|
||||||
|
|
||||||
|
# Check if document with this content hash already exists
|
||||||
|
existing_doc_result = await session.execute(
|
||||||
|
select(Document).where(Document.content_hash == content_hash)
|
||||||
|
)
|
||||||
|
existing_document = existing_doc_result.scalars().first()
|
||||||
|
|
||||||
|
if existing_document:
|
||||||
|
logging.info(f"Document with content hash {content_hash} already exists. Skipping processing.")
|
||||||
|
return existing_document
|
||||||
|
|
||||||
# Generate summary
|
# Generate summary
|
||||||
summary_chain = SUMMARY_PROMPT_TEMPLATE | config.long_context_llm_instance
|
summary_chain = SUMMARY_PROMPT_TEMPLATE | config.long_context_llm_instance
|
||||||
summary_result = await summary_chain.ainvoke({"document": combined_document_string})
|
summary_result = await summary_chain.ainvoke(
|
||||||
|
{"document": combined_document_string}
|
||||||
|
)
|
||||||
summary_content = summary_result.content
|
summary_content = summary_result.content
|
||||||
summary_embedding = config.embedding_model_instance.embed(
|
summary_embedding = config.embedding_model_instance.embed(summary_content)
|
||||||
summary_content)
|
|
||||||
|
|
||||||
# Process chunks
|
# Process chunks
|
||||||
chunks = [
|
chunks = [
|
||||||
Chunk(content=chunk.text, embedding=chunk.embedding)
|
Chunk(
|
||||||
|
content=chunk.text,
|
||||||
|
embedding=config.embedding_model_instance.embed(chunk.text),
|
||||||
|
)
|
||||||
for chunk in config.chunker_instance.chunk(content_in_markdown)
|
for chunk in config.chunker_instance.chunk(content_in_markdown)
|
||||||
]
|
]
|
||||||
|
|
||||||
# Create and store document
|
# Create and store document
|
||||||
document = Document(
|
document = Document(
|
||||||
search_space_id=search_space_id,
|
search_space_id=search_space_id,
|
||||||
title=url_crawled[0].metadata['title'] if type(
|
title=url_crawled[0].metadata["title"]
|
||||||
crawl_loader) == FireCrawlLoader else url_crawled[0].metadata['source'],
|
if type(crawl_loader) == FireCrawlLoader
|
||||||
|
else url_crawled[0].metadata["source"],
|
||||||
document_type=DocumentType.CRAWLED_URL,
|
document_type=DocumentType.CRAWLED_URL,
|
||||||
document_metadata=url_crawled[0].metadata,
|
document_metadata=url_crawled[0].metadata,
|
||||||
content=summary_content,
|
content=summary_content,
|
||||||
embedding=summary_embedding,
|
embedding=summary_embedding,
|
||||||
chunks=chunks
|
chunks=chunks,
|
||||||
|
content_hash=content_hash,
|
||||||
)
|
)
|
||||||
|
|
||||||
session.add(document)
|
session.add(document)
|
||||||
|
|
@ -111,9 +131,7 @@ async def add_crawled_url_document(
|
||||||
|
|
||||||
|
|
||||||
async def add_extension_received_document(
|
async def add_extension_received_document(
|
||||||
session: AsyncSession,
|
session: AsyncSession, content: ExtensionDocumentContent, search_space_id: int
|
||||||
content: ExtensionDocumentContent,
|
|
||||||
search_space_id: int
|
|
||||||
) -> Optional[Document]:
|
) -> Optional[Document]:
|
||||||
"""
|
"""
|
||||||
Process and store document content received from the SurfSense Extension.
|
Process and store document content received from the SurfSense Extension.
|
||||||
|
|
@ -129,20 +147,21 @@ async def add_extension_received_document(
|
||||||
try:
|
try:
|
||||||
# Format document metadata in a more maintainable way
|
# Format document metadata in a more maintainable way
|
||||||
metadata_sections = [
|
metadata_sections = [
|
||||||
("METADATA", [
|
(
|
||||||
f"SESSION_ID: {content.metadata.BrowsingSessionId}",
|
"METADATA",
|
||||||
f"URL: {content.metadata.VisitedWebPageURL}",
|
[
|
||||||
f"TITLE: {content.metadata.VisitedWebPageTitle}",
|
f"SESSION_ID: {content.metadata.BrowsingSessionId}",
|
||||||
f"REFERRER: {content.metadata.VisitedWebPageReffererURL}",
|
f"URL: {content.metadata.VisitedWebPageURL}",
|
||||||
f"TIMESTAMP: {content.metadata.VisitedWebPageDateWithTimeInISOString}",
|
f"TITLE: {content.metadata.VisitedWebPageTitle}",
|
||||||
f"DURATION_MS: {content.metadata.VisitedWebPageVisitDurationInMilliseconds}"
|
f"REFERRER: {content.metadata.VisitedWebPageReffererURL}",
|
||||||
]),
|
f"TIMESTAMP: {content.metadata.VisitedWebPageDateWithTimeInISOString}",
|
||||||
("CONTENT", [
|
f"DURATION_MS: {content.metadata.VisitedWebPageVisitDurationInMilliseconds}",
|
||||||
"FORMAT: markdown",
|
],
|
||||||
"TEXT_START",
|
),
|
||||||
content.pageContent,
|
(
|
||||||
"TEXT_END"
|
"CONTENT",
|
||||||
])
|
["FORMAT: markdown", "TEXT_START", content.pageContent, "TEXT_END"],
|
||||||
|
),
|
||||||
]
|
]
|
||||||
|
|
||||||
# Build the document string more efficiently
|
# Build the document string more efficiently
|
||||||
|
|
@ -155,18 +174,33 @@ async def add_extension_received_document(
|
||||||
document_parts.append(f"</{section_title}>")
|
document_parts.append(f"</{section_title}>")
|
||||||
|
|
||||||
document_parts.append("</DOCUMENT>")
|
document_parts.append("</DOCUMENT>")
|
||||||
combined_document_string = '\n'.join(document_parts)
|
combined_document_string = "\n".join(document_parts)
|
||||||
|
content_hash = generate_content_hash(combined_document_string)
|
||||||
|
|
||||||
|
# Check if document with this content hash already exists
|
||||||
|
existing_doc_result = await session.execute(
|
||||||
|
select(Document).where(Document.content_hash == content_hash)
|
||||||
|
)
|
||||||
|
existing_document = existing_doc_result.scalars().first()
|
||||||
|
|
||||||
|
if existing_document:
|
||||||
|
logging.info(f"Document with content hash {content_hash} already exists. Skipping processing.")
|
||||||
|
return existing_document
|
||||||
|
|
||||||
# Generate summary
|
# Generate summary
|
||||||
summary_chain = SUMMARY_PROMPT_TEMPLATE | config.long_context_llm_instance
|
summary_chain = SUMMARY_PROMPT_TEMPLATE | config.long_context_llm_instance
|
||||||
summary_result = await summary_chain.ainvoke({"document": combined_document_string})
|
summary_result = await summary_chain.ainvoke(
|
||||||
|
{"document": combined_document_string}
|
||||||
|
)
|
||||||
summary_content = summary_result.content
|
summary_content = summary_result.content
|
||||||
summary_embedding = config.embedding_model_instance.embed(
|
summary_embedding = config.embedding_model_instance.embed(summary_content)
|
||||||
summary_content)
|
|
||||||
|
|
||||||
# Process chunks
|
# Process chunks
|
||||||
chunks = [
|
chunks = [
|
||||||
Chunk(content=chunk.text, embedding=chunk.embedding)
|
Chunk(
|
||||||
|
content=chunk.text,
|
||||||
|
embedding=config.embedding_model_instance.embed(chunk.text),
|
||||||
|
)
|
||||||
for chunk in config.chunker_instance.chunk(content.pageContent)
|
for chunk in config.chunker_instance.chunk(content.pageContent)
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|
@ -178,7 +212,8 @@ async def add_extension_received_document(
|
||||||
document_metadata=content.metadata.model_dump(),
|
document_metadata=content.metadata.model_dump(),
|
||||||
content=summary_content,
|
content=summary_content,
|
||||||
embedding=summary_embedding,
|
embedding=summary_embedding,
|
||||||
chunks=chunks
|
chunks=chunks,
|
||||||
|
content_hash=content_hash,
|
||||||
)
|
)
|
||||||
|
|
||||||
session.add(document)
|
session.add(document)
|
||||||
|
|
@ -195,27 +230,34 @@ async def add_extension_received_document(
|
||||||
raise RuntimeError(f"Failed to process extension document: {str(e)}")
|
raise RuntimeError(f"Failed to process extension document: {str(e)}")
|
||||||
|
|
||||||
|
|
||||||
async def add_received_file_document(
|
async def add_received_markdown_file_document(
|
||||||
session: AsyncSession,
|
session: AsyncSession, file_name: str, file_in_markdown: str, search_space_id: int
|
||||||
file_name: str,
|
|
||||||
unstructured_processed_elements: List[LangChainDocument],
|
|
||||||
search_space_id: int
|
|
||||||
) -> Optional[Document]:
|
) -> Optional[Document]:
|
||||||
try:
|
try:
|
||||||
file_in_markdown = await convert_document_to_markdown(unstructured_processed_elements)
|
content_hash = generate_content_hash(file_in_markdown)
|
||||||
|
|
||||||
# TODO: Check if file_markdown exceeds token limit of embedding model
|
# Check if document with this content hash already exists
|
||||||
|
existing_doc_result = await session.execute(
|
||||||
|
select(Document).where(Document.content_hash == content_hash)
|
||||||
|
)
|
||||||
|
existing_document = existing_doc_result.scalars().first()
|
||||||
|
|
||||||
|
if existing_document:
|
||||||
|
logging.info(f"Document with content hash {content_hash} already exists. Skipping processing.")
|
||||||
|
return existing_document
|
||||||
|
|
||||||
# Generate summary
|
# Generate summary
|
||||||
summary_chain = SUMMARY_PROMPT_TEMPLATE | config.long_context_llm_instance
|
summary_chain = SUMMARY_PROMPT_TEMPLATE | config.long_context_llm_instance
|
||||||
summary_result = await summary_chain.ainvoke({"document": file_in_markdown})
|
summary_result = await summary_chain.ainvoke({"document": file_in_markdown})
|
||||||
summary_content = summary_result.content
|
summary_content = summary_result.content
|
||||||
summary_embedding = config.embedding_model_instance.embed(
|
summary_embedding = config.embedding_model_instance.embed(summary_content)
|
||||||
summary_content)
|
|
||||||
|
|
||||||
# Process chunks
|
# Process chunks
|
||||||
chunks = [
|
chunks = [
|
||||||
Chunk(content=chunk.text, embedding=chunk.embedding)
|
Chunk(
|
||||||
|
content=chunk.text,
|
||||||
|
embedding=config.embedding_model_instance.embed(chunk.text),
|
||||||
|
)
|
||||||
for chunk in config.chunker_instance.chunk(file_in_markdown)
|
for chunk in config.chunker_instance.chunk(file_in_markdown)
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|
@ -226,11 +268,11 @@ async def add_received_file_document(
|
||||||
document_type=DocumentType.FILE,
|
document_type=DocumentType.FILE,
|
||||||
document_metadata={
|
document_metadata={
|
||||||
"FILE_NAME": file_name,
|
"FILE_NAME": file_name,
|
||||||
"SAVED_AT": datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
|
||||||
},
|
},
|
||||||
content=summary_content,
|
content=summary_content,
|
||||||
embedding=summary_embedding,
|
embedding=summary_embedding,
|
||||||
chunks=chunks
|
chunks=chunks,
|
||||||
|
content_hash=content_hash,
|
||||||
)
|
)
|
||||||
|
|
||||||
session.add(document)
|
session.add(document)
|
||||||
|
|
@ -246,24 +288,176 @@ async def add_received_file_document(
|
||||||
raise RuntimeError(f"Failed to process file document: {str(e)}")
|
raise RuntimeError(f"Failed to process file document: {str(e)}")
|
||||||
|
|
||||||
|
|
||||||
async def add_youtube_video_document(
|
async def add_received_file_document_using_unstructured(
|
||||||
session: AsyncSession,
|
session: AsyncSession,
|
||||||
url: str,
|
file_name: str,
|
||||||
search_space_id: int
|
unstructured_processed_elements: List[LangChainDocument],
|
||||||
):
|
search_space_id: int,
|
||||||
|
) -> Optional[Document]:
|
||||||
|
try:
|
||||||
|
file_in_markdown = await convert_document_to_markdown(
|
||||||
|
unstructured_processed_elements
|
||||||
|
)
|
||||||
|
|
||||||
|
content_hash = generate_content_hash(file_in_markdown)
|
||||||
|
|
||||||
|
# Check if document with this content hash already exists
|
||||||
|
existing_doc_result = await session.execute(
|
||||||
|
select(Document).where(Document.content_hash == content_hash)
|
||||||
|
)
|
||||||
|
existing_document = existing_doc_result.scalars().first()
|
||||||
|
|
||||||
|
if existing_document:
|
||||||
|
logging.info(f"Document with content hash {content_hash} already exists. Skipping processing.")
|
||||||
|
return existing_document
|
||||||
|
|
||||||
|
# TODO: Check if file_markdown exceeds token limit of embedding model
|
||||||
|
|
||||||
|
# Generate summary
|
||||||
|
summary_chain = SUMMARY_PROMPT_TEMPLATE | config.long_context_llm_instance
|
||||||
|
summary_result = await summary_chain.ainvoke({"document": file_in_markdown})
|
||||||
|
summary_content = summary_result.content
|
||||||
|
summary_embedding = config.embedding_model_instance.embed(summary_content)
|
||||||
|
|
||||||
|
# Process chunks
|
||||||
|
chunks = [
|
||||||
|
Chunk(
|
||||||
|
content=chunk.text,
|
||||||
|
embedding=config.embedding_model_instance.embed(chunk.text),
|
||||||
|
)
|
||||||
|
for chunk in config.chunker_instance.chunk(file_in_markdown)
|
||||||
|
]
|
||||||
|
|
||||||
|
# Create and store document
|
||||||
|
document = Document(
|
||||||
|
search_space_id=search_space_id,
|
||||||
|
title=file_name,
|
||||||
|
document_type=DocumentType.FILE,
|
||||||
|
document_metadata={
|
||||||
|
"FILE_NAME": file_name,
|
||||||
|
"ETL_SERVICE": "UNSTRUCTURED",
|
||||||
|
},
|
||||||
|
content=summary_content,
|
||||||
|
embedding=summary_embedding,
|
||||||
|
chunks=chunks,
|
||||||
|
content_hash=content_hash,
|
||||||
|
)
|
||||||
|
|
||||||
|
session.add(document)
|
||||||
|
await session.commit()
|
||||||
|
await session.refresh(document)
|
||||||
|
|
||||||
|
return document
|
||||||
|
except SQLAlchemyError as db_error:
|
||||||
|
await session.rollback()
|
||||||
|
raise db_error
|
||||||
|
except Exception as e:
|
||||||
|
await session.rollback()
|
||||||
|
raise RuntimeError(f"Failed to process file document: {str(e)}")
|
||||||
|
|
||||||
|
|
||||||
|
async def add_received_file_document_using_llamacloud(
|
||||||
|
session: AsyncSession,
|
||||||
|
file_name: str,
|
||||||
|
llamacloud_markdown_document: str,
|
||||||
|
search_space_id: int,
|
||||||
|
) -> Optional[Document]:
|
||||||
"""
|
"""
|
||||||
Process a YouTube video URL, extract transcripts, and add as document.
|
Process and store document content parsed by LlamaCloud.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
session: Database session
|
||||||
|
file_name: Name of the processed file
|
||||||
|
llamacloud_markdown_documents: List of markdown content from LlamaCloud parsing
|
||||||
|
search_space_id: ID of the search space
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Document object if successful, None if failed
|
||||||
"""
|
"""
|
||||||
try:
|
try:
|
||||||
from youtube_transcript_api import YouTubeTranscriptApi
|
# Combine all markdown documents into one
|
||||||
|
file_in_markdown = llamacloud_markdown_document
|
||||||
|
|
||||||
|
content_hash = generate_content_hash(file_in_markdown)
|
||||||
|
|
||||||
|
# Check if document with this content hash already exists
|
||||||
|
existing_doc_result = await session.execute(
|
||||||
|
select(Document).where(Document.content_hash == content_hash)
|
||||||
|
)
|
||||||
|
existing_document = existing_doc_result.scalars().first()
|
||||||
|
|
||||||
|
if existing_document:
|
||||||
|
logging.info(f"Document with content hash {content_hash} already exists. Skipping processing.")
|
||||||
|
return existing_document
|
||||||
|
|
||||||
|
# Generate summary
|
||||||
|
summary_chain = SUMMARY_PROMPT_TEMPLATE | config.long_context_llm_instance
|
||||||
|
summary_result = await summary_chain.ainvoke({"document": file_in_markdown})
|
||||||
|
summary_content = summary_result.content
|
||||||
|
summary_embedding = config.embedding_model_instance.embed(summary_content)
|
||||||
|
|
||||||
|
# Process chunks
|
||||||
|
chunks = [
|
||||||
|
Chunk(
|
||||||
|
content=chunk.text,
|
||||||
|
embedding=config.embedding_model_instance.embed(chunk.text),
|
||||||
|
)
|
||||||
|
for chunk in config.chunker_instance.chunk(file_in_markdown)
|
||||||
|
]
|
||||||
|
|
||||||
|
# Create and store document
|
||||||
|
document = Document(
|
||||||
|
search_space_id=search_space_id,
|
||||||
|
title=file_name,
|
||||||
|
document_type=DocumentType.FILE,
|
||||||
|
document_metadata={
|
||||||
|
"FILE_NAME": file_name,
|
||||||
|
"ETL_SERVICE": "LLAMACLOUD",
|
||||||
|
},
|
||||||
|
content=summary_content,
|
||||||
|
embedding=summary_embedding,
|
||||||
|
chunks=chunks,
|
||||||
|
content_hash=content_hash,
|
||||||
|
)
|
||||||
|
|
||||||
|
session.add(document)
|
||||||
|
await session.commit()
|
||||||
|
await session.refresh(document)
|
||||||
|
|
||||||
|
return document
|
||||||
|
except SQLAlchemyError as db_error:
|
||||||
|
await session.rollback()
|
||||||
|
raise db_error
|
||||||
|
except Exception as e:
|
||||||
|
await session.rollback()
|
||||||
|
raise RuntimeError(f"Failed to process file document using LlamaCloud: {str(e)}")
|
||||||
|
|
||||||
|
|
||||||
|
async def add_youtube_video_document(
|
||||||
|
session: AsyncSession, url: str, search_space_id: int
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
Process a YouTube video URL, extract transcripts, and store as a document.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
session: Database session for storing the document
|
||||||
|
url: YouTube video URL (supports standard, shortened, and embed formats)
|
||||||
|
search_space_id: ID of the search space to add the document to
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Document: The created document object
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
ValueError: If the YouTube video ID cannot be extracted from the URL
|
||||||
|
SQLAlchemyError: If there's a database error
|
||||||
|
RuntimeError: If the video processing fails
|
||||||
|
"""
|
||||||
|
try:
|
||||||
# Extract video ID from URL
|
# Extract video ID from URL
|
||||||
def get_youtube_video_id(url: str):
|
def get_youtube_video_id(url: str):
|
||||||
from urllib.parse import urlparse, parse_qs
|
|
||||||
|
|
||||||
parsed_url = urlparse(url)
|
parsed_url = urlparse(url)
|
||||||
hostname = parsed_url.hostname
|
hostname = parsed_url.hostname
|
||||||
|
|
||||||
if hostname == "youtu.be":
|
if hostname == "youtu.be":
|
||||||
return parsed_url.path[1:]
|
return parsed_url.path[1:]
|
||||||
if hostname in ("www.youtube.com", "youtube.com"):
|
if hostname in ("www.youtube.com", "youtube.com"):
|
||||||
|
|
@ -275,26 +469,23 @@ async def add_youtube_video_document(
|
||||||
if parsed_url.path.startswith("/v/"):
|
if parsed_url.path.startswith("/v/"):
|
||||||
return parsed_url.path.split("/")[2]
|
return parsed_url.path.split("/")[2]
|
||||||
return None
|
return None
|
||||||
|
|
||||||
# Get video ID
|
# Get video ID
|
||||||
video_id = get_youtube_video_id(url)
|
video_id = get_youtube_video_id(url)
|
||||||
if not video_id:
|
if not video_id:
|
||||||
raise ValueError(f"Could not extract video ID from URL: {url}")
|
raise ValueError(f"Could not extract video ID from URL: {url}")
|
||||||
|
|
||||||
# Get video metadata
|
# Get video metadata using async HTTP client
|
||||||
import json
|
params = {
|
||||||
from urllib.parse import urlencode
|
"format": "json",
|
||||||
from urllib.request import urlopen
|
"url": f"https://www.youtube.com/watch?v={video_id}",
|
||||||
|
}
|
||||||
params = {"format": "json", "url": f"https://www.youtube.com/watch?v={video_id}"}
|
|
||||||
oembed_url = "https://www.youtube.com/oembed"
|
oembed_url = "https://www.youtube.com/oembed"
|
||||||
query_string = urlencode(params)
|
|
||||||
full_url = oembed_url + "?" + query_string
|
async with aiohttp.ClientSession() as http_session:
|
||||||
|
async with http_session.get(oembed_url, params=params) as response:
|
||||||
with urlopen(full_url) as response:
|
video_data = await response.json()
|
||||||
response_text = response.read()
|
|
||||||
video_data = json.loads(response_text.decode())
|
|
||||||
|
|
||||||
# Get video transcript
|
# Get video transcript
|
||||||
try:
|
try:
|
||||||
captions = YouTubeTranscriptApi.get_transcript(video_id)
|
captions = YouTubeTranscriptApi.get_transcript(video_id)
|
||||||
|
|
@ -309,22 +500,23 @@ async def add_youtube_video_document(
|
||||||
transcript_text = "\n".join(transcript_segments)
|
transcript_text = "\n".join(transcript_segments)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
transcript_text = f"No captions available for this video. Error: {str(e)}"
|
transcript_text = f"No captions available for this video. Error: {str(e)}"
|
||||||
|
|
||||||
# Format document metadata in a more maintainable way
|
# Format document metadata in a more maintainable way
|
||||||
metadata_sections = [
|
metadata_sections = [
|
||||||
("METADATA", [
|
(
|
||||||
f"TITLE: {video_data.get('title', 'YouTube Video')}",
|
"METADATA",
|
||||||
f"URL: {url}",
|
[
|
||||||
f"VIDEO_ID: {video_id}",
|
f"TITLE: {video_data.get('title', 'YouTube Video')}",
|
||||||
f"AUTHOR: {video_data.get('author_name', 'Unknown')}",
|
f"URL: {url}",
|
||||||
f"THUMBNAIL: {video_data.get('thumbnail_url', '')}"
|
f"VIDEO_ID: {video_id}",
|
||||||
]),
|
f"AUTHOR: {video_data.get('author_name', 'Unknown')}",
|
||||||
("CONTENT", [
|
f"THUMBNAIL: {video_data.get('thumbnail_url', '')}",
|
||||||
"FORMAT: transcript",
|
],
|
||||||
"TEXT_START",
|
),
|
||||||
transcript_text,
|
(
|
||||||
"TEXT_END"
|
"CONTENT",
|
||||||
])
|
["FORMAT: transcript", "TEXT_START", transcript_text, "TEXT_END"],
|
||||||
|
),
|
||||||
]
|
]
|
||||||
|
|
||||||
# Build the document string more efficiently
|
# Build the document string more efficiently
|
||||||
|
|
@ -337,23 +529,38 @@ async def add_youtube_video_document(
|
||||||
document_parts.append(f"</{section_title}>")
|
document_parts.append(f"</{section_title}>")
|
||||||
|
|
||||||
document_parts.append("</DOCUMENT>")
|
document_parts.append("</DOCUMENT>")
|
||||||
combined_document_string = '\n'.join(document_parts)
|
combined_document_string = "\n".join(document_parts)
|
||||||
|
content_hash = generate_content_hash(combined_document_string)
|
||||||
|
|
||||||
|
# Check if document with this content hash already exists
|
||||||
|
existing_doc_result = await session.execute(
|
||||||
|
select(Document).where(Document.content_hash == content_hash)
|
||||||
|
)
|
||||||
|
existing_document = existing_doc_result.scalars().first()
|
||||||
|
|
||||||
|
if existing_document:
|
||||||
|
logging.info(f"Document with content hash {content_hash} already exists. Skipping processing.")
|
||||||
|
return existing_document
|
||||||
|
|
||||||
# Generate summary
|
# Generate summary
|
||||||
summary_chain = SUMMARY_PROMPT_TEMPLATE | config.long_context_llm_instance
|
summary_chain = SUMMARY_PROMPT_TEMPLATE | config.long_context_llm_instance
|
||||||
summary_result = await summary_chain.ainvoke({"document": combined_document_string})
|
summary_result = await summary_chain.ainvoke(
|
||||||
|
{"document": combined_document_string}
|
||||||
|
)
|
||||||
summary_content = summary_result.content
|
summary_content = summary_result.content
|
||||||
summary_embedding = config.embedding_model_instance.embed(summary_content)
|
summary_embedding = config.embedding_model_instance.embed(summary_content)
|
||||||
|
|
||||||
# Process chunks
|
# Process chunks
|
||||||
chunks = [
|
chunks = [
|
||||||
Chunk(content=chunk.text, embedding=chunk.embedding)
|
Chunk(
|
||||||
for chunk in config.chunker_instance.chunk(transcript_text)
|
content=chunk.text,
|
||||||
|
embedding=config.embedding_model_instance.embed(chunk.text),
|
||||||
|
)
|
||||||
|
for chunk in config.chunker_instance.chunk(combined_document_string)
|
||||||
]
|
]
|
||||||
|
|
||||||
# Create document
|
# Create document
|
||||||
from app.db import Document, DocumentType
|
|
||||||
|
|
||||||
document = Document(
|
document = Document(
|
||||||
title=video_data.get("title", "YouTube Video"),
|
title=video_data.get("title", "YouTube Video"),
|
||||||
document_type=DocumentType.YOUTUBE_VIDEO,
|
document_type=DocumentType.YOUTUBE_VIDEO,
|
||||||
|
|
@ -362,24 +569,24 @@ async def add_youtube_video_document(
|
||||||
"video_id": video_id,
|
"video_id": video_id,
|
||||||
"video_title": video_data.get("title", "YouTube Video"),
|
"video_title": video_data.get("title", "YouTube Video"),
|
||||||
"author": video_data.get("author_name", "Unknown"),
|
"author": video_data.get("author_name", "Unknown"),
|
||||||
"thumbnail": video_data.get("thumbnail_url", "")
|
"thumbnail": video_data.get("thumbnail_url", ""),
|
||||||
},
|
},
|
||||||
content=summary_content,
|
content=summary_content,
|
||||||
embedding=summary_embedding,
|
embedding=summary_embedding,
|
||||||
chunks=chunks,
|
chunks=chunks,
|
||||||
search_space_id=search_space_id
|
search_space_id=search_space_id,
|
||||||
|
content_hash=content_hash,
|
||||||
)
|
)
|
||||||
|
|
||||||
session.add(document)
|
session.add(document)
|
||||||
await session.commit()
|
await session.commit()
|
||||||
await session.refresh(document)
|
await session.refresh(document)
|
||||||
|
|
||||||
return document
|
return document
|
||||||
except SQLAlchemyError as db_error:
|
except SQLAlchemyError as db_error:
|
||||||
await session.rollback()
|
await session.rollback()
|
||||||
raise db_error
|
raise db_error
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
await session.rollback()
|
await session.rollback()
|
||||||
import logging
|
|
||||||
logging.error(f"Failed to process YouTube video: {str(e)}")
|
logging.error(f"Failed to process YouTube video: {str(e)}")
|
||||||
raise
|
raise
|
||||||
|
|
|
||||||
|
|
@ -14,6 +14,8 @@ from app.connectors.linear_connector import LinearConnector
|
||||||
from slack_sdk.errors import SlackApiError
|
from slack_sdk.errors import SlackApiError
|
||||||
import logging
|
import logging
|
||||||
|
|
||||||
|
from app.utils.document_converters import generate_content_hash
|
||||||
|
|
||||||
# Set up logging
|
# Set up logging
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
@ -67,13 +69,13 @@ async def index_slack_messages(
|
||||||
|
|
||||||
# Check if last_indexed_at is in the future or after end_date
|
# Check if last_indexed_at is in the future or after end_date
|
||||||
if last_indexed_naive > end_date:
|
if last_indexed_naive > end_date:
|
||||||
logger.warning(f"Last indexed date ({last_indexed_naive.strftime('%Y-%m-%d')}) is in the future. Using 30 days ago instead.")
|
logger.warning(f"Last indexed date ({last_indexed_naive.strftime('%Y-%m-%d')}) is in the future. Using 365 days ago instead.")
|
||||||
start_date = end_date - timedelta(days=30)
|
start_date = end_date - timedelta(days=365)
|
||||||
else:
|
else:
|
||||||
start_date = last_indexed_naive
|
start_date = last_indexed_naive
|
||||||
logger.info(f"Using last_indexed_at ({start_date.strftime('%Y-%m-%d')}) as start date")
|
logger.info(f"Using last_indexed_at ({start_date.strftime('%Y-%m-%d')}) as start date")
|
||||||
else:
|
else:
|
||||||
start_date = end_date - timedelta(days=30) # Use 30 days instead of 365 to catch recent issues
|
start_date = end_date - timedelta(days=365) # Use 365 days as default
|
||||||
logger.info(f"No last_indexed_at found, using {start_date.strftime('%Y-%m-%d')} (30 days ago) as start date")
|
logger.info(f"No last_indexed_at found, using {start_date.strftime('%Y-%m-%d')} (30 days ago) as start date")
|
||||||
|
|
||||||
# Format dates for Slack API
|
# Format dates for Slack API
|
||||||
|
|
@ -89,58 +91,31 @@ async def index_slack_messages(
|
||||||
if not channels:
|
if not channels:
|
||||||
return 0, "No Slack channels found"
|
return 0, "No Slack channels found"
|
||||||
|
|
||||||
# Get existing documents for this search space and connector type to prevent duplicates
|
|
||||||
existing_docs_result = await session.execute(
|
|
||||||
select(Document)
|
|
||||||
.filter(
|
|
||||||
Document.search_space_id == search_space_id,
|
|
||||||
Document.document_type == DocumentType.SLACK_CONNECTOR
|
|
||||||
)
|
|
||||||
)
|
|
||||||
existing_docs = existing_docs_result.scalars().all()
|
|
||||||
|
|
||||||
# Create a lookup dictionary of existing documents by channel_id
|
|
||||||
existing_docs_by_channel_id = {}
|
|
||||||
for doc in existing_docs:
|
|
||||||
if "channel_id" in doc.document_metadata:
|
|
||||||
existing_docs_by_channel_id[doc.document_metadata["channel_id"]] = doc
|
|
||||||
|
|
||||||
logger.info(f"Found {len(existing_docs_by_channel_id)} existing Slack documents in database")
|
|
||||||
|
|
||||||
# Track the number of documents indexed
|
# Track the number of documents indexed
|
||||||
documents_indexed = 0
|
documents_indexed = 0
|
||||||
documents_updated = 0
|
|
||||||
documents_skipped = 0
|
documents_skipped = 0
|
||||||
skipped_channels = []
|
skipped_channels = []
|
||||||
|
|
||||||
# Process each channel
|
# Process each channel
|
||||||
for channel_name, channel_id in channels.items():
|
for channel_obj in channels: # Modified loop to iterate over list of channel objects
|
||||||
|
channel_id = channel_obj["id"]
|
||||||
|
channel_name = channel_obj["name"]
|
||||||
|
is_private = channel_obj["is_private"]
|
||||||
|
is_member = channel_obj["is_member"] # This might be False for public channels too
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Check if the bot is a member of the channel
|
# If it's a private channel and the bot is not a member, skip.
|
||||||
try:
|
# For public channels, if they are listed by conversations.list, the bot can typically read history.
|
||||||
# First try to get channel info to check if bot is a member
|
# The `not_in_channel` error in get_conversation_history will be the ultimate gatekeeper if history is inaccessible.
|
||||||
channel_info = slack_client.client.conversations_info(channel=channel_id)
|
if is_private and not is_member:
|
||||||
|
logger.warning(f"Bot is not a member of private channel {channel_name} ({channel_id}). Skipping.")
|
||||||
# For private channels, the bot needs to be a member
|
skipped_channels.append(f"{channel_name} (private, bot not a member)")
|
||||||
if channel_info.get("channel", {}).get("is_private", False):
|
documents_skipped += 1
|
||||||
# Check if bot is a member
|
continue
|
||||||
is_member = channel_info.get("channel", {}).get("is_member", False)
|
|
||||||
if not is_member:
|
|
||||||
logger.warning(f"Bot is not a member of private channel {channel_name} ({channel_id}). Skipping.")
|
|
||||||
skipped_channels.append(f"{channel_name} (private, bot not a member)")
|
|
||||||
documents_skipped += 1
|
|
||||||
continue
|
|
||||||
except SlackApiError as e:
|
|
||||||
if "not_in_channel" in str(e) or "channel_not_found" in str(e):
|
|
||||||
logger.warning(f"Bot cannot access channel {channel_name} ({channel_id}). Skipping.")
|
|
||||||
skipped_channels.append(f"{channel_name} (access error)")
|
|
||||||
documents_skipped += 1
|
|
||||||
continue
|
|
||||||
else:
|
|
||||||
# Re-raise if it's a different error
|
|
||||||
raise
|
|
||||||
|
|
||||||
# Get messages for this channel
|
# Get messages for this channel
|
||||||
|
# The get_history_by_date_range now uses get_conversation_history,
|
||||||
|
# which handles 'not_in_channel' by returning [] and logging.
|
||||||
messages, error = slack_client.get_history_by_date_range(
|
messages, error = slack_client.get_history_by_date_range(
|
||||||
channel_id=channel_id,
|
channel_id=channel_id,
|
||||||
start_date=start_date_str,
|
start_date=start_date_str,
|
||||||
|
|
@ -189,10 +164,9 @@ async def index_slack_messages(
|
||||||
("METADATA", [
|
("METADATA", [
|
||||||
f"CHANNEL_NAME: {channel_name}",
|
f"CHANNEL_NAME: {channel_name}",
|
||||||
f"CHANNEL_ID: {channel_id}",
|
f"CHANNEL_ID: {channel_id}",
|
||||||
f"START_DATE: {start_date_str}",
|
# f"START_DATE: {start_date_str}",
|
||||||
f"END_DATE: {end_date_str}",
|
# f"END_DATE: {end_date_str}",
|
||||||
f"MESSAGE_COUNT: {len(formatted_messages)}",
|
f"MESSAGE_COUNT: {len(formatted_messages)}"
|
||||||
f"INDEXED_AT: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"
|
|
||||||
]),
|
]),
|
||||||
("CONTENT", [
|
("CONTENT", [
|
||||||
"FORMAT: markdown",
|
"FORMAT: markdown",
|
||||||
|
|
@ -213,6 +187,18 @@ async def index_slack_messages(
|
||||||
|
|
||||||
document_parts.append("</DOCUMENT>")
|
document_parts.append("</DOCUMENT>")
|
||||||
combined_document_string = '\n'.join(document_parts)
|
combined_document_string = '\n'.join(document_parts)
|
||||||
|
content_hash = generate_content_hash(combined_document_string)
|
||||||
|
|
||||||
|
# Check if document with this content hash already exists
|
||||||
|
existing_doc_by_hash_result = await session.execute(
|
||||||
|
select(Document).where(Document.content_hash == content_hash)
|
||||||
|
)
|
||||||
|
existing_document_by_hash = existing_doc_by_hash_result.scalars().first()
|
||||||
|
|
||||||
|
if existing_document_by_hash:
|
||||||
|
logger.info(f"Document with content hash {content_hash} already exists for channel {channel_name}. Skipping processing.")
|
||||||
|
documents_skipped += 1
|
||||||
|
continue
|
||||||
|
|
||||||
# Generate summary
|
# Generate summary
|
||||||
summary_chain = SUMMARY_PROMPT_TEMPLATE | config.long_context_llm_instance
|
summary_chain = SUMMARY_PROMPT_TEMPLATE | config.long_context_llm_instance
|
||||||
|
|
@ -222,65 +208,32 @@ async def index_slack_messages(
|
||||||
|
|
||||||
# Process chunks
|
# Process chunks
|
||||||
chunks = [
|
chunks = [
|
||||||
Chunk(content=chunk.text, embedding=chunk.embedding)
|
Chunk(content=chunk.text, embedding=config.embedding_model_instance.embed(chunk.text))
|
||||||
for chunk in config.chunker_instance.chunk(channel_content)
|
for chunk in config.chunker_instance.chunk(channel_content)
|
||||||
]
|
]
|
||||||
|
|
||||||
# Check if this channel already exists in our database
|
# Create and store new document
|
||||||
existing_document = existing_docs_by_channel_id.get(channel_id)
|
document = Document(
|
||||||
|
search_space_id=search_space_id,
|
||||||
if existing_document:
|
title=f"Slack - {channel_name}",
|
||||||
# Update existing document instead of creating a new one
|
document_type=DocumentType.SLACK_CONNECTOR,
|
||||||
logger.info(f"Updating existing document for channel {channel_name}")
|
document_metadata={
|
||||||
|
|
||||||
# Update document fields
|
|
||||||
existing_document.title = f"Slack - {channel_name}"
|
|
||||||
existing_document.document_metadata = {
|
|
||||||
"channel_name": channel_name,
|
"channel_name": channel_name,
|
||||||
"channel_id": channel_id,
|
"channel_id": channel_id,
|
||||||
"start_date": start_date_str,
|
"start_date": start_date_str,
|
||||||
"end_date": end_date_str,
|
"end_date": end_date_str,
|
||||||
"message_count": len(formatted_messages),
|
"message_count": len(formatted_messages),
|
||||||
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
|
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||||
"last_updated": datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
},
|
||||||
}
|
content=summary_content,
|
||||||
existing_document.content = summary_content
|
embedding=summary_embedding,
|
||||||
existing_document.embedding = summary_embedding
|
chunks=chunks,
|
||||||
|
content_hash=content_hash,
|
||||||
# Delete existing chunks and add new ones
|
)
|
||||||
await session.execute(
|
|
||||||
delete(Chunk)
|
session.add(document)
|
||||||
.where(Chunk.document_id == existing_document.id)
|
documents_indexed += 1
|
||||||
)
|
logger.info(f"Successfully indexed new channel {channel_name} with {len(formatted_messages)} messages")
|
||||||
|
|
||||||
# Assign new chunks to existing document
|
|
||||||
for chunk in chunks:
|
|
||||||
chunk.document_id = existing_document.id
|
|
||||||
session.add(chunk)
|
|
||||||
|
|
||||||
documents_updated += 1
|
|
||||||
else:
|
|
||||||
# Create and store new document
|
|
||||||
document = Document(
|
|
||||||
search_space_id=search_space_id,
|
|
||||||
title=f"Slack - {channel_name}",
|
|
||||||
document_type=DocumentType.SLACK_CONNECTOR,
|
|
||||||
document_metadata={
|
|
||||||
"channel_name": channel_name,
|
|
||||||
"channel_id": channel_id,
|
|
||||||
"start_date": start_date_str,
|
|
||||||
"end_date": end_date_str,
|
|
||||||
"message_count": len(formatted_messages),
|
|
||||||
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
|
||||||
},
|
|
||||||
content=summary_content,
|
|
||||||
embedding=summary_embedding,
|
|
||||||
chunks=chunks
|
|
||||||
)
|
|
||||||
|
|
||||||
session.add(document)
|
|
||||||
documents_indexed += 1
|
|
||||||
logger.info(f"Successfully indexed new channel {channel_name} with {len(formatted_messages)} messages")
|
|
||||||
|
|
||||||
except SlackApiError as slack_error:
|
except SlackApiError as slack_error:
|
||||||
logger.error(f"Slack API error for channel {channel_name}: {str(slack_error)}")
|
logger.error(f"Slack API error for channel {channel_name}: {str(slack_error)}")
|
||||||
|
|
@ -295,7 +248,7 @@ async def index_slack_messages(
|
||||||
|
|
||||||
# Update the last_indexed_at timestamp for the connector only if requested
|
# Update the last_indexed_at timestamp for the connector only if requested
|
||||||
# and if we successfully indexed at least one channel
|
# and if we successfully indexed at least one channel
|
||||||
total_processed = documents_indexed + documents_updated
|
total_processed = documents_indexed
|
||||||
if update_last_indexed and total_processed > 0:
|
if update_last_indexed and total_processed > 0:
|
||||||
connector.last_indexed_at = datetime.now()
|
connector.last_indexed_at = datetime.now()
|
||||||
|
|
||||||
|
|
@ -305,11 +258,11 @@ async def index_slack_messages(
|
||||||
# Prepare result message
|
# Prepare result message
|
||||||
result_message = None
|
result_message = None
|
||||||
if skipped_channels:
|
if skipped_channels:
|
||||||
result_message = f"Processed {total_processed} channels ({documents_indexed} new, {documents_updated} updated). Skipped {len(skipped_channels)} channels: {', '.join(skipped_channels)}"
|
result_message = f"Processed {total_processed} channels. Skipped {len(skipped_channels)} channels: {', '.join(skipped_channels)}"
|
||||||
else:
|
else:
|
||||||
result_message = f"Processed {total_processed} channels ({documents_indexed} new, {documents_updated} updated)."
|
result_message = f"Processed {total_processed} channels."
|
||||||
|
|
||||||
logger.info(f"Slack indexing completed: {documents_indexed} new channels, {documents_updated} updated, {documents_skipped} skipped")
|
logger.info(f"Slack indexing completed: {documents_indexed} new channels, {documents_skipped} skipped")
|
||||||
return total_processed, result_message
|
return total_processed, result_message
|
||||||
|
|
||||||
except SQLAlchemyError as db_error:
|
except SQLAlchemyError as db_error:
|
||||||
|
|
@ -386,27 +339,8 @@ async def index_notion_pages(
|
||||||
logger.info("No Notion pages found to index")
|
logger.info("No Notion pages found to index")
|
||||||
return 0, "No Notion pages found"
|
return 0, "No Notion pages found"
|
||||||
|
|
||||||
# Get existing documents for this search space and connector type to prevent duplicates
|
|
||||||
existing_docs_result = await session.execute(
|
|
||||||
select(Document)
|
|
||||||
.filter(
|
|
||||||
Document.search_space_id == search_space_id,
|
|
||||||
Document.document_type == DocumentType.NOTION_CONNECTOR
|
|
||||||
)
|
|
||||||
)
|
|
||||||
existing_docs = existing_docs_result.scalars().all()
|
|
||||||
|
|
||||||
# Create a lookup dictionary of existing documents by page_id
|
|
||||||
existing_docs_by_page_id = {}
|
|
||||||
for doc in existing_docs:
|
|
||||||
if "page_id" in doc.document_metadata:
|
|
||||||
existing_docs_by_page_id[doc.document_metadata["page_id"]] = doc
|
|
||||||
|
|
||||||
logger.info(f"Found {len(existing_docs_by_page_id)} existing Notion documents in database")
|
|
||||||
|
|
||||||
# Track the number of documents indexed
|
# Track the number of documents indexed
|
||||||
documents_indexed = 0
|
documents_indexed = 0
|
||||||
documents_updated = 0
|
|
||||||
documents_skipped = 0
|
documents_skipped = 0
|
||||||
skipped_pages = []
|
skipped_pages = []
|
||||||
|
|
||||||
|
|
@ -482,8 +416,7 @@ async def index_notion_pages(
|
||||||
metadata_sections = [
|
metadata_sections = [
|
||||||
("METADATA", [
|
("METADATA", [
|
||||||
f"PAGE_TITLE: {page_title}",
|
f"PAGE_TITLE: {page_title}",
|
||||||
f"PAGE_ID: {page_id}",
|
f"PAGE_ID: {page_id}"
|
||||||
f"INDEXED_AT: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"
|
|
||||||
]),
|
]),
|
||||||
("CONTENT", [
|
("CONTENT", [
|
||||||
"FORMAT: markdown",
|
"FORMAT: markdown",
|
||||||
|
|
@ -504,6 +437,18 @@ async def index_notion_pages(
|
||||||
|
|
||||||
document_parts.append("</DOCUMENT>")
|
document_parts.append("</DOCUMENT>")
|
||||||
combined_document_string = '\n'.join(document_parts)
|
combined_document_string = '\n'.join(document_parts)
|
||||||
|
content_hash = generate_content_hash(combined_document_string)
|
||||||
|
|
||||||
|
# Check if document with this content hash already exists
|
||||||
|
existing_doc_by_hash_result = await session.execute(
|
||||||
|
select(Document).where(Document.content_hash == content_hash)
|
||||||
|
)
|
||||||
|
existing_document_by_hash = existing_doc_by_hash_result.scalars().first()
|
||||||
|
|
||||||
|
if existing_document_by_hash:
|
||||||
|
logger.info(f"Document with content hash {content_hash} already exists for page {page_title}. Skipping processing.")
|
||||||
|
documents_skipped += 1
|
||||||
|
continue
|
||||||
|
|
||||||
# Generate summary
|
# Generate summary
|
||||||
logger.debug(f"Generating summary for page {page_title}")
|
logger.debug(f"Generating summary for page {page_title}")
|
||||||
|
|
@ -515,59 +460,29 @@ async def index_notion_pages(
|
||||||
# Process chunks
|
# Process chunks
|
||||||
logger.debug(f"Chunking content for page {page_title}")
|
logger.debug(f"Chunking content for page {page_title}")
|
||||||
chunks = [
|
chunks = [
|
||||||
Chunk(content=chunk.text, embedding=chunk.embedding)
|
Chunk(content=chunk.text, embedding=config.embedding_model_instance.embed(chunk.text))
|
||||||
for chunk in config.chunker_instance.chunk(markdown_content)
|
for chunk in config.chunker_instance.chunk(markdown_content)
|
||||||
]
|
]
|
||||||
|
|
||||||
# Check if this page already exists in our database
|
# Create and store new document
|
||||||
existing_document = existing_docs_by_page_id.get(page_id)
|
document = Document(
|
||||||
|
search_space_id=search_space_id,
|
||||||
if existing_document:
|
title=f"Notion - {page_title}",
|
||||||
# Update existing document instead of creating a new one
|
document_type=DocumentType.NOTION_CONNECTOR,
|
||||||
logger.info(f"Updating existing document for page {page_title}")
|
document_metadata={
|
||||||
|
|
||||||
# Update document fields
|
|
||||||
existing_document.title = f"Notion - {page_title}"
|
|
||||||
existing_document.document_metadata = {
|
|
||||||
"page_title": page_title,
|
"page_title": page_title,
|
||||||
"page_id": page_id,
|
"page_id": page_id,
|
||||||
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
|
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||||
"last_updated": datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
},
|
||||||
}
|
content=summary_content,
|
||||||
existing_document.content = summary_content
|
content_hash=content_hash,
|
||||||
existing_document.embedding = summary_embedding
|
embedding=summary_embedding,
|
||||||
|
chunks=chunks
|
||||||
# Delete existing chunks and add new ones
|
)
|
||||||
await session.execute(
|
|
||||||
delete(Chunk)
|
session.add(document)
|
||||||
.where(Chunk.document_id == existing_document.id)
|
documents_indexed += 1
|
||||||
)
|
logger.info(f"Successfully indexed new Notion page: {page_title}")
|
||||||
|
|
||||||
# Assign new chunks to existing document
|
|
||||||
for chunk in chunks:
|
|
||||||
chunk.document_id = existing_document.id
|
|
||||||
session.add(chunk)
|
|
||||||
|
|
||||||
documents_updated += 1
|
|
||||||
else:
|
|
||||||
# Create and store new document
|
|
||||||
document = Document(
|
|
||||||
search_space_id=search_space_id,
|
|
||||||
title=f"Notion - {page_title}",
|
|
||||||
document_type=DocumentType.NOTION_CONNECTOR,
|
|
||||||
document_metadata={
|
|
||||||
"page_title": page_title,
|
|
||||||
"page_id": page_id,
|
|
||||||
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
|
||||||
},
|
|
||||||
content=summary_content,
|
|
||||||
embedding=summary_embedding,
|
|
||||||
chunks=chunks
|
|
||||||
)
|
|
||||||
|
|
||||||
session.add(document)
|
|
||||||
documents_indexed += 1
|
|
||||||
logger.info(f"Successfully indexed new Notion page: {page_title}")
|
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"Error processing Notion page {page.get('title', 'Unknown')}: {str(e)}", exc_info=True)
|
logger.error(f"Error processing Notion page {page.get('title', 'Unknown')}: {str(e)}", exc_info=True)
|
||||||
|
|
@ -577,7 +492,7 @@ async def index_notion_pages(
|
||||||
|
|
||||||
# Update the last_indexed_at timestamp for the connector only if requested
|
# Update the last_indexed_at timestamp for the connector only if requested
|
||||||
# and if we successfully indexed at least one page
|
# and if we successfully indexed at least one page
|
||||||
total_processed = documents_indexed + documents_updated
|
total_processed = documents_indexed
|
||||||
if update_last_indexed and total_processed > 0:
|
if update_last_indexed and total_processed > 0:
|
||||||
connector.last_indexed_at = datetime.now()
|
connector.last_indexed_at = datetime.now()
|
||||||
logger.info(f"Updated last_indexed_at for connector {connector_id}")
|
logger.info(f"Updated last_indexed_at for connector {connector_id}")
|
||||||
|
|
@ -588,11 +503,11 @@ async def index_notion_pages(
|
||||||
# Prepare result message
|
# Prepare result message
|
||||||
result_message = None
|
result_message = None
|
||||||
if skipped_pages:
|
if skipped_pages:
|
||||||
result_message = f"Processed {total_processed} pages ({documents_indexed} new, {documents_updated} updated). Skipped {len(skipped_pages)} pages: {', '.join(skipped_pages)}"
|
result_message = f"Processed {total_processed} pages. Skipped {len(skipped_pages)} pages: {', '.join(skipped_pages)}"
|
||||||
else:
|
else:
|
||||||
result_message = f"Processed {total_processed} pages ({documents_indexed} new, {documents_updated} updated)."
|
result_message = f"Processed {total_processed} pages."
|
||||||
|
|
||||||
logger.info(f"Notion indexing completed: {documents_indexed} new pages, {documents_updated} updated, {documents_skipped} skipped")
|
logger.info(f"Notion indexing completed: {documents_indexed} new pages, {documents_skipped} skipped")
|
||||||
return total_processed, result_message
|
return total_processed, result_message
|
||||||
|
|
||||||
except SQLAlchemyError as db_error:
|
except SQLAlchemyError as db_error:
|
||||||
|
|
@ -660,19 +575,6 @@ async def index_github_repos(
|
||||||
# If a repo is inaccessible, get_repository_files will likely fail gracefully later.
|
# If a repo is inaccessible, get_repository_files will likely fail gracefully later.
|
||||||
logger.info(f"Starting indexing for {len(repo_full_names_to_index)} selected repositories.")
|
logger.info(f"Starting indexing for {len(repo_full_names_to_index)} selected repositories.")
|
||||||
|
|
||||||
# 5. Get existing documents for this search space and connector type to prevent duplicates
|
|
||||||
existing_docs_result = await session.execute(
|
|
||||||
select(Document)
|
|
||||||
.filter(
|
|
||||||
Document.search_space_id == search_space_id,
|
|
||||||
Document.document_type == DocumentType.GITHUB_CONNECTOR
|
|
||||||
)
|
|
||||||
)
|
|
||||||
existing_docs = existing_docs_result.scalars().all()
|
|
||||||
# Create a lookup dict: key=repo_fullname/file_path, value=Document object
|
|
||||||
existing_docs_lookup = {doc.document_metadata.get("full_path"): doc for doc in existing_docs if doc.document_metadata.get("full_path")}
|
|
||||||
logger.info(f"Found {len(existing_docs_lookup)} existing GitHub documents in database for search space {search_space_id}")
|
|
||||||
|
|
||||||
# 6. Iterate through selected repositories and index files
|
# 6. Iterate through selected repositories and index files
|
||||||
for repo_full_name in repo_full_names_to_index:
|
for repo_full_name in repo_full_names_to_index:
|
||||||
if not repo_full_name or not isinstance(repo_full_name, str):
|
if not repo_full_name or not isinstance(repo_full_name, str):
|
||||||
|
|
@ -699,12 +601,6 @@ async def index_github_repos(
|
||||||
logger.warning(f"Skipping file with missing info in {repo_full_name}: {file_info}")
|
logger.warning(f"Skipping file with missing info in {repo_full_name}: {file_info}")
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# Check if document already exists and if content hash matches
|
|
||||||
existing_doc = existing_docs_lookup.get(full_path_key)
|
|
||||||
if existing_doc and existing_doc.document_metadata.get("sha") == file_sha:
|
|
||||||
logger.debug(f"Skipping unchanged file: {full_path_key}")
|
|
||||||
continue # Skip if SHA matches (content hasn't changed)
|
|
||||||
|
|
||||||
# Get file content
|
# Get file content
|
||||||
file_content = github_client.get_file_content(repo_full_name, file_path)
|
file_content = github_client.get_file_content(repo_full_name, file_path)
|
||||||
|
|
||||||
|
|
@ -712,6 +608,18 @@ async def index_github_repos(
|
||||||
logger.warning(f"Could not retrieve content for {full_path_key}. Skipping.")
|
logger.warning(f"Could not retrieve content for {full_path_key}. Skipping.")
|
||||||
continue # Skip if content fetch failed
|
continue # Skip if content fetch failed
|
||||||
|
|
||||||
|
content_hash = generate_content_hash(file_content)
|
||||||
|
|
||||||
|
# Check if document with this content hash already exists
|
||||||
|
existing_doc_by_hash_result = await session.execute(
|
||||||
|
select(Document).where(Document.content_hash == content_hash)
|
||||||
|
)
|
||||||
|
existing_document_by_hash = existing_doc_by_hash_result.scalars().first()
|
||||||
|
|
||||||
|
if existing_document_by_hash:
|
||||||
|
logger.info(f"Document with content hash {content_hash} already exists for file {full_path_key}. Skipping processing.")
|
||||||
|
continue
|
||||||
|
|
||||||
# Use file_content directly for chunking, maybe summary for main content?
|
# Use file_content directly for chunking, maybe summary for main content?
|
||||||
# For now, let's use the full content for both, might need refinement
|
# For now, let's use the full content for both, might need refinement
|
||||||
summary_content = f"GitHub file: {full_path_key}\n\n{file_content[:1000]}..." # Simple summary
|
summary_content = f"GitHub file: {full_path_key}\n\n{file_content[:1000]}..." # Simple summary
|
||||||
|
|
@ -720,8 +628,8 @@ async def index_github_repos(
|
||||||
# Chunk the content
|
# Chunk the content
|
||||||
try:
|
try:
|
||||||
chunks_data = [
|
chunks_data = [
|
||||||
Chunk(content=chunk.text, embedding=chunk.embedding)
|
Chunk(content=chunk.text, embedding=config.embedding_model_instance.embed(chunk.text))
|
||||||
for chunk in config.chunker_instance.chunk(file_content)
|
for chunk in config.code_chunker_instance.chunk(file_content)
|
||||||
]
|
]
|
||||||
except Exception as chunk_err:
|
except Exception as chunk_err:
|
||||||
logger.error(f"Failed to chunk file {full_path_key}: {chunk_err}")
|
logger.error(f"Failed to chunk file {full_path_key}: {chunk_err}")
|
||||||
|
|
@ -738,42 +646,20 @@ async def index_github_repos(
|
||||||
"indexed_at": datetime.now(timezone.utc).isoformat()
|
"indexed_at": datetime.now(timezone.utc).isoformat()
|
||||||
}
|
}
|
||||||
|
|
||||||
if existing_doc:
|
# Create new document
|
||||||
# Update existing document
|
logger.info(f"Creating new document for file: {full_path_key}")
|
||||||
logger.info(f"Updating document for file: {full_path_key}")
|
document = Document(
|
||||||
existing_doc.title = f"GitHub - {file_path}"
|
title=f"GitHub - {file_path}",
|
||||||
existing_doc.document_metadata = doc_metadata
|
document_type=DocumentType.GITHUB_CONNECTOR,
|
||||||
existing_doc.content = summary_content # Update summary
|
document_metadata=doc_metadata,
|
||||||
existing_doc.embedding = summary_embedding # Update embedding
|
content=summary_content, # Store summary
|
||||||
|
content_hash=content_hash,
|
||||||
# Delete old chunks
|
embedding=summary_embedding,
|
||||||
await session.execute(
|
search_space_id=search_space_id,
|
||||||
delete(Chunk)
|
chunks=chunks_data # Associate chunks directly
|
||||||
.where(Chunk.document_id == existing_doc.id)
|
)
|
||||||
)
|
session.add(document)
|
||||||
# Add new chunks
|
documents_processed += 1
|
||||||
for chunk_obj in chunks_data:
|
|
||||||
chunk_obj.document_id = existing_doc.id
|
|
||||||
session.add(chunk_obj)
|
|
||||||
|
|
||||||
documents_processed += 1
|
|
||||||
else:
|
|
||||||
# Create new document
|
|
||||||
logger.info(f"Creating new document for file: {full_path_key}")
|
|
||||||
document = Document(
|
|
||||||
title=f"GitHub - {file_path}",
|
|
||||||
document_type=DocumentType.GITHUB_CONNECTOR,
|
|
||||||
document_metadata=doc_metadata,
|
|
||||||
content=summary_content, # Store summary
|
|
||||||
embedding=summary_embedding,
|
|
||||||
search_space_id=search_space_id,
|
|
||||||
chunks=chunks_data # Associate chunks directly
|
|
||||||
)
|
|
||||||
session.add(document)
|
|
||||||
documents_processed += 1
|
|
||||||
|
|
||||||
# Commit periodically or at the end? For now, commit per repo
|
|
||||||
# await session.commit()
|
|
||||||
|
|
||||||
except Exception as repo_err:
|
except Exception as repo_err:
|
||||||
logger.error(f"Failed to process repository {repo_full_name}: {repo_err}")
|
logger.error(f"Failed to process repository {repo_full_name}: {repo_err}")
|
||||||
|
|
@ -847,14 +733,14 @@ async def index_linear_issues(
|
||||||
|
|
||||||
# Check if last_indexed_at is in the future or after end_date
|
# Check if last_indexed_at is in the future or after end_date
|
||||||
if last_indexed_naive > end_date:
|
if last_indexed_naive > end_date:
|
||||||
logger.warning(f"Last indexed date ({last_indexed_naive.strftime('%Y-%m-%d')}) is in the future. Using 30 days ago instead.")
|
logger.warning(f"Last indexed date ({last_indexed_naive.strftime('%Y-%m-%d')}) is in the future. Using 365 days ago instead.")
|
||||||
start_date = end_date - timedelta(days=30)
|
start_date = end_date - timedelta(days=365)
|
||||||
else:
|
else:
|
||||||
start_date = last_indexed_naive
|
start_date = last_indexed_naive
|
||||||
logger.info(f"Using last_indexed_at ({start_date.strftime('%Y-%m-%d')}) as start date")
|
logger.info(f"Using last_indexed_at ({start_date.strftime('%Y-%m-%d')}) as start date")
|
||||||
else:
|
else:
|
||||||
start_date = end_date - timedelta(days=30) # Use 30 days instead of 365 to catch recent issues
|
start_date = end_date - timedelta(days=365) # Use 365 days as default
|
||||||
logger.info(f"No last_indexed_at found, using {start_date.strftime('%Y-%m-%d')} (30 days ago) as start date")
|
logger.info(f"No last_indexed_at found, using {start_date.strftime('%Y-%m-%d')} (365 days ago) as start date")
|
||||||
|
|
||||||
# Format dates for Linear API
|
# Format dates for Linear API
|
||||||
start_date_str = start_date.strftime("%Y-%m-%d")
|
start_date_str = start_date.strftime("%Y-%m-%d")
|
||||||
|
|
@ -905,35 +791,8 @@ async def index_linear_issues(
|
||||||
if len(issues) > 10:
|
if len(issues) > 10:
|
||||||
logger.info(f" ...and {len(issues) - 10} more issues")
|
logger.info(f" ...and {len(issues) - 10} more issues")
|
||||||
|
|
||||||
# Get existing documents for this search space and connector type to prevent duplicates
|
|
||||||
existing_docs_result = await session.execute(
|
|
||||||
select(Document)
|
|
||||||
.filter(
|
|
||||||
Document.search_space_id == search_space_id,
|
|
||||||
Document.document_type == DocumentType.LINEAR_CONNECTOR
|
|
||||||
)
|
|
||||||
)
|
|
||||||
existing_docs = existing_docs_result.scalars().all()
|
|
||||||
|
|
||||||
# Create a lookup dictionary of existing documents by issue_id
|
|
||||||
existing_docs_by_issue_id = {}
|
|
||||||
for doc in existing_docs:
|
|
||||||
if "issue_id" in doc.document_metadata:
|
|
||||||
existing_docs_by_issue_id[doc.document_metadata["issue_id"]] = doc
|
|
||||||
|
|
||||||
logger.info(f"Found {len(existing_docs_by_issue_id)} existing Linear documents in database")
|
|
||||||
|
|
||||||
# Log existing document IDs for debugging
|
|
||||||
if existing_docs_by_issue_id:
|
|
||||||
logger.info("Existing Linear document issue IDs in database:")
|
|
||||||
for idx, (issue_id, doc) in enumerate(list(existing_docs_by_issue_id.items())[:10]): # Log first 10
|
|
||||||
logger.info(f" {idx+1}. {issue_id} - {doc.document_metadata.get('issue_identifier', 'Unknown')} - {doc.document_metadata.get('issue_title', 'Unknown')}")
|
|
||||||
if len(existing_docs_by_issue_id) > 10:
|
|
||||||
logger.info(f" ...and {len(existing_docs_by_issue_id) - 10} more existing documents")
|
|
||||||
|
|
||||||
# Track the number of documents indexed
|
# Track the number of documents indexed
|
||||||
documents_indexed = 0
|
documents_indexed = 0
|
||||||
documents_updated = 0
|
|
||||||
documents_skipped = 0
|
documents_skipped = 0
|
||||||
skipped_issues = []
|
skipped_issues = []
|
||||||
|
|
||||||
|
|
@ -979,71 +838,51 @@ async def index_linear_issues(
|
||||||
comment_count = len(formatted_issue.get("comments", []))
|
comment_count = len(formatted_issue.get("comments", []))
|
||||||
summary_content += f"Comments: {comment_count}"
|
summary_content += f"Comments: {comment_count}"
|
||||||
|
|
||||||
|
content_hash = generate_content_hash(issue_content)
|
||||||
|
|
||||||
|
# Check if document with this content hash already exists
|
||||||
|
existing_doc_by_hash_result = await session.execute(
|
||||||
|
select(Document).where(Document.content_hash == content_hash)
|
||||||
|
)
|
||||||
|
existing_document_by_hash = existing_doc_by_hash_result.scalars().first()
|
||||||
|
|
||||||
|
if existing_document_by_hash:
|
||||||
|
logger.info(f"Document with content hash {content_hash} already exists for issue {issue_identifier}. Skipping processing.")
|
||||||
|
documents_skipped += 1
|
||||||
|
continue
|
||||||
|
|
||||||
# Generate embedding for the summary
|
# Generate embedding for the summary
|
||||||
summary_embedding = config.embedding_model_instance.embed(summary_content)
|
summary_embedding = config.embedding_model_instance.embed(summary_content)
|
||||||
|
|
||||||
# Process chunks - using the full issue content with comments
|
# Process chunks - using the full issue content with comments
|
||||||
chunks = [
|
chunks = [
|
||||||
Chunk(content=chunk.text, embedding=chunk.embedding)
|
Chunk(content=chunk.text, embedding=config.embedding_model_instance.embed(chunk.text))
|
||||||
for chunk in config.chunker_instance.chunk(issue_content)
|
for chunk in config.chunker_instance.chunk(issue_content)
|
||||||
]
|
]
|
||||||
|
|
||||||
# Check if this issue already exists in our database
|
# Create and store new document
|
||||||
existing_document = existing_docs_by_issue_id.get(issue_id)
|
logger.info(f"Creating new document for issue {issue_identifier} - {issue_title}")
|
||||||
|
document = Document(
|
||||||
if existing_document:
|
search_space_id=search_space_id,
|
||||||
# Update existing document instead of creating a new one
|
title=f"Linear - {issue_identifier}: {issue_title}",
|
||||||
logger.info(f"Updating existing document for issue {issue_identifier} - {issue_title}")
|
document_type=DocumentType.LINEAR_CONNECTOR,
|
||||||
|
document_metadata={
|
||||||
# Update document fields
|
|
||||||
existing_document.title = f"Linear - {issue_identifier}: {issue_title}"
|
|
||||||
existing_document.document_metadata = {
|
|
||||||
"issue_id": issue_id,
|
"issue_id": issue_id,
|
||||||
"issue_identifier": issue_identifier,
|
"issue_identifier": issue_identifier,
|
||||||
"issue_title": issue_title,
|
"issue_title": issue_title,
|
||||||
"state": state,
|
"state": state,
|
||||||
"comment_count": comment_count,
|
"comment_count": comment_count,
|
||||||
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
|
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||||
"last_updated": datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
},
|
||||||
}
|
content=summary_content,
|
||||||
existing_document.content = summary_content
|
content_hash=content_hash,
|
||||||
existing_document.embedding = summary_embedding
|
embedding=summary_embedding,
|
||||||
|
chunks=chunks
|
||||||
# Delete existing chunks and add new ones
|
)
|
||||||
await session.execute(
|
|
||||||
delete(Chunk)
|
session.add(document)
|
||||||
.where(Chunk.document_id == existing_document.id)
|
documents_indexed += 1
|
||||||
)
|
logger.info(f"Successfully indexed new issue {issue_identifier} - {issue_title}")
|
||||||
|
|
||||||
# Assign new chunks to existing document
|
|
||||||
for chunk in chunks:
|
|
||||||
chunk.document_id = existing_document.id
|
|
||||||
session.add(chunk)
|
|
||||||
|
|
||||||
documents_updated += 1
|
|
||||||
else:
|
|
||||||
# Create and store new document
|
|
||||||
logger.info(f"Creating new document for issue {issue_identifier} - {issue_title}")
|
|
||||||
document = Document(
|
|
||||||
search_space_id=search_space_id,
|
|
||||||
title=f"Linear - {issue_identifier}: {issue_title}",
|
|
||||||
document_type=DocumentType.LINEAR_CONNECTOR,
|
|
||||||
document_metadata={
|
|
||||||
"issue_id": issue_id,
|
|
||||||
"issue_identifier": issue_identifier,
|
|
||||||
"issue_title": issue_title,
|
|
||||||
"state": state,
|
|
||||||
"comment_count": comment_count,
|
|
||||||
"indexed_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
|
||||||
},
|
|
||||||
content=summary_content,
|
|
||||||
embedding=summary_embedding,
|
|
||||||
chunks=chunks
|
|
||||||
)
|
|
||||||
|
|
||||||
session.add(document)
|
|
||||||
documents_indexed += 1
|
|
||||||
logger.info(f"Successfully indexed new issue {issue_identifier} - {issue_title}")
|
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"Error processing issue {issue.get('identifier', 'Unknown')}: {str(e)}", exc_info=True)
|
logger.error(f"Error processing issue {issue.get('identifier', 'Unknown')}: {str(e)}", exc_info=True)
|
||||||
|
|
@ -1052,7 +891,7 @@ async def index_linear_issues(
|
||||||
continue # Skip this issue and continue with others
|
continue # Skip this issue and continue with others
|
||||||
|
|
||||||
# Update the last_indexed_at timestamp for the connector only if requested
|
# Update the last_indexed_at timestamp for the connector only if requested
|
||||||
total_processed = documents_indexed + documents_updated
|
total_processed = documents_indexed
|
||||||
if update_last_indexed:
|
if update_last_indexed:
|
||||||
connector.last_indexed_at = datetime.now()
|
connector.last_indexed_at = datetime.now()
|
||||||
logger.info(f"Updated last_indexed_at to {connector.last_indexed_at}")
|
logger.info(f"Updated last_indexed_at to {connector.last_indexed_at}")
|
||||||
|
|
@ -1062,7 +901,7 @@ async def index_linear_issues(
|
||||||
logger.info(f"Successfully committed all Linear document changes to database")
|
logger.info(f"Successfully committed all Linear document changes to database")
|
||||||
|
|
||||||
|
|
||||||
logger.info(f"Linear indexing completed: {documents_indexed} new issues, {documents_updated} updated, {documents_skipped} skipped")
|
logger.info(f"Linear indexing completed: {documents_indexed} new issues, {documents_skipped} skipped")
|
||||||
return total_processed, None # Return None as the error message to indicate success
|
return total_processed, None # Return None as the error message to indicate success
|
||||||
|
|
||||||
except SQLAlchemyError as db_error:
|
except SQLAlchemyError as db_error:
|
||||||
|
|
|
||||||
93
surfsense_backend/app/tasks/podcast_tasks.py
Normal file
93
surfsense_backend/app/tasks/podcast_tasks.py
Normal file
|
|
@ -0,0 +1,93 @@
|
||||||
|
|
||||||
|
from app.agents.podcaster.graph import graph as podcaster_graph
|
||||||
|
from app.agents.podcaster.state import State
|
||||||
|
from app.db import Chat, Podcast
|
||||||
|
from sqlalchemy import select
|
||||||
|
from sqlalchemy.ext.asyncio import AsyncSession
|
||||||
|
|
||||||
|
|
||||||
|
async def generate_document_podcast(
|
||||||
|
session: AsyncSession,
|
||||||
|
document_id: int,
|
||||||
|
search_space_id: int,
|
||||||
|
user_id: int
|
||||||
|
):
|
||||||
|
# TODO: Need to fetch the document chunks, then concatenate them and pass them to the podcast generation model
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
async def generate_chat_podcast(
|
||||||
|
session: AsyncSession,
|
||||||
|
chat_id: int,
|
||||||
|
search_space_id: int,
|
||||||
|
podcast_title: str
|
||||||
|
):
|
||||||
|
# Fetch the chat with the specified ID
|
||||||
|
query = select(Chat).filter(
|
||||||
|
Chat.id == chat_id,
|
||||||
|
Chat.search_space_id == search_space_id
|
||||||
|
)
|
||||||
|
|
||||||
|
result = await session.execute(query)
|
||||||
|
chat = result.scalars().first()
|
||||||
|
|
||||||
|
if not chat:
|
||||||
|
raise ValueError(f"Chat with id {chat_id} not found in search space {search_space_id}")
|
||||||
|
|
||||||
|
# Create chat history structure
|
||||||
|
chat_history_str = "<chat_history>"
|
||||||
|
|
||||||
|
for message in chat.messages:
|
||||||
|
if message["role"] == "user":
|
||||||
|
chat_history_str += f"<user_message>{message['content']}</user_message>"
|
||||||
|
elif message["role"] == "assistant":
|
||||||
|
# Last annotation type will always be "ANSWER" here
|
||||||
|
answer_annotation = message["annotations"][-1]
|
||||||
|
answer_text = ""
|
||||||
|
if answer_annotation["type"] == "ANSWER":
|
||||||
|
answer_text = answer_annotation["content"]
|
||||||
|
# If content is a list, join it into a single string
|
||||||
|
if isinstance(answer_text, list):
|
||||||
|
answer_text = "\n".join(answer_text)
|
||||||
|
chat_history_str += f"<assistant_message>{answer_text}</assistant_message>"
|
||||||
|
|
||||||
|
chat_history_str += "</chat_history>"
|
||||||
|
|
||||||
|
# Pass it to the SurfSense Podcaster
|
||||||
|
config = {
|
||||||
|
"configurable": {
|
||||||
|
"podcast_title" : "Surfsense",
|
||||||
|
}
|
||||||
|
}
|
||||||
|
# Initialize state with database session and streaming service
|
||||||
|
initial_state = State(
|
||||||
|
source_content=chat_history_str,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Run the graph directly
|
||||||
|
result = await podcaster_graph.ainvoke(initial_state, config=config)
|
||||||
|
|
||||||
|
# Convert podcast transcript entries to serializable format
|
||||||
|
serializable_transcript = []
|
||||||
|
for entry in result["podcast_transcript"]:
|
||||||
|
serializable_transcript.append({
|
||||||
|
"speaker_id": entry.speaker_id,
|
||||||
|
"dialog": entry.dialog
|
||||||
|
})
|
||||||
|
|
||||||
|
# Create a new podcast entry
|
||||||
|
podcast = Podcast(
|
||||||
|
title=f"{podcast_title}",
|
||||||
|
podcast_transcript=serializable_transcript,
|
||||||
|
file_location=result["final_podcast_file_path"],
|
||||||
|
search_space_id=search_space_id
|
||||||
|
)
|
||||||
|
|
||||||
|
# Add to session and commit
|
||||||
|
session.add(podcast)
|
||||||
|
await session.commit()
|
||||||
|
await session.refresh(podcast)
|
||||||
|
|
||||||
|
return podcast
|
||||||
|
|
||||||
|
|
@ -1,4 +1,4 @@
|
||||||
from typing import AsyncGenerator, List, Union
|
from typing import Any, AsyncGenerator, List, Union
|
||||||
from uuid import UUID
|
from uuid import UUID
|
||||||
|
|
||||||
from app.agents.researcher.graph import graph as researcher_graph
|
from app.agents.researcher.graph import graph as researcher_graph
|
||||||
|
|
@ -6,6 +6,8 @@ from app.agents.researcher.state import State
|
||||||
from app.utils.streaming_service import StreamingService
|
from app.utils.streaming_service import StreamingService
|
||||||
from sqlalchemy.ext.asyncio import AsyncSession
|
from sqlalchemy.ext.asyncio import AsyncSession
|
||||||
|
|
||||||
|
from app.agents.researcher.configuration import SearchMode
|
||||||
|
|
||||||
|
|
||||||
async def stream_connector_search_results(
|
async def stream_connector_search_results(
|
||||||
user_query: str,
|
user_query: str,
|
||||||
|
|
@ -13,7 +15,9 @@ async def stream_connector_search_results(
|
||||||
search_space_id: int,
|
search_space_id: int,
|
||||||
session: AsyncSession,
|
session: AsyncSession,
|
||||||
research_mode: str,
|
research_mode: str,
|
||||||
selected_connectors: List[str]
|
selected_connectors: List[str],
|
||||||
|
langchain_chat_history: List[Any],
|
||||||
|
search_mode_str: str
|
||||||
) -> AsyncGenerator[str, None]:
|
) -> AsyncGenerator[str, None]:
|
||||||
"""
|
"""
|
||||||
Stream connector search results to the client
|
Stream connector search results to the client
|
||||||
|
|
@ -40,6 +44,11 @@ async def stream_connector_search_results(
|
||||||
# Convert UUID to string if needed
|
# Convert UUID to string if needed
|
||||||
user_id_str = str(user_id) if isinstance(user_id, UUID) else user_id
|
user_id_str = str(user_id) if isinstance(user_id, UUID) else user_id
|
||||||
|
|
||||||
|
if search_mode_str == "CHUNKS":
|
||||||
|
search_mode = SearchMode.CHUNKS
|
||||||
|
elif search_mode_str == "DOCUMENTS":
|
||||||
|
search_mode = SearchMode.DOCUMENTS
|
||||||
|
|
||||||
# Sample configuration
|
# Sample configuration
|
||||||
config = {
|
config = {
|
||||||
"configurable": {
|
"configurable": {
|
||||||
|
|
@ -47,13 +56,15 @@ async def stream_connector_search_results(
|
||||||
"num_sections": NUM_SECTIONS,
|
"num_sections": NUM_SECTIONS,
|
||||||
"connectors_to_search": selected_connectors,
|
"connectors_to_search": selected_connectors,
|
||||||
"user_id": user_id_str,
|
"user_id": user_id_str,
|
||||||
"search_space_id": search_space_id
|
"search_space_id": search_space_id,
|
||||||
|
"search_mode": search_mode
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
# Initialize state with database session and streaming service
|
# Initialize state with database session and streaming service
|
||||||
initial_state = State(
|
initial_state = State(
|
||||||
db_session=session,
|
db_session=session,
|
||||||
streaming_service=streaming_service
|
streaming_service=streaming_service,
|
||||||
|
chat_history=langchain_chat_history
|
||||||
)
|
)
|
||||||
|
|
||||||
# Run the graph directly
|
# Run the graph directly
|
||||||
|
|
|
||||||
|
|
@ -10,8 +10,8 @@ from fastapi_users.authentication import (
|
||||||
JWTStrategy,
|
JWTStrategy,
|
||||||
)
|
)
|
||||||
from fastapi_users.db import SQLAlchemyUserDatabase
|
from fastapi_users.db import SQLAlchemyUserDatabase
|
||||||
from httpx_oauth.clients.google import GoogleOAuth2
|
from fastapi.responses import JSONResponse
|
||||||
|
from fastapi_users.schemas import model_dump
|
||||||
from app.config import config
|
from app.config import config
|
||||||
from app.db import User, get_user_db
|
from app.db import User, get_user_db
|
||||||
from pydantic import BaseModel
|
from pydantic import BaseModel
|
||||||
|
|
@ -22,10 +22,13 @@ class BearerResponse(BaseModel):
|
||||||
|
|
||||||
SECRET = config.SECRET_KEY
|
SECRET = config.SECRET_KEY
|
||||||
|
|
||||||
google_oauth_client = GoogleOAuth2(
|
if config.AUTH_TYPE == "GOOGLE":
|
||||||
config.GOOGLE_OAUTH_CLIENT_ID,
|
from httpx_oauth.clients.google import GoogleOAuth2
|
||||||
config.GOOGLE_OAUTH_CLIENT_SECRET,
|
|
||||||
)
|
google_oauth_client = GoogleOAuth2(
|
||||||
|
config.GOOGLE_OAUTH_CLIENT_ID,
|
||||||
|
config.GOOGLE_OAUTH_CLIENT_SECRET,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class UserManager(UUIDIDMixin, BaseUserManager[User, uuid.UUID]):
|
class UserManager(UUIDIDMixin, BaseUserManager[User, uuid.UUID]):
|
||||||
|
|
@ -79,7 +82,10 @@ class CustomBearerTransport(BearerTransport):
|
||||||
async def get_login_response(self, token: str) -> Response:
|
async def get_login_response(self, token: str) -> Response:
|
||||||
bearer_response = BearerResponse(access_token=token, token_type="bearer")
|
bearer_response = BearerResponse(access_token=token, token_type="bearer")
|
||||||
redirect_url = f"{config.NEXT_FRONTEND_URL}/auth/callback?token={bearer_response.access_token}"
|
redirect_url = f"{config.NEXT_FRONTEND_URL}/auth/callback?token={bearer_response.access_token}"
|
||||||
return RedirectResponse(redirect_url, status_code=302)
|
if config.AUTH_TYPE == "GOOGLE":
|
||||||
|
return RedirectResponse(redirect_url, status_code=302)
|
||||||
|
else:
|
||||||
|
return JSONResponse(model_dump(bearer_response))
|
||||||
|
|
||||||
bearer_transport = CustomBearerTransport(tokenUrl="auth/jwt/login")
|
bearer_transport = CustomBearerTransport(tokenUrl="auth/jwt/login")
|
||||||
|
|
||||||
|
|
|
||||||
File diff suppressed because it is too large
Load diff
|
|
@ -1,19 +1,22 @@
|
||||||
|
import hashlib
|
||||||
|
|
||||||
|
|
||||||
async def convert_element_to_markdown(element) -> str:
|
async def convert_element_to_markdown(element) -> str:
|
||||||
"""
|
"""
|
||||||
Convert an Unstructured element to markdown format based on its category.
|
Convert an Unstructured element to markdown format based on its category.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
element: The Unstructured API element object
|
element: The Unstructured API element object
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
str: Markdown formatted string
|
str: Markdown formatted string
|
||||||
"""
|
"""
|
||||||
element_category = element.metadata["category"]
|
element_category = element.metadata["category"]
|
||||||
content = element.page_content
|
content = element.page_content
|
||||||
|
|
||||||
if not content:
|
if not content:
|
||||||
return ""
|
return ""
|
||||||
|
|
||||||
markdown_mapping = {
|
markdown_mapping = {
|
||||||
"Formula": lambda x: f"```math\n{x}\n```",
|
"Formula": lambda x: f"```math\n{x}\n```",
|
||||||
"FigureCaption": lambda x: f"*Figure: {x}*",
|
"FigureCaption": lambda x: f"*Figure: {x}*",
|
||||||
|
|
@ -31,7 +34,7 @@ async def convert_element_to_markdown(element) -> str:
|
||||||
"PageNumber": lambda x: f"*Page {x}*\n\n",
|
"PageNumber": lambda x: f"*Page {x}*\n\n",
|
||||||
"UncategorizedText": lambda x: f"{x}\n\n"
|
"UncategorizedText": lambda x: f"{x}\n\n"
|
||||||
}
|
}
|
||||||
|
|
||||||
converter = markdown_mapping.get(element_category, lambda x: x)
|
converter = markdown_mapping.get(element_category, lambda x: x)
|
||||||
return converter(content)
|
return converter(content)
|
||||||
|
|
||||||
|
|
@ -39,29 +42,30 @@ async def convert_element_to_markdown(element) -> str:
|
||||||
async def convert_document_to_markdown(elements):
|
async def convert_document_to_markdown(elements):
|
||||||
"""
|
"""
|
||||||
Convert all document elements to markdown.
|
Convert all document elements to markdown.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
elements: List of Unstructured API elements
|
elements: List of Unstructured API elements
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
str: Complete markdown document
|
str: Complete markdown document
|
||||||
"""
|
"""
|
||||||
markdown_parts = []
|
markdown_parts = []
|
||||||
|
|
||||||
for element in elements:
|
for element in elements:
|
||||||
markdown_text = await convert_element_to_markdown(element)
|
markdown_text = await convert_element_to_markdown(element)
|
||||||
if markdown_text:
|
if markdown_text:
|
||||||
markdown_parts.append(markdown_text)
|
markdown_parts.append(markdown_text)
|
||||||
|
|
||||||
return "".join(markdown_parts)
|
return "".join(markdown_parts)
|
||||||
|
|
||||||
|
|
||||||
def convert_chunks_to_langchain_documents(chunks):
|
def convert_chunks_to_langchain_documents(chunks):
|
||||||
"""
|
"""
|
||||||
Convert chunks from hybrid search results to LangChain Document objects.
|
Convert chunks from hybrid search results to LangChain Document objects.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
chunks: List of chunk dictionaries from hybrid search results
|
chunks: List of chunk dictionaries from hybrid search results
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
List of LangChain Document objects
|
List of LangChain Document objects
|
||||||
"""
|
"""
|
||||||
|
|
@ -71,20 +75,20 @@ def convert_chunks_to_langchain_documents(chunks):
|
||||||
raise ImportError(
|
raise ImportError(
|
||||||
"LangChain is not installed. Please install it with `pip install langchain langchain-core`"
|
"LangChain is not installed. Please install it with `pip install langchain langchain-core`"
|
||||||
)
|
)
|
||||||
|
|
||||||
langchain_docs = []
|
langchain_docs = []
|
||||||
|
|
||||||
for chunk in chunks:
|
for chunk in chunks:
|
||||||
# Extract content from the chunk
|
# Extract content from the chunk
|
||||||
content = chunk.get("content", "")
|
content = chunk.get("content", "")
|
||||||
|
|
||||||
# Create metadata dictionary
|
# Create metadata dictionary
|
||||||
metadata = {
|
metadata = {
|
||||||
"chunk_id": chunk.get("chunk_id"),
|
"chunk_id": chunk.get("chunk_id"),
|
||||||
"score": chunk.get("score"),
|
"score": chunk.get("score"),
|
||||||
"rank": chunk.get("rank") if "rank" in chunk else None,
|
"rank": chunk.get("rank") if "rank" in chunk else None,
|
||||||
}
|
}
|
||||||
|
|
||||||
# Add document information to metadata
|
# Add document information to metadata
|
||||||
if "document" in chunk:
|
if "document" in chunk:
|
||||||
doc = chunk["document"]
|
doc = chunk["document"]
|
||||||
|
|
@ -93,24 +97,25 @@ def convert_chunks_to_langchain_documents(chunks):
|
||||||
"document_title": doc.get("title"),
|
"document_title": doc.get("title"),
|
||||||
"document_type": doc.get("document_type"),
|
"document_type": doc.get("document_type"),
|
||||||
})
|
})
|
||||||
|
|
||||||
# Add document metadata if available
|
# Add document metadata if available
|
||||||
if "metadata" in doc:
|
if "metadata" in doc:
|
||||||
# Prefix document metadata keys to avoid conflicts
|
# Prefix document metadata keys to avoid conflicts
|
||||||
doc_metadata = {f"doc_meta_{k}": v for k, v in doc.get("metadata", {}).items()}
|
doc_metadata = {f"doc_meta_{k}": v for k,
|
||||||
|
v in doc.get("metadata", {}).items()}
|
||||||
metadata.update(doc_metadata)
|
metadata.update(doc_metadata)
|
||||||
|
|
||||||
# Add source URL if available in metadata
|
# Add source URL if available in metadata
|
||||||
if "url" in doc.get("metadata", {}):
|
if "url" in doc.get("metadata", {}):
|
||||||
metadata["source"] = doc["metadata"]["url"]
|
metadata["source"] = doc["metadata"]["url"]
|
||||||
elif "sourceURL" in doc.get("metadata", {}):
|
elif "sourceURL" in doc.get("metadata", {}):
|
||||||
metadata["source"] = doc["metadata"]["sourceURL"]
|
metadata["source"] = doc["metadata"]["sourceURL"]
|
||||||
|
|
||||||
# Ensure source_id is set for citation purposes
|
# Ensure source_id is set for citation purposes
|
||||||
# Use document_id as the source_id if available
|
# Use document_id as the source_id if available
|
||||||
if "document_id" in metadata:
|
if "document_id" in metadata:
|
||||||
metadata["source_id"] = metadata["document_id"]
|
metadata["source_id"] = metadata["document_id"]
|
||||||
|
|
||||||
# Update content for citation mode - format as XML with explicit source_id
|
# Update content for citation mode - format as XML with explicit source_id
|
||||||
new_content = f"""
|
new_content = f"""
|
||||||
<document>
|
<document>
|
||||||
|
|
@ -124,13 +129,18 @@ def convert_chunks_to_langchain_documents(chunks):
|
||||||
</content>
|
</content>
|
||||||
</document>
|
</document>
|
||||||
"""
|
"""
|
||||||
|
|
||||||
# Create LangChain Document
|
# Create LangChain Document
|
||||||
langchain_doc = LangChainDocument(
|
langchain_doc = LangChainDocument(
|
||||||
page_content=new_content,
|
page_content=new_content,
|
||||||
metadata=metadata
|
metadata=metadata
|
||||||
)
|
)
|
||||||
|
|
||||||
langchain_docs.append(langchain_doc)
|
langchain_docs.append(langchain_doc)
|
||||||
|
|
||||||
return langchain_docs
|
return langchain_docs
|
||||||
|
|
||||||
|
|
||||||
|
def generate_content_hash(content: str) -> str:
|
||||||
|
"""Generate SHA-256 hash for the given content."""
|
||||||
|
return hashlib.sha256(content.encode('utf-8')).hexdigest()
|
||||||
|
|
|
||||||
|
|
@ -1,8 +1,8 @@
|
||||||
"""
|
import datetime
|
||||||
NOTE: This is not used anymore. Might be removed in the future.
|
from langchain.schema import HumanMessage, SystemMessage, AIMessage
|
||||||
"""
|
|
||||||
from langchain.schema import HumanMessage, SystemMessage
|
|
||||||
from app.config import config
|
from app.config import config
|
||||||
|
from typing import Any, List, Optional
|
||||||
|
|
||||||
|
|
||||||
class QueryService:
|
class QueryService:
|
||||||
"""
|
"""
|
||||||
|
|
@ -10,72 +10,91 @@ class QueryService:
|
||||||
"""
|
"""
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
async def reformulate_query(user_query: str) -> str:
|
async def reformulate_query_with_chat_history(user_query: str, chat_history_str: Optional[str] = None) -> str:
|
||||||
"""
|
"""
|
||||||
Reformulate the user query using the STRATEGIC_LLM to make it more
|
Reformulate the user query using the STRATEGIC_LLM to make it more
|
||||||
effective for information retrieval and research purposes.
|
effective for information retrieval and research purposes.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
user_query: The original user query
|
user_query: The original user query
|
||||||
|
chat_history: Optional list of previous chat messages
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
str: The reformulated query
|
str: The reformulated query
|
||||||
"""
|
"""
|
||||||
if not user_query or not user_query.strip():
|
if not user_query or not user_query.strip():
|
||||||
return user_query
|
return user_query
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Get the strategic LLM instance from config
|
# Get the strategic LLM instance from config
|
||||||
llm = config.strategic_llm_instance
|
llm = config.strategic_llm_instance
|
||||||
|
|
||||||
# Create system message with instructions
|
# Create system message with instructions
|
||||||
system_message = SystemMessage(
|
system_message = SystemMessage(
|
||||||
content="""
|
content=f"""
|
||||||
You are an expert at reformulating user queries to optimize information retrieval.
|
Today's date: {datetime.datetime.now().strftime("%Y-%m-%d")}
|
||||||
Your job is to take a user query and reformulate it to:
|
You are a highly skilled AI assistant specializing in query optimization for advanced research.
|
||||||
|
Your primary objective is to transform a user's initial query into a highly effective search query.
|
||||||
1. Make it more specific and detailed
|
This reformulated query will be used to retrieve information from diverse data sources.
|
||||||
2. Expand ambiguous terms
|
|
||||||
3. Include relevant synonyms and alternative phrasings
|
**Chat History Context:**
|
||||||
4. Break down complex questions into their core components
|
{chat_history_str if chat_history_str else "No prior conversation history is available."}
|
||||||
5. Ensure it's comprehensive for research purposes
|
If chat history is provided, analyze it to understand the user's evolving information needs and the broader context of their request. Use this understanding to refine the current query, ensuring it builds upon or clarifies previous interactions.
|
||||||
|
|
||||||
The query will be used with the following data sources/connectors:
|
**Query Reformulation Guidelines:**
|
||||||
- SERPER_API: Web search for retrieving current information from the internet
|
Your reformulated query should:
|
||||||
- TAVILY_API: Research-focused search API for comprehensive information
|
1. **Enhance Specificity and Detail:** Add precision to narrow the search focus effectively, making the query less ambiguous and more targeted.
|
||||||
- SLACK_CONNECTOR: Retrieves information from indexed Slack workspace conversations
|
2. **Resolve Ambiguities:** Identify and clarify vague terms or phrases. If a term has multiple meanings, orient the query towards the most likely one given the context.
|
||||||
- NOTION_CONNECTOR: Retrieves information from indexed Notion documents and databases
|
3. **Expand Key Concepts:** Incorporate relevant synonyms, related terms, and alternative phrasings for core concepts. This helps capture a wider range of relevant documents.
|
||||||
- FILE: Searches through user's uploaded files
|
4. **Deconstruct Complex Questions:** If the original query is multifaceted, break it down into its core searchable components or rephrase it to address each aspect clearly. The final output must still be a single, coherent query string.
|
||||||
- CRAWLED_URL: Searches through previously crawled web pages
|
5. **Optimize for Comprehensiveness:** Ensure the query is structured to uncover all essential facets of the original request, aiming for thorough information retrieval suitable for research.
|
||||||
|
6. **Maintain User Intent:** The reformulated query must stay true to the original intent of the user's query. Do not introduce new topics or shift the focus significantly.
|
||||||
IMPORTANT: Keep the reformulated query as concise as possible while still being effective.
|
|
||||||
Avoid unnecessary verbosity and limit the query to only essential terms and concepts.
|
**Crucial Constraints:**
|
||||||
|
* **Conciseness and Effectiveness:** While aiming for comprehensiveness, the reformulated query MUST be as concise as possible. Eliminate all unnecessary verbosity. Focus on essential keywords, entities, and concepts that directly contribute to effective retrieval.
|
||||||
Please optimize the query to work effectively across these different data sources.
|
* **Single, Direct Output:** Return ONLY the reformulated query itself. Do NOT include any explanations, introductory phrases (e.g., "Reformulated query:", "Here is the optimized query:"), or any other surrounding text or markdown formatting.
|
||||||
|
|
||||||
Return ONLY the reformulated query without explanations, prefixes, or commentary.
|
Your output should be a single, optimized query string, ready for immediate use in a search system.
|
||||||
Do not include phrases like "Reformulated query:" or any other text except the query itself.
|
|
||||||
"""
|
"""
|
||||||
)
|
)
|
||||||
|
|
||||||
# Create human message with the user query
|
# Create human message with the user query
|
||||||
human_message = HumanMessage(
|
human_message = HumanMessage(
|
||||||
content=f"Reformulate this query for better research results: {user_query}"
|
content=f"Reformulate this query for better research results: {user_query}"
|
||||||
)
|
)
|
||||||
|
|
||||||
# Get the response from the LLM
|
# Get the response from the LLM
|
||||||
response = await llm.agenerate(messages=[[system_message, human_message]])
|
response = await llm.agenerate(messages=[[system_message, human_message]])
|
||||||
|
|
||||||
# Extract the reformulated query from the response
|
# Extract the reformulated query from the response
|
||||||
reformulated_query = response.generations[0][0].text.strip()
|
reformulated_query = response.generations[0][0].text.strip()
|
||||||
|
|
||||||
# Return the original query if the reformulation is empty
|
# Return the original query if the reformulation is empty
|
||||||
if not reformulated_query:
|
if not reformulated_query:
|
||||||
return user_query
|
return user_query
|
||||||
|
|
||||||
return reformulated_query
|
return reformulated_query
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
# Log the error and return the original query
|
# Log the error and return the original query
|
||||||
print(f"Error reformulating query: {e}")
|
print(f"Error reformulating query: {e}")
|
||||||
return user_query
|
return user_query
|
||||||
|
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
async def langchain_chat_history_to_str(chat_history: List[Any]) -> str:
|
||||||
|
"""
|
||||||
|
Convert a list of chat history messages to a string.
|
||||||
|
"""
|
||||||
|
chat_history_str = "<chat_history>\n"
|
||||||
|
|
||||||
|
for chat_message in chat_history:
|
||||||
|
if isinstance(chat_message, HumanMessage):
|
||||||
|
chat_history_str += f"<user>{chat_message.content}</user>\n"
|
||||||
|
elif isinstance(chat_message, AIMessage):
|
||||||
|
chat_history_str += f"<assistant>{chat_message.content}</assistant>\n"
|
||||||
|
elif isinstance(chat_message, SystemMessage):
|
||||||
|
chat_history_str += f"<system>{chat_message.content}</system>\n"
|
||||||
|
|
||||||
|
chat_history_str += "</chat_history>"
|
||||||
|
return chat_history_str
|
||||||
|
|
|
||||||
|
|
@ -1,5 +0,0 @@
|
||||||
from app.agents.researcher.graph import graph as researcher_graph
|
|
||||||
from app.agents.researcher.sub_section_writer.graph import graph as sub_section_writer_graph
|
|
||||||
|
|
||||||
print(researcher_graph.get_graph().draw_mermaid())
|
|
||||||
print(sub_section_writer_graph.get_graph().draw_mermaid())
|
|
||||||
|
|
@ -1,13 +1,13 @@
|
||||||
[project]
|
[project]
|
||||||
name = "surf-new-backend"
|
name = "surf-new-backend"
|
||||||
version = "0.0.6"
|
version = "0.0.7"
|
||||||
description = "SurfSense Backend"
|
description = "SurfSense Backend"
|
||||||
readme = "README.md"
|
readme = "README.md"
|
||||||
requires-python = ">=3.12"
|
requires-python = ">=3.12"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"alembic>=1.13.0",
|
"alembic>=1.13.0",
|
||||||
"asyncpg>=0.30.0",
|
"asyncpg>=0.30.0",
|
||||||
"chonkie[all]>=0.4.1",
|
"chonkie[all]>=1.0.6",
|
||||||
"fastapi>=0.115.8",
|
"fastapi>=0.115.8",
|
||||||
"fastapi-users[oauth,sqlalchemy]>=14.0.1",
|
"fastapi-users[oauth,sqlalchemy]>=14.0.1",
|
||||||
"firecrawl-py>=1.12.0",
|
"firecrawl-py>=1.12.0",
|
||||||
|
|
@ -15,14 +15,18 @@ dependencies = [
|
||||||
"langchain-community>=0.3.17",
|
"langchain-community>=0.3.17",
|
||||||
"langchain-unstructured>=0.1.6",
|
"langchain-unstructured>=0.1.6",
|
||||||
"langgraph>=0.3.29",
|
"langgraph>=0.3.29",
|
||||||
|
"linkup-sdk>=0.2.4",
|
||||||
"litellm>=1.61.4",
|
"litellm>=1.61.4",
|
||||||
|
"llama-cloud-services>=0.6.25",
|
||||||
"markdownify>=0.14.1",
|
"markdownify>=0.14.1",
|
||||||
"notion-client>=2.3.0",
|
"notion-client>=2.3.0",
|
||||||
"pgvector>=0.3.6",
|
"pgvector>=0.3.6",
|
||||||
"playwright>=1.50.0",
|
"playwright>=1.50.0",
|
||||||
|
"python-ffmpeg>=2.0.12",
|
||||||
"rerankers[flashrank]>=0.7.1",
|
"rerankers[flashrank]>=0.7.1",
|
||||||
"sentence-transformers>=3.4.1",
|
"sentence-transformers>=3.4.1",
|
||||||
"slack-sdk>=3.34.0",
|
"slack-sdk>=3.34.0",
|
||||||
|
"static-ffmpeg>=2.13",
|
||||||
"tavily-python>=0.3.2",
|
"tavily-python>=0.3.2",
|
||||||
"unstructured-client>=0.30.0",
|
"unstructured-client>=0.30.0",
|
||||||
"unstructured[all-docs]>=0.16.25",
|
"unstructured[all-docs]>=0.16.25",
|
||||||
|
|
|
||||||
653
surfsense_backend/uv.lock
generated
653
surfsense_backend/uv.lock
generated
|
|
@ -13,6 +13,24 @@ resolution-markers = [
|
||||||
"(python_full_version < '3.12.4' and platform_machine != 'aarch64' and sys_platform == 'linux') or (python_full_version < '3.12.4' and sys_platform != 'darwin' and sys_platform != 'linux')",
|
"(python_full_version < '3.12.4' and platform_machine != 'aarch64' and sys_platform == 'linux') or (python_full_version < '3.12.4' and sys_platform != 'darwin' and sys_platform != 'linux')",
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "accelerate"
|
||||||
|
version = "1.6.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "huggingface-hub" },
|
||||||
|
{ name = "numpy" },
|
||||||
|
{ name = "packaging" },
|
||||||
|
{ name = "psutil" },
|
||||||
|
{ name = "pyyaml" },
|
||||||
|
{ name = "safetensors" },
|
||||||
|
{ name = "torch" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/8a/6e/c29a1dcde7db07f47870ed63e5124086b11874ad52ccd533dc1ca2c799da/accelerate-1.6.0.tar.gz", hash = "sha256:28c1ef1846e690944f98b68dc7b8bb6c51d032d45e85dcbb3adb0c8b99dffb32", size = 363804 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/63/b1/8198e3cdd11a426b1df2912e3381018c4a4a55368f6d0857ba3ca418ef93/accelerate-1.6.0-py3-none-any.whl", hash = "sha256:1aee717d3d3735ad6d09710a7c26990ee4652b79b4e93df46551551b5227c2aa", size = 354748 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "aiofiles"
|
name = "aiofiles"
|
||||||
version = "24.1.0"
|
version = "24.1.0"
|
||||||
|
|
@ -92,6 +110,18 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/ec/6a/bc7e17a3e87a2985d3e8f4da4cd0f481060eb78fb08596c42be62c90a4d9/aiosignal-1.3.2-py2.py3-none-any.whl", hash = "sha256:45cde58e409a301715980c2b01d0c28bdde3770d8290b5eb2173759d9acb31a5", size = 7597 },
|
{ url = "https://files.pythonhosted.org/packages/ec/6a/bc7e17a3e87a2985d3e8f4da4cd0f481060eb78fb08596c42be62c90a4d9/aiosignal-1.3.2-py2.py3-none-any.whl", hash = "sha256:45cde58e409a301715980c2b01d0c28bdde3770d8290b5eb2173759d9acb31a5", size = 7597 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "aiosqlite"
|
||||||
|
version = "0.21.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "typing-extensions" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/13/7d/8bca2bf9a247c2c5dfeec1d7a5f40db6518f88d314b8bca9da29670d2671/aiosqlite-0.21.0.tar.gz", hash = "sha256:131bb8056daa3bc875608c631c678cda73922a2d4ba8aec373b19f18c17e7aa3", size = 13454 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/f5/10/6c25ed6de94c49f88a91fa5018cb4c0f3625f31d5be9f771ebe5cc7cd506/aiosqlite-0.21.0-py3-none-any.whl", hash = "sha256:2549cf4057f95f53dcba16f2b64e8e2791d7e1adedb13197dd8ed77bb226d7d0", size = 15792 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "alembic"
|
name = "alembic"
|
||||||
version = "1.15.2"
|
version = "1.15.2"
|
||||||
|
|
@ -201,19 +231,6 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/fc/30/d4986a882011f9df997a55e6becd864812ccfcd821d64aac8570ee39f719/attrs-25.1.0-py3-none-any.whl", hash = "sha256:c75a69e28a550a7e93789579c22aa26b0f5b83b75dc4e08fe092980051e1090a", size = 63152 },
|
{ url = "https://files.pythonhosted.org/packages/fc/30/d4986a882011f9df997a55e6becd864812ccfcd821d64aac8570ee39f719/attrs-25.1.0-py3-none-any.whl", hash = "sha256:c75a69e28a550a7e93789579c22aa26b0f5b83b75dc4e08fe092980051e1090a", size = 63152 },
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
|
||||||
name = "autotiktokenizer"
|
|
||||||
version = "0.2.2"
|
|
||||||
source = { registry = "https://pypi.org/simple" }
|
|
||||||
dependencies = [
|
|
||||||
{ name = "huggingface-hub" },
|
|
||||||
{ name = "tiktoken" },
|
|
||||||
]
|
|
||||||
sdist = { url = "https://files.pythonhosted.org/packages/a6/1a/c6f494750dc67c2e5b06b91ae9565d46adb384f25f61a7136ff79dd02413/autotiktokenizer-0.2.2.tar.gz", hash = "sha256:f0954f14cedfe538b96ba0eed2e39996378c0bdf649fd977d6a047e419e05fdb", size = 15401 }
|
|
||||||
wheels = [
|
|
||||||
{ url = "https://files.pythonhosted.org/packages/d8/7b/c34469a1495d755bac1c80fbf3c0c2c29eb03ffe61172d889426025173bd/autotiktokenizer-0.2.2-py3-none-any.whl", hash = "sha256:ebbf15d9d5516fcb3287a8153bd8efbcc932f9c99089b2357255413cf37815d9", size = 8957 },
|
|
||||||
]
|
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "backoff"
|
name = "backoff"
|
||||||
version = "2.2.1"
|
version = "2.2.1"
|
||||||
|
|
@ -223,6 +240,22 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/df/73/b6e24bd22e6720ca8ee9a85a0c4a2971af8497d8f3193fa05390cbd46e09/backoff-2.2.1-py3-none-any.whl", hash = "sha256:63579f9a0628e06278f7e47b7d7d5b6ce20dc65c5e96a6f3ca99a6adca0396e8", size = 15148 },
|
{ url = "https://files.pythonhosted.org/packages/df/73/b6e24bd22e6720ca8ee9a85a0c4a2971af8497d8f3193fa05390cbd46e09/backoff-2.2.1-py3-none-any.whl", hash = "sha256:63579f9a0628e06278f7e47b7d7d5b6ce20dc65c5e96a6f3ca99a6adca0396e8", size = 15148 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "banks"
|
||||||
|
version = "2.1.2"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "deprecated" },
|
||||||
|
{ name = "griffe" },
|
||||||
|
{ name = "jinja2" },
|
||||||
|
{ name = "platformdirs" },
|
||||||
|
{ name = "pydantic" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/77/34/2b6697f02ffb68bee50e5fd37d6c64432244d3245603fd62950169dfed7e/banks-2.1.2.tar.gz", hash = "sha256:a0651db9d14b57fa2e115e78f68dbb1b36fe226ad6eef96192542908b1d20c1f", size = 173332 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/04/4a/7fdca29d1db62f5f5c3446bf8f668beacdb0b5a8aff4247574ddfddc6bcd/banks-2.1.2-py3-none-any.whl", hash = "sha256:7fba451069f6bea376483b8136a0f29cb1e6883133626d00e077e20a3d102c0e", size = 28064 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "bcrypt"
|
name = "bcrypt"
|
||||||
version = "4.2.1"
|
version = "4.2.1"
|
||||||
|
|
@ -363,23 +396,36 @@ wheels = [
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "chonkie"
|
name = "chonkie"
|
||||||
version = "0.4.1"
|
version = "1.0.6"
|
||||||
source = { registry = "https://pypi.org/simple" }
|
source = { registry = "https://pypi.org/simple" }
|
||||||
dependencies = [
|
dependencies = [
|
||||||
{ name = "autotiktokenizer" },
|
{ name = "tokenizers" },
|
||||||
{ name = "tqdm" },
|
{ name = "tqdm" },
|
||||||
]
|
]
|
||||||
sdist = { url = "https://files.pythonhosted.org/packages/2e/94/4a1bc8bdf06e7327bb256abb85767647125286c9bbc7cbcd77a550b96d63/chonkie-0.4.1.tar.gz", hash = "sha256:164216efa01af02e750e7cb218cea87918a18f83ebbd8f020b25557f1ed36aa9", size = 43284 }
|
sdist = { url = "https://files.pythonhosted.org/packages/5a/db/16d5d23a216db734bcb68e61c466ff48a55dc0d2cdc7ecdd73aaea1f6f7d/chonkie-1.0.6.tar.gz", hash = "sha256:feefad3cbbb62b4a55f4c6409bd8d8f0ee180d8319c4d32e31539a768955b3b0", size = 70056 }
|
||||||
wheels = [
|
wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/c0/b5/c0d77500a413794773edb630bdc7061121c237a4eaf6ce222226c200d603/chonkie-0.4.1-py3-none-any.whl", hash = "sha256:af7d95d17f4ed60a26e32f0bad60f807287e3301189114755d727657ed2ef964", size = 51193 },
|
{ url = "https://files.pythonhosted.org/packages/bc/46/d6d9789eb6e61bfa073a13fd2b5cbbcf022a7781adbb060a25d82f16437e/chonkie-1.0.6-py3-none-any.whl", hash = "sha256:d8cfcf665cb6a64ac6ca87da61207372a88b9e5a7bb697faade78069c853e4b1", size = 89526 },
|
||||||
]
|
]
|
||||||
|
|
||||||
[package.optional-dependencies]
|
[package.optional-dependencies]
|
||||||
all = [
|
all = [
|
||||||
|
{ name = "accelerate" },
|
||||||
|
{ name = "cohere" },
|
||||||
|
{ name = "google-genai" },
|
||||||
|
{ name = "huggingface-hub" },
|
||||||
|
{ name = "jsonschema" },
|
||||||
|
{ name = "magika" },
|
||||||
{ name = "model2vec" },
|
{ name = "model2vec" },
|
||||||
{ name = "numpy" },
|
{ name = "numpy" },
|
||||||
{ name = "openai" },
|
{ name = "openai" },
|
||||||
|
{ name = "pydantic" },
|
||||||
|
{ name = "rich" },
|
||||||
{ name = "sentence-transformers" },
|
{ name = "sentence-transformers" },
|
||||||
|
{ name = "tiktoken" },
|
||||||
|
{ name = "torch" },
|
||||||
|
{ name = "transformers" },
|
||||||
|
{ name = "tree-sitter" },
|
||||||
|
{ name = "tree-sitter-language-pack" },
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
|
|
@ -394,6 +440,26 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/7e/d4/7ebdbd03970677812aac39c869717059dbb71a4cfc033ca6e5221787892c/click-8.1.8-py3-none-any.whl", hash = "sha256:63c132bbbed01578a06712a2d1f497bb62d9c1c0d329b7903a866228027263b2", size = 98188 },
|
{ url = "https://files.pythonhosted.org/packages/7e/d4/7ebdbd03970677812aac39c869717059dbb71a4cfc033ca6e5221787892c/click-8.1.8-py3-none-any.whl", hash = "sha256:63c132bbbed01578a06712a2d1f497bb62d9c1c0d329b7903a866228027263b2", size = 98188 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "cohere"
|
||||||
|
version = "5.15.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "fastavro" },
|
||||||
|
{ name = "httpx" },
|
||||||
|
{ name = "httpx-sse" },
|
||||||
|
{ name = "pydantic" },
|
||||||
|
{ name = "pydantic-core" },
|
||||||
|
{ name = "requests" },
|
||||||
|
{ name = "tokenizers" },
|
||||||
|
{ name = "types-requests" },
|
||||||
|
{ name = "typing-extensions" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/a1/33/69c7d1b25a20eafef4197a1444c7f87d5241e936194e54876ea8996157e6/cohere-5.15.0.tar.gz", hash = "sha256:e802d4718ddb0bb655654382ebbce002756a3800faac30296cde7f1bdc6ff2cc", size = 135021 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/c7/87/94694db7fe6df979fbc03286eaabdfa98f1c8fa532960e5afdf965e10960/cohere-5.15.0-py3-none-any.whl", hash = "sha256:22ff867c2a6f2fc2b585360c6072f584f11f275ef6d9242bac24e0fa2df1dfb5", size = 259522 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "colorama"
|
name = "colorama"
|
||||||
version = "0.4.6"
|
version = "0.4.6"
|
||||||
|
|
@ -534,6 +600,15 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/6e/c6/ac0b6c1e2d138f1002bcf799d330bd6d85084fece321e662a14223794041/Deprecated-1.2.18-py2.py3-none-any.whl", hash = "sha256:bd5011788200372a32418f888e326a09ff80d0214bd961147cfed01b5c018eec", size = 9998 },
|
{ url = "https://files.pythonhosted.org/packages/6e/c6/ac0b6c1e2d138f1002bcf799d330bd6d85084fece321e662a14223794041/Deprecated-1.2.18-py2.py3-none-any.whl", hash = "sha256:bd5011788200372a32418f888e326a09ff80d0214bd961147cfed01b5c018eec", size = 9998 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "dirtyjson"
|
||||||
|
version = "1.0.8"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/db/04/d24f6e645ad82ba0ef092fa17d9ef7a21953781663648a01c9371d9e8e98/dirtyjson-1.0.8.tar.gz", hash = "sha256:90ca4a18f3ff30ce849d100dcf4a003953c79d3a2348ef056f1d9c22231a25fd", size = 30782 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/68/69/1bcf70f81de1b4a9f21b3a62ec0c83bdff991c88d6cc2267d02408457e88/dirtyjson-1.0.8-py3-none-any.whl", hash = "sha256:125e27248435a58acace26d5c2c4c11a1c0de0a9c5124c5a94ba78e517d74f53", size = 25197 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "distro"
|
name = "distro"
|
||||||
version = "1.9.0"
|
version = "1.9.0"
|
||||||
|
|
@ -552,6 +627,15 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/68/1b/e0a87d256e40e8c888847551b20a017a6b98139178505dc7ffb96f04e954/dnspython-2.7.0-py3-none-any.whl", hash = "sha256:b4c34b7d10b51bcc3a5071e7b8dee77939f1e878477eeecc965e9835f63c6c86", size = 313632 },
|
{ url = "https://files.pythonhosted.org/packages/68/1b/e0a87d256e40e8c888847551b20a017a6b98139178505dc7ffb96f04e954/dnspython-2.7.0-py3-none-any.whl", hash = "sha256:b4c34b7d10b51bcc3a5071e7b8dee77939f1e878477eeecc965e9835f63c6c86", size = 313632 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "docutils"
|
||||||
|
version = "0.21.2"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/ae/ed/aefcc8cd0ba62a0560c3c18c33925362d46c6075480bfa4df87b28e169a9/docutils-0.21.2.tar.gz", hash = "sha256:3a6b18732edf182daa3cd12775bbb338cf5691468f91eeeb109deff6ebfa986f", size = 2204444 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/8f/d7/9322c609343d929e75e7e5e6255e614fcc67572cfd083959cdef3b7aad79/docutils-0.21.2-py3-none-any.whl", hash = "sha256:dafca5b9e384f0e419294eb4d2ff9fa826435bf15f15b7bd45723e8ad76811b2", size = 587408 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "effdet"
|
name = "effdet"
|
||||||
version = "0.4.1"
|
version = "0.4.1"
|
||||||
|
|
@ -660,6 +744,26 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/a6/08/9968963c1fb8c34627b7f1fbcdfe9438540f87dc7c9bfb59bb4fd19a4ecf/fastapi_users_db_sqlalchemy-7.0.0-py3-none-any.whl", hash = "sha256:5fceac018e7cfa69efc70834dd3035b3de7988eb4274154a0dbe8b14f5aa001e", size = 6891 },
|
{ url = "https://files.pythonhosted.org/packages/a6/08/9968963c1fb8c34627b7f1fbcdfe9438540f87dc7c9bfb59bb4fd19a4ecf/fastapi_users_db_sqlalchemy-7.0.0-py3-none-any.whl", hash = "sha256:5fceac018e7cfa69efc70834dd3035b3de7988eb4274154a0dbe8b14f5aa001e", size = 6891 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "fastavro"
|
||||||
|
version = "1.10.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/f3/67/7121d2221e998706cac00fa779ec44c1c943cb65e8a7ed1bd57d78d93f2c/fastavro-1.10.0.tar.gz", hash = "sha256:47bf41ac6d52cdfe4a3da88c75a802321321b37b663a900d12765101a5d6886f", size = 987970 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/9c/a4/8e69c0a5cd121e5d476237de1bde5a7947f791ae45768ae52ed0d3ea8d18/fastavro-1.10.0-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:cfe57cb0d72f304bd0dcc5a3208ca6a7363a9ae76f3073307d095c9d053b29d4", size = 1036343 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/1e/01/aa219e2b33e5873d27b867ec0fad9f35f23d461114e1135a7e46c06786d2/fastavro-1.10.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:74e517440c824cb65fb29d3e3903a9406f4d7c75490cef47e55c4c82cdc66270", size = 3263368 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/a7/ba/1766e2d7d95df2e95e9e9a089dc7a537c0616720b053a111a918fa7ee6b6/fastavro-1.10.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:203c17d44cadde76e8eecb30f2d1b4f33eb478877552d71f049265dc6f2ecd10", size = 3328933 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/2e/40/26e56696b9696ab4fbba25a96b8037ca3f9fd8a8cc55b4b36400ef023e49/fastavro-1.10.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:6575be7f2b5f94023b5a4e766b0251924945ad55e9a96672dc523656d17fe251", size = 3258045 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/4e/bc/2f6c92c06c5363372abe828bccdd95762f2c1983b261509f94189c38c8a1/fastavro-1.10.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:fe471deb675ed2f01ee2aac958fbf8ebb13ea00fa4ce7f87e57710a0bc592208", size = 3418001 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/0c/ce/cfd16546c04ebbca1be80873b533c788cec76f7bfac231bfac6786047572/fastavro-1.10.0-cp312-cp312-win_amd64.whl", hash = "sha256:567ff515f2a5d26d9674b31c95477f3e6022ec206124c62169bc2ffaf0889089", size = 487855 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/c9/c4/163cf154cc694c2dccc70cd6796db6214ac668a1260bf0310401dad188dc/fastavro-1.10.0-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:82263af0adfddb39c85f9517d736e1e940fe506dfcc35bc9ab9f85e0fa9236d8", size = 1022741 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/38/01/a24598f5f31b8582a92fe9c41bf91caeed50d5b5eaa7576e6f8b23cb488d/fastavro-1.10.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:566c193109ff0ff84f1072a165b7106c4f96050078a4e6ac7391f81ca1ef3efa", size = 3237421 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/a7/bf/08bcf65cfb7feb0e5b1329fafeb4a9b95b7b5ec723ba58c7dbd0d04ded34/fastavro-1.10.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e400d2e55d068404d9fea7c5021f8b999c6f9d9afa1d1f3652ec92c105ffcbdd", size = 3300222 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/53/4d/a6c25f3166328f8306ec2e6be1123ed78a55b8ab774a43a661124508881f/fastavro-1.10.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:9b8227497f71565270f9249fc9af32a93644ca683a0167cfe66d203845c3a038", size = 3233276 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/47/1c/b2b2ce2bf866a248ae23e96a87b3b8369427ff79be9112073039bee1d245/fastavro-1.10.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:8e62d04c65461b30ac6d314e4197ad666371e97ae8cb2c16f971d802f6c7f514", size = 3388936 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/1f/2c/43927e22a2d57587b3aa09765098a6d833246b672d34c10c5f135414745a/fastavro-1.10.0-cp313-cp313-win_amd64.whl", hash = "sha256:86baf8c9740ab570d0d4d18517da71626fe9be4d1142bea684db52bd5adb078f", size = 483967 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "filelock"
|
name = "filelock"
|
||||||
version = "3.17.0"
|
version = "3.17.0"
|
||||||
|
|
@ -858,6 +962,24 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/94/b6/60f2910485d32f7bba92cc33e5053b3f29d61fccaa57e5e58c600bb7e0d2/google_cloud_vision-3.10.1-py3-none-any.whl", hash = "sha256:91959ea12b0d6a8442e30c0a5062cd305f349a4840f9184b5061b3153bbd8476", size = 526076 },
|
{ url = "https://files.pythonhosted.org/packages/94/b6/60f2910485d32f7bba92cc33e5053b3f29d61fccaa57e5e58c600bb7e0d2/google_cloud_vision-3.10.1-py3-none-any.whl", hash = "sha256:91959ea12b0d6a8442e30c0a5062cd305f349a4840f9184b5061b3153bbd8476", size = 526076 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "google-genai"
|
||||||
|
version = "1.12.1"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "anyio" },
|
||||||
|
{ name = "google-auth" },
|
||||||
|
{ name = "httpx" },
|
||||||
|
{ name = "pydantic" },
|
||||||
|
{ name = "requests" },
|
||||||
|
{ name = "typing-extensions" },
|
||||||
|
{ name = "websockets" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/38/9c/c907dbea921663bb7c41f415337bedd08259d17da8d156396c7237611744/google_genai-1.12.1.tar.gz", hash = "sha256:5c7eda422360643ce602a3f6b23152470ec1039310ef40080cbe4e71237f6391", size = 167752 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/29/2c/5b454dec837328eb167e78f45a14da502af223f8b94a4824e2fd0df74f19/google_genai-1.12.1-py3-none-any.whl", hash = "sha256:7cbc1bc029712946ce41bcf80c0eaa89eb8c09c308efbbfe30fd491f402c258a", size = 165940 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "googleapis-common-protos"
|
name = "googleapis-common-protos"
|
||||||
version = "1.69.2"
|
version = "1.69.2"
|
||||||
|
|
@ -903,6 +1025,18 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/ac/38/08cc303ddddc4b3d7c628c3039a61a3aae36c241ed01393d00c2fd663473/greenlet-3.1.1-cp313-cp313t-musllinux_1_1_x86_64.whl", hash = "sha256:411f015496fec93c1c8cd4e5238da364e1da7a124bcb293f085bf2860c32c6f6", size = 1142112 },
|
{ url = "https://files.pythonhosted.org/packages/ac/38/08cc303ddddc4b3d7c628c3039a61a3aae36c241ed01393d00c2fd663473/greenlet-3.1.1-cp313-cp313t-musllinux_1_1_x86_64.whl", hash = "sha256:411f015496fec93c1c8cd4e5238da364e1da7a124bcb293f085bf2860c32c6f6", size = 1142112 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "griffe"
|
||||||
|
version = "1.7.3"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "colorama" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/a9/3e/5aa9a61f7c3c47b0b52a1d930302992229d191bf4bc76447b324b731510a/griffe-1.7.3.tar.gz", hash = "sha256:52ee893c6a3a968b639ace8015bec9d36594961e156e23315c8e8e51401fa50b", size = 395137 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/58/c6/5c20af38c2a57c15d87f7f38bee77d63c1d2a3689f74fefaf35915dd12b2/griffe-1.7.3-py3-none-any.whl", hash = "sha256:c6b3ee30c2f0f17f30bcdef5068d6ab7a2a4f1b8bf1a3e74b56fffd21e1c5f75", size = 129303 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "grpcio"
|
name = "grpcio"
|
||||||
version = "1.71.0"
|
version = "1.71.0"
|
||||||
|
|
@ -1068,6 +1202,18 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/f0/0f/310fb31e39e2d734ccaa2c0fb981ee41f7bd5056ce9bc29b2248bd569169/humanfriendly-10.0-py2.py3-none-any.whl", hash = "sha256:1697e1a8a8f550fd43c2865cd84542fc175a61dcb779b6fee18cf6b6ccba1477", size = 86794 },
|
{ url = "https://files.pythonhosted.org/packages/f0/0f/310fb31e39e2d734ccaa2c0fb981ee41f7bd5056ce9bc29b2248bd569169/humanfriendly-10.0-py2.py3-none-any.whl", hash = "sha256:1697e1a8a8f550fd43c2865cd84542fc175a61dcb779b6fee18cf6b6ccba1477", size = 86794 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "id"
|
||||||
|
version = "1.5.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "requests" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/22/11/102da08f88412d875fa2f1a9a469ff7ad4c874b0ca6fed0048fe385bdb3d/id-1.5.0.tar.gz", hash = "sha256:292cb8a49eacbbdbce97244f47a97b4c62540169c976552e497fd57df0734c1d", size = 15237 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/9f/cb/18326d2d89ad3b0dd143da971e77afd1e6ca6674f1b1c3df4b6bec6279fc/id-1.5.0-py3-none-any.whl", hash = "sha256:f1434e1cef91f2cbb8a4ec64663d5a23b9ed43ef44c4c957d02583d61714c658", size = 13611 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "idna"
|
name = "idna"
|
||||||
version = "3.10"
|
version = "3.10"
|
||||||
|
|
@ -1089,6 +1235,48 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/79/9d/0fb148dc4d6fa4a7dd1d8378168d9b4cd8d4560a6fbf6f0121c5fc34eb68/importlib_metadata-8.6.1-py3-none-any.whl", hash = "sha256:02a89390c1e15fdfdc0d7c6b25cb3e62650d0494005c97d6f148bf5b9787525e", size = 26971 },
|
{ url = "https://files.pythonhosted.org/packages/79/9d/0fb148dc4d6fa4a7dd1d8378168d9b4cd8d4560a6fbf6f0121c5fc34eb68/importlib_metadata-8.6.1-py3-none-any.whl", hash = "sha256:02a89390c1e15fdfdc0d7c6b25cb3e62650d0494005c97d6f148bf5b9787525e", size = 26971 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "jaraco-classes"
|
||||||
|
version = "3.4.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "more-itertools" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/06/c0/ed4a27bc5571b99e3cff68f8a9fa5b56ff7df1c2251cc715a652ddd26402/jaraco.classes-3.4.0.tar.gz", hash = "sha256:47a024b51d0239c0dd8c8540c6c7f484be3b8fcf0b2d85c13825780d3b3f3acd", size = 11780 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/7f/66/b15ce62552d84bbfcec9a4873ab79d993a1dd4edb922cbfccae192bd5b5f/jaraco.classes-3.4.0-py3-none-any.whl", hash = "sha256:f662826b6bed8cace05e7ff873ce0f9283b5c924470fe664fff1c2f00f581790", size = 6777 },
|
||||||
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "jaraco-context"
|
||||||
|
version = "6.0.1"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/df/ad/f3777b81bf0b6e7bc7514a1656d3e637b2e8e15fab2ce3235730b3e7a4e6/jaraco_context-6.0.1.tar.gz", hash = "sha256:9bae4ea555cf0b14938dc0aee7c9f32ed303aa20a3b73e7dc80111628792d1b3", size = 13912 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/ff/db/0c52c4cf5e4bd9f5d7135ec7669a3a767af21b3a308e1ed3674881e52b62/jaraco.context-6.0.1-py3-none-any.whl", hash = "sha256:f797fc481b490edb305122c9181830a3a5b76d84ef6d1aef2fb9b47ab956f9e4", size = 6825 },
|
||||||
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "jaraco-functools"
|
||||||
|
version = "4.1.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "more-itertools" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/ab/23/9894b3df5d0a6eb44611c36aec777823fc2e07740dabbd0b810e19594013/jaraco_functools-4.1.0.tar.gz", hash = "sha256:70f7e0e2ae076498e212562325e805204fc092d7b4c17e0e86c959e249701a9d", size = 19159 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/9f/4f/24b319316142c44283d7540e76c7b5a6dbd5db623abd86bb7b3491c21018/jaraco.functools-4.1.0-py3-none-any.whl", hash = "sha256:ad159f13428bc4acbf5541ad6dec511f91573b90fba04df61dafa2a1231cf649", size = 10187 },
|
||||||
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "jeepney"
|
||||||
|
version = "0.9.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/7b/6f/357efd7602486741aa73ffc0617fb310a29b588ed0fd69c2399acbb85b0c/jeepney-0.9.0.tar.gz", hash = "sha256:cf0e9e845622b81e4a28df94c40345400256ec608d0e55bb8a3feaa9163f5732", size = 106758 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/b2/a3/e137168c9c44d18eff0376253da9f1e9234d0239e0ee230d2fee6cea8e55/jeepney-0.9.0-py3-none-any.whl", hash = "sha256:97e5714520c16fc0a45695e5365a2e11b81ea79bba796e26f9f1d178cb182683", size = 49010 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "jinja2"
|
name = "jinja2"
|
||||||
version = "3.1.5"
|
version = "3.1.5"
|
||||||
|
|
@ -1193,6 +1381,23 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/d1/0f/8910b19ac0670a0f80ce1008e5e751c4a57e14d2c4c13a482aa6079fa9d6/jsonschema_specifications-2024.10.1-py3-none-any.whl", hash = "sha256:a09a0680616357d9a0ecf05c12ad234479f549239d0f5b55f3deea67475da9bf", size = 18459 },
|
{ url = "https://files.pythonhosted.org/packages/d1/0f/8910b19ac0670a0f80ce1008e5e751c4a57e14d2c4c13a482aa6079fa9d6/jsonschema_specifications-2024.10.1-py3-none-any.whl", hash = "sha256:a09a0680616357d9a0ecf05c12ad234479f549239d0f5b55f3deea67475da9bf", size = 18459 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "keyring"
|
||||||
|
version = "25.6.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "jaraco-classes" },
|
||||||
|
{ name = "jaraco-context" },
|
||||||
|
{ name = "jaraco-functools" },
|
||||||
|
{ name = "jeepney", marker = "sys_platform == 'linux'" },
|
||||||
|
{ name = "pywin32-ctypes", marker = "sys_platform == 'win32'" },
|
||||||
|
{ name = "secretstorage", marker = "sys_platform == 'linux'" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/70/09/d904a6e96f76ff214be59e7aa6ef7190008f52a0ab6689760a98de0bf37d/keyring-25.6.0.tar.gz", hash = "sha256:0b39998aa941431eb3d9b0d4b2460bc773b9df6fed7621c2dfb291a7e0187a66", size = 62750 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/d3/32/da7f44bcb1105d3e88a0b74ebdca50c59121d2ddf71c9e34ba47df7f3a56/keyring-25.6.0-py3-none-any.whl", hash = "sha256:552a3f7af126ece7ed5c89753650eec89c7eaae8617d0aa4d9ad2b75111266bd", size = 39085 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "kiwisolver"
|
name = "kiwisolver"
|
||||||
version = "1.4.8"
|
version = "1.4.8"
|
||||||
|
|
@ -1413,6 +1618,19 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/8b/e4/5380e8229c442e406404977d2ec71a9db6a3e6a89fce7791c6ad7cd2bdbe/langsmith-0.3.8-py3-none-any.whl", hash = "sha256:fbb9dd97b0f090219447fca9362698d07abaeda1da85aa7cc6ec6517b36581b1", size = 332800 },
|
{ url = "https://files.pythonhosted.org/packages/8b/e4/5380e8229c442e406404977d2ec71a9db6a3e6a89fce7791c6ad7cd2bdbe/langsmith-0.3.8-py3-none-any.whl", hash = "sha256:fbb9dd97b0f090219447fca9362698d07abaeda1da85aa7cc6ec6517b36581b1", size = 332800 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "linkup-sdk"
|
||||||
|
version = "0.2.4"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "httpx" },
|
||||||
|
{ name = "pydantic" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/c2/c7/d9a85331bf2611ecac67f1ad92a6ced641b2e2e93eea26b17a9af701b3d1/linkup_sdk-0.2.4.tar.gz", hash = "sha256:2b8fd1894b9b4715bc14aabcbf53df6def9024f2cc426f234cc59e1807ec4c12", size = 9392 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/18/d8/bb9e01328fe5ad979e3e459c0f76321d295663906deef56eeaa5ce0cf269/linkup_sdk-0.2.4-py3-none-any.whl", hash = "sha256:8bc4c4f34de93529136a14e42441d803868d681c2bf3fd59be51923e44f1f1d4", size = 8325 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "litellm"
|
name = "litellm"
|
||||||
version = "1.61.4"
|
version = "1.61.4"
|
||||||
|
|
@ -1435,6 +1653,72 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/f9/c2/1b6c502909b7af9054736af61e27558a3341e8c1ba28e7f82473e6dd936f/litellm-1.61.4-py3-none-any.whl", hash = "sha256:e87e0d397a191795b4217f9299fc9b21eaacaab91409695f0a4780cceccda6e1", size = 6814517 },
|
{ url = "https://files.pythonhosted.org/packages/f9/c2/1b6c502909b7af9054736af61e27558a3341e8c1ba28e7f82473e6dd936f/litellm-1.61.4-py3-none-any.whl", hash = "sha256:e87e0d397a191795b4217f9299fc9b21eaacaab91409695f0a4780cceccda6e1", size = 6814517 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "llama-cloud"
|
||||||
|
version = "0.1.23"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "certifi" },
|
||||||
|
{ name = "httpx" },
|
||||||
|
{ name = "pydantic" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/5b/e4/d1a30167ed6690a408382be1cf7de220a506085f4371baaf067d65bad8fd/llama_cloud-0.1.23.tar.gz", hash = "sha256:3d84a24a860f046d39a106c06742ec0ea39a574ac42bbf91706fe025f44e233e", size = 101292 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/8a/15/3b56acef877dbc5d01d7e1a782c2cc50ef8a08d5773121c3bc20546de582/llama_cloud-0.1.23-py3-none-any.whl", hash = "sha256:ce95b0705d85c99b3b27b0af0d16a17d9a81b14c96bf13c1063a1bd13d8d0446", size = 267343 },
|
||||||
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "llama-cloud-services"
|
||||||
|
version = "0.6.25"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "click" },
|
||||||
|
{ name = "llama-cloud" },
|
||||||
|
{ name = "llama-index-core" },
|
||||||
|
{ name = "platformdirs" },
|
||||||
|
{ name = "pydantic" },
|
||||||
|
{ name = "python-dotenv" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/79/c0/89f89dfc2c2b6c2d5c1c5fde9f445696eb12f9c2a4e17637ab0aaf7cc373/llama_cloud_services-0.6.25.tar.gz", hash = "sha256:3608004b0cf984640a3a36657b8b40394d7ce2c48e3eb9dd24fc654df7643595", size = 32303 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/e6/f1/99b8ef4a636dafd5f1ae1e1b19eb9f793f51573d782919bf01d9b9f797f4/llama_cloud_services-0.6.25-py3-none-any.whl", hash = "sha256:aef0afbbf0d6dc485e6566af2daeeefa8caa7bc7f6511d860036bc0aac15361b", size = 37231 },
|
||||||
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "llama-index-core"
|
||||||
|
version = "0.12.39"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "aiohttp" },
|
||||||
|
{ name = "aiosqlite" },
|
||||||
|
{ name = "banks" },
|
||||||
|
{ name = "dataclasses-json" },
|
||||||
|
{ name = "deprecated" },
|
||||||
|
{ name = "dirtyjson" },
|
||||||
|
{ name = "filetype" },
|
||||||
|
{ name = "fsspec" },
|
||||||
|
{ name = "httpx" },
|
||||||
|
{ name = "nest-asyncio" },
|
||||||
|
{ name = "networkx" },
|
||||||
|
{ name = "nltk" },
|
||||||
|
{ name = "numpy" },
|
||||||
|
{ name = "pillow" },
|
||||||
|
{ name = "pydantic" },
|
||||||
|
{ name = "pyyaml" },
|
||||||
|
{ name = "requests" },
|
||||||
|
{ name = "sqlalchemy", extra = ["asyncio"] },
|
||||||
|
{ name = "tenacity" },
|
||||||
|
{ name = "tiktoken" },
|
||||||
|
{ name = "tqdm" },
|
||||||
|
{ name = "typing-extensions" },
|
||||||
|
{ name = "typing-inspect" },
|
||||||
|
{ name = "wrapt" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/f7/45/163806502804ff75ace474f868cc33158774c4eb31d565133f32932e930e/llama_index_core-0.12.39.tar.gz", hash = "sha256:0cca9de59953542a3c2f1db61327c5204e0b1e997f31f1200e49392b2879593a", size = 7292040 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/dd/a3/583d80764df75aefc9885f28dcc06a0e5aefc993fa5318186e70f2340d73/llama_index_core-0.12.39-py3-none-any.whl", hash = "sha256:c255ed87aa85e43893f2bb05870b61ce7701d7a6a931d174ba925def5856b4c2", size = 7664906 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "lxml"
|
name = "lxml"
|
||||||
version = "5.3.1"
|
version = "5.3.1"
|
||||||
|
|
@ -1477,6 +1761,24 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/80/83/8c54533b3576f4391eebea88454738978669a6cad0d8e23266224007939d/lxml-5.3.1-cp313-cp313-win_amd64.whl", hash = "sha256:91fb6a43d72b4f8863d21f347a9163eecbf36e76e2f51068d59cd004c506f332", size = 3814484 },
|
{ url = "https://files.pythonhosted.org/packages/80/83/8c54533b3576f4391eebea88454738978669a6cad0d8e23266224007939d/lxml-5.3.1-cp313-cp313-win_amd64.whl", hash = "sha256:91fb6a43d72b4f8863d21f347a9163eecbf36e76e2f51068d59cd004c506f332", size = 3814484 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "magika"
|
||||||
|
version = "0.6.1"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "click" },
|
||||||
|
{ name = "numpy" },
|
||||||
|
{ name = "onnxruntime" },
|
||||||
|
{ name = "python-dotenv" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/6d/18/ea70f6abd36f455037340f12c8125918c726d08cd6e01f0b76b6884e0c38/magika-0.6.1.tar.gz", hash = "sha256:e3dd22c73936630b1cd79d0f412d6d9a53dc99ba5e3709b1ac53f56bc998e635", size = 3030234 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/1f/be/c9f7bb9ee94abe8d344b660672001313e459c67b867b24abe32d5c80a9ce/magika-0.6.1-py3-none-any.whl", hash = "sha256:15838d2469f1394d8e9598bc7fceea1ede7f35aebe9675c6b45c6b5c48315931", size = 2968516 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/3c/b9/016b174520e81faef5edb31b6c7a73966dc84ee33acd23a2e7b775df7ba4/magika-0.6.1-py3-none-macosx_11_0_arm64.whl", hash = "sha256:dadd036296a2e4840fd48fa0712848fe122da438e8f607dc8f19ca4663c359dc", size = 12408519 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/02/b7/e7dfeb235823a82d676c68a748541c24db0249b854f945f6e3cec11c1b7e/magika-0.6.1-py3-none-manylinux_2_28_x86_64.whl", hash = "sha256:133c0e1a844361de86ca2dd7c530e38b324e86177d30c52e36fd82101c190b5c", size = 15089294 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/64/f0/bec5bff0125d08c1bc3baef88beeb910121085249f67b5994ea961615b55/magika-0.6.1-py3-none-win_amd64.whl", hash = "sha256:0342b6230ea9aea7ab4b8fa92e1b46f1cc62e724d452ee8d6821a37f56738d22", size = 12378455 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "makefun"
|
name = "makefun"
|
||||||
version = "1.15.6"
|
version = "1.15.6"
|
||||||
|
|
@ -1630,7 +1932,7 @@ wheels = [
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "model2vec"
|
name = "model2vec"
|
||||||
version = "0.4.0"
|
version = "0.4.1"
|
||||||
source = { registry = "https://pypi.org/simple" }
|
source = { registry = "https://pypi.org/simple" }
|
||||||
dependencies = [
|
dependencies = [
|
||||||
{ name = "jinja2" },
|
{ name = "jinja2" },
|
||||||
|
|
@ -1642,9 +1944,18 @@ dependencies = [
|
||||||
{ name = "tokenizers" },
|
{ name = "tokenizers" },
|
||||||
{ name = "tqdm" },
|
{ name = "tqdm" },
|
||||||
]
|
]
|
||||||
sdist = { url = "https://files.pythonhosted.org/packages/83/e2/3fb7bd8c612f71ad3abded92e7401f97f1e71427d3a68a3fb85f39394b17/model2vec-0.4.0.tar.gz", hash = "sha256:48d4a3da040499b0090f736eb8f22ea0fdd35b67462d81d789c70004423adbae", size = 2486998 }
|
sdist = { url = "https://files.pythonhosted.org/packages/b8/c1/3cd6cab10e8b7da8c32acebf85672d38a26f5f03165bfeaa617a5ec0bb61/model2vec-0.4.1.tar.gz", hash = "sha256:fc6038416679eebe448951708f2d0bebdee8510f47970af1c81a8f054a3c3f9f", size = 2660626 }
|
||||||
wheels = [
|
wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/93/7d/39ff093c4e45303a06e3c5825c6144cbd21f18a1393a154bbf93232b0f1a/model2vec-0.4.0-py3-none-any.whl", hash = "sha256:df30685a55841c61c6638e4f329648e76b148507bd778801d7bfcd6b970a4f2f", size = 38593 },
|
{ url = "https://files.pythonhosted.org/packages/cd/76/c8575f90f521017597c5e57e3bfef61e3f27d9cb6c741a82a24d72b10a60/model2vec-0.4.1-py3-none-any.whl", hash = "sha256:04a397a17da9b967082b6baa4c494f0be48c89ec4e1a3975b4f290f045238a38", size = 41972 },
|
||||||
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "more-itertools"
|
||||||
|
version = "10.7.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/ce/a0/834b0cebabbfc7e311f30b46c8188790a37f89fc8d756660346fe5abfd09/more_itertools-10.7.0.tar.gz", hash = "sha256:9fddd5403be01a94b204faadcff459ec3568cf110265d3c54323e1e866ad29d3", size = 127671 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/2b/9f/7ba6f94fc1e9ac3d2b853fdff3035fb2fa5afbed898c4a72b8a020610594/more_itertools-10.7.0-py3-none-any.whl", hash = "sha256:d43980384673cb07d2f7d2d918c616b30c659c089ee23953f601d6609c67510e", size = 65278 },
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
|
|
@ -1722,6 +2033,37 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/b9/54/dd730b32ea14ea797530a4479b2ed46a6fb250f682a9cfb997e968bf0261/networkx-3.4.2-py3-none-any.whl", hash = "sha256:df5d4365b724cf81b8c6a7312509d0c22386097011ad1abe274afd5e9d3bbc5f", size = 1723263 },
|
{ url = "https://files.pythonhosted.org/packages/b9/54/dd730b32ea14ea797530a4479b2ed46a6fb250f682a9cfb997e968bf0261/networkx-3.4.2-py3-none-any.whl", hash = "sha256:df5d4365b724cf81b8c6a7312509d0c22386097011ad1abe274afd5e9d3bbc5f", size = 1723263 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "nh3"
|
||||||
|
version = "0.2.21"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/37/30/2f81466f250eb7f591d4d193930df661c8c23e9056bdc78e365b646054d8/nh3-0.2.21.tar.gz", hash = "sha256:4990e7ee6a55490dbf00d61a6f476c9a3258e31e711e13713b2ea7d6616f670e", size = 16581 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/7f/81/b83775687fcf00e08ade6d4605f0be9c4584cb44c4973d9f27b7456a31c9/nh3-0.2.21-cp313-cp313t-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:fcff321bd60c6c5c9cb4ddf2554e22772bb41ebd93ad88171bbbb6f271255286", size = 1297678 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/22/ee/d0ad8fb4b5769f073b2df6807f69a5e57ca9cea504b78809921aef460d20/nh3-0.2.21-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:31eedcd7d08b0eae28ba47f43fd33a653b4cdb271d64f1aeda47001618348fde", size = 733774 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/ea/76/b450141e2d384ede43fe53953552f1c6741a499a8c20955ad049555cabc8/nh3-0.2.21-cp313-cp313t-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:d426d7be1a2f3d896950fe263332ed1662f6c78525b4520c8e9861f8d7f0d243", size = 760012 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/97/90/1182275db76cd8fbb1f6bf84c770107fafee0cb7da3e66e416bcb9633da2/nh3-0.2.21-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:9d67709bc0d7d1f5797b21db26e7a8b3d15d21c9c5f58ccfe48b5328483b685b", size = 923619 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/29/c7/269a7cfbec9693fad8d767c34a755c25ccb8d048fc1dfc7a7d86bc99375c/nh3-0.2.21-cp313-cp313t-musllinux_1_2_armv7l.whl", hash = "sha256:55823c5ea1f6b267a4fad5de39bc0524d49a47783e1fe094bcf9c537a37df251", size = 1000384 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/68/a9/48479dbf5f49ad93f0badd73fbb48b3d769189f04c6c69b0df261978b009/nh3-0.2.21-cp313-cp313t-musllinux_1_2_i686.whl", hash = "sha256:818f2b6df3763e058efa9e69677b5a92f9bc0acff3295af5ed013da544250d5b", size = 918908 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/d7/da/0279c118f8be2dc306e56819880b19a1cf2379472e3b79fc8eab44e267e3/nh3-0.2.21-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:b3b5c58161e08549904ac4abd450dacd94ff648916f7c376ae4b2c0652b98ff9", size = 909180 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/26/16/93309693f8abcb1088ae143a9c8dbcece9c8f7fb297d492d3918340c41f1/nh3-0.2.21-cp313-cp313t-win32.whl", hash = "sha256:637d4a10c834e1b7d9548592c7aad760611415fcd5bd346f77fd8a064309ae6d", size = 532747 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/a2/3a/96eb26c56cbb733c0b4a6a907fab8408ddf3ead5d1b065830a8f6a9c3557/nh3-0.2.21-cp313-cp313t-win_amd64.whl", hash = "sha256:713d16686596e556b65e7f8c58328c2df63f1a7abe1277d87625dcbbc012ef82", size = 528908 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/ba/1d/b1ef74121fe325a69601270f276021908392081f4953d50b03cbb38b395f/nh3-0.2.21-cp38-abi3-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl", hash = "sha256:a772dec5b7b7325780922dd904709f0f5f3a79fbf756de5291c01370f6df0967", size = 1316133 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/b8/f2/2c7f79ce6de55b41e7715f7f59b159fd59f6cdb66223c05b42adaee2b645/nh3-0.2.21-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d002b648592bf3033adfd875a48f09b8ecc000abd7f6a8769ed86b6ccc70c759", size = 758328 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/6d/ad/07bd706fcf2b7979c51b83d8b8def28f413b090cf0cb0035ee6b425e9de5/nh3-0.2.21-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:2a5174551f95f2836f2ad6a8074560f261cf9740a48437d6151fd2d4d7d617ab", size = 747020 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/75/99/06a6ba0b8a0d79c3d35496f19accc58199a1fb2dce5e711a31be7e2c1426/nh3-0.2.21-cp38-abi3-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:b8d55ea1fc7ae3633d758a92aafa3505cd3cc5a6e40470c9164d54dff6f96d42", size = 944878 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/79/d4/dc76f5dc50018cdaf161d436449181557373869aacf38a826885192fc587/nh3-0.2.21-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:6ae319f17cd8960d0612f0f0ddff5a90700fa71926ca800e9028e7851ce44a6f", size = 903460 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/cd/c3/d4f8037b2ab02ebf5a2e8637bd54736ed3d0e6a2869e10341f8d9085f00e/nh3-0.2.21-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:63ca02ac6f27fc80f9894409eb61de2cb20ef0a23740c7e29f9ec827139fa578", size = 839369 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/11/a9/1cd3c6964ec51daed7b01ca4686a5c793581bf4492cbd7274b3f544c9abe/nh3-0.2.21-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a5f77e62aed5c4acad635239ac1290404c7e940c81abe561fd2af011ff59f585", size = 739036 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/fd/04/bfb3ff08d17a8a96325010ae6c53ba41de6248e63cdb1b88ef6369a6cdfc/nh3-0.2.21-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:087ffadfdcd497658c3adc797258ce0f06be8a537786a7217649fc1c0c60c293", size = 768712 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/9e/aa/cfc0bf545d668b97d9adea4f8b4598667d2b21b725d83396c343ad12bba7/nh3-0.2.21-cp38-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:ac7006c3abd097790e611fe4646ecb19a8d7f2184b882f6093293b8d9b887431", size = 930559 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/78/9d/6f5369a801d3a1b02e6a9a097d56bcc2f6ef98cffebf03c4bb3850d8e0f0/nh3-0.2.21-cp38-abi3-musllinux_1_2_armv7l.whl", hash = "sha256:6141caabe00bbddc869665b35fc56a478eb774a8c1dfd6fba9fe1dfdf29e6efa", size = 1008591 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/a6/df/01b05299f68c69e480edff608248313cbb5dbd7595c5e048abe8972a57f9/nh3-0.2.21-cp38-abi3-musllinux_1_2_i686.whl", hash = "sha256:20979783526641c81d2f5bfa6ca5ccca3d1e4472474b162c6256745fbfe31cd1", size = 925670 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/3d/79/bdba276f58d15386a3387fe8d54e980fb47557c915f5448d8c6ac6f7ea9b/nh3-0.2.21-cp38-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:a7ea28cd49293749d67e4fcf326c554c83ec912cd09cd94aa7ec3ab1921c8283", size = 917093 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/e7/d8/c6f977a5cd4011c914fb58f5ae573b071d736187ccab31bfb1d539f4af9f/nh3-0.2.21-cp38-abi3-win32.whl", hash = "sha256:6c9c30b8b0d291a7c5ab0967ab200598ba33208f754f2f4920e9343bdd88f79a", size = 537623 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/23/fc/8ce756c032c70ae3dd1d48a3552577a325475af2a2f629604b44f571165c/nh3-0.2.21-cp38-abi3-win_amd64.whl", hash = "sha256:bb0014948f04d7976aabae43fcd4cb7f551f9f8ce785a4c9ef66e6c2590f8629", size = 535283 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "nltk"
|
name = "nltk"
|
||||||
version = "3.9.1"
|
version = "3.9.1"
|
||||||
|
|
@ -1751,18 +2093,40 @@ wheels = [
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "numpy"
|
name = "numpy"
|
||||||
version = "1.26.4"
|
version = "2.2.5"
|
||||||
source = { registry = "https://pypi.org/simple" }
|
source = { registry = "https://pypi.org/simple" }
|
||||||
sdist = { url = "https://files.pythonhosted.org/packages/65/6e/09db70a523a96d25e115e71cc56a6f9031e7b8cd166c1ac8438307c14058/numpy-1.26.4.tar.gz", hash = "sha256:2a02aba9ed12e4ac4eb3ea9421c420301a0c6460d9830d74a9df87efa4912010", size = 15786129 }
|
sdist = { url = "https://files.pythonhosted.org/packages/dc/b2/ce4b867d8cd9c0ee84938ae1e6a6f7926ebf928c9090d036fc3c6a04f946/numpy-2.2.5.tar.gz", hash = "sha256:a9c0d994680cd991b1cb772e8b297340085466a6fe964bc9d4e80f5e2f43c291", size = 20273920 }
|
||||||
wheels = [
|
wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/95/12/8f2020a8e8b8383ac0177dc9570aad031a3beb12e38847f7129bacd96228/numpy-1.26.4-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:b3ce300f3644fb06443ee2222c2201dd3a89ea6040541412b8fa189341847218", size = 20335901 },
|
{ url = "https://files.pythonhosted.org/packages/e2/f7/1fd4ff108cd9d7ef929b8882692e23665dc9c23feecafbb9c6b80f4ec583/numpy-2.2.5-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:ee461a4eaab4f165b68780a6a1af95fb23a29932be7569b9fab666c407969051", size = 20948633 },
|
||||||
{ url = "https://files.pythonhosted.org/packages/75/5b/ca6c8bd14007e5ca171c7c03102d17b4f4e0ceb53957e8c44343a9546dcc/numpy-1.26.4-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:03a8c78d01d9781b28a6989f6fa1bb2c4f2d51201cf99d3dd875df6fbd96b23b", size = 13685868 },
|
{ url = "https://files.pythonhosted.org/packages/12/03/d443c278348371b20d830af155ff2079acad6a9e60279fac2b41dbbb73d8/numpy-2.2.5-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:ec31367fd6a255dc8de4772bd1658c3e926d8e860a0b6e922b615e532d320ddc", size = 14176123 },
|
||||||
{ url = "https://files.pythonhosted.org/packages/79/f8/97f10e6755e2a7d027ca783f63044d5b1bc1ae7acb12afe6a9b4286eac17/numpy-1.26.4-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9fad7dcb1aac3c7f0584a5a8133e3a43eeb2fe127f47e3632d43d677c66c102b", size = 13925109 },
|
{ url = "https://files.pythonhosted.org/packages/2b/0b/5ca264641d0e7b14393313304da48b225d15d471250376f3fbdb1a2be603/numpy-2.2.5-cp312-cp312-macosx_14_0_arm64.whl", hash = "sha256:47834cde750d3c9f4e52c6ca28a7361859fcaf52695c7dc3cc1a720b8922683e", size = 5163817 },
|
||||||
{ url = "https://files.pythonhosted.org/packages/0f/50/de23fde84e45f5c4fda2488c759b69990fd4512387a8632860f3ac9cd225/numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:675d61ffbfa78604709862923189bad94014bef562cc35cf61d3a07bba02a7ed", size = 17950613 },
|
{ url = "https://files.pythonhosted.org/packages/04/b3/d522672b9e3d28e26e1613de7675b441bbd1eaca75db95680635dd158c67/numpy-2.2.5-cp312-cp312-macosx_14_0_x86_64.whl", hash = "sha256:2c1a1c6ccce4022383583a6ded7bbcda22fc635eb4eb1e0a053336425ed36dfa", size = 6698066 },
|
||||||
{ url = "https://files.pythonhosted.org/packages/4c/0c/9c603826b6465e82591e05ca230dfc13376da512b25ccd0894709b054ed0/numpy-1.26.4-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:ab47dbe5cc8210f55aa58e4805fe224dac469cde56b9f731a4c098b91917159a", size = 13572172 },
|
{ url = "https://files.pythonhosted.org/packages/a0/93/0f7a75c1ff02d4b76df35079676b3b2719fcdfb39abdf44c8b33f43ef37d/numpy-2.2.5-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9d75f338f5f79ee23548b03d801d28a505198297534f62416391857ea0479571", size = 14087277 },
|
||||||
{ url = "https://files.pythonhosted.org/packages/76/8c/2ba3902e1a0fc1c74962ea9bb33a534bb05984ad7ff9515bf8d07527cadd/numpy-1.26.4-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:1dda2e7b4ec9dd512f84935c5f126c8bd8b9f2fc001e9f54af255e8c5f16b0e0", size = 17786643 },
|
{ url = "https://files.pythonhosted.org/packages/b0/d9/7c338b923c53d431bc837b5b787052fef9ae68a56fe91e325aac0d48226e/numpy-2.2.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3a801fef99668f309b88640e28d261991bfad9617c27beda4a3aec4f217ea073", size = 16135742 },
|
||||||
{ url = "https://files.pythonhosted.org/packages/28/4a/46d9e65106879492374999e76eb85f87b15328e06bd1550668f79f7b18c6/numpy-1.26.4-cp312-cp312-win32.whl", hash = "sha256:50193e430acfc1346175fcbdaa28ffec49947a06918b7b92130744e81e640110", size = 5677803 },
|
{ url = "https://files.pythonhosted.org/packages/2d/10/4dec9184a5d74ba9867c6f7d1e9f2e0fb5fe96ff2bf50bb6f342d64f2003/numpy-2.2.5-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:abe38cd8381245a7f49967a6010e77dbf3680bd3627c0fe4362dd693b404c7f8", size = 15581825 },
|
||||||
{ url = "https://files.pythonhosted.org/packages/16/2e/86f24451c2d530c88daf997cb8d6ac622c1d40d19f5a031ed68a4b73a374/numpy-1.26.4-cp312-cp312-win_amd64.whl", hash = "sha256:08beddf13648eb95f8d867350f6a018a4be2e5ad54c8d8caed89ebca558b2818", size = 15517754 },
|
{ url = "https://files.pythonhosted.org/packages/80/1f/2b6fcd636e848053f5b57712a7d1880b1565eec35a637fdfd0a30d5e738d/numpy-2.2.5-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:5a0ac90e46fdb5649ab6369d1ab6104bfe5854ab19b645bf5cda0127a13034ae", size = 17899600 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/ec/87/36801f4dc2623d76a0a3835975524a84bd2b18fe0f8835d45c8eae2f9ff2/numpy-2.2.5-cp312-cp312-win32.whl", hash = "sha256:0cd48122a6b7eab8f06404805b1bd5856200e3ed6f8a1b9a194f9d9054631beb", size = 6312626 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/8b/09/4ffb4d6cfe7ca6707336187951992bd8a8b9142cf345d87ab858d2d7636a/numpy-2.2.5-cp312-cp312-win_amd64.whl", hash = "sha256:ced69262a8278547e63409b2653b372bf4baff0870c57efa76c5703fd6543282", size = 12645715 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/e2/a0/0aa7f0f4509a2e07bd7a509042967c2fab635690d4f48c6c7b3afd4f448c/numpy-2.2.5-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:059b51b658f4414fff78c6d7b1b4e18283ab5fa56d270ff212d5ba0c561846f4", size = 20935102 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/7e/e4/a6a9f4537542912ec513185396fce52cdd45bdcf3e9d921ab02a93ca5aa9/numpy-2.2.5-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:47f9ed103af0bc63182609044b0490747e03bd20a67e391192dde119bf43d52f", size = 14191709 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/be/65/72f3186b6050bbfe9c43cb81f9df59ae63603491d36179cf7a7c8d216758/numpy-2.2.5-cp313-cp313-macosx_14_0_arm64.whl", hash = "sha256:261a1ef047751bb02f29dfe337230b5882b54521ca121fc7f62668133cb119c9", size = 5149173 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/e5/e9/83e7a9432378dde5802651307ae5e9ea07bb72b416728202218cd4da2801/numpy-2.2.5-cp313-cp313-macosx_14_0_x86_64.whl", hash = "sha256:4520caa3807c1ceb005d125a75e715567806fed67e315cea619d5ec6e75a4191", size = 6684502 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/ea/27/b80da6c762394c8ee516b74c1f686fcd16c8f23b14de57ba0cad7349d1d2/numpy-2.2.5-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3d14b17b9be5f9c9301f43d2e2a4886a33b53f4e6fdf9ca2f4cc60aeeee76372", size = 14084417 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/aa/fc/ebfd32c3e124e6a1043e19c0ab0769818aa69050ce5589b63d05ff185526/numpy-2.2.5-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2ba321813a00e508d5421104464510cc962a6f791aa2fca1c97b1e65027da80d", size = 16133807 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/bf/9b/4cc171a0acbe4666f7775cfd21d4eb6bb1d36d3a0431f48a73e9212d2278/numpy-2.2.5-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:a4cbdef3ddf777423060c6f81b5694bad2dc9675f110c4b2a60dc0181543fac7", size = 15575611 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/a3/45/40f4135341850df48f8edcf949cf47b523c404b712774f8855a64c96ef29/numpy-2.2.5-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:54088a5a147ab71a8e7fdfd8c3601972751ded0739c6b696ad9cb0343e21ab73", size = 17895747 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/f8/4c/b32a17a46f0ffbde8cc82df6d3daeaf4f552e346df143e1b188a701a8f09/numpy-2.2.5-cp313-cp313-win32.whl", hash = "sha256:c8b82a55ef86a2d8e81b63da85e55f5537d2157165be1cb2ce7cfa57b6aef38b", size = 6309594 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/13/ae/72e6276feb9ef06787365b05915bfdb057d01fceb4a43cb80978e518d79b/numpy-2.2.5-cp313-cp313-win_amd64.whl", hash = "sha256:d8882a829fd779f0f43998e931c466802a77ca1ee0fe25a3abe50278616b1471", size = 12638356 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/79/56/be8b85a9f2adb688e7ded6324e20149a03541d2b3297c3ffc1a73f46dedb/numpy-2.2.5-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:e8b025c351b9f0e8b5436cf28a07fa4ac0204d67b38f01433ac7f9b870fa38c6", size = 20963778 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/ff/77/19c5e62d55bff507a18c3cdff82e94fe174957bad25860a991cac719d3ab/numpy-2.2.5-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:8dfa94b6a4374e7851bbb6f35e6ded2120b752b063e6acdd3157e4d2bb922eba", size = 14207279 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/75/22/aa11f22dc11ff4ffe4e849d9b63bbe8d4ac6d5fae85ddaa67dfe43be3e76/numpy-2.2.5-cp313-cp313t-macosx_14_0_arm64.whl", hash = "sha256:97c8425d4e26437e65e1d189d22dff4a079b747ff9c2788057bfb8114ce1e133", size = 5199247 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/4f/6c/12d5e760fc62c08eded0394f62039f5a9857f758312bf01632a81d841459/numpy-2.2.5-cp313-cp313t-macosx_14_0_x86_64.whl", hash = "sha256:352d330048c055ea6db701130abc48a21bec690a8d38f8284e00fab256dc1376", size = 6711087 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/ef/94/ece8280cf4218b2bee5cec9567629e61e51b4be501e5c6840ceb593db945/numpy-2.2.5-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:8b4c0773b6ada798f51f0f8e30c054d32304ccc6e9c5d93d46cb26f3d385ab19", size = 14059964 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/39/41/c5377dac0514aaeec69115830a39d905b1882819c8e65d97fc60e177e19e/numpy-2.2.5-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:55f09e00d4dccd76b179c0f18a44f041e5332fd0e022886ba1c0bbf3ea4a18d0", size = 16121214 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/db/54/3b9f89a943257bc8e187145c6bc0eb8e3d615655f7b14e9b490b053e8149/numpy-2.2.5-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:02f226baeefa68f7d579e213d0f3493496397d8f1cff5e2b222af274c86a552a", size = 15575788 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/b1/c4/2e407e85df35b29f79945751b8f8e671057a13a376497d7fb2151ba0d290/numpy-2.2.5-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:c26843fd58f65da9491165072da2cccc372530681de481ef670dcc8e27cfb066", size = 17893672 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/29/7e/d0b44e129d038dba453f00d0e29ebd6eaf2f06055d72b95b9947998aca14/numpy-2.2.5-cp313-cp313t-win32.whl", hash = "sha256:1a161c2c79ab30fe4501d5a2bbfe8b162490757cf90b7f05be8b80bc02f7bb8e", size = 6377102 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/63/be/b85e4aa4bf42c6502851b971f1c326d583fcc68227385f92089cf50a7b45/numpy-2.2.5-cp313-cp313t-win_amd64.whl", hash = "sha256:d403c84991b5ad291d3809bace5e85f4bbf44a04bdc9a88ed2bb1807b3360bb8", size = 12750096 },
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
|
|
@ -2219,6 +2583,15 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/cf/6c/41c21c6c8af92b9fea313aa47c75de49e2f9a467964ee33eb0135d47eb64/pillow-11.1.0-cp313-cp313t-win_arm64.whl", hash = "sha256:67cd427c68926108778a9005f2a04adbd5e67c442ed21d95389fe1d595458756", size = 2377651 },
|
{ url = "https://files.pythonhosted.org/packages/cf/6c/41c21c6c8af92b9fea313aa47c75de49e2f9a467964ee33eb0135d47eb64/pillow-11.1.0-cp313-cp313t-win_arm64.whl", hash = "sha256:67cd427c68926108778a9005f2a04adbd5e67c442ed21d95389fe1d595458756", size = 2377651 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "platformdirs"
|
||||||
|
version = "4.3.8"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/fe/8b/3c73abc9c759ecd3f1f7ceff6685840859e8070c4d947c93fae71f6a0bf2/platformdirs-4.3.8.tar.gz", hash = "sha256:3d512d96e16bcb959a814c9f348431070822a6496326a4be0911c40b5a74c2bc", size = 21362 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/fe/39/979e8e21520d4e47a0bbe349e2713c0aac6f3d853d0e5b34d76206c439aa/platformdirs-4.3.8-py3-none-any.whl", hash = "sha256:ff7059bb7eb1179e2685604f4aaf157cfd9535242bd23742eadc3c13542139b4", size = 18567 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "playwright"
|
name = "playwright"
|
||||||
version = "1.50.0"
|
version = "1.50.0"
|
||||||
|
|
@ -2237,6 +2610,12 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/bc/2b/e944e10c9b18e77e43d3bb4d6faa323f6cc27597db37b75bc3fd796adfd5/playwright-1.50.0-py3-none-win_amd64.whl", hash = "sha256:1859423da82de631704d5e3d88602d755462b0906824c1debe140979397d2e8d", size = 34784546 },
|
{ url = "https://files.pythonhosted.org/packages/bc/2b/e944e10c9b18e77e43d3bb4d6faa323f6cc27597db37b75bc3fd796adfd5/playwright-1.50.0-py3-none-win_amd64.whl", hash = "sha256:1859423da82de631704d5e3d88602d755462b0906824c1debe140979397d2e8d", size = 34784546 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "progress"
|
||||||
|
version = "1.6"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/2a/68/d8412d1e0d70edf9791cbac5426dc859f4649afc22f2abbeb0d947cf70fd/progress-1.6.tar.gz", hash = "sha256:c9c86e98b5c03fa1fe11e3b67c1feda4788b8d0fe7336c2ff7d5644ccfba34cd", size = 7842 }
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "propcache"
|
name = "propcache"
|
||||||
version = "0.2.1"
|
version = "0.2.1"
|
||||||
|
|
@ -2576,6 +2955,19 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/6a/3e/b68c118422ec867fa7ab88444e1274aa40681c606d59ac27de5a5588f082/python_dotenv-1.0.1-py3-none-any.whl", hash = "sha256:f7b63ef50f1b690dddf550d03497b66d609393b40b564ed0d674909a68ebf16a", size = 19863 },
|
{ url = "https://files.pythonhosted.org/packages/6a/3e/b68c118422ec867fa7ab88444e1274aa40681c606d59ac27de5a5588f082/python_dotenv-1.0.1-py3-none-any.whl", hash = "sha256:f7b63ef50f1b690dddf550d03497b66d609393b40b564ed0d674909a68ebf16a", size = 19863 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "python-ffmpeg"
|
||||||
|
version = "2.0.12"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "pyee" },
|
||||||
|
{ name = "typing-extensions" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/dd/4d/7ecffb341d646e016be76e36f5a42cb32f409c9ca21a57b68f067fad3fc7/python_ffmpeg-2.0.12.tar.gz", hash = "sha256:19ac80af5a064a2f53c245af1a909b2d7648ea045500d96d3bcd507b88d43dc7", size = 14126292 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/7f/6d/02e817aec661defe148cb9eb0c4eca2444846305f625c2243fb9f92a9045/python_ffmpeg-2.0.12-py3-none-any.whl", hash = "sha256:d86697da8dfb39335183e336d31baf42fb217468adf5ac97fd743898240faae3", size = 14411 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "python-iso639"
|
name = "python-iso639"
|
||||||
version = "2025.2.18"
|
version = "2025.2.18"
|
||||||
|
|
@ -2641,6 +3033,15 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/eb/38/ac33370d784287baa1c3d538978b5e2ea064d4c1b93ffbd12826c190dd10/pytz-2025.1-py2.py3-none-any.whl", hash = "sha256:89dd22dca55b46eac6eda23b2d72721bf1bdfef212645d81513ef5d03038de57", size = 507930 },
|
{ url = "https://files.pythonhosted.org/packages/eb/38/ac33370d784287baa1c3d538978b5e2ea064d4c1b93ffbd12826c190dd10/pytz-2025.1-py2.py3-none-any.whl", hash = "sha256:89dd22dca55b46eac6eda23b2d72721bf1bdfef212645d81513ef5d03038de57", size = 507930 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "pywin32-ctypes"
|
||||||
|
version = "0.2.3"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/85/9f/01a1a99704853cb63f253eea009390c88e7131c67e66a0a02099a8c917cb/pywin32-ctypes-0.2.3.tar.gz", hash = "sha256:d162dc04946d704503b2edc4d55f3dba5c1d539ead017afa00142c38b9885755", size = 29471 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/de/3d/8161f7711c017e01ac9f008dfddd9410dff3674334c233bde66e7ba65bbf/pywin32_ctypes-0.2.3-py3-none-any.whl", hash = "sha256:8a1513379d709975552d202d942d9837758905c8d01eb82b8bcc30918929e7b8", size = 30756 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "pyyaml"
|
name = "pyyaml"
|
||||||
version = "6.0.2"
|
version = "6.0.2"
|
||||||
|
|
@ -2705,6 +3106,20 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/4b/43/ca3d1018b392f49131843648e10b08ace23afe8dad3bee5f136e4346b7cd/rapidfuzz-3.12.2-cp313-cp313-win_arm64.whl", hash = "sha256:69f6ecdf1452139f2b947d0c169a605de578efdb72cbb2373cb0a94edca1fd34", size = 863535 },
|
{ url = "https://files.pythonhosted.org/packages/4b/43/ca3d1018b392f49131843648e10b08ace23afe8dad3bee5f136e4346b7cd/rapidfuzz-3.12.2-cp313-cp313-win_arm64.whl", hash = "sha256:69f6ecdf1452139f2b947d0c169a605de578efdb72cbb2373cb0a94edca1fd34", size = 863535 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "readme-renderer"
|
||||||
|
version = "44.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "docutils" },
|
||||||
|
{ name = "nh3" },
|
||||||
|
{ name = "pygments" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/5a/a9/104ec9234c8448c4379768221ea6df01260cd6c2ce13182d4eac531c8342/readme_renderer-44.0.tar.gz", hash = "sha256:8712034eabbfa6805cacf1402b4eeb2a73028f72d1166d6f5cb7f9c047c5d1e1", size = 32056 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/e1/67/921ec3024056483db83953ae8e48079ad62b92db7880013ca77632921dd0/readme_renderer-44.0-py3-none-any.whl", hash = "sha256:2fbca89b81a08526aadf1357a8c2ae889ec05fb03f5da67f9769c9a592166151", size = 13310 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "referencing"
|
name = "referencing"
|
||||||
version = "0.36.2"
|
version = "0.36.2"
|
||||||
|
|
@ -2798,17 +3213,26 @@ flashrank = [
|
||||||
{ name = "flashrank" },
|
{ name = "flashrank" },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "rfc3986"
|
||||||
|
version = "2.0.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/85/40/1520d68bfa07ab5a6f065a186815fb6610c86fe957bc065754e47f7b0840/rfc3986-2.0.0.tar.gz", hash = "sha256:97aacf9dbd4bfd829baad6e6309fa6573aaf1be3f6fa735c8ab05e46cecb261c", size = 49026 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/ff/9a/9afaade874b2fa6c752c36f1548f718b5b83af81ed9b76628329dab81c1b/rfc3986-2.0.0-py2.py3-none-any.whl", hash = "sha256:50b1502b60e289cb37883f3dfd34532b8873c7de9f49bb546641ce9cbd256ebd", size = 31326 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "rich"
|
name = "rich"
|
||||||
version = "13.9.4"
|
version = "14.0.0"
|
||||||
source = { registry = "https://pypi.org/simple" }
|
source = { registry = "https://pypi.org/simple" }
|
||||||
dependencies = [
|
dependencies = [
|
||||||
{ name = "markdown-it-py" },
|
{ name = "markdown-it-py" },
|
||||||
{ name = "pygments" },
|
{ name = "pygments" },
|
||||||
]
|
]
|
||||||
sdist = { url = "https://files.pythonhosted.org/packages/ab/3a/0316b28d0761c6734d6bc14e770d85506c986c85ffb239e688eeaab2c2bc/rich-13.9.4.tar.gz", hash = "sha256:439594978a49a09530cff7ebc4b5c7103ef57baf48d5ea3184f21d9a2befa098", size = 223149 }
|
sdist = { url = "https://files.pythonhosted.org/packages/a1/53/830aa4c3066a8ab0ae9a9955976fb770fe9c6102117c8ec4ab3ea62d89e8/rich-14.0.0.tar.gz", hash = "sha256:82f1bc23a6a21ebca4ae0c45af9bdbc492ed20231dcb63f297d6d1021a9d5725", size = 224078 }
|
||||||
wheels = [
|
wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/19/71/39c7c0d87f8d4e6c020a393182060eaefeeae6c01dab6a84ec346f2567df/rich-13.9.4-py3-none-any.whl", hash = "sha256:6049d5e6ec054bf2779ab3358186963bac2ea89175919d699e378b99738c2a90", size = 242424 },
|
{ url = "https://files.pythonhosted.org/packages/0d/9b/63f4c7ebc259242c89b3acafdb37b41d1185c07ff0011164674e9076b491/rich-14.0.0-py3-none-any.whl", hash = "sha256:1c9491e1951aac09caffd42f448ee3d04e58923ffe14993f6e83068dc395d7e0", size = 243229 },
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
|
|
@ -2954,6 +3378,19 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/e4/1f/5d46a8d94e9f6d2c913cbb109e57e7eed914de38ea99e2c4d69a9fc93140/scipy-1.15.1-cp313-cp313t-win_amd64.whl", hash = "sha256:bc7136626261ac1ed988dca56cfc4ab5180f75e0ee52e58f1e6aa74b5f3eacd5", size = 43181730 },
|
{ url = "https://files.pythonhosted.org/packages/e4/1f/5d46a8d94e9f6d2c913cbb109e57e7eed914de38ea99e2c4d69a9fc93140/scipy-1.15.1-cp313-cp313t-win_amd64.whl", hash = "sha256:bc7136626261ac1ed988dca56cfc4ab5180f75e0ee52e58f1e6aa74b5f3eacd5", size = 43181730 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "secretstorage"
|
||||||
|
version = "3.3.3"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "cryptography", marker = "sys_platform != 'darwin'" },
|
||||||
|
{ name = "jeepney", marker = "sys_platform != 'darwin'" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/53/a4/f48c9d79cb507ed1373477dbceaba7401fd8a23af63b837fa61f1dcd3691/SecretStorage-3.3.3.tar.gz", hash = "sha256:2403533ef369eca6d2ba81718576c5e0f564d5cca1b58f73a8b23e7d4eeebd77", size = 19739 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/54/24/b4293291fa1dd830f353d2cb163295742fa87f179fcc8a20a306a81978b7/SecretStorage-3.3.3-py3-none-any.whl", hash = "sha256:f356e6628222568e3af06f2eba8df495efa13b3b63081dafd4f7d9a7b7bc9f99", size = 15221 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "sentence-transformers"
|
name = "sentence-transformers"
|
||||||
version = "3.4.1"
|
version = "3.4.1"
|
||||||
|
|
@ -3063,9 +3500,23 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/d9/61/f2b52e107b1fc8944b33ef56bf6ac4ebbe16d91b94d2b87ce013bf63fb84/starlette-0.45.3-py3-none-any.whl", hash = "sha256:dfb6d332576f136ec740296c7e8bb8c8a7125044e7c6da30744718880cdd059d", size = 71507 },
|
{ url = "https://files.pythonhosted.org/packages/d9/61/f2b52e107b1fc8944b33ef56bf6ac4ebbe16d91b94d2b87ce013bf63fb84/starlette-0.45.3-py3-none-any.whl", hash = "sha256:dfb6d332576f136ec740296c7e8bb8c8a7125044e7c6da30744718880cdd059d", size = 71507 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "static-ffmpeg"
|
||||||
|
version = "2.13"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "filelock" },
|
||||||
|
{ name = "progress" },
|
||||||
|
{ name = "requests" },
|
||||||
|
{ name = "twine" },
|
||||||
|
]
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/09/39/1a5d0603280dd681ec52a2a6717c05dab530190dff7887b7603740a1741b/static_ffmpeg-2.13-py3-none-any.whl", hash = "sha256:3bed55a7979f9de9d1eec1126b98774a1d41c2e323811f59973d54b9c94d6dac", size = 7586 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "surf-new-backend"
|
name = "surf-new-backend"
|
||||||
version = "0.0.6"
|
version = "0.0.7"
|
||||||
source = { virtual = "." }
|
source = { virtual = "." }
|
||||||
dependencies = [
|
dependencies = [
|
||||||
{ name = "alembic" },
|
{ name = "alembic" },
|
||||||
|
|
@ -3078,14 +3529,18 @@ dependencies = [
|
||||||
{ name = "langchain-community" },
|
{ name = "langchain-community" },
|
||||||
{ name = "langchain-unstructured" },
|
{ name = "langchain-unstructured" },
|
||||||
{ name = "langgraph" },
|
{ name = "langgraph" },
|
||||||
|
{ name = "linkup-sdk" },
|
||||||
{ name = "litellm" },
|
{ name = "litellm" },
|
||||||
|
{ name = "llama-cloud-services" },
|
||||||
{ name = "markdownify" },
|
{ name = "markdownify" },
|
||||||
{ name = "notion-client" },
|
{ name = "notion-client" },
|
||||||
{ name = "pgvector" },
|
{ name = "pgvector" },
|
||||||
{ name = "playwright" },
|
{ name = "playwright" },
|
||||||
|
{ name = "python-ffmpeg" },
|
||||||
{ name = "rerankers", extra = ["flashrank"] },
|
{ name = "rerankers", extra = ["flashrank"] },
|
||||||
{ name = "sentence-transformers" },
|
{ name = "sentence-transformers" },
|
||||||
{ name = "slack-sdk" },
|
{ name = "slack-sdk" },
|
||||||
|
{ name = "static-ffmpeg" },
|
||||||
{ name = "tavily-python" },
|
{ name = "tavily-python" },
|
||||||
{ name = "unstructured", extra = ["all-docs"] },
|
{ name = "unstructured", extra = ["all-docs"] },
|
||||||
{ name = "unstructured-client" },
|
{ name = "unstructured-client" },
|
||||||
|
|
@ -3098,7 +3553,7 @@ dependencies = [
|
||||||
requires-dist = [
|
requires-dist = [
|
||||||
{ name = "alembic", specifier = ">=1.13.0" },
|
{ name = "alembic", specifier = ">=1.13.0" },
|
||||||
{ name = "asyncpg", specifier = ">=0.30.0" },
|
{ name = "asyncpg", specifier = ">=0.30.0" },
|
||||||
{ name = "chonkie", extras = ["all"], specifier = ">=0.4.1" },
|
{ name = "chonkie", extras = ["all"], specifier = ">=1.0.6" },
|
||||||
{ name = "fastapi", specifier = ">=0.115.8" },
|
{ name = "fastapi", specifier = ">=0.115.8" },
|
||||||
{ name = "fastapi-users", extras = ["oauth", "sqlalchemy"], specifier = ">=14.0.1" },
|
{ name = "fastapi-users", extras = ["oauth", "sqlalchemy"], specifier = ">=14.0.1" },
|
||||||
{ name = "firecrawl-py", specifier = ">=1.12.0" },
|
{ name = "firecrawl-py", specifier = ">=1.12.0" },
|
||||||
|
|
@ -3106,14 +3561,18 @@ requires-dist = [
|
||||||
{ name = "langchain-community", specifier = ">=0.3.17" },
|
{ name = "langchain-community", specifier = ">=0.3.17" },
|
||||||
{ name = "langchain-unstructured", specifier = ">=0.1.6" },
|
{ name = "langchain-unstructured", specifier = ">=0.1.6" },
|
||||||
{ name = "langgraph", specifier = ">=0.3.29" },
|
{ name = "langgraph", specifier = ">=0.3.29" },
|
||||||
|
{ name = "linkup-sdk", specifier = ">=0.2.4" },
|
||||||
{ name = "litellm", specifier = ">=1.61.4" },
|
{ name = "litellm", specifier = ">=1.61.4" },
|
||||||
|
{ name = "llama-cloud-services", specifier = ">=0.6.25" },
|
||||||
{ name = "markdownify", specifier = ">=0.14.1" },
|
{ name = "markdownify", specifier = ">=0.14.1" },
|
||||||
{ name = "notion-client", specifier = ">=2.3.0" },
|
{ name = "notion-client", specifier = ">=2.3.0" },
|
||||||
{ name = "pgvector", specifier = ">=0.3.6" },
|
{ name = "pgvector", specifier = ">=0.3.6" },
|
||||||
{ name = "playwright", specifier = ">=1.50.0" },
|
{ name = "playwright", specifier = ">=1.50.0" },
|
||||||
|
{ name = "python-ffmpeg", specifier = ">=2.0.12" },
|
||||||
{ name = "rerankers", extras = ["flashrank"], specifier = ">=0.7.1" },
|
{ name = "rerankers", extras = ["flashrank"], specifier = ">=0.7.1" },
|
||||||
{ name = "sentence-transformers", specifier = ">=3.4.1" },
|
{ name = "sentence-transformers", specifier = ">=3.4.1" },
|
||||||
{ name = "slack-sdk", specifier = ">=3.34.0" },
|
{ name = "slack-sdk", specifier = ">=3.34.0" },
|
||||||
|
{ name = "static-ffmpeg", specifier = ">=2.13" },
|
||||||
{ name = "tavily-python", specifier = ">=0.3.2" },
|
{ name = "tavily-python", specifier = ">=0.3.2" },
|
||||||
{ name = "unstructured", extras = ["all-docs"], specifier = ">=0.16.25" },
|
{ name = "unstructured", extras = ["all-docs"], specifier = ">=0.16.25" },
|
||||||
{ name = "unstructured-client", specifier = ">=0.30.0" },
|
{ name = "unstructured-client", specifier = ">=0.30.0" },
|
||||||
|
|
@ -3324,6 +3783,91 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/b6/1a/efeecb8d83705f2f4beac98d46f2148c95ecd7babfb31b5c0f1e7017e83d/transformers-4.48.3-py3-none-any.whl", hash = "sha256:78697f990f5ef350c23b46bf86d5081ce96b49479ab180b2de7687267de8fd36", size = 9669412 },
|
{ url = "https://files.pythonhosted.org/packages/b6/1a/efeecb8d83705f2f4beac98d46f2148c95ecd7babfb31b5c0f1e7017e83d/transformers-4.48.3-py3-none-any.whl", hash = "sha256:78697f990f5ef350c23b46bf86d5081ce96b49479ab180b2de7687267de8fd36", size = 9669412 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "tree-sitter"
|
||||||
|
version = "0.24.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/a7/a2/698b9d31d08ad5558f8bfbfe3a0781bd4b1f284e89bde3ad18e05101a892/tree-sitter-0.24.0.tar.gz", hash = "sha256:abd95af65ca2f4f7eca356343391ed669e764f37748b5352946f00f7fc78e734", size = 168304 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/e9/57/3a590f287b5aa60c07d5545953912be3d252481bf5e178f750db75572bff/tree_sitter-0.24.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:14beeff5f11e223c37be7d5d119819880601a80d0399abe8c738ae2288804afc", size = 140788 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/61/0b/fc289e0cba7dbe77c6655a4dd949cd23c663fd62a8b4d8f02f97e28d7fe5/tree_sitter-0.24.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:26a5b130f70d5925d67b47db314da209063664585a2fd36fa69e0717738efaf4", size = 133945 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/86/d7/80767238308a137e0b5b5c947aa243e3c1e3e430e6d0d5ae94b9a9ffd1a2/tree_sitter-0.24.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5fc5c3c26d83c9d0ecb4fc4304fba35f034b7761d35286b936c1db1217558b4e", size = 564819 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/bf/b3/6c5574f4b937b836601f5fb556b24804b0a6341f2eb42f40c0e6464339f4/tree_sitter-0.24.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:772e1bd8c0931c866b848d0369b32218ac97c24b04790ec4b0e409901945dd8e", size = 579303 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/0a/f4/bd0ddf9abe242ea67cca18a64810f8af230fc1ea74b28bb702e838ccd874/tree_sitter-0.24.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:24a8dd03b0d6b8812425f3b84d2f4763322684e38baf74e5bb766128b5633dc7", size = 581054 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/8c/1c/ff23fa4931b6ef1bbeac461b904ca7e49eaec7e7e5398584e3eef836ec96/tree_sitter-0.24.0-cp312-cp312-win_amd64.whl", hash = "sha256:f9e8b1605ab60ed43803100f067eed71b0b0e6c1fb9860a262727dbfbbb74751", size = 120221 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/b2/2a/9979c626f303177b7612a802237d0533155bf1e425ff6f73cc40f25453e2/tree_sitter-0.24.0-cp312-cp312-win_arm64.whl", hash = "sha256:f733a83d8355fc95561582b66bbea92ffd365c5d7a665bc9ebd25e049c2b2abb", size = 108234 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/61/cd/2348339c85803330ce38cee1c6cbbfa78a656b34ff58606ebaf5c9e83bd0/tree_sitter-0.24.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:0d4a6416ed421c4210f0ca405a4834d5ccfbb8ad6692d4d74f7773ef68f92071", size = 140781 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/8b/a3/1ea9d8b64e8dcfcc0051028a9c84a630301290995cd6e947bf88267ef7b1/tree_sitter-0.24.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:e0992d483677e71d5c5d37f30dfb2e3afec2f932a9c53eec4fca13869b788c6c", size = 133928 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/fe/ae/55c1055609c9428a4aedf4b164400ab9adb0b1bf1538b51f4b3748a6c983/tree_sitter-0.24.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:57277a12fbcefb1c8b206186068d456c600dbfbc3fd6c76968ee22614c5cd5ad", size = 564497 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/ce/d0/f2ffcd04882c5aa28d205a787353130cbf84b2b8a977fd211bdc3b399ae3/tree_sitter-0.24.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d25fa22766d63f73716c6fec1a31ee5cf904aa429484256bd5fdf5259051ed74", size = 578917 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/af/82/aebe78ea23a2b3a79324993d4915f3093ad1af43d7c2208ee90be9273273/tree_sitter-0.24.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:7d5d9537507e1c8c5fa9935b34f320bfec4114d675e028f3ad94f11cf9db37b9", size = 581148 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/a1/b4/6b0291a590c2b0417cfdb64ccb8ea242f270a46ed429c641fbc2bfab77e0/tree_sitter-0.24.0-cp313-cp313-win_amd64.whl", hash = "sha256:f58bb4956917715ec4d5a28681829a8dad5c342cafd4aea269f9132a83ca9b34", size = 120207 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/a8/18/542fd844b75272630229c9939b03f7db232c71a9d82aadc59c596319ea6a/tree_sitter-0.24.0-cp313-cp313-win_arm64.whl", hash = "sha256:23641bd25dcd4bb0b6fa91b8fb3f46cc9f1c9f475efe4d536d3f1f688d1b84c8", size = 108232 },
|
||||||
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "tree-sitter-c-sharp"
|
||||||
|
version = "0.23.1"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/22/85/a61c782afbb706a47d990eaee6977e7c2bd013771c5bf5c81c617684f286/tree_sitter_c_sharp-0.23.1.tar.gz", hash = "sha256:322e2cfd3a547a840375276b2aea3335fa6458aeac082f6c60fec3f745c967eb", size = 1317728 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/58/04/f6c2df4c53a588ccd88d50851155945cff8cd887bd70c175e00aaade7edf/tree_sitter_c_sharp-0.23.1-cp39-abi3-macosx_10_9_x86_64.whl", hash = "sha256:2b612a6e5bd17bb7fa2aab4bb6fc1fba45c94f09cb034ab332e45603b86e32fd", size = 372235 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/99/10/1aa9486f1e28fc22810fa92cbdc54e1051e7f5536a5e5b5e9695f609b31e/tree_sitter_c_sharp-0.23.1-cp39-abi3-macosx_11_0_arm64.whl", hash = "sha256:1a8b98f62bc53efcd4d971151950c9b9cd5cbe3bacdb0cd69fdccac63350d83e", size = 419046 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/0f/21/13df29f8fcb9ba9f209b7b413a4764b673dfd58989a0dd67e9c7e19e9c2e/tree_sitter_c_sharp-0.23.1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:986e93d845a438ec3c4416401aa98e6a6f6631d644bbbc2e43fcb915c51d255d", size = 415999 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/ca/72/fc6846795bcdae2f8aa94cc8b1d1af33d634e08be63e294ff0d6794b1efc/tree_sitter_c_sharp-0.23.1-cp39-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a8024e466b2f5611c6dc90321f232d8584893c7fb88b75e4a831992f877616d2", size = 402830 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/fe/3a/b6028c5890ce6653807d5fa88c72232c027c6ceb480dbeb3b186d60e5971/tree_sitter_c_sharp-0.23.1-cp39-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:7f9bf876866835492281d336b9e1f9626ab668737f74e914c31d285261507da7", size = 397880 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/47/d2/4facaa34b40f8104d8751746d0e1cd2ddf0beb9f1404b736b97f372bd1f3/tree_sitter_c_sharp-0.23.1-cp39-abi3-win_amd64.whl", hash = "sha256:ae9a9e859e8f44e2b07578d44f9a220d3fa25b688966708af6aa55d42abeebb3", size = 377562 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/d8/88/3cf6bd9959d94d1fec1e6a9c530c5f08ff4115a474f62aedb5fedb0f7241/tree_sitter_c_sharp-0.23.1-cp39-abi3-win_arm64.whl", hash = "sha256:c81548347a93347be4f48cb63ec7d60ef4b0efa91313330e69641e49aa5a08c5", size = 375157 },
|
||||||
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "tree-sitter-embedded-template"
|
||||||
|
version = "0.23.2"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/28/d6/5a58ea2f0480f5ed188b733114a8c275532a2fd1568b3898793b13d28af5/tree_sitter_embedded_template-0.23.2.tar.gz", hash = "sha256:7b24dcf2e92497f54323e617564d36866230a8bfb719dbb7b45b461510dcddaa", size = 8471 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/ef/c1/be0c48ed9609b720e74ade86f24ea086e353fe9c7405ee9630c3d52d09a2/tree_sitter_embedded_template-0.23.2-cp39-abi3-macosx_10_9_x86_64.whl", hash = "sha256:a505c2d2494464029d79db541cab52f6da5fb326bf3d355e69bf98b84eb89ae0", size = 9554 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/6d/a5/7c12f5d302525ee36d1eafc28a68e4454da5bad208436d547326bee4ed76/tree_sitter_embedded_template-0.23.2-cp39-abi3-macosx_11_0_arm64.whl", hash = "sha256:28028b93b42cc3753261ae7ce066675d407f59de512417524f9c3ab7792b1d37", size = 10051 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/cd/87/95aaba8b64b849200bd7d4ae510cc394ecaef46a031499cbff301766970d/tree_sitter_embedded_template-0.23.2-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ec399d59ce93ffb60759a2d96053eed529f3c3f6a27128f261710d0d0de60e10", size = 17532 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/13/f8/8c837b898f00b35f9f3f76a4abc525e80866a69343083c9ff329e17ecb03/tree_sitter_embedded_template-0.23.2-cp39-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bcfa01f62b88d50dbcb736cc23baec8ddbfe08daacfdc613eee8c04ab65efd09", size = 17394 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/89/9b/893adf9e465d2d7f14870871bf2f3b30045e5ac417cb596f667a72eda493/tree_sitter_embedded_template-0.23.2-cp39-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:6debd24791466f887109a433c31aa4a5deeba2b217817521c745a4e748a944ed", size = 16439 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/40/96/e79934572723673db9f867000500c6eea61a37705e02c7aee9ee031bbb6f/tree_sitter_embedded_template-0.23.2-cp39-abi3-win_amd64.whl", hash = "sha256:158fecb38be5b15db0190ef7238e5248f24bf32ae3cab93bc1197e293a5641eb", size = 12572 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/63/06/27f678b9874e4e2e39ddc6f5cce3374c8c60e6046ea8588a491ab6fc9fcb/tree_sitter_embedded_template-0.23.2-cp39-abi3-win_arm64.whl", hash = "sha256:9f1f3b79fe273f3d15a5b64c85fc6ebfb48decfbe8542accd05f5b7694860df0", size = 11232 },
|
||||||
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "tree-sitter-language-pack"
|
||||||
|
version = "0.7.2"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "tree-sitter" },
|
||||||
|
{ name = "tree-sitter-c-sharp" },
|
||||||
|
{ name = "tree-sitter-embedded-template" },
|
||||||
|
{ name = "tree-sitter-yaml" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/9b/1e/2d63d93025fd5b527327c3fd348955cebaec02a3f1bcec88ab4d88ddfc39/tree_sitter_language_pack-0.7.2.tar.gz", hash = "sha256:46fc96cc3bddfee7091fdedec2ae7e34218679e58241e8319bf82026f6d02eae", size = 59264078 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/da/9d/2c6272bf4fd18a22d8c07d3c983940dbece4f0e9e21f5c78f15a2740f435/tree_sitter_language_pack-0.7.2-cp39-abi3-macosx_10_13_universal2.whl", hash = "sha256:4036603020bd32060d9931a64f8c3d8637de575f350f11534971012e51a27a95", size = 28132977 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/2b/e2/0f2511019c27b870061f9ad719074095ef84cd7857a730765bfa066384be/tree_sitter_language_pack-0.7.2-cp39-abi3-manylinux2014_aarch64.whl", hash = "sha256:801926dbc81eeca4ce97b846cc899dcf3fecfdc3b2514a68eeeb118f70ac686d", size = 17576769 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/3a/88/7b38233def5c359503ad4d36533f96f9fe2943a8eeeced66b36312c49e1b/tree_sitter_language_pack-0.7.2-cp39-abi3-manylinux2014_x86_64.whl", hash = "sha256:77be80335fb585f48eb268b0e07ca54f3da8f30c2eab7be749113f116c3ef316", size = 17433872 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/f8/27/fc5dce240b68a1ed876bc80b2238fbaaa0f695dbaf88660728a0239a2b20/tree_sitter_language_pack-0.7.2-cp39-abi3-win_amd64.whl", hash = "sha256:d71c6b4c14b3370ca783319ede7a581a10e6dd1bdfe5d31d316d9216981a6406", size = 14316050 },
|
||||||
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "tree-sitter-yaml"
|
||||||
|
version = "0.7.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/93/04/6de8be8112c50450cab753fcd6b74d8368c60f6099bf551cee0bec69563a/tree_sitter_yaml-0.7.0.tar.gz", hash = "sha256:9c8bb17d9755c3b0e757260917240c0d19883cd3b59a5d74f205baa8bf8435a4", size = 85085 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/69/1d/243dbdf59fae8a4109e19f0994e2627ddedb2e16b7cf99bd42be64367742/tree_sitter_yaml-0.7.0-cp39-abi3-macosx_10_9_x86_64.whl", hash = "sha256:e21553ac190ae05bf82796df8beb4d9158ba195b5846018cb36fbc3a35bd0679", size = 43335 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/e2/63/e5d5868a1498e20fd07e7db62933766fd64950279862e3e7f150b88ec69d/tree_sitter_yaml-0.7.0-cp39-abi3-macosx_11_0_arm64.whl", hash = "sha256:c022054f1f9b54201082ea83073a6c24c42d0436ad8ee99ff2574cba8f928c28", size = 44574 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/f5/ba/9cff9a3fddb1b6b38bc71ce1dfdb8892ab15a4042c104f4582e30318b412/tree_sitter_yaml-0.7.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:1cd1725142f19e41c51d27c99cfc60780f596e069eb181cfa6433d993a19aa3d", size = 93088 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/19/09/39d29d9a22cee0b3c3e4f3fdbd23e4534b9c2a84b5f962f369eafcfbf88c/tree_sitter_yaml-0.7.0-cp39-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9d1b268378254f75bb27396d83c96d886ccbfcda6bd8c2778e94e3e1d2459085", size = 91367 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/b0/b7/285653b894b351436917b5fe5e738eecaeb2128b4e4bf72bfe0c6043f62e/tree_sitter_yaml-0.7.0-cp39-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:27c2e7f4f49ddf410003abbb82a7b00ec77ea263d8ef08dbce1a15d293eed2fd", size = 87405 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/bb/73/0cdc82ea653c190475a4f63dd4a1f4efd5d1c7d09d2668b8d84008a4c4f8/tree_sitter_yaml-0.7.0-cp39-abi3-win_amd64.whl", hash = "sha256:98dce0d6bc376f842cfb1d3c32512eea95b37e61cd2c87074bb4b05c999917c8", size = 45360 },
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/2e/32/af2d676b0176a958f22a75b04be836e09476a10844baab78c018a5030297/tree_sitter_yaml-0.7.0-cp39-abi3-win_arm64.whl", hash = "sha256:f0f8d8e05fa8e70f08d0f18a209d6026e171844f4ea7090e7c779b9c375b3a31", size = 43650 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "triton"
|
name = "triton"
|
||||||
version = "3.2.0"
|
version = "3.2.0"
|
||||||
|
|
@ -3333,6 +3877,38 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/c7/30/37a3384d1e2e9320331baca41e835e90a3767303642c7a80d4510152cbcf/triton-3.2.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e5dfa23ba84541d7c0a531dfce76d8bcd19159d50a4a8b14ad01e91734a5c1b0", size = 253154278 },
|
{ url = "https://files.pythonhosted.org/packages/c7/30/37a3384d1e2e9320331baca41e835e90a3767303642c7a80d4510152cbcf/triton-3.2.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e5dfa23ba84541d7c0a531dfce76d8bcd19159d50a4a8b14ad01e91734a5c1b0", size = 253154278 },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "twine"
|
||||||
|
version = "6.1.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "id" },
|
||||||
|
{ name = "keyring", marker = "platform_machine != 'ppc64le' and platform_machine != 's390x'" },
|
||||||
|
{ name = "packaging" },
|
||||||
|
{ name = "readme-renderer" },
|
||||||
|
{ name = "requests" },
|
||||||
|
{ name = "requests-toolbelt" },
|
||||||
|
{ name = "rfc3986" },
|
||||||
|
{ name = "rich" },
|
||||||
|
{ name = "urllib3" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/c8/a2/6df94fc5c8e2170d21d7134a565c3a8fb84f9797c1dd65a5976aaf714418/twine-6.1.0.tar.gz", hash = "sha256:be324f6272eff91d07ee93f251edf232fc647935dd585ac003539b42404a8dbd", size = 168404 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/7c/b6/74e927715a285743351233f33ea3c684528a0d374d2e43ff9ce9585b73fe/twine-6.1.0-py3-none-any.whl", hash = "sha256:a47f973caf122930bf0fbbf17f80b83bc1602c9ce393c7845f289a3001dc5384", size = 40791 },
|
||||||
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "types-requests"
|
||||||
|
version = "2.32.0.20250328"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "urllib3" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/00/7d/eb174f74e3f5634eaacb38031bbe467dfe2e545bc255e5c90096ec46bc46/types_requests-2.32.0.20250328.tar.gz", hash = "sha256:c9e67228ea103bd811c96984fac36ed2ae8da87a36a633964a21f199d60baf32", size = 22995 }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/cc/15/3700282a9d4ea3b37044264d3e4d1b1f0095a4ebf860a99914fd544e3be3/types_requests-2.32.0.20250328-py3-none-any.whl", hash = "sha256:72ff80f84b15eb3aa7a8e2625fffb6a93f2ad5a0c20215fc1dcfa61117bcb2a2", size = 20663 },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "typing-extensions"
|
name = "typing-extensions"
|
||||||
version = "4.12.2"
|
version = "4.12.2"
|
||||||
|
|
@ -3366,7 +3942,7 @@ wheels = [
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "unstructured"
|
name = "unstructured"
|
||||||
version = "0.16.25"
|
version = "0.17.2"
|
||||||
source = { registry = "https://pypi.org/simple" }
|
source = { registry = "https://pypi.org/simple" }
|
||||||
dependencies = [
|
dependencies = [
|
||||||
{ name = "backoff" },
|
{ name = "backoff" },
|
||||||
|
|
@ -3391,9 +3967,9 @@ dependencies = [
|
||||||
{ name = "unstructured-client" },
|
{ name = "unstructured-client" },
|
||||||
{ name = "wrapt" },
|
{ name = "wrapt" },
|
||||||
]
|
]
|
||||||
sdist = { url = "https://files.pythonhosted.org/packages/64/31/98c4c78e305d1294888adf87fd5ee30577a4c393951341ca32b43f167f1e/unstructured-0.16.25.tar.gz", hash = "sha256:73b9b0f51dbb687af572ecdb849a6811710b9cac797ddeab8ee80fa07d8aa5e6", size = 1683097 }
|
sdist = { url = "https://files.pythonhosted.org/packages/b4/49/b95ff4b609d7328cd0394ac9d8ad69839e11a1f879462496afcf4887154a/unstructured-0.17.2.tar.gz", hash = "sha256:af18c3caef0a6c562cf77e34ee8b6ff522b605031d2336ffe565df66f126aa46", size = 1684745 }
|
||||||
wheels = [
|
wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/12/4f/ad08585b5c8a33c82ea119494c4d3023f4796958c56e668b15cc282ec0a0/unstructured-0.16.25-py3-none-any.whl", hash = "sha256:14719ccef2830216cf1c5bf654f75e2bf07b17ca5dcee9da5ac74618130fd337", size = 1769286 },
|
{ url = "https://files.pythonhosted.org/packages/cb/88/061a9dedd4e8cc0c31097c3275a9ef1fd7307e26afac5cd582487386e1b8/unstructured-0.17.2-py3-none-any.whl", hash = "sha256:527dd26a4b273aebef2f9119c9d4f0d0ce17640038d92296d23abe89be123840", size = 1771563 },
|
||||||
]
|
]
|
||||||
|
|
||||||
[package.optional-dependencies]
|
[package.optional-dependencies]
|
||||||
|
|
@ -3403,6 +3979,7 @@ all-docs = [
|
||||||
{ name = "markdown" },
|
{ name = "markdown" },
|
||||||
{ name = "networkx" },
|
{ name = "networkx" },
|
||||||
{ name = "onnx" },
|
{ name = "onnx" },
|
||||||
|
{ name = "onnxruntime" },
|
||||||
{ name = "openpyxl" },
|
{ name = "openpyxl" },
|
||||||
{ name = "pandas" },
|
{ name = "pandas" },
|
||||||
{ name = "pdf2image" },
|
{ name = "pdf2image" },
|
||||||
|
|
|
||||||
|
|
@ -1,7 +1,7 @@
|
||||||
{
|
{
|
||||||
"name": "surfsense_browser_extension",
|
"name": "surfsense_browser_extension",
|
||||||
"displayName": "Surfsense Browser Extension",
|
"displayName": "Surfsense Browser Extension",
|
||||||
"version": "0.0.6",
|
"version": "0.0.7",
|
||||||
"description": "Extension to collect Browsing History for SurfSense.",
|
"description": "Extension to collect Browsing History for SurfSense.",
|
||||||
"author": "https://github.com/MODSetter",
|
"author": "https://github.com/MODSetter",
|
||||||
"scripts": {
|
"scripts": {
|
||||||
|
|
|
||||||
|
|
@ -1 +1,3 @@
|
||||||
NEXT_PUBLIC_FASTAPI_BACKEND_URL=http://localhost:8000
|
NEXT_PUBLIC_FASTAPI_BACKEND_URL=http://localhost:8000
|
||||||
|
NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE=LOCAL or GOOGLE
|
||||||
|
NEXT_PUBLIC_ETL_SERVICE=UNSTRUCTURED or LLAMACLOUD
|
||||||
|
|
@ -3,7 +3,7 @@
|
||||||
import { useState, useEffect } from 'react';
|
import { useState, useEffect } from 'react';
|
||||||
import { motion, AnimatePresence } from 'framer-motion';
|
import { motion, AnimatePresence } from 'framer-motion';
|
||||||
import { useSearchParams } from 'next/navigation';
|
import { useSearchParams } from 'next/navigation';
|
||||||
import { MessageCircleMore, Search, Calendar, Tag, Trash2, ExternalLink, MoreHorizontal } from 'lucide-react';
|
import { MessageCircleMore, Search, Calendar, Tag, Trash2, ExternalLink, MoreHorizontal, Radio, CheckCircle, Circle, Podcast } from 'lucide-react';
|
||||||
import { format } from 'date-fns';
|
import { format } from 'date-fns';
|
||||||
|
|
||||||
// UI Components
|
// UI Components
|
||||||
|
|
@ -42,6 +42,9 @@ import {
|
||||||
SelectTrigger,
|
SelectTrigger,
|
||||||
SelectValue,
|
SelectValue,
|
||||||
} from "@/components/ui/select";
|
} from "@/components/ui/select";
|
||||||
|
import { Checkbox } from "@/components/ui/checkbox";
|
||||||
|
import { Label } from "@/components/ui/label";
|
||||||
|
import { toast } from "sonner";
|
||||||
|
|
||||||
interface Chat {
|
interface Chat {
|
||||||
created_at: string;
|
created_at: string;
|
||||||
|
|
@ -92,6 +95,18 @@ export default function ChatsPageClient({ searchSpaceId }: ChatsPageClientProps)
|
||||||
const [chatToDelete, setChatToDelete] = useState<{ id: number, title: string } | null>(null);
|
const [chatToDelete, setChatToDelete] = useState<{ id: number, title: string } | null>(null);
|
||||||
const [isDeleting, setIsDeleting] = useState(false);
|
const [isDeleting, setIsDeleting] = useState(false);
|
||||||
|
|
||||||
|
// New state for podcast generation
|
||||||
|
const [selectedChats, setSelectedChats] = useState<number[]>([]);
|
||||||
|
const [selectionMode, setSelectionMode] = useState(false);
|
||||||
|
const [podcastDialogOpen, setPodcastDialogOpen] = useState(false);
|
||||||
|
const [podcastTitle, setPodcastTitle] = useState("");
|
||||||
|
const [isGeneratingPodcast, setIsGeneratingPodcast] = useState(false);
|
||||||
|
|
||||||
|
// New state for individual podcast generation
|
||||||
|
const [currentChatIndex, setCurrentChatIndex] = useState(0);
|
||||||
|
const [podcastTitles, setPodcastTitles] = useState<{[key: number]: string}>({});
|
||||||
|
const [processingChat, setProcessingChat] = useState<Chat | null>(null);
|
||||||
|
|
||||||
const chatsPerPage = 9;
|
const chatsPerPage = 9;
|
||||||
const searchParams = useSearchParams();
|
const searchParams = useSearchParams();
|
||||||
|
|
||||||
|
|
@ -234,6 +249,177 @@ export default function ChatsPageClient({ searchSpaceId }: ChatsPageClientProps)
|
||||||
// Get unique chat types for filter dropdown
|
// Get unique chat types for filter dropdown
|
||||||
const chatTypes = ['all', ...Array.from(new Set(chats.map(chat => chat.type)))];
|
const chatTypes = ['all', ...Array.from(new Set(chats.map(chat => chat.type)))];
|
||||||
|
|
||||||
|
// Generate individual podcasts from selected chats
|
||||||
|
const handleGeneratePodcast = async () => {
|
||||||
|
if (selectedChats.length === 0) {
|
||||||
|
toast.error("Please select at least one chat");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
const currentChatId = selectedChats[currentChatIndex];
|
||||||
|
const currentTitle = podcastTitles[currentChatId] || podcastTitle;
|
||||||
|
|
||||||
|
if (!currentTitle.trim()) {
|
||||||
|
toast.error("Please enter a podcast title");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
setIsGeneratingPodcast(true);
|
||||||
|
try {
|
||||||
|
const token = localStorage.getItem('surfsense_bearer_token');
|
||||||
|
if (!token) {
|
||||||
|
toast.error("Authentication error. Please log in again.");
|
||||||
|
setIsGeneratingPodcast(false);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create payload for single chat
|
||||||
|
const payload = {
|
||||||
|
type: "CHAT",
|
||||||
|
ids: [currentChatId], // Single chat ID
|
||||||
|
search_space_id: parseInt(searchSpaceId),
|
||||||
|
podcast_title: currentTitle
|
||||||
|
};
|
||||||
|
|
||||||
|
const response = await fetch(`${process.env.NEXT_PUBLIC_FASTAPI_BACKEND_URL}/api/v1/podcasts/generate/`, {
|
||||||
|
method: 'POST',
|
||||||
|
headers: {
|
||||||
|
'Authorization': `Bearer ${token}`,
|
||||||
|
'Content-Type': 'application/json',
|
||||||
|
},
|
||||||
|
body: JSON.stringify(payload)
|
||||||
|
});
|
||||||
|
|
||||||
|
if (!response.ok) {
|
||||||
|
const errorData = await response.json().catch(() => ({}));
|
||||||
|
throw new Error(errorData.detail || "Failed to generate podcast");
|
||||||
|
}
|
||||||
|
|
||||||
|
const data = await response.json();
|
||||||
|
toast.success(`Podcast "${currentTitle}" generation started!`);
|
||||||
|
|
||||||
|
// Move to the next chat or finish
|
||||||
|
if (currentChatIndex < selectedChats.length - 1) {
|
||||||
|
// Set up for next chat
|
||||||
|
setCurrentChatIndex(currentChatIndex + 1);
|
||||||
|
|
||||||
|
// Find the next chat from the chats array
|
||||||
|
const nextChatId = selectedChats[currentChatIndex + 1];
|
||||||
|
const nextChat = chats.find(chat => chat.id === nextChatId) || null;
|
||||||
|
setProcessingChat(nextChat);
|
||||||
|
|
||||||
|
// Default title for the next chat
|
||||||
|
if (!podcastTitles[nextChatId]) {
|
||||||
|
setPodcastTitle(nextChat?.title || `Podcast from Chat ${nextChatId}`);
|
||||||
|
} else {
|
||||||
|
setPodcastTitle(podcastTitles[nextChatId]);
|
||||||
|
}
|
||||||
|
|
||||||
|
setIsGeneratingPodcast(false);
|
||||||
|
} else {
|
||||||
|
// All done
|
||||||
|
finishPodcastGeneration();
|
||||||
|
}
|
||||||
|
} catch (error) {
|
||||||
|
console.error('Error generating podcast:', error);
|
||||||
|
toast.error(error instanceof Error ? error.message : 'Failed to generate podcast');
|
||||||
|
setIsGeneratingPodcast(false);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Helper to finish the podcast generation process
|
||||||
|
const finishPodcastGeneration = () => {
|
||||||
|
toast.success("All podcasts are being generated! Check the podcasts tab to see them when ready.");
|
||||||
|
setPodcastDialogOpen(false);
|
||||||
|
setSelectedChats([]);
|
||||||
|
setSelectionMode(false);
|
||||||
|
setCurrentChatIndex(0);
|
||||||
|
setPodcastTitles({});
|
||||||
|
setProcessingChat(null);
|
||||||
|
setPodcastTitle("");
|
||||||
|
setIsGeneratingPodcast(false);
|
||||||
|
};
|
||||||
|
|
||||||
|
// Start podcast generation flow
|
||||||
|
const startPodcastGeneration = () => {
|
||||||
|
if (selectedChats.length === 0) {
|
||||||
|
toast.error("Please select at least one chat");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Reset the state for podcast generation
|
||||||
|
setCurrentChatIndex(0);
|
||||||
|
setPodcastTitles({});
|
||||||
|
|
||||||
|
// Set up for the first chat
|
||||||
|
const firstChatId = selectedChats[0];
|
||||||
|
const firstChat = chats.find(chat => chat.id === firstChatId) || null;
|
||||||
|
setProcessingChat(firstChat);
|
||||||
|
|
||||||
|
// Set default title for the first chat
|
||||||
|
setPodcastTitle(firstChat?.title || `Podcast from Chat ${firstChatId}`);
|
||||||
|
setPodcastDialogOpen(true);
|
||||||
|
};
|
||||||
|
|
||||||
|
// Update the title for the current chat
|
||||||
|
const updateCurrentChatTitle = (title: string) => {
|
||||||
|
const currentChatId = selectedChats[currentChatIndex];
|
||||||
|
setPodcastTitle(title);
|
||||||
|
setPodcastTitles(prev => ({
|
||||||
|
...prev,
|
||||||
|
[currentChatId]: title
|
||||||
|
}));
|
||||||
|
};
|
||||||
|
|
||||||
|
// Skip generating a podcast for the current chat
|
||||||
|
const skipCurrentChat = () => {
|
||||||
|
if (currentChatIndex < selectedChats.length - 1) {
|
||||||
|
// Move to the next chat
|
||||||
|
setCurrentChatIndex(currentChatIndex + 1);
|
||||||
|
|
||||||
|
// Find the next chat
|
||||||
|
const nextChatId = selectedChats[currentChatIndex + 1];
|
||||||
|
const nextChat = chats.find(chat => chat.id === nextChatId) || null;
|
||||||
|
setProcessingChat(nextChat);
|
||||||
|
|
||||||
|
// Set default title for the next chat
|
||||||
|
if (!podcastTitles[nextChatId]) {
|
||||||
|
setPodcastTitle(nextChat?.title || `Podcast from Chat ${nextChatId}`);
|
||||||
|
} else {
|
||||||
|
setPodcastTitle(podcastTitles[nextChatId]);
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
// All done (all skipped)
|
||||||
|
finishPodcastGeneration();
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Toggle chat selection
|
||||||
|
const toggleChatSelection = (chatId: number) => {
|
||||||
|
setSelectedChats(prev =>
|
||||||
|
prev.includes(chatId)
|
||||||
|
? prev.filter(id => id !== chatId)
|
||||||
|
: [...prev, chatId]
|
||||||
|
);
|
||||||
|
};
|
||||||
|
|
||||||
|
// Select all visible chats
|
||||||
|
const selectAllVisibleChats = () => {
|
||||||
|
const visibleChatIds = currentChats.map(chat => chat.id);
|
||||||
|
setSelectedChats(prev => {
|
||||||
|
const allSelected = visibleChatIds.every(id => prev.includes(id));
|
||||||
|
return allSelected
|
||||||
|
? prev.filter(id => !visibleChatIds.includes(id)) // Deselect all visible if all are selected
|
||||||
|
: [...new Set([...prev, ...visibleChatIds])]; // Add all visible, ensuring no duplicates
|
||||||
|
});
|
||||||
|
};
|
||||||
|
|
||||||
|
// Cancel selection mode
|
||||||
|
const cancelSelectionMode = () => {
|
||||||
|
setSelectionMode(false);
|
||||||
|
setSelectedChats([]);
|
||||||
|
};
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<motion.div
|
<motion.div
|
||||||
className="container p-6 mx-auto"
|
className="container p-6 mx-auto"
|
||||||
|
|
@ -278,18 +464,63 @@ export default function ChatsPageClient({ searchSpaceId }: ChatsPageClientProps)
|
||||||
</Select>
|
</Select>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<div>
|
<div className="flex items-center gap-2">
|
||||||
<Select value={sortOrder} onValueChange={setSortOrder}>
|
{selectionMode ? (
|
||||||
<SelectTrigger className="w-40">
|
<>
|
||||||
<SelectValue placeholder="Sort order" />
|
<Button
|
||||||
</SelectTrigger>
|
variant="outline"
|
||||||
<SelectContent>
|
size="sm"
|
||||||
<SelectGroup>
|
onClick={selectAllVisibleChats}
|
||||||
<SelectItem value="newest">Newest First</SelectItem>
|
className="gap-1"
|
||||||
<SelectItem value="oldest">Oldest First</SelectItem>
|
title="Select or deselect all chats on the current page"
|
||||||
</SelectGroup>
|
>
|
||||||
</SelectContent>
|
<CheckCircle className="h-4 w-4" />
|
||||||
</Select>
|
{currentChats.every(chat => selectedChats.includes(chat.id))
|
||||||
|
? "Deselect Page"
|
||||||
|
: "Select Page"}
|
||||||
|
</Button>
|
||||||
|
<Button
|
||||||
|
variant="default"
|
||||||
|
size="sm"
|
||||||
|
onClick={startPodcastGeneration}
|
||||||
|
className="gap-1"
|
||||||
|
disabled={selectedChats.length === 0}
|
||||||
|
>
|
||||||
|
<Podcast className="h-4 w-4" />
|
||||||
|
Generate Podcast ({selectedChats.length})
|
||||||
|
</Button>
|
||||||
|
<Button
|
||||||
|
variant="ghost"
|
||||||
|
size="sm"
|
||||||
|
onClick={cancelSelectionMode}
|
||||||
|
>
|
||||||
|
Cancel
|
||||||
|
</Button>
|
||||||
|
</>
|
||||||
|
) : (
|
||||||
|
<>
|
||||||
|
<Button
|
||||||
|
variant="outline"
|
||||||
|
size="sm"
|
||||||
|
onClick={() => setSelectionMode(true)}
|
||||||
|
className="gap-1"
|
||||||
|
>
|
||||||
|
<Podcast className="h-4 w-4" />
|
||||||
|
Podcaster
|
||||||
|
</Button>
|
||||||
|
<Select value={sortOrder} onValueChange={setSortOrder}>
|
||||||
|
<SelectTrigger className="w-40">
|
||||||
|
<SelectValue placeholder="Sort order" />
|
||||||
|
</SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
<SelectGroup>
|
||||||
|
<SelectItem value="newest">Newest First</SelectItem>
|
||||||
|
<SelectItem value="oldest">Oldest First</SelectItem>
|
||||||
|
</SelectGroup>
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
</>
|
||||||
|
)}
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
|
@ -334,44 +565,79 @@ export default function ChatsPageClient({ searchSpaceId }: ChatsPageClientProps)
|
||||||
animate="animate"
|
animate="animate"
|
||||||
exit="exit"
|
exit="exit"
|
||||||
transition={{ duration: 0.2, delay: index * 0.05 }}
|
transition={{ duration: 0.2, delay: index * 0.05 }}
|
||||||
className="overflow-hidden hover:shadow-md transition-shadow"
|
className={`overflow-hidden hover:shadow-md transition-shadow
|
||||||
|
${selectionMode && selectedChats.includes(chat.id)
|
||||||
|
? 'ring-2 ring-primary ring-offset-2' : ''}`}
|
||||||
|
onClick={(e) => {
|
||||||
|
if (!selectionMode) return;
|
||||||
|
// Ignore clicks coming from interactive elements
|
||||||
|
if ((e.target as HTMLElement).closest('button, a, [data-stop-selection]')) return;
|
||||||
|
toggleChatSelection(chat.id);
|
||||||
|
}}
|
||||||
>
|
>
|
||||||
<CardHeader className="pb-3">
|
<CardHeader className="pb-3">
|
||||||
<div className="flex justify-between items-start">
|
<div className="flex justify-between items-start">
|
||||||
<div className="space-y-1">
|
<div className="space-y-1 flex items-start gap-2">
|
||||||
<CardTitle className="line-clamp-1">{chat.title || `Chat ${chat.id}`}</CardTitle>
|
{selectionMode && (
|
||||||
<CardDescription>
|
<div className="mt-1">
|
||||||
<span className="flex items-center gap-1">
|
{selectedChats.includes(chat.id)
|
||||||
<Calendar className="h-3.5 w-3.5" />
|
? <CheckCircle className="h-4 w-4 text-primary" />
|
||||||
<span>{format(new Date(chat.created_at), 'MMM d, yyyy')}</span>
|
: <Circle className="h-4 w-4 text-muted-foreground" />}
|
||||||
</span>
|
</div>
|
||||||
</CardDescription>
|
)}
|
||||||
|
<div>
|
||||||
|
<CardTitle className="line-clamp-1">{chat.title || `Chat ${chat.id}`}</CardTitle>
|
||||||
|
<CardDescription>
|
||||||
|
<span className="flex items-center gap-1">
|
||||||
|
<Calendar className="h-3.5 w-3.5" />
|
||||||
|
<span>{format(new Date(chat.created_at), 'MMM d, yyyy')}</span>
|
||||||
|
</span>
|
||||||
|
</CardDescription>
|
||||||
|
</div>
|
||||||
</div>
|
</div>
|
||||||
<DropdownMenu>
|
{!selectionMode && (
|
||||||
<DropdownMenuTrigger asChild>
|
<DropdownMenu>
|
||||||
<Button variant="ghost" size="icon" className="h-8 w-8">
|
<DropdownMenuTrigger asChild>
|
||||||
<MoreHorizontal className="h-4 w-4" />
|
<Button
|
||||||
<span className="sr-only">Open menu</span>
|
variant="ghost"
|
||||||
</Button>
|
size="icon"
|
||||||
</DropdownMenuTrigger>
|
className="h-8 w-8"
|
||||||
<DropdownMenuContent align="end">
|
data-stop-selection
|
||||||
<DropdownMenuItem onClick={() => window.location.href = `/dashboard/${chat.search_space_id}/researcher/${chat.id}`}>
|
>
|
||||||
<ExternalLink className="mr-2 h-4 w-4" />
|
<MoreHorizontal className="h-4 w-4" />
|
||||||
<span>View Chat</span>
|
<span className="sr-only">Open menu</span>
|
||||||
</DropdownMenuItem>
|
</Button>
|
||||||
<DropdownMenuSeparator />
|
</DropdownMenuTrigger>
|
||||||
<DropdownMenuItem
|
<DropdownMenuContent align="end">
|
||||||
className="text-destructive focus:text-destructive"
|
<DropdownMenuItem onClick={() => window.location.href = `/dashboard/${chat.search_space_id}/researcher/${chat.id}`}>
|
||||||
onClick={() => {
|
<ExternalLink className="mr-2 h-4 w-4" />
|
||||||
setChatToDelete({ id: chat.id, title: chat.title || `Chat ${chat.id}` });
|
<span>View Chat</span>
|
||||||
setDeleteDialogOpen(true);
|
</DropdownMenuItem>
|
||||||
}}
|
<DropdownMenuItem
|
||||||
>
|
onClick={() => {
|
||||||
<Trash2 className="mr-2 h-4 w-4" />
|
setSelectedChats([chat.id]);
|
||||||
<span>Delete Chat</span>
|
setPodcastTitle(chat.title || `Chat ${chat.id}`);
|
||||||
</DropdownMenuItem>
|
setPodcastDialogOpen(true);
|
||||||
</DropdownMenuContent>
|
}}
|
||||||
</DropdownMenu>
|
>
|
||||||
|
<Podcast className="mr-2 h-4 w-4" />
|
||||||
|
<span>Generate Podcast</span>
|
||||||
|
</DropdownMenuItem>
|
||||||
|
<DropdownMenuSeparator />
|
||||||
|
<DropdownMenuItem
|
||||||
|
className="text-destructive focus:text-destructive"
|
||||||
|
onClick={(e) => {
|
||||||
|
e.stopPropagation();
|
||||||
|
setChatToDelete({ id: chat.id, title: chat.title || `Chat ${chat.id}` });
|
||||||
|
setDeleteDialogOpen(true);
|
||||||
|
}}
|
||||||
|
>
|
||||||
|
<Trash2 className="mr-2 h-4 w-4" />
|
||||||
|
<span>Delete Chat</span>
|
||||||
|
</DropdownMenuItem>
|
||||||
|
</DropdownMenuContent>
|
||||||
|
</DropdownMenu>
|
||||||
|
)}
|
||||||
</div>
|
</div>
|
||||||
</CardHeader>
|
</CardHeader>
|
||||||
<CardContent>
|
<CardContent>
|
||||||
|
|
@ -505,6 +771,104 @@ export default function ChatsPageClient({ searchSpaceId }: ChatsPageClientProps)
|
||||||
</DialogFooter>
|
</DialogFooter>
|
||||||
</DialogContent>
|
</DialogContent>
|
||||||
</Dialog>
|
</Dialog>
|
||||||
|
|
||||||
|
{/* Podcast Generation Dialog */}
|
||||||
|
<Dialog
|
||||||
|
open={podcastDialogOpen}
|
||||||
|
onOpenChange={(isOpen: boolean) => {
|
||||||
|
if (!isOpen) {
|
||||||
|
// Cancel the process if dialog is closed
|
||||||
|
setPodcastDialogOpen(false);
|
||||||
|
setSelectedChats([]);
|
||||||
|
setSelectionMode(false);
|
||||||
|
setCurrentChatIndex(0);
|
||||||
|
setPodcastTitles({});
|
||||||
|
setProcessingChat(null);
|
||||||
|
setPodcastTitle("");
|
||||||
|
} else {
|
||||||
|
setPodcastDialogOpen(true);
|
||||||
|
}
|
||||||
|
}}
|
||||||
|
>
|
||||||
|
<DialogContent className="sm:max-w-md">
|
||||||
|
<DialogHeader>
|
||||||
|
<DialogTitle className="flex items-center gap-2">
|
||||||
|
<Podcast className="h-5 w-5 text-primary" />
|
||||||
|
<span>Generate Podcast {currentChatIndex + 1} of {selectedChats.length}</span>
|
||||||
|
</DialogTitle>
|
||||||
|
<DialogDescription>
|
||||||
|
{selectedChats.length > 1 ? (
|
||||||
|
<>Creating individual podcasts for each selected chat. Currently processing: <span className="font-medium">{processingChat?.title || `Chat ${selectedChats[currentChatIndex]}`}</span></>
|
||||||
|
) : (
|
||||||
|
<>Create a podcast from this chat. The podcast will be available in the podcasts section once generated.</>
|
||||||
|
)}
|
||||||
|
</DialogDescription>
|
||||||
|
</DialogHeader>
|
||||||
|
|
||||||
|
<div className="space-y-4 py-2">
|
||||||
|
<div className="space-y-2">
|
||||||
|
<Label htmlFor="podcast-title">Podcast Title</Label>
|
||||||
|
<Input
|
||||||
|
id="podcast-title"
|
||||||
|
placeholder="Enter podcast title"
|
||||||
|
value={podcastTitle}
|
||||||
|
onChange={(e) => updateCurrentChatTitle(e.target.value)}
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{selectedChats.length > 1 && (
|
||||||
|
<div className="w-full bg-muted rounded-full h-2.5 mt-4">
|
||||||
|
<div
|
||||||
|
className="bg-primary h-2.5 rounded-full transition-all duration-300"
|
||||||
|
style={{ width: `${((currentChatIndex) / selectedChats.length) * 100}%` }}
|
||||||
|
></div>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<DialogFooter className="flex gap-2 sm:justify-end">
|
||||||
|
{selectedChats.length > 1 && !isGeneratingPodcast && (
|
||||||
|
<Button
|
||||||
|
variant="outline"
|
||||||
|
onClick={skipCurrentChat}
|
||||||
|
className="gap-1"
|
||||||
|
>
|
||||||
|
Skip
|
||||||
|
</Button>
|
||||||
|
)}
|
||||||
|
<Button
|
||||||
|
variant="outline"
|
||||||
|
onClick={() => {
|
||||||
|
setPodcastDialogOpen(false);
|
||||||
|
setCurrentChatIndex(0);
|
||||||
|
setPodcastTitles({});
|
||||||
|
setProcessingChat(null);
|
||||||
|
}}
|
||||||
|
disabled={isGeneratingPodcast}
|
||||||
|
>
|
||||||
|
Cancel
|
||||||
|
</Button>
|
||||||
|
<Button
|
||||||
|
variant="default"
|
||||||
|
onClick={handleGeneratePodcast}
|
||||||
|
disabled={isGeneratingPodcast}
|
||||||
|
className="gap-2"
|
||||||
|
>
|
||||||
|
{isGeneratingPodcast ? (
|
||||||
|
<>
|
||||||
|
<span className="h-4 w-4 animate-spin rounded-full border-2 border-current border-t-transparent" />
|
||||||
|
Generating...
|
||||||
|
</>
|
||||||
|
) : (
|
||||||
|
<>
|
||||||
|
<Podcast className="h-4 w-4" />
|
||||||
|
Generate Podcast
|
||||||
|
</>
|
||||||
|
)}
|
||||||
|
</Button>
|
||||||
|
</DialogFooter>
|
||||||
|
</DialogContent>
|
||||||
|
</Dialog>
|
||||||
</motion.div>
|
</motion.div>
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
@ -8,8 +8,8 @@ interface PageProps {
|
||||||
}
|
}
|
||||||
|
|
||||||
export default async function ChatsPage({ params }: PageProps) {
|
export default async function ChatsPage({ params }: PageProps) {
|
||||||
// Await params to properly access dynamic route parameters
|
// Get search space ID from the route parameter
|
||||||
const searchSpaceId = params.search_space_id;
|
const { search_space_id: searchSpaceId } = await Promise.resolve(params);
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<Suspense fallback={<div className="flex items-center justify-center h-[60vh]">
|
<Suspense fallback={<div className="flex items-center justify-center h-[60vh]">
|
||||||
|
|
|
||||||
|
|
@ -4,255 +4,308 @@ import { useState, useEffect } from "react";
|
||||||
import { useRouter, useParams } from "next/navigation";
|
import { useRouter, useParams } from "next/navigation";
|
||||||
import { motion } from "framer-motion";
|
import { motion } from "framer-motion";
|
||||||
import { toast } from "sonner";
|
import { toast } from "sonner";
|
||||||
import { Edit, Plus, Search, Trash2, ExternalLink, RefreshCw } from "lucide-react";
|
import {
|
||||||
|
Edit,
|
||||||
|
Plus,
|
||||||
|
Search,
|
||||||
|
Trash2,
|
||||||
|
ExternalLink,
|
||||||
|
RefreshCw,
|
||||||
|
} from "lucide-react";
|
||||||
|
|
||||||
import { useSearchSourceConnectors } from "@/hooks/useSearchSourceConnectors";
|
import { useSearchSourceConnectors } from "@/hooks/useSearchSourceConnectors";
|
||||||
import { Button } from "@/components/ui/button";
|
import { Button } from "@/components/ui/button";
|
||||||
import {
|
import {
|
||||||
Card,
|
Card,
|
||||||
CardContent,
|
CardContent,
|
||||||
CardDescription,
|
CardDescription,
|
||||||
CardFooter,
|
CardFooter,
|
||||||
CardHeader,
|
CardHeader,
|
||||||
CardTitle,
|
CardTitle,
|
||||||
} from "@/components/ui/card";
|
} from "@/components/ui/card";
|
||||||
import {
|
import {
|
||||||
Table,
|
Table,
|
||||||
TableBody,
|
TableBody,
|
||||||
TableCell,
|
TableCell,
|
||||||
TableHead,
|
TableHead,
|
||||||
TableHeader,
|
TableHeader,
|
||||||
TableRow,
|
TableRow,
|
||||||
} from "@/components/ui/table";
|
} from "@/components/ui/table";
|
||||||
import {
|
import {
|
||||||
AlertDialog,
|
AlertDialog,
|
||||||
AlertDialogAction,
|
AlertDialogAction,
|
||||||
AlertDialogCancel,
|
AlertDialogCancel,
|
||||||
AlertDialogContent,
|
AlertDialogContent,
|
||||||
AlertDialogDescription,
|
AlertDialogDescription,
|
||||||
AlertDialogFooter,
|
AlertDialogFooter,
|
||||||
AlertDialogHeader,
|
AlertDialogHeader,
|
||||||
AlertDialogTitle,
|
AlertDialogTitle,
|
||||||
AlertDialogTrigger,
|
AlertDialogTrigger,
|
||||||
} from "@/components/ui/alert-dialog";
|
} from "@/components/ui/alert-dialog";
|
||||||
import { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger } from "@/components/ui/tooltip";
|
import {
|
||||||
|
Tooltip,
|
||||||
|
TooltipContent,
|
||||||
|
TooltipProvider,
|
||||||
|
TooltipTrigger,
|
||||||
|
} from "@/components/ui/tooltip";
|
||||||
|
import { getConnectorIcon } from "@/components/chat";
|
||||||
|
|
||||||
// Helper function to get connector type display name
|
// Helper function to get connector type display name
|
||||||
const getConnectorTypeDisplay = (type: string): string => {
|
const getConnectorTypeDisplay = (type: string): string => {
|
||||||
const typeMap: Record<string, string> = {
|
const typeMap: Record<string, string> = {
|
||||||
"SERPER_API": "Serper API",
|
SERPER_API: "Serper API",
|
||||||
"TAVILY_API": "Tavily API",
|
TAVILY_API: "Tavily API",
|
||||||
"SLACK_CONNECTOR": "Slack",
|
SLACK_CONNECTOR: "Slack",
|
||||||
"NOTION_CONNECTOR": "Notion",
|
NOTION_CONNECTOR: "Notion",
|
||||||
"GITHUB_CONNECTOR": "GitHub",
|
GITHUB_CONNECTOR: "GitHub",
|
||||||
"LINEAR_CONNECTOR": "Linear",
|
LINEAR_CONNECTOR: "Linear",
|
||||||
// Add other connector types here as needed
|
LINKUP_API: "Linkup",
|
||||||
};
|
// Add other connector types here as needed
|
||||||
return typeMap[type] || type;
|
};
|
||||||
|
return typeMap[type] || type;
|
||||||
};
|
};
|
||||||
|
|
||||||
// Helper function to format date with time
|
// Helper function to format date with time
|
||||||
const formatDateTime = (dateString: string | null): string => {
|
const formatDateTime = (dateString: string | null): string => {
|
||||||
if (!dateString) return "Never";
|
if (!dateString) return "Never";
|
||||||
|
|
||||||
const date = new Date(dateString);
|
const date = new Date(dateString);
|
||||||
return new Intl.DateTimeFormat('en-US', {
|
return new Intl.DateTimeFormat("en-US", {
|
||||||
year: 'numeric',
|
year: "numeric",
|
||||||
month: 'short',
|
month: "short",
|
||||||
day: 'numeric',
|
day: "numeric",
|
||||||
hour: '2-digit',
|
hour: "2-digit",
|
||||||
minute: '2-digit'
|
minute: "2-digit",
|
||||||
}).format(date);
|
}).format(date);
|
||||||
};
|
};
|
||||||
|
|
||||||
export default function ConnectorsPage() {
|
export default function ConnectorsPage() {
|
||||||
const router = useRouter();
|
const router = useRouter();
|
||||||
const params = useParams();
|
const params = useParams();
|
||||||
const searchSpaceId = params.search_space_id as string;
|
const searchSpaceId = params.search_space_id as string;
|
||||||
|
|
||||||
const { connectors, isLoading, error, deleteConnector, indexConnector } = useSearchSourceConnectors();
|
|
||||||
const [connectorToDelete, setConnectorToDelete] = useState<number | null>(null);
|
|
||||||
const [indexingConnectorId, setIndexingConnectorId] = useState<number | null>(null);
|
|
||||||
|
|
||||||
useEffect(() => {
|
const { connectors, isLoading, error, deleteConnector, indexConnector } =
|
||||||
if (error) {
|
useSearchSourceConnectors();
|
||||||
toast.error("Failed to load connectors");
|
const [connectorToDelete, setConnectorToDelete] = useState<number | null>(
|
||||||
console.error("Error fetching connectors:", error);
|
null,
|
||||||
}
|
);
|
||||||
}, [error]);
|
const [indexingConnectorId, setIndexingConnectorId] = useState<number | null>(
|
||||||
|
null,
|
||||||
|
);
|
||||||
|
|
||||||
// Handle connector deletion
|
useEffect(() => {
|
||||||
const handleDeleteConnector = async () => {
|
if (error) {
|
||||||
if (connectorToDelete === null) return;
|
toast.error("Failed to load connectors");
|
||||||
|
console.error("Error fetching connectors:", error);
|
||||||
try {
|
}
|
||||||
await deleteConnector(connectorToDelete);
|
}, [error]);
|
||||||
toast.success("Connector deleted successfully");
|
|
||||||
} catch (error) {
|
|
||||||
console.error("Error deleting connector:", error);
|
|
||||||
toast.error("Failed to delete connector");
|
|
||||||
} finally {
|
|
||||||
setConnectorToDelete(null);
|
|
||||||
}
|
|
||||||
};
|
|
||||||
|
|
||||||
// Handle connector indexing
|
// Handle connector deletion
|
||||||
const handleIndexConnector = async (connectorId: number) => {
|
const handleDeleteConnector = async () => {
|
||||||
setIndexingConnectorId(connectorId);
|
if (connectorToDelete === null) return;
|
||||||
try {
|
|
||||||
await indexConnector(connectorId, searchSpaceId);
|
|
||||||
toast.success("Connector content indexed successfully");
|
|
||||||
} catch (error) {
|
|
||||||
console.error("Error indexing connector content:", error);
|
|
||||||
toast.error(error instanceof Error ? error.message : "Failed to index connector content");
|
|
||||||
} finally {
|
|
||||||
setIndexingConnectorId(null);
|
|
||||||
}
|
|
||||||
};
|
|
||||||
|
|
||||||
return (
|
try {
|
||||||
<div className="container mx-auto py-8 max-w-6xl">
|
await deleteConnector(connectorToDelete);
|
||||||
<motion.div
|
toast.success("Connector deleted successfully");
|
||||||
initial={{ opacity: 0, y: 20 }}
|
} catch (error) {
|
||||||
animate={{ opacity: 1, y: 0 }}
|
console.error("Error deleting connector:", error);
|
||||||
transition={{ duration: 0.5 }}
|
toast.error("Failed to delete connector");
|
||||||
className="mb-8 flex items-center justify-between"
|
} finally {
|
||||||
>
|
setConnectorToDelete(null);
|
||||||
<div>
|
}
|
||||||
<h1 className="text-3xl font-bold tracking-tight">Connectors</h1>
|
};
|
||||||
<p className="text-muted-foreground mt-2">
|
|
||||||
Manage your connected services and data sources.
|
|
||||||
</p>
|
|
||||||
</div>
|
|
||||||
<Button onClick={() => router.push(`/dashboard/${searchSpaceId}/connectors/add`)}>
|
|
||||||
<Plus className="mr-2 h-4 w-4" />
|
|
||||||
Add Connector
|
|
||||||
</Button>
|
|
||||||
</motion.div>
|
|
||||||
|
|
||||||
<Card>
|
// Handle connector indexing
|
||||||
<CardHeader className="pb-3">
|
const handleIndexConnector = async (connectorId: number) => {
|
||||||
<CardTitle>Your Connectors</CardTitle>
|
setIndexingConnectorId(connectorId);
|
||||||
<CardDescription>
|
try {
|
||||||
View and manage all your connected services.
|
await indexConnector(connectorId, searchSpaceId);
|
||||||
</CardDescription>
|
toast.success("Connector content indexed successfully");
|
||||||
</CardHeader>
|
} catch (error) {
|
||||||
<CardContent>
|
console.error("Error indexing connector content:", error);
|
||||||
{isLoading ? (
|
toast.error(
|
||||||
<div className="flex justify-center py-8">
|
error instanceof Error
|
||||||
<div className="animate-pulse text-center">
|
? error.message
|
||||||
<div className="h-6 w-32 bg-muted rounded mx-auto mb-2"></div>
|
: "Failed to index connector content",
|
||||||
<div className="h-4 w-48 bg-muted rounded mx-auto"></div>
|
);
|
||||||
</div>
|
} finally {
|
||||||
</div>
|
setIndexingConnectorId(null);
|
||||||
) : connectors.length === 0 ? (
|
}
|
||||||
<div className="text-center py-12">
|
};
|
||||||
<h3 className="text-lg font-medium mb-2">No connectors found</h3>
|
|
||||||
<p className="text-muted-foreground mb-6">
|
return (
|
||||||
You haven't added any connectors yet. Add one to enhance your search capabilities.
|
<div className="container mx-auto py-8 max-w-6xl">
|
||||||
</p>
|
<motion.div
|
||||||
<Button onClick={() => router.push(`/dashboard/${searchSpaceId}/connectors/add`)}>
|
initial={{ opacity: 0, y: 20 }}
|
||||||
<Plus className="mr-2 h-4 w-4" />
|
animate={{ opacity: 1, y: 0 }}
|
||||||
Add Your First Connector
|
transition={{ duration: 0.5 }}
|
||||||
</Button>
|
className="mb-8 flex items-center justify-between"
|
||||||
</div>
|
>
|
||||||
) : (
|
<div>
|
||||||
<div className="rounded-md border">
|
<h1 className="text-3xl font-bold tracking-tight">Connectors</h1>
|
||||||
<Table>
|
<p className="text-muted-foreground mt-2">
|
||||||
<TableHeader>
|
Manage your connected services and data sources.
|
||||||
<TableRow>
|
</p>
|
||||||
<TableHead>Name</TableHead>
|
</div>
|
||||||
<TableHead>Type</TableHead>
|
<Button
|
||||||
<TableHead>Last Indexed</TableHead>
|
onClick={() =>
|
||||||
<TableHead className="text-right">Actions</TableHead>
|
router.push(`/dashboard/${searchSpaceId}/connectors/add`)
|
||||||
</TableRow>
|
}
|
||||||
</TableHeader>
|
>
|
||||||
<TableBody>
|
<Plus className="mr-2 h-4 w-4" />
|
||||||
{connectors.map((connector) => (
|
Add Connector
|
||||||
<TableRow key={connector.id}>
|
</Button>
|
||||||
<TableCell className="font-medium">{connector.name}</TableCell>
|
</motion.div>
|
||||||
<TableCell>{getConnectorTypeDisplay(connector.connector_type)}</TableCell>
|
|
||||||
<TableCell>
|
<Card>
|
||||||
{connector.is_indexable
|
<CardHeader className="pb-3">
|
||||||
? formatDateTime(connector.last_indexed_at)
|
<CardTitle>Your Connectors</CardTitle>
|
||||||
: "Not indexable"}
|
<CardDescription>
|
||||||
</TableCell>
|
View and manage all your connected services.
|
||||||
<TableCell className="text-right">
|
</CardDescription>
|
||||||
<div className="flex justify-end gap-2">
|
</CardHeader>
|
||||||
{connector.is_indexable && (
|
<CardContent>
|
||||||
<TooltipProvider>
|
{isLoading ? (
|
||||||
<Tooltip>
|
<div className="flex justify-center py-8">
|
||||||
<TooltipTrigger asChild>
|
<div className="animate-pulse text-center">
|
||||||
<Button
|
<div className="h-6 w-32 bg-muted rounded mx-auto mb-2"></div>
|
||||||
variant="outline"
|
<div className="h-4 w-48 bg-muted rounded mx-auto"></div>
|
||||||
size="sm"
|
</div>
|
||||||
onClick={() => handleIndexConnector(connector.id)}
|
</div>
|
||||||
disabled={indexingConnectorId === connector.id}
|
) : connectors.length === 0 ? (
|
||||||
>
|
<div className="text-center py-12">
|
||||||
{indexingConnectorId === connector.id ? (
|
<h3 className="text-lg font-medium mb-2">No connectors found</h3>
|
||||||
<RefreshCw className="h-4 w-4 animate-spin" />
|
<p className="text-muted-foreground mb-6">
|
||||||
) : (
|
You haven't added any connectors yet. Add one to enhance your
|
||||||
<RefreshCw className="h-4 w-4" />
|
search capabilities.
|
||||||
)}
|
</p>
|
||||||
<span className="sr-only">Index Content</span>
|
<Button
|
||||||
</Button>
|
onClick={() =>
|
||||||
</TooltipTrigger>
|
router.push(`/dashboard/${searchSpaceId}/connectors/add`)
|
||||||
<TooltipContent>
|
}
|
||||||
<p>Index Content</p>
|
>
|
||||||
</TooltipContent>
|
<Plus className="mr-2 h-4 w-4" />
|
||||||
</Tooltip>
|
Add Your First Connector
|
||||||
</TooltipProvider>
|
</Button>
|
||||||
)}
|
</div>
|
||||||
<Button
|
) : (
|
||||||
variant="outline"
|
<div className="rounded-md border">
|
||||||
size="sm"
|
<Table>
|
||||||
onClick={() => router.push(`/dashboard/${searchSpaceId}/connectors/${connector.id}/edit`)}
|
<TableHeader>
|
||||||
>
|
<TableRow>
|
||||||
<Edit className="h-4 w-4" />
|
<TableHead>Name</TableHead>
|
||||||
<span className="sr-only">Edit</span>
|
<TableHead>Type</TableHead>
|
||||||
</Button>
|
<TableHead>Last Indexed</TableHead>
|
||||||
<AlertDialog>
|
<TableHead className="text-right">Actions</TableHead>
|
||||||
<AlertDialogTrigger asChild>
|
</TableRow>
|
||||||
<Button
|
</TableHeader>
|
||||||
variant="outline"
|
<TableBody>
|
||||||
size="sm"
|
{connectors.map((connector) => (
|
||||||
className="text-destructive-foreground hover:bg-destructive/10"
|
<TableRow key={connector.id}>
|
||||||
onClick={() => setConnectorToDelete(connector.id)}
|
<TableCell className="font-medium">
|
||||||
>
|
{connector.name}
|
||||||
<Trash2 className="h-4 w-4" />
|
</TableCell>
|
||||||
<span className="sr-only">Delete</span>
|
<TableCell>
|
||||||
</Button>
|
{getConnectorIcon(connector.connector_type)}
|
||||||
</AlertDialogTrigger>
|
</TableCell>
|
||||||
<AlertDialogContent>
|
<TableCell>
|
||||||
<AlertDialogHeader>
|
{connector.is_indexable
|
||||||
<AlertDialogTitle>Delete Connector</AlertDialogTitle>
|
? formatDateTime(connector.last_indexed_at)
|
||||||
<AlertDialogDescription>
|
: "Not indexable"}
|
||||||
Are you sure you want to delete this connector? This action cannot be undone.
|
</TableCell>
|
||||||
</AlertDialogDescription>
|
<TableCell className="text-right">
|
||||||
</AlertDialogHeader>
|
<div className="flex justify-end gap-2">
|
||||||
<AlertDialogFooter>
|
{connector.is_indexable && (
|
||||||
<AlertDialogCancel onClick={() => setConnectorToDelete(null)}>
|
<TooltipProvider>
|
||||||
Cancel
|
<Tooltip>
|
||||||
</AlertDialogCancel>
|
<TooltipTrigger asChild>
|
||||||
<AlertDialogAction
|
<Button
|
||||||
className="bg-destructive text-destructive-foreground hover:bg-destructive/90"
|
variant="outline"
|
||||||
onClick={handleDeleteConnector}
|
size="sm"
|
||||||
>
|
onClick={() =>
|
||||||
Delete
|
handleIndexConnector(connector.id)
|
||||||
</AlertDialogAction>
|
}
|
||||||
</AlertDialogFooter>
|
disabled={
|
||||||
</AlertDialogContent>
|
indexingConnectorId === connector.id
|
||||||
</AlertDialog>
|
}
|
||||||
</div>
|
>
|
||||||
</TableCell>
|
{indexingConnectorId === connector.id ? (
|
||||||
</TableRow>
|
<RefreshCw className="h-4 w-4 animate-spin" />
|
||||||
))}
|
) : (
|
||||||
</TableBody>
|
<RefreshCw className="h-4 w-4" />
|
||||||
</Table>
|
)}
|
||||||
</div>
|
<span className="sr-only">
|
||||||
)}
|
Index Content
|
||||||
</CardContent>
|
</span>
|
||||||
</Card>
|
</Button>
|
||||||
</div>
|
</TooltipTrigger>
|
||||||
);
|
<TooltipContent>
|
||||||
}
|
<p>Index Content</p>
|
||||||
|
</TooltipContent>
|
||||||
|
</Tooltip>
|
||||||
|
</TooltipProvider>
|
||||||
|
)}
|
||||||
|
<Button
|
||||||
|
variant="outline"
|
||||||
|
size="sm"
|
||||||
|
onClick={() =>
|
||||||
|
router.push(
|
||||||
|
`/dashboard/${searchSpaceId}/connectors/${connector.id}/edit`,
|
||||||
|
)
|
||||||
|
}
|
||||||
|
>
|
||||||
|
<Edit className="h-4 w-4" />
|
||||||
|
<span className="sr-only">Edit</span>
|
||||||
|
</Button>
|
||||||
|
<AlertDialog>
|
||||||
|
<AlertDialogTrigger asChild>
|
||||||
|
<Button
|
||||||
|
variant="outline"
|
||||||
|
size="sm"
|
||||||
|
className="text-destructive-foreground hover:bg-destructive/10"
|
||||||
|
onClick={() =>
|
||||||
|
setConnectorToDelete(connector.id)
|
||||||
|
}
|
||||||
|
>
|
||||||
|
<Trash2 className="h-4 w-4" />
|
||||||
|
<span className="sr-only">Delete</span>
|
||||||
|
</Button>
|
||||||
|
</AlertDialogTrigger>
|
||||||
|
<AlertDialogContent>
|
||||||
|
<AlertDialogHeader>
|
||||||
|
<AlertDialogTitle>
|
||||||
|
Delete Connector
|
||||||
|
</AlertDialogTitle>
|
||||||
|
<AlertDialogDescription>
|
||||||
|
Are you sure you want to delete this
|
||||||
|
connector? This action cannot be undone.
|
||||||
|
</AlertDialogDescription>
|
||||||
|
</AlertDialogHeader>
|
||||||
|
<AlertDialogFooter>
|
||||||
|
<AlertDialogCancel
|
||||||
|
onClick={() => setConnectorToDelete(null)}
|
||||||
|
>
|
||||||
|
Cancel
|
||||||
|
</AlertDialogCancel>
|
||||||
|
<AlertDialogAction
|
||||||
|
className="bg-destructive text-destructive-foreground hover:bg-destructive/90"
|
||||||
|
onClick={handleDeleteConnector}
|
||||||
|
>
|
||||||
|
Delete
|
||||||
|
</AlertDialogAction>
|
||||||
|
</AlertDialogFooter>
|
||||||
|
</AlertDialogContent>
|
||||||
|
</AlertDialog>
|
||||||
|
</div>
|
||||||
|
</TableCell>
|
||||||
|
</TableRow>
|
||||||
|
))}
|
||||||
|
</TableBody>
|
||||||
|
</Table>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
"use client";
|
"use client";
|
||||||
|
|
||||||
import React, { useEffect } from 'react';
|
import React, { useEffect } from "react";
|
||||||
import { useRouter, useParams } from "next/navigation";
|
import { useRouter, useParams } from "next/navigation";
|
||||||
import { motion } from "framer-motion";
|
import { motion } from "framer-motion";
|
||||||
import { toast } from "sonner";
|
import { toast } from "sonner";
|
||||||
|
|
@ -8,169 +8,208 @@ import { ArrowLeft, Check, Loader2, Github } from "lucide-react";
|
||||||
|
|
||||||
import { Form } from "@/components/ui/form";
|
import { Form } from "@/components/ui/form";
|
||||||
import { Button } from "@/components/ui/button";
|
import { Button } from "@/components/ui/button";
|
||||||
import { Card, CardContent, CardDescription, CardFooter, CardHeader, CardTitle } from "@/components/ui/card";
|
import {
|
||||||
|
Card,
|
||||||
|
CardContent,
|
||||||
|
CardDescription,
|
||||||
|
CardFooter,
|
||||||
|
CardHeader,
|
||||||
|
CardTitle,
|
||||||
|
} from "@/components/ui/card";
|
||||||
|
|
||||||
// Import Utils, Types, Hook, and Components
|
// Import Utils, Types, Hook, and Components
|
||||||
import { getConnectorTypeDisplay } from '@/lib/connectors/utils';
|
import { getConnectorTypeDisplay } from "@/lib/connectors/utils";
|
||||||
import { useConnectorEditPage } from '@/hooks/useConnectorEditPage';
|
import { useConnectorEditPage } from "@/hooks/useConnectorEditPage";
|
||||||
import { EditConnectorLoadingSkeleton } from "@/components/editConnector/EditConnectorLoadingSkeleton";
|
import { EditConnectorLoadingSkeleton } from "@/components/editConnector/EditConnectorLoadingSkeleton";
|
||||||
import { EditConnectorNameForm } from "@/components/editConnector/EditConnectorNameForm";
|
import { EditConnectorNameForm } from "@/components/editConnector/EditConnectorNameForm";
|
||||||
import { EditGitHubConnectorConfig } from "@/components/editConnector/EditGitHubConnectorConfig";
|
import { EditGitHubConnectorConfig } from "@/components/editConnector/EditGitHubConnectorConfig";
|
||||||
import { EditSimpleTokenForm } from "@/components/editConnector/EditSimpleTokenForm";
|
import { EditSimpleTokenForm } from "@/components/editConnector/EditSimpleTokenForm";
|
||||||
|
import { getConnectorIcon } from "@/components/chat";
|
||||||
|
|
||||||
export default function EditConnectorPage() {
|
export default function EditConnectorPage() {
|
||||||
const router = useRouter();
|
const router = useRouter();
|
||||||
const params = useParams();
|
const params = useParams();
|
||||||
const searchSpaceId = params.search_space_id as string;
|
const searchSpaceId = params.search_space_id as string;
|
||||||
// Ensure connectorId is parsed safely
|
// Ensure connectorId is parsed safely
|
||||||
const connectorIdParam = params.connector_id as string;
|
const connectorIdParam = params.connector_id as string;
|
||||||
const connectorId = connectorIdParam ? parseInt(connectorIdParam, 10) : NaN;
|
const connectorId = connectorIdParam ? parseInt(connectorIdParam, 10) : NaN;
|
||||||
|
|
||||||
// Use the custom hook to manage state and logic
|
// Use the custom hook to manage state and logic
|
||||||
const {
|
const {
|
||||||
connectorsLoading,
|
connectorsLoading,
|
||||||
connector,
|
connector,
|
||||||
isSaving,
|
isSaving,
|
||||||
editForm,
|
editForm,
|
||||||
patForm, // Needed for GitHub child component
|
patForm, // Needed for GitHub child component
|
||||||
handleSaveChanges,
|
handleSaveChanges,
|
||||||
// GitHub specific props for the child component
|
// GitHub specific props for the child component
|
||||||
editMode,
|
editMode,
|
||||||
setEditMode, // Pass down if needed by GitHub component
|
setEditMode, // Pass down if needed by GitHub component
|
||||||
originalPat,
|
originalPat,
|
||||||
currentSelectedRepos,
|
currentSelectedRepos,
|
||||||
fetchedRepos,
|
fetchedRepos,
|
||||||
setFetchedRepos,
|
setFetchedRepos,
|
||||||
newSelectedRepos,
|
newSelectedRepos,
|
||||||
setNewSelectedRepos,
|
setNewSelectedRepos,
|
||||||
isFetchingRepos,
|
isFetchingRepos,
|
||||||
handleFetchRepositories,
|
handleFetchRepositories,
|
||||||
handleRepoSelectionChange,
|
handleRepoSelectionChange,
|
||||||
} = useConnectorEditPage(connectorId, searchSpaceId);
|
} = useConnectorEditPage(connectorId, searchSpaceId);
|
||||||
|
|
||||||
// Redirect if connectorId is not a valid number after parsing
|
// Redirect if connectorId is not a valid number after parsing
|
||||||
useEffect(() => {
|
useEffect(() => {
|
||||||
if (isNaN(connectorId)) {
|
if (isNaN(connectorId)) {
|
||||||
toast.error("Invalid Connector ID.");
|
toast.error("Invalid Connector ID.");
|
||||||
router.push(`/dashboard/${searchSpaceId}/connectors`);
|
router.push(`/dashboard/${searchSpaceId}/connectors`);
|
||||||
}
|
}
|
||||||
}, [connectorId, router, searchSpaceId]);
|
}, [connectorId, router, searchSpaceId]);
|
||||||
|
|
||||||
// Loading State
|
// Loading State
|
||||||
if (connectorsLoading || !connector) {
|
if (connectorsLoading || !connector) {
|
||||||
// Handle NaN case before showing skeleton
|
// Handle NaN case before showing skeleton
|
||||||
if (isNaN(connectorId)) return null;
|
if (isNaN(connectorId)) return null;
|
||||||
return <EditConnectorLoadingSkeleton />;
|
return <EditConnectorLoadingSkeleton />;
|
||||||
}
|
}
|
||||||
|
|
||||||
// Main Render using data/handlers from the hook
|
// Main Render using data/handlers from the hook
|
||||||
return (
|
return (
|
||||||
<div className="container mx-auto py-8 max-w-3xl">
|
<div className="container mx-auto py-8 max-w-3xl">
|
||||||
<Button variant="ghost" className="mb-6" onClick={() => router.push(`/dashboard/${searchSpaceId}/connectors`)}>
|
<Button
|
||||||
<ArrowLeft className="mr-2 h-4 w-4" /> Back to Connectors
|
variant="ghost"
|
||||||
</Button>
|
className="mb-6"
|
||||||
|
onClick={() => router.push(`/dashboard/${searchSpaceId}/connectors`)}
|
||||||
|
>
|
||||||
|
<ArrowLeft className="mr-2 h-4 w-4" /> Back to Connectors
|
||||||
|
</Button>
|
||||||
|
|
||||||
<motion.div initial={{ opacity: 0, y: 20 }} animate={{ opacity: 1, y: 0 }} transition={{ duration: 0.5 }}>
|
<motion.div
|
||||||
<Card className="border-2 border-border">
|
initial={{ opacity: 0, y: 20 }}
|
||||||
<CardHeader>
|
animate={{ opacity: 1, y: 0 }}
|
||||||
<CardTitle className="text-2xl font-bold flex items-center gap-2">
|
transition={{ duration: 0.5 }}
|
||||||
<Github className="h-6 w-6" /> {/* TODO: Dynamic icon */}
|
>
|
||||||
Edit {getConnectorTypeDisplay(connector.connector_type)} Connector
|
<Card className="border-2 border-border">
|
||||||
</CardTitle>
|
<CardHeader>
|
||||||
<CardDescription>Modify connector name and configuration.</CardDescription>
|
<CardTitle className="text-2xl font-bold flex items-center gap-2">
|
||||||
</CardHeader>
|
{getConnectorIcon(connector.connector_type)}
|
||||||
|
Edit {getConnectorTypeDisplay(connector.connector_type)} Connector
|
||||||
|
</CardTitle>
|
||||||
|
<CardDescription>
|
||||||
|
Modify connector name and configuration.
|
||||||
|
</CardDescription>
|
||||||
|
</CardHeader>
|
||||||
|
|
||||||
<Form {...editForm}>
|
<Form {...editForm}>
|
||||||
{/* Pass hook's handleSaveChanges */}
|
{/* Pass hook's handleSaveChanges */}
|
||||||
<form onSubmit={editForm.handleSubmit(handleSaveChanges)} className="space-y-6">
|
<form
|
||||||
<CardContent className="space-y-6">
|
onSubmit={editForm.handleSubmit(handleSaveChanges)}
|
||||||
{/* Pass form control from hook */}
|
className="space-y-6"
|
||||||
<EditConnectorNameForm control={editForm.control} />
|
>
|
||||||
|
<CardContent className="space-y-6">
|
||||||
|
{/* Pass form control from hook */}
|
||||||
|
<EditConnectorNameForm control={editForm.control} />
|
||||||
|
|
||||||
<hr />
|
<hr />
|
||||||
|
|
||||||
<h3 className="text-lg font-semibold">Configuration</h3>
|
<h3 className="text-lg font-semibold">Configuration</h3>
|
||||||
|
|
||||||
{/* == GitHub == */}
|
{/* == GitHub == */}
|
||||||
{connector.connector_type === 'GITHUB_CONNECTOR' && (
|
{connector.connector_type === "GITHUB_CONNECTOR" && (
|
||||||
<EditGitHubConnectorConfig
|
<EditGitHubConnectorConfig
|
||||||
// Pass relevant state and handlers from hook
|
// Pass relevant state and handlers from hook
|
||||||
editMode={editMode}
|
editMode={editMode}
|
||||||
setEditMode={setEditMode} // Pass setter if child manages mode
|
setEditMode={setEditMode} // Pass setter if child manages mode
|
||||||
originalPat={originalPat}
|
originalPat={originalPat}
|
||||||
currentSelectedRepos={currentSelectedRepos}
|
currentSelectedRepos={currentSelectedRepos}
|
||||||
fetchedRepos={fetchedRepos}
|
fetchedRepos={fetchedRepos}
|
||||||
newSelectedRepos={newSelectedRepos}
|
newSelectedRepos={newSelectedRepos}
|
||||||
isFetchingRepos={isFetchingRepos}
|
isFetchingRepos={isFetchingRepos}
|
||||||
patForm={patForm}
|
patForm={patForm}
|
||||||
handleFetchRepositories={handleFetchRepositories}
|
handleFetchRepositories={handleFetchRepositories}
|
||||||
handleRepoSelectionChange={handleRepoSelectionChange}
|
handleRepoSelectionChange={handleRepoSelectionChange}
|
||||||
setNewSelectedRepos={setNewSelectedRepos}
|
setNewSelectedRepos={setNewSelectedRepos}
|
||||||
setFetchedRepos={setFetchedRepos}
|
setFetchedRepos={setFetchedRepos}
|
||||||
/>
|
/>
|
||||||
)}
|
)}
|
||||||
|
|
||||||
{/* == Slack == */}
|
{/* == Slack == */}
|
||||||
{connector.connector_type === 'SLACK_CONNECTOR' && (
|
{connector.connector_type === "SLACK_CONNECTOR" && (
|
||||||
<EditSimpleTokenForm
|
<EditSimpleTokenForm
|
||||||
control={editForm.control}
|
control={editForm.control}
|
||||||
fieldName="SLACK_BOT_TOKEN"
|
fieldName="SLACK_BOT_TOKEN"
|
||||||
fieldLabel="Slack Bot Token"
|
fieldLabel="Slack Bot Token"
|
||||||
fieldDescription="Update the Slack Bot Token if needed."
|
fieldDescription="Update the Slack Bot Token if needed."
|
||||||
placeholder="Begins with xoxb-..."
|
placeholder="Begins with xoxb-..."
|
||||||
/>
|
/>
|
||||||
)}
|
)}
|
||||||
{/* == Notion == */}
|
{/* == Notion == */}
|
||||||
{connector.connector_type === 'NOTION_CONNECTOR' && (
|
{connector.connector_type === "NOTION_CONNECTOR" && (
|
||||||
<EditSimpleTokenForm
|
<EditSimpleTokenForm
|
||||||
control={editForm.control}
|
control={editForm.control}
|
||||||
fieldName="NOTION_INTEGRATION_TOKEN"
|
fieldName="NOTION_INTEGRATION_TOKEN"
|
||||||
fieldLabel="Notion Integration Token"
|
fieldLabel="Notion Integration Token"
|
||||||
fieldDescription="Update the Notion Integration Token if needed."
|
fieldDescription="Update the Notion Integration Token if needed."
|
||||||
placeholder="Begins with secret_..."
|
placeholder="Begins with secret_..."
|
||||||
/>
|
/>
|
||||||
)}
|
)}
|
||||||
{/* == Serper == */}
|
{/* == Serper == */}
|
||||||
{connector.connector_type === 'SERPER_API' && (
|
{connector.connector_type === "SERPER_API" && (
|
||||||
<EditSimpleTokenForm
|
<EditSimpleTokenForm
|
||||||
control={editForm.control}
|
control={editForm.control}
|
||||||
fieldName="SERPER_API_KEY"
|
fieldName="SERPER_API_KEY"
|
||||||
fieldLabel="Serper API Key"
|
fieldLabel="Serper API Key"
|
||||||
fieldDescription="Update the Serper API Key if needed."
|
fieldDescription="Update the Serper API Key if needed."
|
||||||
/>
|
/>
|
||||||
)}
|
)}
|
||||||
{/* == Tavily == */}
|
{/* == Tavily == */}
|
||||||
{connector.connector_type === 'TAVILY_API' && (
|
{connector.connector_type === "TAVILY_API" && (
|
||||||
<EditSimpleTokenForm
|
<EditSimpleTokenForm
|
||||||
control={editForm.control}
|
control={editForm.control}
|
||||||
fieldName="TAVILY_API_KEY"
|
fieldName="TAVILY_API_KEY"
|
||||||
fieldLabel="Tavily API Key"
|
fieldLabel="Tavily API Key"
|
||||||
fieldDescription="Update the Tavily API Key if needed."
|
fieldDescription="Update the Tavily API Key if needed."
|
||||||
/>
|
/>
|
||||||
)}
|
)}
|
||||||
|
|
||||||
{/* == Linear == */}
|
{/* == Linear == */}
|
||||||
{connector.connector_type === 'LINEAR_CONNECTOR' && (
|
{connector.connector_type === "LINEAR_CONNECTOR" && (
|
||||||
<EditSimpleTokenForm
|
<EditSimpleTokenForm
|
||||||
control={editForm.control}
|
control={editForm.control}
|
||||||
fieldName="LINEAR_API_KEY"
|
fieldName="LINEAR_API_KEY"
|
||||||
fieldLabel="Linear API Key"
|
fieldLabel="Linear API Key"
|
||||||
fieldDescription="Update your Linear API Key if needed."
|
fieldDescription="Update your Linear API Key if needed."
|
||||||
placeholder="Begins with lin_api_..."
|
placeholder="Begins with lin_api_..."
|
||||||
/>
|
/>
|
||||||
)}
|
)}
|
||||||
|
|
||||||
</CardContent>
|
{/* == Linkup == */}
|
||||||
<CardFooter className="border-t pt-6">
|
{connector.connector_type === "LINKUP_API" && (
|
||||||
<Button type="submit" disabled={isSaving} className="w-full sm:w-auto">
|
<EditSimpleTokenForm
|
||||||
{isSaving ? <Loader2 className="mr-2 h-4 w-4 animate-spin" /> : <Check className="mr-2 h-4 w-4" />}
|
control={editForm.control}
|
||||||
Save Changes
|
fieldName="LINKUP_API_KEY"
|
||||||
</Button>
|
fieldLabel="Linkup API Key"
|
||||||
</CardFooter>
|
fieldDescription="Update your Linkup API Key if needed."
|
||||||
</form>
|
placeholder="Begins with linkup_..."
|
||||||
</Form>
|
/>
|
||||||
</Card>
|
)}
|
||||||
</motion.div>
|
</CardContent>
|
||||||
</div>
|
<CardFooter className="border-t pt-6">
|
||||||
);
|
<Button
|
||||||
}
|
type="submit"
|
||||||
|
disabled={isSaving}
|
||||||
|
className="w-full sm:w-auto"
|
||||||
|
>
|
||||||
|
{isSaving ? (
|
||||||
|
<Loader2 className="mr-2 h-4 w-4 animate-spin" />
|
||||||
|
) : (
|
||||||
|
<Check className="mr-2 h-4 w-4" />
|
||||||
|
)}
|
||||||
|
Save Changes
|
||||||
|
</Button>
|
||||||
|
</CardFooter>
|
||||||
|
</form>
|
||||||
|
</Form>
|
||||||
|
</Card>
|
||||||
|
</motion.div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
|
||||||
|
|
@ -52,6 +52,7 @@ const getConnectorTypeDisplay = (type: string): string => {
|
||||||
"SLACK_CONNECTOR": "Slack Connector",
|
"SLACK_CONNECTOR": "Slack Connector",
|
||||||
"NOTION_CONNECTOR": "Notion Connector",
|
"NOTION_CONNECTOR": "Notion Connector",
|
||||||
"GITHUB_CONNECTOR": "GitHub Connector",
|
"GITHUB_CONNECTOR": "GitHub Connector",
|
||||||
|
"LINKUP_API": "Linkup",
|
||||||
// Add other connector types here as needed
|
// Add other connector types here as needed
|
||||||
};
|
};
|
||||||
return typeMap[type] || type;
|
return typeMap[type] || type;
|
||||||
|
|
@ -87,7 +88,8 @@ export default function EditConnectorPage() {
|
||||||
"TAVILY_API": "TAVILY_API_KEY",
|
"TAVILY_API": "TAVILY_API_KEY",
|
||||||
"SLACK_CONNECTOR": "SLACK_BOT_TOKEN",
|
"SLACK_CONNECTOR": "SLACK_BOT_TOKEN",
|
||||||
"NOTION_CONNECTOR": "NOTION_INTEGRATION_TOKEN",
|
"NOTION_CONNECTOR": "NOTION_INTEGRATION_TOKEN",
|
||||||
"GITHUB_CONNECTOR": "GITHUB_PAT"
|
"GITHUB_CONNECTOR": "GITHUB_PAT",
|
||||||
|
"LINKUP_API": "LINKUP_API_KEY"
|
||||||
};
|
};
|
||||||
return fieldMap[connectorType] || "";
|
return fieldMap[connectorType] || "";
|
||||||
};
|
};
|
||||||
|
|
@ -229,7 +231,9 @@ export default function EditConnectorPage() {
|
||||||
? "Notion Integration Token"
|
? "Notion Integration Token"
|
||||||
: connector?.connector_type === "GITHUB_CONNECTOR"
|
: connector?.connector_type === "GITHUB_CONNECTOR"
|
||||||
? "GitHub Personal Access Token (PAT)"
|
? "GitHub Personal Access Token (PAT)"
|
||||||
: "API Key"}
|
: connector?.connector_type === "LINKUP_API"
|
||||||
|
? "Linkup API Key"
|
||||||
|
: "API Key"}
|
||||||
</FormLabel>
|
</FormLabel>
|
||||||
<FormControl>
|
<FormControl>
|
||||||
<Input
|
<Input
|
||||||
|
|
@ -241,7 +245,9 @@ export default function EditConnectorPage() {
|
||||||
? "Enter new Notion Token (optional)"
|
? "Enter new Notion Token (optional)"
|
||||||
: connector?.connector_type === "GITHUB_CONNECTOR"
|
: connector?.connector_type === "GITHUB_CONNECTOR"
|
||||||
? "Enter new GitHub PAT (optional)"
|
? "Enter new GitHub PAT (optional)"
|
||||||
: "Enter new API key (optional)"
|
: connector?.connector_type === "LINKUP_API"
|
||||||
|
? "Enter new Linkup API Key (optional)"
|
||||||
|
: "Enter new API key (optional)"
|
||||||
}
|
}
|
||||||
{...field}
|
{...field}
|
||||||
/>
|
/>
|
||||||
|
|
@ -253,7 +259,9 @@ export default function EditConnectorPage() {
|
||||||
? "Enter a new Notion Integration Token or leave blank to keep your existing token."
|
? "Enter a new Notion Integration Token or leave blank to keep your existing token."
|
||||||
: connector?.connector_type === "GITHUB_CONNECTOR"
|
: connector?.connector_type === "GITHUB_CONNECTOR"
|
||||||
? "Enter a new GitHub PAT or leave blank to keep your existing token."
|
? "Enter a new GitHub PAT or leave blank to keep your existing token."
|
||||||
: "Enter a new API key or leave blank to keep your existing key."}
|
: connector?.connector_type === "LINKUP_API"
|
||||||
|
? "Enter a new Linkup API Key or leave blank to keep your existing key."
|
||||||
|
: "Enter a new API key or leave blank to keep your existing key."}
|
||||||
</FormDescription>
|
</FormDescription>
|
||||||
<FormMessage />
|
<FormMessage />
|
||||||
</FormItem>
|
</FormItem>
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,207 @@
|
||||||
|
"use client";
|
||||||
|
|
||||||
|
import { useState } from "react";
|
||||||
|
import { useRouter, useParams } from "next/navigation";
|
||||||
|
import { motion } from "framer-motion";
|
||||||
|
import { zodResolver } from "@hookform/resolvers/zod";
|
||||||
|
import { useForm } from "react-hook-form";
|
||||||
|
import * as z from "zod";
|
||||||
|
import { toast } from "sonner";
|
||||||
|
import { ArrowLeft, Check, Info, Loader2 } from "lucide-react";
|
||||||
|
|
||||||
|
import { useSearchSourceConnectors } from "@/hooks/useSearchSourceConnectors";
|
||||||
|
import {
|
||||||
|
Form,
|
||||||
|
FormControl,
|
||||||
|
FormDescription,
|
||||||
|
FormField,
|
||||||
|
FormItem,
|
||||||
|
FormLabel,
|
||||||
|
FormMessage,
|
||||||
|
} from "@/components/ui/form";
|
||||||
|
import { Input } from "@/components/ui/input";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import {
|
||||||
|
Card,
|
||||||
|
CardContent,
|
||||||
|
CardDescription,
|
||||||
|
CardFooter,
|
||||||
|
CardHeader,
|
||||||
|
CardTitle,
|
||||||
|
} from "@/components/ui/card";
|
||||||
|
import {
|
||||||
|
Alert,
|
||||||
|
AlertDescription,
|
||||||
|
AlertTitle,
|
||||||
|
} from "@/components/ui/alert";
|
||||||
|
|
||||||
|
// Define the form schema with Zod
|
||||||
|
const linkupApiFormSchema = z.object({
|
||||||
|
name: z.string().min(3, {
|
||||||
|
message: "Connector name must be at least 3 characters.",
|
||||||
|
}),
|
||||||
|
api_key: z.string().min(10, {
|
||||||
|
message: "API key is required and must be valid.",
|
||||||
|
}),
|
||||||
|
});
|
||||||
|
|
||||||
|
// Define the type for the form values
|
||||||
|
type LinkupApiFormValues = z.infer<typeof linkupApiFormSchema>;
|
||||||
|
|
||||||
|
export default function LinkupApiPage() {
|
||||||
|
const router = useRouter();
|
||||||
|
const params = useParams();
|
||||||
|
const searchSpaceId = params.search_space_id as string;
|
||||||
|
const [isSubmitting, setIsSubmitting] = useState(false);
|
||||||
|
const { createConnector } = useSearchSourceConnectors();
|
||||||
|
|
||||||
|
// Initialize the form
|
||||||
|
const form = useForm<LinkupApiFormValues>({
|
||||||
|
resolver: zodResolver(linkupApiFormSchema),
|
||||||
|
defaultValues: {
|
||||||
|
name: "Linkup API Connector",
|
||||||
|
api_key: "",
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
// Handle form submission
|
||||||
|
const onSubmit = async (values: LinkupApiFormValues) => {
|
||||||
|
setIsSubmitting(true);
|
||||||
|
try {
|
||||||
|
await createConnector({
|
||||||
|
name: values.name,
|
||||||
|
connector_type: "LINKUP_API",
|
||||||
|
config: {
|
||||||
|
LINKUP_API_KEY: values.api_key,
|
||||||
|
},
|
||||||
|
is_indexable: false,
|
||||||
|
last_indexed_at: null,
|
||||||
|
});
|
||||||
|
|
||||||
|
toast.success("Linkup API connector created successfully!");
|
||||||
|
|
||||||
|
// Navigate back to connectors page
|
||||||
|
router.push(`/dashboard/${searchSpaceId}/connectors`);
|
||||||
|
} catch (error) {
|
||||||
|
console.error("Error creating connector:", error);
|
||||||
|
toast.error(error instanceof Error ? error.message : "Failed to create connector");
|
||||||
|
} finally {
|
||||||
|
setIsSubmitting(false);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="container mx-auto py-8 max-w-3xl">
|
||||||
|
<Button
|
||||||
|
variant="ghost"
|
||||||
|
className="mb-6"
|
||||||
|
onClick={() => router.push(`/dashboard/${searchSpaceId}/connectors/add`)}
|
||||||
|
>
|
||||||
|
<ArrowLeft className="mr-2 h-4 w-4" />
|
||||||
|
Back to Connectors
|
||||||
|
</Button>
|
||||||
|
|
||||||
|
<motion.div
|
||||||
|
initial={{ opacity: 0, y: 20 }}
|
||||||
|
animate={{ opacity: 1, y: 0 }}
|
||||||
|
transition={{ duration: 0.5 }}
|
||||||
|
>
|
||||||
|
<Card className="border-2 border-border">
|
||||||
|
<CardHeader>
|
||||||
|
<CardTitle className="text-2xl font-bold">Connect Linkup API</CardTitle>
|
||||||
|
<CardDescription>
|
||||||
|
Integrate with Linkup API to enhance your search capabilities with AI-powered search results.
|
||||||
|
</CardDescription>
|
||||||
|
</CardHeader>
|
||||||
|
<CardContent>
|
||||||
|
<Alert className="mb-6 bg-muted">
|
||||||
|
<Info className="h-4 w-4" />
|
||||||
|
<AlertTitle>API Key Required</AlertTitle>
|
||||||
|
<AlertDescription>
|
||||||
|
You'll need a Linkup API key to use this connector. You can get one by signing up at{" "}
|
||||||
|
<a
|
||||||
|
href="https://linkup.so"
|
||||||
|
target="_blank"
|
||||||
|
rel="noopener noreferrer"
|
||||||
|
className="font-medium underline underline-offset-4"
|
||||||
|
>
|
||||||
|
linkup.so
|
||||||
|
</a>
|
||||||
|
</AlertDescription>
|
||||||
|
</Alert>
|
||||||
|
|
||||||
|
<Form {...form}>
|
||||||
|
<form onSubmit={form.handleSubmit(onSubmit)} className="space-y-6">
|
||||||
|
<FormField
|
||||||
|
control={form.control}
|
||||||
|
name="name"
|
||||||
|
render={({ field }) => (
|
||||||
|
<FormItem>
|
||||||
|
<FormLabel>Connector Name</FormLabel>
|
||||||
|
<FormControl>
|
||||||
|
<Input placeholder="My Linkup API Connector" {...field} />
|
||||||
|
</FormControl>
|
||||||
|
<FormDescription>
|
||||||
|
A friendly name to identify this connector.
|
||||||
|
</FormDescription>
|
||||||
|
<FormMessage />
|
||||||
|
</FormItem>
|
||||||
|
)}
|
||||||
|
/>
|
||||||
|
|
||||||
|
<FormField
|
||||||
|
control={form.control}
|
||||||
|
name="api_key"
|
||||||
|
render={({ field }) => (
|
||||||
|
<FormItem>
|
||||||
|
<FormLabel>Linkup API Key</FormLabel>
|
||||||
|
<FormControl>
|
||||||
|
<Input
|
||||||
|
type="password"
|
||||||
|
placeholder="Enter your Linkup API key"
|
||||||
|
{...field}
|
||||||
|
/>
|
||||||
|
</FormControl>
|
||||||
|
<FormDescription>
|
||||||
|
Your API key will be encrypted and stored securely.
|
||||||
|
</FormDescription>
|
||||||
|
<FormMessage />
|
||||||
|
</FormItem>
|
||||||
|
)}
|
||||||
|
/>
|
||||||
|
|
||||||
|
<div className="flex justify-end">
|
||||||
|
<Button
|
||||||
|
type="submit"
|
||||||
|
disabled={isSubmitting}
|
||||||
|
className="w-full sm:w-auto"
|
||||||
|
>
|
||||||
|
{isSubmitting ? (
|
||||||
|
<>
|
||||||
|
<Loader2 className="mr-2 h-4 w-4 animate-spin" />
|
||||||
|
Connecting...
|
||||||
|
</>
|
||||||
|
) : (
|
||||||
|
<>
|
||||||
|
<Check className="mr-2 h-4 w-4" />
|
||||||
|
Connect Linkup API
|
||||||
|
</>
|
||||||
|
)}
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
</form>
|
||||||
|
</Form>
|
||||||
|
</CardContent>
|
||||||
|
<CardFooter className="flex flex-col items-start border-t bg-muted/50 px-6 py-4">
|
||||||
|
<h4 className="text-sm font-medium">What you get with Linkup API:</h4>
|
||||||
|
<ul className="mt-2 list-disc pl-5 text-sm text-muted-foreground">
|
||||||
|
<li>AI-powered search results tailored to your queries</li>
|
||||||
|
<li>Real-time information from the web</li>
|
||||||
|
<li>Enhanced search capabilities for your projects</li>
|
||||||
|
</ul>
|
||||||
|
</CardFooter>
|
||||||
|
</Card>
|
||||||
|
</motion.div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
@ -16,6 +16,7 @@ import {
|
||||||
IconWorldWww,
|
IconWorldWww,
|
||||||
IconTicket,
|
IconTicket,
|
||||||
IconLayoutKanban,
|
IconLayoutKanban,
|
||||||
|
IconLinkPlus,
|
||||||
} from "@tabler/icons-react";
|
} from "@tabler/icons-react";
|
||||||
import { AnimatePresence, motion } from "framer-motion";
|
import { AnimatePresence, motion } from "framer-motion";
|
||||||
import Link from "next/link";
|
import Link from "next/link";
|
||||||
|
|
@ -50,7 +51,13 @@ const connectorCategories: ConnectorCategory[] = [
|
||||||
icon: <IconWorldWww className="h-6 w-6" />,
|
icon: <IconWorldWww className="h-6 w-6" />,
|
||||||
status: "available",
|
status: "available",
|
||||||
},
|
},
|
||||||
// Add other search engine connectors like Tavily, Serper if they have UI config
|
{
|
||||||
|
id: "linkup-api",
|
||||||
|
title: "Linkup API",
|
||||||
|
description: "Search the web using the Linkup API",
|
||||||
|
icon: <IconLinkPlus className="h-6 w-6" />,
|
||||||
|
status: "available",
|
||||||
|
},
|
||||||
],
|
],
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
|
|
||||||
|
|
@ -42,34 +42,95 @@ export default function FileUploader() {
|
||||||
const router = useRouter();
|
const router = useRouter();
|
||||||
const fileInputRef = useRef<HTMLInputElement>(null);
|
const fileInputRef = useRef<HTMLInputElement>(null);
|
||||||
|
|
||||||
const acceptedFileTypes = {
|
// Audio files are always supported (using whisper)
|
||||||
'image/bmp': ['.bmp'],
|
const audioFileTypes = {
|
||||||
'text/csv': ['.csv'],
|
'audio/mpeg': ['.mp3', '.mpeg', '.mpga'],
|
||||||
'application/msword': ['.doc'],
|
'audio/mp4': ['.mp4', '.m4a'],
|
||||||
'application/vnd.openxmlformats-officedocument.wordprocessingml.document': ['.docx'],
|
'audio/wav': ['.wav'],
|
||||||
'message/rfc822': ['.eml'],
|
'audio/webm': ['.webm'],
|
||||||
'application/epub+zip': ['.epub'],
|
};
|
||||||
'image/heic': ['.heic'],
|
|
||||||
'text/html': ['.html'],
|
// Conditionally set accepted file types based on ETL service
|
||||||
'image/jpeg': ['.jpeg', '.jpg'],
|
const acceptedFileTypes = process.env.NEXT_PUBLIC_ETL_SERVICE === 'LLAMACLOUD'
|
||||||
'image/png': ['.png'],
|
? {
|
||||||
'text/markdown': ['.md'],
|
// LlamaCloud supported file types
|
||||||
'application/vnd.ms-outlook': ['.msg'],
|
'application/pdf': ['.pdf'],
|
||||||
'application/vnd.oasis.opendocument.text': ['.odt'],
|
'application/msword': ['.doc'],
|
||||||
'text/x-org': ['.org'],
|
'application/vnd.openxmlformats-officedocument.wordprocessingml.document': ['.docx'],
|
||||||
'application/pkcs7-signature': ['.p7s'],
|
'application/vnd.ms-word.document.macroEnabled.12': ['.docm'],
|
||||||
'application/pdf': ['.pdf'],
|
'application/msword-template': ['.dot'],
|
||||||
'application/vnd.ms-powerpoint': ['.ppt'],
|
'application/vnd.ms-word.template.macroEnabled.12': ['.dotm'],
|
||||||
'application/vnd.openxmlformats-officedocument.presentationml.presentation': ['.pptx'],
|
'application/vnd.ms-powerpoint': ['.ppt'],
|
||||||
'text/x-rst': ['.rst'],
|
'application/vnd.ms-powerpoint.template.macroEnabled.12': ['.pptm'],
|
||||||
'application/rtf': ['.rtf'],
|
'application/vnd.openxmlformats-officedocument.presentationml.presentation': ['.pptx'],
|
||||||
'image/tiff': ['.tiff'],
|
'application/vnd.ms-powerpoint.template': ['.pot'],
|
||||||
'text/plain': ['.txt'],
|
'application/vnd.openxmlformats-officedocument.presentationml.template': ['.potx'],
|
||||||
'text/tab-separated-values': ['.tsv'],
|
'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet': ['.xlsx'],
|
||||||
'application/vnd.ms-excel': ['.xls'],
|
'application/vnd.ms-excel': ['.xls'],
|
||||||
'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet': ['.xlsx'],
|
'application/vnd.ms-excel.sheet.macroEnabled.12': ['.xlsm'],
|
||||||
'application/xml': ['.xml'],
|
'application/vnd.ms-excel.sheet.binary.macroEnabled.12': ['.xlsb'],
|
||||||
}
|
'application/vnd.ms-excel.workspace': ['.xlw'],
|
||||||
|
'application/rtf': ['.rtf'],
|
||||||
|
'application/xml': ['.xml'],
|
||||||
|
'application/epub+zip': ['.epub'],
|
||||||
|
'application/vnd.apple.keynote': ['.key'],
|
||||||
|
'application/vnd.apple.pages': ['.pages'],
|
||||||
|
'application/vnd.apple.numbers': ['.numbers'],
|
||||||
|
'application/vnd.wordperfect': ['.wpd'],
|
||||||
|
'application/vnd.oasis.opendocument.text': ['.odt'],
|
||||||
|
'application/vnd.oasis.opendocument.presentation': ['.odp'],
|
||||||
|
'application/vnd.oasis.opendocument.graphics': ['.odg'],
|
||||||
|
'application/vnd.oasis.opendocument.spreadsheet': ['.ods'],
|
||||||
|
'application/vnd.oasis.opendocument.formula': ['.fods'],
|
||||||
|
'text/plain': ['.txt'],
|
||||||
|
'text/csv': ['.csv'],
|
||||||
|
'text/tab-separated-values': ['.tsv'],
|
||||||
|
'text/html': ['.html', '.htm', '.web'],
|
||||||
|
'image/jpeg': ['.jpg', '.jpeg'],
|
||||||
|
'image/png': ['.png'],
|
||||||
|
'image/gif': ['.gif'],
|
||||||
|
'image/bmp': ['.bmp'],
|
||||||
|
'image/svg+xml': ['.svg'],
|
||||||
|
'image/tiff': ['.tiff'],
|
||||||
|
'image/webp': ['.webp'],
|
||||||
|
'application/dbase': ['.dbf'],
|
||||||
|
'application/vnd.lotus-1-2-3': ['.123'],
|
||||||
|
'text/x-web-markdown': ['.602', '.abw', '.cgm', '.cwk', '.hwp', '.lwp', '.mw', '.mcw', '.pbd', '.sda', '.sdd', '.sdp', '.sdw', '.sgl', '.sti', '.sxi', '.sxw', '.stw', '.sxg', '.uof', '.uop', '.uot', '.vor', '.wps', '.zabw'],
|
||||||
|
'text/x-spreadsheet': ['.dif', '.sylk', '.slk', '.prn', '.et', '.uos1', '.uos2', '.wk1', '.wk2', '.wk3', '.wk4', '.wks', '.wq1', '.wq2', '.wb1', '.wb2', '.wb3', '.qpw', '.xlr', '.eth'],
|
||||||
|
// Audio files (always supported)
|
||||||
|
...audioFileTypes,
|
||||||
|
}
|
||||||
|
: {
|
||||||
|
// Unstructured supported file types
|
||||||
|
'image/bmp': ['.bmp'],
|
||||||
|
'text/csv': ['.csv'],
|
||||||
|
'application/msword': ['.doc'],
|
||||||
|
'application/vnd.openxmlformats-officedocument.wordprocessingml.document': ['.docx'],
|
||||||
|
'message/rfc822': ['.eml'],
|
||||||
|
'application/epub+zip': ['.epub'],
|
||||||
|
'image/heic': ['.heic'],
|
||||||
|
'text/html': ['.html'],
|
||||||
|
'image/jpeg': ['.jpeg', '.jpg'],
|
||||||
|
'image/png': ['.png'],
|
||||||
|
'text/markdown': ['.md', '.markdown'],
|
||||||
|
'application/vnd.ms-outlook': ['.msg'],
|
||||||
|
'application/vnd.oasis.opendocument.text': ['.odt'],
|
||||||
|
'text/x-org': ['.org'],
|
||||||
|
'application/pkcs7-signature': ['.p7s'],
|
||||||
|
'application/pdf': ['.pdf'],
|
||||||
|
'application/vnd.ms-powerpoint': ['.ppt'],
|
||||||
|
'application/vnd.openxmlformats-officedocument.presentationml.presentation': ['.pptx'],
|
||||||
|
'text/x-rst': ['.rst'],
|
||||||
|
'application/rtf': ['.rtf'],
|
||||||
|
'image/tiff': ['.tiff'],
|
||||||
|
'text/plain': ['.txt'],
|
||||||
|
'text/tab-separated-values': ['.tsv'],
|
||||||
|
'application/vnd.ms-excel': ['.xls'],
|
||||||
|
'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet': ['.xlsx'],
|
||||||
|
'application/xml': ['.xml'],
|
||||||
|
// Audio files (always supported)
|
||||||
|
...audioFileTypes,
|
||||||
|
};
|
||||||
|
|
||||||
const supportedExtensions = Array.from(new Set(Object.values(acceptedFileTypes).flat())).sort()
|
const supportedExtensions = Array.from(new Set(Object.values(acceptedFileTypes).flat())).sort()
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -73,6 +73,13 @@ export default function DashboardLayout({
|
||||||
},
|
},
|
||||||
],
|
],
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
title: "Podcasts",
|
||||||
|
url: `/dashboard/${search_space_id}/podcasts`,
|
||||||
|
icon: "Podcast",
|
||||||
|
items: [
|
||||||
|
],
|
||||||
|
}
|
||||||
// TODO: Add research synthesizer's
|
// TODO: Add research synthesizer's
|
||||||
// {
|
// {
|
||||||
// title: "Research Synthesizer's",
|
// title: "Research Synthesizer's",
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,20 @@
|
||||||
|
import { Suspense } from 'react';
|
||||||
|
import PodcastsPageClient from './podcasts-client';
|
||||||
|
|
||||||
|
interface PageProps {
|
||||||
|
params: {
|
||||||
|
search_space_id: string;
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
export default async function PodcastsPage({ params }: PageProps) {
|
||||||
|
const { search_space_id: searchSpaceId } = await Promise.resolve(params);
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Suspense fallback={<div className="flex items-center justify-center h-[60vh]">
|
||||||
|
<div className="h-8 w-8 animate-spin rounded-full border-4 border-primary border-t-transparent"></div>
|
||||||
|
</div>}>
|
||||||
|
<PodcastsPageClient searchSpaceId={searchSpaceId} />
|
||||||
|
</Suspense>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
@ -0,0 +1,968 @@
|
||||||
|
'use client';
|
||||||
|
|
||||||
|
import { format } from 'date-fns';
|
||||||
|
import { AnimatePresence, motion } from 'framer-motion';
|
||||||
|
import {
|
||||||
|
Calendar,
|
||||||
|
MoreHorizontal,
|
||||||
|
Pause,
|
||||||
|
Play,
|
||||||
|
Podcast,
|
||||||
|
Search,
|
||||||
|
SkipBack,
|
||||||
|
SkipForward,
|
||||||
|
Trash2,
|
||||||
|
Volume2, VolumeX
|
||||||
|
} from 'lucide-react';
|
||||||
|
import { useEffect, useRef, useState } from 'react';
|
||||||
|
|
||||||
|
// UI Components
|
||||||
|
import { Button } from '@/components/ui/button';
|
||||||
|
import { Card } from '@/components/ui/card';
|
||||||
|
import {
|
||||||
|
Dialog,
|
||||||
|
DialogContent,
|
||||||
|
DialogDescription,
|
||||||
|
DialogFooter,
|
||||||
|
DialogHeader,
|
||||||
|
DialogTitle,
|
||||||
|
} from "@/components/ui/dialog";
|
||||||
|
import {
|
||||||
|
DropdownMenu,
|
||||||
|
DropdownMenuContent,
|
||||||
|
DropdownMenuItem,
|
||||||
|
DropdownMenuTrigger
|
||||||
|
} from '@/components/ui/dropdown-menu';
|
||||||
|
import { Input } from '@/components/ui/input';
|
||||||
|
import {
|
||||||
|
Select,
|
||||||
|
SelectContent,
|
||||||
|
SelectGroup,
|
||||||
|
SelectItem,
|
||||||
|
SelectTrigger,
|
||||||
|
SelectValue,
|
||||||
|
} from "@/components/ui/select";
|
||||||
|
import { Slider } from '@/components/ui/slider';
|
||||||
|
import { toast } from "sonner";
|
||||||
|
|
||||||
|
interface PodcastItem {
|
||||||
|
id: number;
|
||||||
|
title: string;
|
||||||
|
created_at: string;
|
||||||
|
file_location: string;
|
||||||
|
podcast_transcript: any[];
|
||||||
|
search_space_id: number;
|
||||||
|
}
|
||||||
|
|
||||||
|
interface PodcastsPageClientProps {
|
||||||
|
searchSpaceId: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
const pageVariants = {
|
||||||
|
initial: { opacity: 0 },
|
||||||
|
enter: { opacity: 1, transition: { duration: 0.4, ease: 'easeInOut', staggerChildren: 0.1 } },
|
||||||
|
exit: { opacity: 0, transition: { duration: 0.3, ease: 'easeInOut' } }
|
||||||
|
};
|
||||||
|
|
||||||
|
const podcastCardVariants = {
|
||||||
|
initial: { scale: 0.95, y: 20, opacity: 0 },
|
||||||
|
animate: { scale: 1, y: 0, opacity: 1, transition: { type: "spring", stiffness: 300, damping: 25 } },
|
||||||
|
exit: { scale: 0.95, y: -20, opacity: 0 },
|
||||||
|
hover: { y: -5, scale: 1.02, transition: { duration: 0.2 } }
|
||||||
|
};
|
||||||
|
|
||||||
|
const MotionCard = motion(Card);
|
||||||
|
|
||||||
|
export default function PodcastsPageClient({ searchSpaceId }: PodcastsPageClientProps) {
|
||||||
|
const [podcasts, setPodcasts] = useState<PodcastItem[]>([]);
|
||||||
|
const [filteredPodcasts, setFilteredPodcasts] = useState<PodcastItem[]>([]);
|
||||||
|
const [isLoading, setIsLoading] = useState(true);
|
||||||
|
const [error, setError] = useState<string | null>(null);
|
||||||
|
const [searchQuery, setSearchQuery] = useState('');
|
||||||
|
const [sortOrder, setSortOrder] = useState<string>('newest');
|
||||||
|
const [deleteDialogOpen, setDeleteDialogOpen] = useState(false);
|
||||||
|
const [podcastToDelete, setPodcastToDelete] = useState<{ id: number, title: string } | null>(null);
|
||||||
|
const [isDeleting, setIsDeleting] = useState(false);
|
||||||
|
|
||||||
|
// Audio player state
|
||||||
|
const [currentPodcast, setCurrentPodcast] = useState<PodcastItem | null>(null);
|
||||||
|
const [audioSrc, setAudioSrc] = useState<string | undefined>(undefined);
|
||||||
|
const [isAudioLoading, setIsAudioLoading] = useState(false);
|
||||||
|
const [isPlaying, setIsPlaying] = useState(false);
|
||||||
|
const [currentTime, setCurrentTime] = useState(0);
|
||||||
|
const [duration, setDuration] = useState(0);
|
||||||
|
const [volume, setVolume] = useState(0.7);
|
||||||
|
const [isMuted, setIsMuted] = useState(false);
|
||||||
|
const audioRef = useRef<HTMLAudioElement | null>(null);
|
||||||
|
const currentObjectUrlRef = useRef<string | null>(null);
|
||||||
|
|
||||||
|
// Add podcast image URL constant
|
||||||
|
const PODCAST_IMAGE_URL = "https://static.vecteezy.com/system/resources/thumbnails/002/157/611/small_2x/illustrations-concept-design-podcast-channel-free-vector.jpg";
|
||||||
|
|
||||||
|
// Fetch podcasts from API
|
||||||
|
useEffect(() => {
|
||||||
|
const fetchPodcasts = async () => {
|
||||||
|
try {
|
||||||
|
setIsLoading(true);
|
||||||
|
|
||||||
|
// Get token from localStorage
|
||||||
|
const token = localStorage.getItem('surfsense_bearer_token');
|
||||||
|
|
||||||
|
if (!token) {
|
||||||
|
setError('Authentication token not found. Please log in again.');
|
||||||
|
setIsLoading(false);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Fetch all podcasts for this search space
|
||||||
|
const response = await fetch(
|
||||||
|
`${process.env.NEXT_PUBLIC_FASTAPI_BACKEND_URL}/api/v1/podcasts/`,
|
||||||
|
{
|
||||||
|
headers: {
|
||||||
|
'Authorization': `Bearer ${token}`,
|
||||||
|
'Content-Type': 'application/json',
|
||||||
|
},
|
||||||
|
cache: 'no-store',
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
|
if (!response.ok) {
|
||||||
|
const errorData = await response.json().catch(() => null);
|
||||||
|
throw new Error(`Failed to fetch podcasts: ${response.status} ${errorData?.detail || ''}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const data: PodcastItem[] = await response.json();
|
||||||
|
setPodcasts(data);
|
||||||
|
setFilteredPodcasts(data);
|
||||||
|
setError(null);
|
||||||
|
} catch (error) {
|
||||||
|
console.error('Error fetching podcasts:', error);
|
||||||
|
setError(error instanceof Error ? error.message : 'Unknown error occurred');
|
||||||
|
setPodcasts([]);
|
||||||
|
setFilteredPodcasts([]);
|
||||||
|
} finally {
|
||||||
|
setIsLoading(false);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
fetchPodcasts();
|
||||||
|
}, [searchSpaceId]);
|
||||||
|
|
||||||
|
// Filter and sort podcasts based on search query and sort order
|
||||||
|
useEffect(() => {
|
||||||
|
let result = [...podcasts];
|
||||||
|
|
||||||
|
// Filter by search term
|
||||||
|
if (searchQuery) {
|
||||||
|
const query = searchQuery.toLowerCase();
|
||||||
|
result = result.filter(podcast =>
|
||||||
|
podcast.title.toLowerCase().includes(query)
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Filter by search space
|
||||||
|
result = result.filter(podcast =>
|
||||||
|
podcast.search_space_id === parseInt(searchSpaceId)
|
||||||
|
);
|
||||||
|
|
||||||
|
// Sort podcasts
|
||||||
|
result.sort((a, b) => {
|
||||||
|
const dateA = new Date(a.created_at).getTime();
|
||||||
|
const dateB = new Date(b.created_at).getTime();
|
||||||
|
|
||||||
|
return sortOrder === 'newest' ? dateB - dateA : dateA - dateB;
|
||||||
|
});
|
||||||
|
|
||||||
|
setFilteredPodcasts(result);
|
||||||
|
}, [podcasts, searchQuery, sortOrder, searchSpaceId]);
|
||||||
|
|
||||||
|
// Cleanup object URL on unmount or when currentPodcast changes
|
||||||
|
useEffect(() => {
|
||||||
|
return () => {
|
||||||
|
if (currentObjectUrlRef.current) {
|
||||||
|
URL.revokeObjectURL(currentObjectUrlRef.current);
|
||||||
|
currentObjectUrlRef.current = null;
|
||||||
|
}
|
||||||
|
};
|
||||||
|
}, []);
|
||||||
|
|
||||||
|
// Audio player time update handler
|
||||||
|
const handleTimeUpdate = () => {
|
||||||
|
if (audioRef.current) {
|
||||||
|
setCurrentTime(audioRef.current.currentTime);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Audio player metadata loaded handler
|
||||||
|
const handleMetadataLoaded = () => {
|
||||||
|
if (audioRef.current) {
|
||||||
|
setDuration(audioRef.current.duration);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Play/pause toggle
|
||||||
|
const togglePlayPause = () => {
|
||||||
|
if (audioRef.current) {
|
||||||
|
if (isPlaying) {
|
||||||
|
audioRef.current.pause();
|
||||||
|
} else {
|
||||||
|
audioRef.current.play();
|
||||||
|
}
|
||||||
|
setIsPlaying(!isPlaying);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Seek to position
|
||||||
|
const handleSeek = (value: number[]) => {
|
||||||
|
if (audioRef.current) {
|
||||||
|
audioRef.current.currentTime = value[0];
|
||||||
|
setCurrentTime(value[0]);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Volume change
|
||||||
|
const handleVolumeChange = (value: number[]) => {
|
||||||
|
if (audioRef.current) {
|
||||||
|
const newVolume = value[0];
|
||||||
|
|
||||||
|
// Set volume
|
||||||
|
audioRef.current.volume = newVolume;
|
||||||
|
setVolume(newVolume);
|
||||||
|
|
||||||
|
// Handle mute state based on volume
|
||||||
|
if (newVolume === 0) {
|
||||||
|
audioRef.current.muted = true;
|
||||||
|
setIsMuted(true);
|
||||||
|
} else {
|
||||||
|
audioRef.current.muted = false;
|
||||||
|
setIsMuted(false);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Toggle mute
|
||||||
|
const toggleMute = () => {
|
||||||
|
if (audioRef.current) {
|
||||||
|
const newMutedState = !isMuted;
|
||||||
|
audioRef.current.muted = newMutedState;
|
||||||
|
setIsMuted(newMutedState);
|
||||||
|
|
||||||
|
// If unmuting, restore previous volume if it was 0
|
||||||
|
if (!newMutedState && volume === 0) {
|
||||||
|
const restoredVolume = 0.5;
|
||||||
|
audioRef.current.volume = restoredVolume;
|
||||||
|
setVolume(restoredVolume);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Skip forward 10 seconds
|
||||||
|
const skipForward = () => {
|
||||||
|
if (audioRef.current) {
|
||||||
|
audioRef.current.currentTime = Math.min(audioRef.current.duration, audioRef.current.currentTime + 10);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Skip backward 10 seconds
|
||||||
|
const skipBackward = () => {
|
||||||
|
if (audioRef.current) {
|
||||||
|
audioRef.current.currentTime = Math.max(0, audioRef.current.currentTime - 10);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Format time in MM:SS
|
||||||
|
const formatTime = (time: number) => {
|
||||||
|
const minutes = Math.floor(time / 60);
|
||||||
|
const seconds = Math.floor(time % 60);
|
||||||
|
return `${minutes}:${seconds < 10 ? '0' : ''}${seconds}`;
|
||||||
|
};
|
||||||
|
|
||||||
|
// Play podcast - Fetch blob and set object URL
|
||||||
|
const playPodcast = async (podcast: PodcastItem) => {
|
||||||
|
// If the same podcast is selected, just toggle play/pause
|
||||||
|
if (currentPodcast && currentPodcast.id === podcast.id) {
|
||||||
|
togglePlayPause();
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Prevent multiple simultaneous loading requests
|
||||||
|
if (isAudioLoading) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
try {
|
||||||
|
// Reset player state and show loading
|
||||||
|
setCurrentPodcast(podcast);
|
||||||
|
setAudioSrc(undefined);
|
||||||
|
setCurrentTime(0);
|
||||||
|
setDuration(0);
|
||||||
|
setIsPlaying(false);
|
||||||
|
setIsAudioLoading(true);
|
||||||
|
|
||||||
|
const token = localStorage.getItem('surfsense_bearer_token');
|
||||||
|
if (!token) {
|
||||||
|
throw new Error('Authentication token not found.');
|
||||||
|
}
|
||||||
|
|
||||||
|
// Revoke previous object URL if exists (only after we've started the new request)
|
||||||
|
if (currentObjectUrlRef.current) {
|
||||||
|
URL.revokeObjectURL(currentObjectUrlRef.current);
|
||||||
|
currentObjectUrlRef.current = null;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Use AbortController to handle timeout or cancellation
|
||||||
|
const controller = new AbortController();
|
||||||
|
const timeoutId = setTimeout(() => controller.abort(), 30000); // 30 second timeout
|
||||||
|
|
||||||
|
try {
|
||||||
|
const response = await fetch(
|
||||||
|
`${process.env.NEXT_PUBLIC_FASTAPI_BACKEND_URL}/api/v1/podcasts/${podcast.id}/stream`,
|
||||||
|
{
|
||||||
|
headers: {
|
||||||
|
'Authorization': `Bearer ${token}`,
|
||||||
|
},
|
||||||
|
signal: controller.signal
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
|
if (!response.ok) {
|
||||||
|
throw new Error(`Failed to fetch audio stream: ${response.statusText}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const blob = await response.blob();
|
||||||
|
const objectUrl = URL.createObjectURL(blob);
|
||||||
|
currentObjectUrlRef.current = objectUrl;
|
||||||
|
|
||||||
|
// Set audio source
|
||||||
|
setAudioSrc(objectUrl);
|
||||||
|
|
||||||
|
// Wait for the audio to be ready before playing
|
||||||
|
// We'll handle actual playback in the onLoadedData event instead of here
|
||||||
|
} catch (error) {
|
||||||
|
if (error instanceof DOMException && error.name === 'AbortError') {
|
||||||
|
throw new Error('Request timed out. Please try again.');
|
||||||
|
}
|
||||||
|
throw error;
|
||||||
|
} finally {
|
||||||
|
clearTimeout(timeoutId);
|
||||||
|
}
|
||||||
|
} catch (error) {
|
||||||
|
console.error('Error fetching or playing podcast:', error);
|
||||||
|
toast.error(error instanceof Error ? error.message : 'Failed to load podcast audio.');
|
||||||
|
// Reset state on error
|
||||||
|
setCurrentPodcast(null);
|
||||||
|
setAudioSrc(undefined);
|
||||||
|
} finally {
|
||||||
|
setIsAudioLoading(false);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Function to handle podcast deletion
|
||||||
|
const handleDeletePodcast = async () => {
|
||||||
|
if (!podcastToDelete) return;
|
||||||
|
|
||||||
|
setIsDeleting(true);
|
||||||
|
try {
|
||||||
|
const token = localStorage.getItem('surfsense_bearer_token');
|
||||||
|
if (!token) {
|
||||||
|
setIsDeleting(false);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
const response = await fetch(`${process.env.NEXT_PUBLIC_FASTAPI_BACKEND_URL}/api/v1/podcasts/${podcastToDelete.id}`, {
|
||||||
|
method: 'DELETE',
|
||||||
|
headers: {
|
||||||
|
'Authorization': `Bearer ${token}`,
|
||||||
|
'Content-Type': 'application/json',
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
if (!response.ok) {
|
||||||
|
throw new Error(`Failed to delete podcast: ${response.statusText}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Close dialog and refresh podcasts
|
||||||
|
setDeleteDialogOpen(false);
|
||||||
|
setPodcastToDelete(null);
|
||||||
|
|
||||||
|
// Update local state by removing the deleted podcast
|
||||||
|
setPodcasts(prevPodcasts => prevPodcasts.filter(podcast => podcast.id !== podcastToDelete.id));
|
||||||
|
|
||||||
|
// If the current playing podcast is deleted, stop playback
|
||||||
|
if (currentPodcast && currentPodcast.id === podcastToDelete.id) {
|
||||||
|
if (audioRef.current) {
|
||||||
|
audioRef.current.pause();
|
||||||
|
}
|
||||||
|
setCurrentPodcast(null);
|
||||||
|
setIsPlaying(false);
|
||||||
|
}
|
||||||
|
|
||||||
|
toast.success('Podcast deleted successfully');
|
||||||
|
} catch (error) {
|
||||||
|
console.error('Error deleting podcast:', error);
|
||||||
|
toast.error(error instanceof Error ? error.message : 'Failed to delete podcast');
|
||||||
|
} finally {
|
||||||
|
setIsDeleting(false);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
return (
|
||||||
|
<motion.div
|
||||||
|
className="container p-6 mx-auto"
|
||||||
|
initial="initial"
|
||||||
|
animate="enter"
|
||||||
|
exit="exit"
|
||||||
|
variants={pageVariants}
|
||||||
|
>
|
||||||
|
<div className="flex flex-col space-y-4 md:space-y-6">
|
||||||
|
<div className="flex flex-col space-y-2">
|
||||||
|
<h1 className="text-3xl font-bold tracking-tight">Podcasts</h1>
|
||||||
|
<p className="text-muted-foreground">Listen to generated podcasts.</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{/* Filter and Search Bar */}
|
||||||
|
<div className="flex flex-col space-y-4 md:flex-row md:items-center md:justify-between md:space-y-0">
|
||||||
|
<div className="flex flex-1 items-center gap-2">
|
||||||
|
<div className="relative w-full md:w-80">
|
||||||
|
<Search className="absolute left-2.5 top-2.5 h-4 w-4 text-muted-foreground" />
|
||||||
|
<Input
|
||||||
|
type="text"
|
||||||
|
placeholder="Search podcasts..."
|
||||||
|
className="pl-8"
|
||||||
|
value={searchQuery}
|
||||||
|
onChange={(e) => setSearchQuery(e.target.value)}
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div>
|
||||||
|
<Select value={sortOrder} onValueChange={setSortOrder}>
|
||||||
|
<SelectTrigger className="w-40">
|
||||||
|
<SelectValue placeholder="Sort order" />
|
||||||
|
</SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
<SelectGroup>
|
||||||
|
<SelectItem value="newest">Newest First</SelectItem>
|
||||||
|
<SelectItem value="oldest">Oldest First</SelectItem>
|
||||||
|
</SelectGroup>
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{/* Status Messages */}
|
||||||
|
{isLoading && (
|
||||||
|
<div className="flex items-center justify-center h-40">
|
||||||
|
<div className="flex flex-col items-center gap-2">
|
||||||
|
<div className="h-8 w-8 animate-spin rounded-full border-4 border-primary border-t-transparent"></div>
|
||||||
|
<p className="text-sm text-muted-foreground">Loading podcasts...</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{error && !isLoading && (
|
||||||
|
<div className="border border-destructive/50 text-destructive p-4 rounded-md">
|
||||||
|
<h3 className="font-medium">Error loading podcasts</h3>
|
||||||
|
<p className="text-sm">{error}</p>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{!isLoading && !error && filteredPodcasts.length === 0 && (
|
||||||
|
<div className="flex flex-col items-center justify-center h-40 gap-2 text-center">
|
||||||
|
<Podcast className="h-8 w-8 text-muted-foreground" />
|
||||||
|
<h3 className="font-medium">No podcasts found</h3>
|
||||||
|
<p className="text-sm text-muted-foreground">
|
||||||
|
{searchQuery
|
||||||
|
? 'Try adjusting your search filters'
|
||||||
|
: 'Generate podcasts from your chats to get started'}
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{/* Podcast Grid */}
|
||||||
|
{!isLoading && !error && filteredPodcasts.length > 0 && (
|
||||||
|
<AnimatePresence mode="wait">
|
||||||
|
<motion.div
|
||||||
|
className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-6"
|
||||||
|
variants={pageVariants}
|
||||||
|
initial="initial"
|
||||||
|
animate="enter"
|
||||||
|
exit="exit"
|
||||||
|
>
|
||||||
|
{filteredPodcasts.map((podcast, index) => (
|
||||||
|
<MotionCard
|
||||||
|
key={podcast.id}
|
||||||
|
variants={podcastCardVariants}
|
||||||
|
initial="initial"
|
||||||
|
animate="animate"
|
||||||
|
exit="exit"
|
||||||
|
whileHover="hover"
|
||||||
|
transition={{ duration: 0.2, delay: index * 0.05 }}
|
||||||
|
className={`
|
||||||
|
bg-card/60 dark:bg-card/40 backdrop-blur-lg rounded-xl p-4
|
||||||
|
shadow-md hover:shadow-xl transition-all duration-300
|
||||||
|
border-border overflow-hidden cursor-pointer
|
||||||
|
${currentPodcast?.id === podcast.id ? 'ring-2 ring-primary ring-offset-2 ring-offset-background' : ''}
|
||||||
|
`}
|
||||||
|
layout
|
||||||
|
onClick={() => playPodcast(podcast)}
|
||||||
|
>
|
||||||
|
<div
|
||||||
|
className="relative w-full aspect-[16/10] mb-4 rounded-lg overflow-hidden"
|
||||||
|
>
|
||||||
|
{/* Podcast image with gradient overlay */}
|
||||||
|
<img
|
||||||
|
src={PODCAST_IMAGE_URL}
|
||||||
|
alt="Podcast illustration"
|
||||||
|
className="w-full h-full object-cover transition-transform duration-500 group-hover:scale-105 brightness-[0.85] contrast-[1.1]"
|
||||||
|
loading="lazy"
|
||||||
|
/>
|
||||||
|
|
||||||
|
{/* Better overlay with gradient for improved text legibility */}
|
||||||
|
<div className="absolute inset-0 bg-gradient-to-t from-black/60 to-black/10 transition-opacity duration-300"></div>
|
||||||
|
|
||||||
|
{/* Loading indicator with improved animation */}
|
||||||
|
{currentPodcast?.id === podcast.id && isAudioLoading && (
|
||||||
|
<motion.div
|
||||||
|
className="absolute inset-0 flex items-center justify-center bg-background/60 backdrop-blur-md z-10"
|
||||||
|
initial={{ opacity: 0 }}
|
||||||
|
animate={{ opacity: 1 }}
|
||||||
|
exit={{ opacity: 0 }}
|
||||||
|
transition={{ duration: 0.2 }}
|
||||||
|
>
|
||||||
|
<motion.div
|
||||||
|
className="flex flex-col items-center gap-3"
|
||||||
|
initial={{ scale: 0.9 }}
|
||||||
|
animate={{ scale: 1 }}
|
||||||
|
transition={{ type: "spring", damping: 20 }}
|
||||||
|
>
|
||||||
|
<div className="h-14 w-14 rounded-full border-4 border-primary/30 border-t-primary animate-spin"></div>
|
||||||
|
<p className="text-sm text-foreground font-medium">Loading podcast...</p>
|
||||||
|
</motion.div>
|
||||||
|
</motion.div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{/* Play button with animations */}
|
||||||
|
{!(currentPodcast?.id === podcast.id && (isPlaying || isAudioLoading)) && (
|
||||||
|
<motion.div
|
||||||
|
className="absolute top-1/2 left-1/2 -translate-x-1/2 -translate-y-1/2 z-10"
|
||||||
|
whileHover={{ scale: 1.1 }}
|
||||||
|
whileTap={{ scale: 0.9 }}
|
||||||
|
>
|
||||||
|
<Button
|
||||||
|
variant="secondary"
|
||||||
|
size="icon"
|
||||||
|
className="h-16 w-16 rounded-full
|
||||||
|
bg-background/80 hover:bg-background/95 backdrop-blur-md
|
||||||
|
transition-all duration-200 shadow-xl border-0
|
||||||
|
flex items-center justify-center"
|
||||||
|
onClick={(e) => {
|
||||||
|
e.stopPropagation();
|
||||||
|
playPodcast(podcast);
|
||||||
|
}}
|
||||||
|
disabled={isAudioLoading}
|
||||||
|
>
|
||||||
|
<motion.div
|
||||||
|
initial={{ scale: 0.8 }}
|
||||||
|
animate={{ scale: 1 }}
|
||||||
|
transition={{ type: "spring", stiffness: 400, damping: 10 }}
|
||||||
|
className="text-primary w-10 h-10 flex items-center justify-center"
|
||||||
|
>
|
||||||
|
<Play className="h-8 w-8 ml-1" />
|
||||||
|
</motion.div>
|
||||||
|
</Button>
|
||||||
|
</motion.div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{/* Pause button with animations */}
|
||||||
|
{currentPodcast?.id === podcast.id && isPlaying && !isAudioLoading && (
|
||||||
|
<motion.div
|
||||||
|
className="absolute top-1/2 left-1/2 -translate-x-1/2 -translate-y-1/2 z-10"
|
||||||
|
whileHover={{ scale: 1.1 }}
|
||||||
|
whileTap={{ scale: 0.9 }}
|
||||||
|
>
|
||||||
|
<Button
|
||||||
|
variant="secondary"
|
||||||
|
size="icon"
|
||||||
|
className="h-16 w-16 rounded-full
|
||||||
|
bg-background/80 hover:bg-background/95 backdrop-blur-md
|
||||||
|
transition-all duration-200 shadow-xl border-0
|
||||||
|
flex items-center justify-center"
|
||||||
|
onClick={(e) => {
|
||||||
|
e.stopPropagation();
|
||||||
|
togglePlayPause();
|
||||||
|
}}
|
||||||
|
disabled={isAudioLoading}
|
||||||
|
>
|
||||||
|
<motion.div
|
||||||
|
initial={{ scale: 0.8 }}
|
||||||
|
animate={{ scale: 1 }}
|
||||||
|
transition={{ type: "spring", stiffness: 400, damping: 10 }}
|
||||||
|
className="text-primary w-10 h-10 flex items-center justify-center"
|
||||||
|
>
|
||||||
|
<Pause className="h-8 w-8" />
|
||||||
|
</motion.div>
|
||||||
|
</Button>
|
||||||
|
</motion.div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{/* Now playing indicator */}
|
||||||
|
{currentPodcast?.id === podcast.id && !isAudioLoading && (
|
||||||
|
<div className="absolute top-2 left-2 bg-primary text-primary-foreground text-xs px-2 py-1 rounded-full z-10 flex items-center gap-1.5">
|
||||||
|
<span className="relative flex h-2 w-2">
|
||||||
|
<span className="animate-ping absolute inline-flex h-full w-full rounded-full bg-primary-foreground opacity-75"></span>
|
||||||
|
<span className="relative inline-flex rounded-full h-2 w-2 bg-primary-foreground"></span>
|
||||||
|
</span>
|
||||||
|
Now Playing
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="mb-3 px-1">
|
||||||
|
<h3 className="text-base font-semibold text-foreground truncate" title={podcast.title}>
|
||||||
|
{podcast.title || 'Untitled Podcast'}
|
||||||
|
</h3>
|
||||||
|
<p className="text-xs text-muted-foreground mt-0.5 flex items-center gap-1.5">
|
||||||
|
<Calendar className="h-3 w-3" />
|
||||||
|
{format(new Date(podcast.created_at), 'MMM d, yyyy')}
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{currentPodcast?.id === podcast.id && !isAudioLoading && (
|
||||||
|
<motion.div
|
||||||
|
className="mb-3 px-1"
|
||||||
|
initial={{ opacity: 0, y: 5 }}
|
||||||
|
animate={{ opacity: 1, y: 0 }}
|
||||||
|
transition={{ delay: 0.1 }}
|
||||||
|
>
|
||||||
|
<div
|
||||||
|
className="h-1.5 bg-muted rounded-full cursor-pointer group relative overflow-hidden"
|
||||||
|
onClick={(e) => {
|
||||||
|
e.stopPropagation();
|
||||||
|
if (!audioRef.current || !duration) return;
|
||||||
|
const container = e.currentTarget;
|
||||||
|
const rect = container.getBoundingClientRect();
|
||||||
|
const x = e.clientX - rect.left;
|
||||||
|
const percentage = Math.max(0, Math.min(1, x / rect.width));
|
||||||
|
const newTime = percentage * duration;
|
||||||
|
handleSeek([newTime]);
|
||||||
|
}}
|
||||||
|
>
|
||||||
|
<motion.div
|
||||||
|
className="h-full bg-primary rounded-full relative"
|
||||||
|
style={{ width: `${(currentTime / duration) * 100}%` }}
|
||||||
|
transition={{ ease: "linear" }}
|
||||||
|
>
|
||||||
|
<motion.div
|
||||||
|
className="absolute right-0 top-1/2 -translate-y-1/2 w-3 h-3
|
||||||
|
bg-primary rounded-full shadow-md transform scale-0
|
||||||
|
group-hover:scale-100 transition-transform"
|
||||||
|
whileHover={{ scale: 1.5 }}
|
||||||
|
/>
|
||||||
|
</motion.div>
|
||||||
|
</div>
|
||||||
|
<div className="flex justify-between mt-1.5 text-xs text-muted-foreground">
|
||||||
|
<span>{formatTime(currentTime)}</span>
|
||||||
|
<span>{formatTime(duration)}</span>
|
||||||
|
</div>
|
||||||
|
</motion.div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{currentPodcast?.id === podcast.id && !isAudioLoading && (
|
||||||
|
<motion.div
|
||||||
|
className="flex items-center justify-between px-2 mt-1"
|
||||||
|
initial={{ opacity: 0, y: 5 }}
|
||||||
|
animate={{ opacity: 1, y: 0 }}
|
||||||
|
transition={{ delay: 0.2 }}
|
||||||
|
>
|
||||||
|
<motion.div whileHover={{ scale: 1.2 }} whileTap={{ scale: 0.95 }}>
|
||||||
|
<Button
|
||||||
|
variant="ghost"
|
||||||
|
size="icon"
|
||||||
|
onClick={(e) => {
|
||||||
|
e.stopPropagation();
|
||||||
|
skipBackward();
|
||||||
|
}}
|
||||||
|
className="w-9 h-9 text-muted-foreground hover:text-primary transition-colors"
|
||||||
|
title="Rewind 10 seconds"
|
||||||
|
disabled={!duration}
|
||||||
|
>
|
||||||
|
<SkipBack className="w-5 h-5" />
|
||||||
|
</Button>
|
||||||
|
</motion.div>
|
||||||
|
<motion.div whileHover={{ scale: 1.2 }} whileTap={{ scale: 0.95 }}>
|
||||||
|
<Button
|
||||||
|
variant="ghost"
|
||||||
|
size="icon"
|
||||||
|
onClick={(e) => {
|
||||||
|
e.stopPropagation();
|
||||||
|
togglePlayPause();
|
||||||
|
}}
|
||||||
|
className="w-10 h-10 text-primary hover:bg-primary/10 rounded-full transition-colors"
|
||||||
|
disabled={!duration}
|
||||||
|
>
|
||||||
|
{isPlaying ?
|
||||||
|
<Pause className="w-6 h-6" /> :
|
||||||
|
<Play className="w-6 h-6 ml-0.5" />
|
||||||
|
}
|
||||||
|
</Button>
|
||||||
|
</motion.div>
|
||||||
|
<motion.div whileHover={{ scale: 1.2 }} whileTap={{ scale: 0.95 }}>
|
||||||
|
<Button
|
||||||
|
variant="ghost"
|
||||||
|
size="icon"
|
||||||
|
onClick={(e) => {
|
||||||
|
e.stopPropagation();
|
||||||
|
skipForward();
|
||||||
|
}}
|
||||||
|
className="w-9 h-9 text-muted-foreground hover:text-primary transition-colors"
|
||||||
|
title="Forward 10 seconds"
|
||||||
|
disabled={!duration}
|
||||||
|
>
|
||||||
|
<SkipForward className="w-5 h-5" />
|
||||||
|
</Button>
|
||||||
|
</motion.div>
|
||||||
|
</motion.div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
<div className="absolute top-2 right-2 z-20">
|
||||||
|
<DropdownMenu>
|
||||||
|
<DropdownMenuTrigger asChild>
|
||||||
|
<Button
|
||||||
|
variant="ghost"
|
||||||
|
size="icon"
|
||||||
|
className="h-7 w-7 bg-background/50 hover:bg-background/80 rounded-full backdrop-blur-sm"
|
||||||
|
onClick={(e) => e.stopPropagation()}
|
||||||
|
>
|
||||||
|
<MoreHorizontal className="h-4 w-4" />
|
||||||
|
<span className="sr-only">Open menu</span>
|
||||||
|
</Button>
|
||||||
|
</DropdownMenuTrigger>
|
||||||
|
<DropdownMenuContent align="end">
|
||||||
|
<DropdownMenuItem
|
||||||
|
className="text-destructive focus:text-destructive"
|
||||||
|
onClick={(e) => {
|
||||||
|
e.stopPropagation();
|
||||||
|
setPodcastToDelete({ id: podcast.id, title: podcast.title });
|
||||||
|
setDeleteDialogOpen(true);
|
||||||
|
}}
|
||||||
|
>
|
||||||
|
<Trash2 className="mr-2 h-4 w-4" />
|
||||||
|
<span>Delete Podcast</span>
|
||||||
|
</DropdownMenuItem>
|
||||||
|
</DropdownMenuContent>
|
||||||
|
</DropdownMenu>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</MotionCard>
|
||||||
|
))}
|
||||||
|
</motion.div>
|
||||||
|
</AnimatePresence>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{/* Current Podcast Player (Fixed at bottom) */}
|
||||||
|
{currentPodcast && !isAudioLoading && audioSrc && (
|
||||||
|
<motion.div
|
||||||
|
initial={{ y: 100, opacity: 0 }}
|
||||||
|
animate={{ y: 0, opacity: 1 }}
|
||||||
|
exit={{ y: 100, opacity: 0 }}
|
||||||
|
transition={{ type: "spring", stiffness: 300, damping: 30 }}
|
||||||
|
className="fixed bottom-0 left-0 right-0 bg-background/95 backdrop-blur-sm border-t p-4 shadow-lg z-50"
|
||||||
|
>
|
||||||
|
<div className="container mx-auto">
|
||||||
|
<div className="flex flex-col md:flex-row items-center gap-4">
|
||||||
|
<div className="flex-shrink-0">
|
||||||
|
<motion.div
|
||||||
|
className="w-12 h-12 bg-primary/20 rounded-md flex items-center justify-center"
|
||||||
|
animate={{ scale: isPlaying ? [1, 1.05, 1] : 1 }}
|
||||||
|
transition={{ repeat: isPlaying ? Infinity : 0, duration: 2 }}
|
||||||
|
>
|
||||||
|
<Podcast className="h-6 w-6 text-primary" />
|
||||||
|
</motion.div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="flex-grow min-w-0">
|
||||||
|
<h4 className="font-medium text-sm line-clamp-1">{currentPodcast.title}</h4>
|
||||||
|
|
||||||
|
<div className="flex items-center gap-2 mt-2">
|
||||||
|
<div className="flex-grow relative">
|
||||||
|
<Slider
|
||||||
|
value={[currentTime]}
|
||||||
|
min={0}
|
||||||
|
max={duration || 100}
|
||||||
|
step={0.1}
|
||||||
|
onValueChange={handleSeek}
|
||||||
|
className="relative z-10"
|
||||||
|
/>
|
||||||
|
<motion.div
|
||||||
|
className="absolute left-0 top-1/2 h-2 bg-primary/25 rounded-full -translate-y-1/2"
|
||||||
|
style={{ width: `${(currentTime / (duration || 100)) * 100}%` }}
|
||||||
|
transition={{ ease: "linear" }}
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
<div className="flex-shrink-0 text-xs text-muted-foreground whitespace-nowrap">
|
||||||
|
{formatTime(currentTime)} / {formatTime(duration)}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="flex items-center gap-2">
|
||||||
|
<motion.div whileHover={{ scale: 1.1 }} whileTap={{ scale: 0.95 }}>
|
||||||
|
<Button
|
||||||
|
variant="ghost"
|
||||||
|
size="icon"
|
||||||
|
onClick={skipBackward}
|
||||||
|
className="h-8 w-8"
|
||||||
|
>
|
||||||
|
<SkipBack className="h-4 w-4" />
|
||||||
|
</Button>
|
||||||
|
</motion.div>
|
||||||
|
|
||||||
|
<motion.div whileHover={{ scale: 1.1 }} whileTap={{ scale: 0.95 }}>
|
||||||
|
<Button
|
||||||
|
variant="default"
|
||||||
|
size="icon"
|
||||||
|
onClick={togglePlayPause}
|
||||||
|
className="h-10 w-10 rounded-full"
|
||||||
|
>
|
||||||
|
{isPlaying ? <Pause className="h-5 w-5" /> : <Play className="h-5 w-5 ml-0.5" />}
|
||||||
|
</Button>
|
||||||
|
</motion.div>
|
||||||
|
|
||||||
|
<motion.div whileHover={{ scale: 1.1 }} whileTap={{ scale: 0.95 }}>
|
||||||
|
<Button
|
||||||
|
variant="ghost"
|
||||||
|
size="icon"
|
||||||
|
onClick={skipForward}
|
||||||
|
className="h-8 w-8"
|
||||||
|
>
|
||||||
|
<SkipForward className="h-4 w-4" />
|
||||||
|
</Button>
|
||||||
|
</motion.div>
|
||||||
|
|
||||||
|
<div className="hidden md:flex items-center gap-2 ml-4 w-32">
|
||||||
|
<motion.div whileHover={{ scale: 1.1 }} whileTap={{ scale: 0.95 }}>
|
||||||
|
<Button
|
||||||
|
variant="ghost"
|
||||||
|
size="icon"
|
||||||
|
onClick={toggleMute}
|
||||||
|
className={`h-8 w-8 ${isMuted ? "text-muted-foreground" : "text-primary"}`}
|
||||||
|
>
|
||||||
|
{isMuted ? <VolumeX className="h-4 w-4" /> : <Volume2 className="h-4 w-4" />}
|
||||||
|
</Button>
|
||||||
|
</motion.div>
|
||||||
|
|
||||||
|
<div className="relative w-24">
|
||||||
|
<Slider
|
||||||
|
value={[isMuted ? 0 : volume]}
|
||||||
|
min={0}
|
||||||
|
max={1}
|
||||||
|
step={0.01}
|
||||||
|
onValueChange={handleVolumeChange}
|
||||||
|
className="w-24"
|
||||||
|
disabled={isMuted}
|
||||||
|
/>
|
||||||
|
<motion.div
|
||||||
|
className={`absolute left-0 bottom-0 h-1 bg-primary/30 rounded-full ${isMuted ? "opacity-50" : ""}`}
|
||||||
|
initial={false}
|
||||||
|
animate={{ width: `${(isMuted ? 0 : volume) * 100}%` }}
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</motion.div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{/* Delete Confirmation Dialog */}
|
||||||
|
<Dialog open={deleteDialogOpen} onOpenChange={setDeleteDialogOpen}>
|
||||||
|
<DialogContent className="sm:max-w-md">
|
||||||
|
<DialogHeader>
|
||||||
|
<DialogTitle className="flex items-center gap-2">
|
||||||
|
<Trash2 className="h-5 w-5 text-destructive" />
|
||||||
|
<span>Delete Podcast</span>
|
||||||
|
</DialogTitle>
|
||||||
|
<DialogDescription>
|
||||||
|
Are you sure you want to delete <span className="font-medium">{podcastToDelete?.title}</span>? This action cannot be undone.
|
||||||
|
</DialogDescription>
|
||||||
|
</DialogHeader>
|
||||||
|
<DialogFooter className="flex gap-2 sm:justify-end">
|
||||||
|
<Button
|
||||||
|
variant="outline"
|
||||||
|
onClick={() => setDeleteDialogOpen(false)}
|
||||||
|
disabled={isDeleting}
|
||||||
|
>
|
||||||
|
Cancel
|
||||||
|
</Button>
|
||||||
|
<Button
|
||||||
|
variant="destructive"
|
||||||
|
onClick={handleDeletePodcast}
|
||||||
|
disabled={isDeleting}
|
||||||
|
className="gap-2"
|
||||||
|
>
|
||||||
|
{isDeleting ? (
|
||||||
|
<>
|
||||||
|
<span className="h-4 w-4 animate-spin rounded-full border-2 border-current border-t-transparent" />
|
||||||
|
Deleting...
|
||||||
|
</>
|
||||||
|
) : (
|
||||||
|
<>
|
||||||
|
<Trash2 className="h-4 w-4" />
|
||||||
|
Delete
|
||||||
|
</>
|
||||||
|
)}
|
||||||
|
</Button>
|
||||||
|
</DialogFooter>
|
||||||
|
</DialogContent>
|
||||||
|
</Dialog>
|
||||||
|
|
||||||
|
{/* Hidden audio element for playback */}
|
||||||
|
<audio
|
||||||
|
ref={audioRef}
|
||||||
|
src={audioSrc}
|
||||||
|
preload="auto"
|
||||||
|
onTimeUpdate={handleTimeUpdate}
|
||||||
|
onLoadedMetadata={handleMetadataLoaded}
|
||||||
|
onLoadedData={() => {
|
||||||
|
// Only auto-play when audio is fully loaded
|
||||||
|
if (audioRef.current && currentPodcast && audioSrc) {
|
||||||
|
// Small delay to ensure browser is ready to play
|
||||||
|
setTimeout(() => {
|
||||||
|
if (audioRef.current) {
|
||||||
|
audioRef.current.play()
|
||||||
|
.then(() => {
|
||||||
|
setIsPlaying(true);
|
||||||
|
})
|
||||||
|
.catch(error => {
|
||||||
|
console.error('Error playing audio:', error);
|
||||||
|
// Don't show error if it's just the user navigating away
|
||||||
|
if (error.name !== 'AbortError') {
|
||||||
|
toast.error('Failed to play audio.');
|
||||||
|
}
|
||||||
|
setIsPlaying(false);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}, 100);
|
||||||
|
}
|
||||||
|
}}
|
||||||
|
onEnded={() => setIsPlaying(false)}
|
||||||
|
onError={(e) => {
|
||||||
|
console.error('Audio error:', e);
|
||||||
|
if (audioRef.current?.error) {
|
||||||
|
// Log the specific error code for debugging
|
||||||
|
console.error('Audio error code:', audioRef.current.error.code);
|
||||||
|
|
||||||
|
// Don't show error message for aborted loads
|
||||||
|
if (audioRef.current.error.code !== audioRef.current.error.MEDIA_ERR_ABORTED) {
|
||||||
|
toast.error('Error playing audio. Please try again.');
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// Reset playing state on error
|
||||||
|
setIsPlaying(false);
|
||||||
|
}}
|
||||||
|
/>
|
||||||
|
</motion.div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
@ -13,7 +13,9 @@ import {
|
||||||
ArrowDown,
|
ArrowDown,
|
||||||
CircleUser,
|
CircleUser,
|
||||||
Database,
|
Database,
|
||||||
SendHorizontal
|
SendHorizontal,
|
||||||
|
FileText,
|
||||||
|
Grid3x3
|
||||||
} from 'lucide-react';
|
} from 'lucide-react';
|
||||||
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card';
|
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card';
|
||||||
import { Button } from '@/components/ui/button';
|
import { Button } from '@/components/ui/button';
|
||||||
|
|
@ -46,7 +48,6 @@ import {
|
||||||
researcherOptions
|
researcherOptions
|
||||||
} from '@/components/chat';
|
} from '@/components/chat';
|
||||||
import { MarkdownViewer } from '@/components/markdown-viewer';
|
import { MarkdownViewer } from '@/components/markdown-viewer';
|
||||||
import { connectorSourcesMenu as defaultConnectorSourcesMenu } from '@/components/chat/connector-sources';
|
|
||||||
import { Logo } from '@/components/Logo';
|
import { Logo } from '@/components/Logo';
|
||||||
import { useSearchSourceConnectors } from '@/hooks';
|
import { useSearchSourceConnectors } from '@/hooks';
|
||||||
|
|
||||||
|
|
@ -239,7 +240,6 @@ const SourcesDialogContent = ({
|
||||||
|
|
||||||
const ChatPage = () => {
|
const ChatPage = () => {
|
||||||
const [token, setToken] = React.useState<string | null>(null);
|
const [token, setToken] = React.useState<string | null>(null);
|
||||||
const [activeTab, setActiveTab] = useState("");
|
|
||||||
const [dialogOpenId, setDialogOpenId] = useState<number | null>(null);
|
const [dialogOpenId, setDialogOpenId] = useState<number | null>(null);
|
||||||
const [sourcesPage, setSourcesPage] = useState(1);
|
const [sourcesPage, setSourcesPage] = useState(1);
|
||||||
const [expandedSources, setExpandedSources] = useState(false);
|
const [expandedSources, setExpandedSources] = useState(false);
|
||||||
|
|
@ -249,10 +249,10 @@ const ChatPage = () => {
|
||||||
const tabsListRef = useRef<HTMLDivElement>(null);
|
const tabsListRef = useRef<HTMLDivElement>(null);
|
||||||
const [terminalExpanded, setTerminalExpanded] = useState(false);
|
const [terminalExpanded, setTerminalExpanded] = useState(false);
|
||||||
const [selectedConnectors, setSelectedConnectors] = useState<string[]>(["CRAWLED_URL"]);
|
const [selectedConnectors, setSelectedConnectors] = useState<string[]>(["CRAWLED_URL"]);
|
||||||
|
const [searchMode, setSearchMode] = useState<'DOCUMENTS' | 'CHUNKS'>('DOCUMENTS');
|
||||||
const [researchMode, setResearchMode] = useState<ResearchMode>("GENERAL");
|
const [researchMode, setResearchMode] = useState<ResearchMode>("GENERAL");
|
||||||
const [currentTime, setCurrentTime] = useState<string>('');
|
const [currentTime, setCurrentTime] = useState<string>('');
|
||||||
const [currentDate, setCurrentDate] = useState<string>('');
|
const [currentDate, setCurrentDate] = useState<string>('');
|
||||||
const [connectorSources, setConnectorSources] = useState<any[]>([]);
|
|
||||||
const terminalMessagesRef = useRef<HTMLDivElement>(null);
|
const terminalMessagesRef = useRef<HTMLDivElement>(null);
|
||||||
const { connectorSourceItems, isLoading: isLoadingConnectors } = useSearchSourceConnectors();
|
const { connectorSourceItems, isLoading: isLoadingConnectors } = useSearchSourceConnectors();
|
||||||
|
|
||||||
|
|
@ -364,7 +364,8 @@ const ChatPage = () => {
|
||||||
data: {
|
data: {
|
||||||
search_space_id: search_space_id,
|
search_space_id: search_space_id,
|
||||||
selected_connectors: selectedConnectors,
|
selected_connectors: selectedConnectors,
|
||||||
research_mode: researchMode
|
research_mode: researchMode,
|
||||||
|
search_mode: searchMode
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
onError: (error) => {
|
onError: (error) => {
|
||||||
|
|
@ -476,43 +477,10 @@ const ChatPage = () => {
|
||||||
updateChat();
|
updateChat();
|
||||||
}, [messages, status, chat_id, researchMode, selectedConnectors, search_space_id]);
|
}, [messages, status, chat_id, researchMode, selectedConnectors, search_space_id]);
|
||||||
|
|
||||||
// Memoize connector sources to prevent excessive re-renders
|
|
||||||
const processedConnectorSources = React.useMemo(() => {
|
|
||||||
if (messages.length === 0) return connectorSources;
|
|
||||||
|
|
||||||
// Only process when we have a complete message (not streaming)
|
|
||||||
if (status !== 'ready') return connectorSources;
|
|
||||||
|
|
||||||
// Find the latest assistant message
|
|
||||||
const assistantMessages = messages.filter(msg => msg.role === 'assistant');
|
|
||||||
if (assistantMessages.length === 0) return connectorSources;
|
|
||||||
|
|
||||||
const latestAssistantMessage = assistantMessages[assistantMessages.length - 1];
|
|
||||||
if (!latestAssistantMessage?.annotations) return connectorSources;
|
|
||||||
|
|
||||||
// Find the latest SOURCES annotation
|
|
||||||
const annotations = latestAssistantMessage.annotations as any[];
|
|
||||||
const sourcesAnnotations = annotations.filter(a => a.type === 'SOURCES');
|
|
||||||
|
|
||||||
if (sourcesAnnotations.length === 0) return connectorSources;
|
|
||||||
|
|
||||||
const latestSourcesAnnotation = sourcesAnnotations[sourcesAnnotations.length - 1];
|
|
||||||
if (!latestSourcesAnnotation.content) return connectorSources;
|
|
||||||
|
|
||||||
// Use this content if it differs from current
|
|
||||||
return latestSourcesAnnotation.content;
|
|
||||||
}, [messages, status, connectorSources]);
|
|
||||||
|
|
||||||
// Update connector sources when processed value changes
|
|
||||||
useEffect(() => {
|
|
||||||
if (processedConnectorSources !== connectorSources) {
|
|
||||||
setConnectorSources(processedConnectorSources);
|
|
||||||
}
|
|
||||||
}, [processedConnectorSources, connectorSources]);
|
|
||||||
|
|
||||||
// Check and scroll terminal when terminal info is available
|
// Check and scroll terminal when terminal info is available
|
||||||
useEffect(() => {
|
useEffect(() => {
|
||||||
if (messages.length === 0 || status !== 'ready') return;
|
// Modified to trigger during streaming as well (removed status check)
|
||||||
|
if (messages.length === 0) return;
|
||||||
|
|
||||||
// Find the latest assistant message
|
// Find the latest assistant message
|
||||||
const assistantMessages = messages.filter(msg => msg.role === 'assistant');
|
const assistantMessages = messages.filter(msg => msg.role === 'assistant');
|
||||||
|
|
@ -526,10 +494,27 @@ const ChatPage = () => {
|
||||||
const terminalInfoAnnotations = annotations.filter(a => a.type === 'TERMINAL_INFO');
|
const terminalInfoAnnotations = annotations.filter(a => a.type === 'TERMINAL_INFO');
|
||||||
|
|
||||||
if (terminalInfoAnnotations.length > 0) {
|
if (terminalInfoAnnotations.length > 0) {
|
||||||
// Schedule scrolling after the DOM has been updated
|
// Always scroll to bottom when terminal info is updated, even during streaming
|
||||||
setTimeout(scrollTerminalToBottom, 100);
|
scrollTerminalToBottom();
|
||||||
}
|
}
|
||||||
}, [messages, status]);
|
}, [messages]); // Removed status from dependencies to ensure it triggers during streaming
|
||||||
|
|
||||||
|
// Pure function to get connector sources for a specific message
|
||||||
|
const getMessageConnectorSources = (message: any): any[] => {
|
||||||
|
if (!message || message.role !== 'assistant' || !message.annotations) return [];
|
||||||
|
|
||||||
|
// Find all SOURCES annotations
|
||||||
|
const annotations = message.annotations as any[];
|
||||||
|
const sourcesAnnotations = annotations.filter(a => a.type === 'SOURCES');
|
||||||
|
|
||||||
|
// Get the latest SOURCES annotation
|
||||||
|
if (sourcesAnnotations.length === 0) return [];
|
||||||
|
const latestSourcesAnnotation = sourcesAnnotations[sourcesAnnotations.length - 1];
|
||||||
|
|
||||||
|
if (!latestSourcesAnnotation.content) return [];
|
||||||
|
|
||||||
|
return latestSourcesAnnotation.content;
|
||||||
|
};
|
||||||
|
|
||||||
// Custom handleSubmit function to include selected connectors and answer type
|
// Custom handleSubmit function to include selected connectors and answer type
|
||||||
const handleSubmit = (e: React.FormEvent) => {
|
const handleSubmit = (e: React.FormEvent) => {
|
||||||
|
|
@ -561,17 +546,12 @@ const ChatPage = () => {
|
||||||
scrollToBottom();
|
scrollToBottom();
|
||||||
}, [messages]);
|
}, [messages]);
|
||||||
|
|
||||||
// Set activeTab when connectorSources change using a memoized value
|
// Reset sources page when new messages arrive
|
||||||
const activeTabValue = React.useMemo(() => {
|
|
||||||
return connectorSources.length > 0 ? connectorSources[0].type : "";
|
|
||||||
}, [connectorSources]);
|
|
||||||
|
|
||||||
// Update activeTab when the memoized value changes
|
|
||||||
useEffect(() => {
|
useEffect(() => {
|
||||||
if (activeTabValue && activeTabValue !== activeTab) {
|
// Reset pagination when we get new messages
|
||||||
setActiveTab(activeTabValue);
|
setSourcesPage(1);
|
||||||
}
|
setExpandedSources(false);
|
||||||
}, [activeTabValue, activeTab]);
|
}, [messages]);
|
||||||
|
|
||||||
// Scroll terminal to bottom when expanded
|
// Scroll terminal to bottom when expanded
|
||||||
useEffect(() => {
|
useEffect(() => {
|
||||||
|
|
@ -580,11 +560,6 @@ const ChatPage = () => {
|
||||||
}
|
}
|
||||||
}, [terminalExpanded]);
|
}, [terminalExpanded]);
|
||||||
|
|
||||||
// Get total sources count for a connector type
|
|
||||||
const getSourcesCount = (connectorType: string) => {
|
|
||||||
return getSourcesCountUtil(connectorSources, connectorType);
|
|
||||||
};
|
|
||||||
|
|
||||||
// Function to check scroll position and update indicators
|
// Function to check scroll position and update indicators
|
||||||
const updateScrollIndicators = () => {
|
const updateScrollIndicators = () => {
|
||||||
updateScrollIndicatorsUtil(tabsListRef as React.RefObject<HTMLDivElement>, setCanScrollLeft, setCanScrollRight);
|
updateScrollIndicatorsUtil(tabsListRef as React.RefObject<HTMLDivElement>, setCanScrollLeft, setCanScrollRight);
|
||||||
|
|
@ -610,23 +585,6 @@ const ChatPage = () => {
|
||||||
// Use the scroll to bottom hook
|
// Use the scroll to bottom hook
|
||||||
useScrollToBottom(messagesEndRef as React.RefObject<HTMLDivElement>, [messages]);
|
useScrollToBottom(messagesEndRef as React.RefObject<HTMLDivElement>, [messages]);
|
||||||
|
|
||||||
// Function to get sources for the main view
|
|
||||||
const getMainViewSources = (connector: any) => {
|
|
||||||
return getMainViewSourcesUtil(connector, INITIAL_SOURCES_DISPLAY);
|
|
||||||
};
|
|
||||||
|
|
||||||
// Function to get filtered sources for the dialog with null check
|
|
||||||
const getFilteredSourcesWithCheck = (connector: any, sourceFilter: string) => {
|
|
||||||
if (!connector?.sources) return [];
|
|
||||||
return getFilteredSourcesUtil(connector, sourceFilter);
|
|
||||||
};
|
|
||||||
|
|
||||||
// Function to get paginated dialog sources with null check
|
|
||||||
const getPaginatedDialogSourcesWithCheck = (connector: any, sourceFilter: string, expandedSources: boolean, sourcesPage: number, sourcesPerPage: number) => {
|
|
||||||
if (!connector?.sources) return [];
|
|
||||||
return getPaginatedDialogSourcesUtil(connector, sourceFilter, expandedSources, sourcesPage, sourcesPerPage);
|
|
||||||
};
|
|
||||||
|
|
||||||
// Function to get a citation source by ID
|
// Function to get a citation source by ID
|
||||||
const getCitationSource = React.useCallback((citationId: number, messageIndex?: number): Source | null => {
|
const getCitationSource = React.useCallback((citationId: number, messageIndex?: number): Source | null => {
|
||||||
if (!messages || messages.length === 0) return null;
|
if (!messages || messages.length === 0) return null;
|
||||||
|
|
@ -638,23 +596,14 @@ const ChatPage = () => {
|
||||||
if (assistantMessages.length === 0) return null;
|
if (assistantMessages.length === 0) return null;
|
||||||
|
|
||||||
const latestAssistantMessage = assistantMessages[assistantMessages.length - 1];
|
const latestAssistantMessage = assistantMessages[assistantMessages.length - 1];
|
||||||
if (!latestAssistantMessage?.annotations) return null;
|
|
||||||
|
// Use our helper function to get sources
|
||||||
// Find all SOURCES annotations
|
const sources = getMessageConnectorSources(latestAssistantMessage);
|
||||||
const annotations = latestAssistantMessage.annotations as any[];
|
if (sources.length === 0) return null;
|
||||||
const sourcesAnnotations = annotations.filter(
|
|
||||||
(annotation) => annotation.type === 'SOURCES'
|
|
||||||
);
|
|
||||||
|
|
||||||
// Get the latest SOURCES annotation
|
|
||||||
if (sourcesAnnotations.length === 0) return null;
|
|
||||||
const latestSourcesAnnotation = sourcesAnnotations[sourcesAnnotations.length - 1];
|
|
||||||
|
|
||||||
if (!latestSourcesAnnotation.content) return null;
|
|
||||||
|
|
||||||
// Flatten all sources from all connectors
|
// Flatten all sources from all connectors
|
||||||
const allSources: Source[] = [];
|
const allSources: Source[] = [];
|
||||||
latestSourcesAnnotation.content.forEach((connector: ConnectorSource) => {
|
sources.forEach((connector: ConnectorSource) => {
|
||||||
if (connector.sources && Array.isArray(connector.sources)) {
|
if (connector.sources && Array.isArray(connector.sources)) {
|
||||||
connector.sources.forEach((source: SourceItem) => {
|
connector.sources.forEach((source: SourceItem) => {
|
||||||
allSources.push({
|
allSources.push({
|
||||||
|
|
@ -675,23 +624,14 @@ const ChatPage = () => {
|
||||||
} else {
|
} else {
|
||||||
// Use the specific message by index
|
// Use the specific message by index
|
||||||
const message = messages[messageIndex];
|
const message = messages[messageIndex];
|
||||||
if (!message || message.role !== 'assistant' || !message.annotations) return null;
|
|
||||||
|
// Use our helper function to get sources
|
||||||
// Find all SOURCES annotations
|
const sources = getMessageConnectorSources(message);
|
||||||
const annotations = message.annotations as any[];
|
if (sources.length === 0) return null;
|
||||||
const sourcesAnnotations = annotations.filter(
|
|
||||||
(annotation) => annotation.type === 'SOURCES'
|
|
||||||
);
|
|
||||||
|
|
||||||
// Get the latest SOURCES annotation
|
|
||||||
if (sourcesAnnotations.length === 0) return null;
|
|
||||||
const latestSourcesAnnotation = sourcesAnnotations[sourcesAnnotations.length - 1];
|
|
||||||
|
|
||||||
if (!latestSourcesAnnotation.content) return null;
|
|
||||||
|
|
||||||
// Flatten all sources from all connectors
|
// Flatten all sources from all connectors
|
||||||
const allSources: Source[] = [];
|
const allSources: Source[] = [];
|
||||||
latestSourcesAnnotation.content.forEach((connector: ConnectorSource) => {
|
sources.forEach((connector: ConnectorSource) => {
|
||||||
if (connector.sources && Array.isArray(connector.sources)) {
|
if (connector.sources && Array.isArray(connector.sources)) {
|
||||||
connector.sources.forEach((source: SourceItem) => {
|
connector.sources.forEach((source: SourceItem) => {
|
||||||
allSources.push({
|
allSources.push({
|
||||||
|
|
@ -712,6 +652,34 @@ const ChatPage = () => {
|
||||||
}
|
}
|
||||||
}, [messages]);
|
}, [messages]);
|
||||||
|
|
||||||
|
// Pure function for rendering terminal content - no hooks allowed here
|
||||||
|
const renderTerminalContent = (message: any) => {
|
||||||
|
if (!message.annotations) return null;
|
||||||
|
|
||||||
|
// Get all TERMINAL_INFO annotations
|
||||||
|
const terminalInfoAnnotations = (message.annotations as any[])
|
||||||
|
.filter(a => a.type === 'TERMINAL_INFO');
|
||||||
|
|
||||||
|
// Get the latest TERMINAL_INFO annotation
|
||||||
|
const latestTerminalInfo = terminalInfoAnnotations.length > 0
|
||||||
|
? terminalInfoAnnotations[terminalInfoAnnotations.length - 1]
|
||||||
|
: null;
|
||||||
|
|
||||||
|
// Render the content of the latest TERMINAL_INFO annotation
|
||||||
|
return latestTerminalInfo?.content.map((item: any, idx: number) => (
|
||||||
|
<div key={idx} className="py-0.5 flex items-start text-gray-300">
|
||||||
|
<span className="text-gray-500 text-xs mr-2 w-10 flex-shrink-0">[{String(idx).padStart(2, '0')}:{String(Math.floor(idx * 2)).padStart(2, '0')}]</span>
|
||||||
|
<span className="mr-2 opacity-70">{'>'}</span>
|
||||||
|
<span className={`
|
||||||
|
${item.type === 'info' ? 'text-blue-300' : ''}
|
||||||
|
${item.type === 'success' ? 'text-green-300' : ''}
|
||||||
|
${item.type === 'error' ? 'text-red-300' : ''}
|
||||||
|
${item.type === 'warning' ? 'text-yellow-300' : ''}
|
||||||
|
`}>{item.text}</span>
|
||||||
|
</div>
|
||||||
|
));
|
||||||
|
};
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<>
|
<>
|
||||||
<div className="flex flex-col min-h-[calc(100vh-4rem)] min-w-4xl max-w-4xl mx-auto px-4 py-8 overflow-x-hidden justify-center gap-4">
|
<div className="flex flex-col min-h-[calc(100vh-4rem)] min-w-4xl max-w-4xl mx-auto px-4 py-8 overflow-x-hidden justify-center gap-4">
|
||||||
|
|
@ -781,30 +749,9 @@ const ChatPage = () => {
|
||||||
<span className="mr-1">$</span>
|
<span className="mr-1">$</span>
|
||||||
<span>surfsense-researcher</span>
|
<span>surfsense-researcher</span>
|
||||||
</div>
|
</div>
|
||||||
{message.annotations && (() => {
|
|
||||||
// Get all TERMINAL_INFO annotations
|
{renderTerminalContent(message)}
|
||||||
const terminalInfoAnnotations = (message.annotations as any[])
|
|
||||||
.filter(a => a.type === 'TERMINAL_INFO');
|
|
||||||
|
|
||||||
// Get the latest TERMINAL_INFO annotation
|
|
||||||
const latestTerminalInfo = terminalInfoAnnotations.length > 0
|
|
||||||
? terminalInfoAnnotations[terminalInfoAnnotations.length - 1]
|
|
||||||
: null;
|
|
||||||
|
|
||||||
// Render the content of the latest TERMINAL_INFO annotation
|
|
||||||
return latestTerminalInfo?.content.map((item: any, idx: number) => (
|
|
||||||
<div key={idx} className="py-0.5 flex items-start text-gray-300">
|
|
||||||
<span className="text-gray-500 text-xs mr-2 w-10 flex-shrink-0">[{String(idx).padStart(2, '0')}:{String(Math.floor(idx * 2)).padStart(2, '0')}]</span>
|
|
||||||
<span className="mr-2 opacity-70">{'>'}</span>
|
|
||||||
<span className={`
|
|
||||||
${item.type === 'info' ? 'text-blue-300' : ''}
|
|
||||||
${item.type === 'success' ? 'text-green-300' : ''}
|
|
||||||
${item.type === 'error' ? 'text-red-300' : ''}
|
|
||||||
${item.type === 'warning' ? 'text-yellow-300' : ''}
|
|
||||||
`}>{item.text}</span>
|
|
||||||
</div>
|
|
||||||
));
|
|
||||||
})()}
|
|
||||||
<div className="mt-2 flex items-center">
|
<div className="mt-2 flex items-center">
|
||||||
<span className="text-gray-500 text-xs mr-2 w-10 flex-shrink-0">[00:13]</span>
|
<span className="text-gray-500 text-xs mr-2 w-10 flex-shrink-0">[00:13]</span>
|
||||||
<span className="text-green-400 mr-1">researcher@surfsense</span>
|
<span className="text-green-400 mr-1">researcher@surfsense</span>
|
||||||
|
|
@ -836,105 +783,120 @@ const ChatPage = () => {
|
||||||
<span className="font-medium">Sources</span>
|
<span className="font-medium">Sources</span>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<Tabs
|
{(() => {
|
||||||
defaultValue={connectorSources.length > 0 ? connectorSources[0].type : "CRAWLED_URL"}
|
// Get sources for this specific message
|
||||||
className="w-full"
|
const messageConnectorSources = getMessageConnectorSources(message);
|
||||||
onValueChange={setActiveTab}
|
|
||||||
>
|
if (messageConnectorSources.length === 0) {
|
||||||
<div className="mb-4">
|
return (
|
||||||
<div className="flex items-center">
|
<div className="text-center py-8 text-gray-500 dark:text-gray-400 border border-dashed rounded-md">
|
||||||
<Button
|
<Database className="h-8 w-8 mx-auto mb-2 opacity-50" />
|
||||||
variant="ghost"
|
</div>
|
||||||
size="icon"
|
);
|
||||||
onClick={scrollTabsLeft}
|
}
|
||||||
className="flex-shrink-0 mr-2 z-10"
|
|
||||||
disabled={!canScrollLeft}
|
// Use these message-specific sources for the Tabs component
|
||||||
>
|
return (
|
||||||
<ChevronLeft className="h-4 w-4" />
|
<Tabs
|
||||||
</Button>
|
defaultValue={messageConnectorSources.length > 0 ? messageConnectorSources[0].type : "CRAWLED_URL"}
|
||||||
|
className="w-full"
|
||||||
|
>
|
||||||
|
<div className="mb-4">
|
||||||
|
<div className="flex items-center">
|
||||||
|
<Button
|
||||||
|
variant="ghost"
|
||||||
|
size="icon"
|
||||||
|
onClick={scrollTabsLeft}
|
||||||
|
className="flex-shrink-0 mr-2 z-10"
|
||||||
|
disabled={!canScrollLeft}
|
||||||
|
>
|
||||||
|
<ChevronLeft className="h-4 w-4" />
|
||||||
|
</Button>
|
||||||
|
|
||||||
<div className="flex-1 overflow-hidden">
|
<div className="flex-1 overflow-hidden">
|
||||||
<div className="flex overflow-x-auto hide-scrollbar" ref={tabsListRef} onScroll={updateScrollIndicators}>
|
<div className="flex overflow-x-auto hide-scrollbar" ref={tabsListRef} onScroll={updateScrollIndicators}>
|
||||||
<TabsList className="flex-1 bg-transparent border-0 p-0 custom-tabs-list">
|
<TabsList className="flex-1 bg-transparent border-0 p-0 custom-tabs-list">
|
||||||
{connectorSources.map((connector) => (
|
{messageConnectorSources.map((connector) => (
|
||||||
<TabsTrigger
|
<TabsTrigger
|
||||||
key={connector.id}
|
key={connector.id}
|
||||||
value={connector.type}
|
value={connector.type}
|
||||||
className="flex items-center gap-1 mx-1 data-[state=active]:bg-gray-100 dark:data-[state=active]:bg-gray-800 rounded-md"
|
className="flex items-center gap-1 mx-1 data-[state=active]:bg-gray-100 dark:data-[state=active]:bg-gray-800 rounded-md"
|
||||||
>
|
>
|
||||||
{getConnectorIcon(connector.type)}
|
{getConnectorIcon(connector.type)}
|
||||||
<span className="hidden sm:inline ml-1">{connector.name.split(' ')[0]}</span>
|
<span className="hidden sm:inline ml-1">{connector.name.split(' ')[0]}</span>
|
||||||
<span className="bg-gray-200 dark:bg-gray-700 px-1.5 py-0.5 rounded text-xs">
|
<span className="bg-gray-200 dark:bg-gray-700 px-1.5 py-0.5 rounded text-xs">
|
||||||
{getSourcesCount(connector.type)}
|
{connector.sources?.length || 0}
|
||||||
</span>
|
</span>
|
||||||
</TabsTrigger>
|
</TabsTrigger>
|
||||||
))}
|
))}
|
||||||
</TabsList>
|
</TabsList>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<Button
|
||||||
|
variant="ghost"
|
||||||
|
size="icon"
|
||||||
|
onClick={scrollTabsRight}
|
||||||
|
className="flex-shrink-0 ml-2 z-10"
|
||||||
|
disabled={!canScrollRight}
|
||||||
|
>
|
||||||
|
<ChevronRight className="h-4 w-4" />
|
||||||
|
</Button>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<Button
|
{messageConnectorSources.map(connector => (
|
||||||
variant="ghost"
|
<TabsContent key={connector.id} value={connector.type} className="mt-0">
|
||||||
size="icon"
|
<div className="space-y-3">
|
||||||
onClick={scrollTabsRight}
|
{connector.sources?.slice(0, INITIAL_SOURCES_DISPLAY)?.map((source: any) => (
|
||||||
className="flex-shrink-0 ml-2 z-10"
|
<Card key={source.id} className="p-3 hover:bg-gray-50 dark:hover:bg-gray-800 cursor-pointer">
|
||||||
disabled={!canScrollRight}
|
<div className="flex items-start gap-3">
|
||||||
>
|
<div className="flex-shrink-0 w-6 h-6 flex items-center justify-center">
|
||||||
<ChevronRight className="h-4 w-4" />
|
{getConnectorIcon(connector.type)}
|
||||||
</Button>
|
</div>
|
||||||
</div>
|
<div className="flex-1">
|
||||||
</div>
|
<h3 className="font-medium text-sm">{source.title}</h3>
|
||||||
|
<p className="text-sm text-gray-500 dark:text-gray-400">{source.description}</p>
|
||||||
|
</div>
|
||||||
|
<Button
|
||||||
|
variant="ghost"
|
||||||
|
size="icon"
|
||||||
|
className="h-6 w-6"
|
||||||
|
onClick={() => window.open(source.url, '_blank')}
|
||||||
|
>
|
||||||
|
<ExternalLink className="h-4 w-4" />
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
</Card>
|
||||||
|
))}
|
||||||
|
|
||||||
{connectorSources.map(connector => (
|
{connector.sources?.length > INITIAL_SOURCES_DISPLAY && (
|
||||||
<TabsContent key={connector.id} value={connector.type} className="mt-0">
|
<Dialog open={dialogOpenId === connector.id} onOpenChange={(open) => setDialogOpenId(open ? connector.id : null)}>
|
||||||
<div className="space-y-3">
|
<DialogTrigger asChild>
|
||||||
{getMainViewSources(connector)?.map((source: any) => (
|
<Button variant="ghost" className="w-full text-sm text-gray-500 dark:text-gray-400">
|
||||||
<Card key={source.id} className="p-3 hover:bg-gray-50 dark:hover:bg-gray-800 cursor-pointer">
|
Show {connector.sources.length - INITIAL_SOURCES_DISPLAY} More Sources
|
||||||
<div className="flex items-start gap-3">
|
</Button>
|
||||||
<div className="flex-shrink-0 w-6 h-6 flex items-center justify-center">
|
</DialogTrigger>
|
||||||
{getConnectorIcon(connector.type)}
|
<DialogContent className="sm:max-w-[600px] max-h-[80vh] overflow-y-auto dark:border-gray-700">
|
||||||
</div>
|
<SourcesDialogContent
|
||||||
<div className="flex-1">
|
connector={connector}
|
||||||
<h3 className="font-medium text-sm">{source.title}</h3>
|
sourceFilter={sourceFilter}
|
||||||
<p className="text-sm text-gray-500 dark:text-gray-400">{source.description}</p>
|
expandedSources={expandedSources}
|
||||||
</div>
|
sourcesPage={sourcesPage}
|
||||||
<Button
|
setSourcesPage={setSourcesPage}
|
||||||
variant="ghost"
|
setSourceFilter={setSourceFilter}
|
||||||
size="icon"
|
setExpandedSources={setExpandedSources}
|
||||||
className="h-6 w-6"
|
isLoadingMore={false}
|
||||||
onClick={() => window.open(source.url, '_blank')}
|
/>
|
||||||
>
|
</DialogContent>
|
||||||
<ExternalLink className="h-4 w-4" />
|
</Dialog>
|
||||||
</Button>
|
)}
|
||||||
</div>
|
</div>
|
||||||
</Card>
|
</TabsContent>
|
||||||
))}
|
))}
|
||||||
|
</Tabs>
|
||||||
{connector.sources.length > INITIAL_SOURCES_DISPLAY && (
|
);
|
||||||
<Dialog open={dialogOpenId === connector.id} onOpenChange={(open) => setDialogOpenId(open ? connector.id : null)}>
|
})()}
|
||||||
<DialogTrigger asChild>
|
|
||||||
<Button variant="ghost" className="w-full text-sm text-gray-500 dark:text-gray-400">
|
|
||||||
Show {connector.sources.length - INITIAL_SOURCES_DISPLAY} More Sources
|
|
||||||
</Button>
|
|
||||||
</DialogTrigger>
|
|
||||||
<DialogContent className="sm:max-w-[600px] max-h-[80vh] overflow-y-auto dark:border-gray-700">
|
|
||||||
<SourcesDialogContent
|
|
||||||
connector={connector}
|
|
||||||
sourceFilter={sourceFilter}
|
|
||||||
expandedSources={expandedSources}
|
|
||||||
sourcesPage={sourcesPage}
|
|
||||||
setSourcesPage={setSourcesPage}
|
|
||||||
setSourceFilter={setSourceFilter}
|
|
||||||
setExpandedSources={setExpandedSources}
|
|
||||||
isLoadingMore={false}
|
|
||||||
/>
|
|
||||||
</DialogContent>
|
|
||||||
</Dialog>
|
|
||||||
)}
|
|
||||||
</div>
|
|
||||||
</TabsContent>
|
|
||||||
))}
|
|
||||||
</Tabs>
|
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
{/* Answer Section */}
|
{/* Answer Section */}
|
||||||
|
|
@ -1014,15 +976,17 @@ const ChatPage = () => {
|
||||||
<span className="sr-only">Send</span>
|
<span className="sr-only">Send</span>
|
||||||
</Button>
|
</Button>
|
||||||
</form>
|
</form>
|
||||||
<div className="flex items-center justify-between px-2 py-1 mt-8">
|
<div className="flex items-center justify-between px-2 py-2 mt-3">
|
||||||
<div className="flex items-center gap-4">
|
<div className="flex items-center space-x-3">
|
||||||
{/* Connector Selection Dialog */}
|
{/* Connector Selection Dialog */}
|
||||||
<Dialog>
|
<Dialog>
|
||||||
<DialogTrigger asChild>
|
<DialogTrigger asChild>
|
||||||
<ConnectorButton
|
<div className="h-8">
|
||||||
selectedConnectors={selectedConnectors}
|
<ConnectorButton
|
||||||
onClick={() => { }}
|
selectedConnectors={selectedConnectors}
|
||||||
/>
|
onClick={() => { }}
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
</DialogTrigger>
|
</DialogTrigger>
|
||||||
<DialogContent className="sm:max-w-md">
|
<DialogContent className="sm:max-w-md">
|
||||||
<DialogHeader>
|
<DialogHeader>
|
||||||
|
|
@ -1089,12 +1053,40 @@ const ChatPage = () => {
|
||||||
</DialogContent>
|
</DialogContent>
|
||||||
</Dialog>
|
</Dialog>
|
||||||
|
|
||||||
|
{/* Search Mode Control */}
|
||||||
|
<div className="flex items-center p-0.5 rounded-md border border-border bg-muted/20 h-8">
|
||||||
|
<button
|
||||||
|
onClick={() => setSearchMode('DOCUMENTS')}
|
||||||
|
className={`flex h-full items-center justify-center gap-1 px-2 rounded text-xs font-medium transition-colors flex-1 whitespace-nowrap overflow-hidden ${
|
||||||
|
searchMode === 'DOCUMENTS'
|
||||||
|
? 'bg-primary text-primary-foreground shadow-sm'
|
||||||
|
: 'text-muted-foreground hover:text-foreground hover:bg-muted/50'
|
||||||
|
}`}
|
||||||
|
>
|
||||||
|
<FileText className="h-3 w-3 flex-shrink-0 mr-1" />
|
||||||
|
<span>Full Document</span>
|
||||||
|
</button>
|
||||||
|
<button
|
||||||
|
onClick={() => setSearchMode('CHUNKS')}
|
||||||
|
className={`flex h-full items-center justify-center gap-1 px-2 rounded text-xs font-medium transition-colors flex-1 whitespace-nowrap overflow-hidden ${
|
||||||
|
searchMode === 'CHUNKS'
|
||||||
|
? 'bg-primary text-primary-foreground shadow-sm'
|
||||||
|
: 'text-muted-foreground hover:text-foreground hover:bg-muted/50'
|
||||||
|
}`}
|
||||||
|
>
|
||||||
|
<Grid3x3 className="h-3 w-3 flex-shrink-0 mr-1" />
|
||||||
|
<span>Document Chunks</span>
|
||||||
|
</button>
|
||||||
|
</div>
|
||||||
|
|
||||||
{/* Research Mode Segmented Control */}
|
{/* Research Mode Segmented Control */}
|
||||||
<SegmentedControl<ResearchMode>
|
<div className="h-8">
|
||||||
value={researchMode}
|
<SegmentedControl<ResearchMode>
|
||||||
onChange={setResearchMode}
|
value={researchMode}
|
||||||
options={researcherOptions}
|
onChange={setResearchMode}
|
||||||
/>
|
options={researcherOptions}
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
|
||||||
|
|
@ -4,7 +4,7 @@ import React from 'react'
|
||||||
import Link from 'next/link'
|
import Link from 'next/link'
|
||||||
import { motion } from 'framer-motion'
|
import { motion } from 'framer-motion'
|
||||||
import { Button } from '@/components/ui/button'
|
import { Button } from '@/components/ui/button'
|
||||||
import { Plus, Search, Trash2, AlertCircle, Loader2 } from 'lucide-react'
|
import { Plus, Search, Trash2, AlertCircle, Loader2, LogOut } from 'lucide-react'
|
||||||
import { Tilt } from '@/components/ui/tilt'
|
import { Tilt } from '@/components/ui/tilt'
|
||||||
import { Spotlight } from '@/components/ui/spotlight'
|
import { Spotlight } from '@/components/ui/spotlight'
|
||||||
import { Logo } from '@/components/Logo';
|
import { Logo } from '@/components/Logo';
|
||||||
|
|
@ -145,11 +145,19 @@ const DashboardPage = () => {
|
||||||
},
|
},
|
||||||
};
|
};
|
||||||
|
|
||||||
|
const router = useRouter();
|
||||||
const { searchSpaces, loading, error, refreshSearchSpaces } = useSearchSpaces();
|
const { searchSpaces, loading, error, refreshSearchSpaces } = useSearchSpaces();
|
||||||
|
|
||||||
if (loading) return <LoadingScreen />;
|
if (loading) return <LoadingScreen />;
|
||||||
if (error) return <ErrorScreen message={error} />;
|
if (error) return <ErrorScreen message={error} />;
|
||||||
|
|
||||||
|
const handleLogout = () => {
|
||||||
|
if (typeof window !== 'undefined') {
|
||||||
|
localStorage.removeItem('surfsense_bearer_token');
|
||||||
|
router.push('/');
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
const handleDeleteSearchSpace = async (id: number) => {
|
const handleDeleteSearchSpace = async (id: number) => {
|
||||||
// Send DELETE request to the API
|
// Send DELETE request to the API
|
||||||
try {
|
try {
|
||||||
|
|
@ -193,7 +201,18 @@ const DashboardPage = () => {
|
||||||
</p>
|
</p>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
<ThemeTogglerComponent />
|
<div className="flex items-center space-x-3">
|
||||||
|
<Button
|
||||||
|
variant="ghost"
|
||||||
|
size="icon"
|
||||||
|
onClick={handleLogout}
|
||||||
|
className="h-9 w-9 rounded-full"
|
||||||
|
aria-label="Logout"
|
||||||
|
>
|
||||||
|
<LogOut className="h-5 w-5" />
|
||||||
|
</Button>
|
||||||
|
<ThemeTogglerComponent />
|
||||||
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<div className="flex flex-col space-y-6 mt-6">
|
<div className="flex flex-col space-y-6 mt-6">
|
||||||
|
|
|
||||||
|
|
@ -45,6 +45,7 @@
|
||||||
--sidebar-accent-foreground: oklch(0.205 0 0);
|
--sidebar-accent-foreground: oklch(0.205 0 0);
|
||||||
--sidebar-border: oklch(0.922 0 0);
|
--sidebar-border: oklch(0.922 0 0);
|
||||||
--sidebar-ring: oklch(0.708 0 0);
|
--sidebar-ring: oklch(0.708 0 0);
|
||||||
|
--syntax-bg: #f5f5f5;
|
||||||
}
|
}
|
||||||
|
|
||||||
.dark {
|
.dark {
|
||||||
|
|
@ -80,6 +81,7 @@
|
||||||
--sidebar-accent-foreground: oklch(0.985 0 0);
|
--sidebar-accent-foreground: oklch(0.985 0 0);
|
||||||
--sidebar-border: oklch(0.269 0 0);
|
--sidebar-border: oklch(0.269 0 0);
|
||||||
--sidebar-ring: oklch(0.439 0 0);
|
--sidebar-ring: oklch(0.439 0 0);
|
||||||
|
--syntax-bg: #1e1e1e;
|
||||||
}
|
}
|
||||||
|
|
||||||
@theme inline {
|
@theme inline {
|
||||||
|
|
|
||||||
|
|
@ -15,35 +15,67 @@ const roboto = Roboto({
|
||||||
});
|
});
|
||||||
|
|
||||||
export const metadata: Metadata = {
|
export const metadata: Metadata = {
|
||||||
title: "SurfSense - A Personal NotebookLM and Perplexity-like AI Assistant for Everyone.",
|
title: "SurfSense – Customizable AI Research & Knowledge Management Assistant",
|
||||||
description:
|
description:
|
||||||
"Have your own private NotebookLM and Perplexity with better integrations.",
|
"SurfSense is an AI-powered research assistant that integrates with tools like Notion, GitHub, Slack, and more to help you efficiently manage, search, and chat with your documents. Generate podcasts, perform hybrid search, and unlock insights from your knowledge base.",
|
||||||
openGraph: {
|
keywords: [
|
||||||
images: [
|
"SurfSense",
|
||||||
{
|
"AI research assistant",
|
||||||
url: "https://surfsense.net/og-image.png",
|
"AI knowledge management",
|
||||||
width: 1200,
|
"AI document assistant",
|
||||||
height: 630,
|
"customizable AI assistant",
|
||||||
alt: "SurfSense - A Personal NotebookLM and Perplexity-like AI Assistant for Everyone.",
|
"notion integration",
|
||||||
},
|
"slack integration",
|
||||||
],
|
"github integration",
|
||||||
},
|
"hybrid search",
|
||||||
twitter: {
|
"vector search",
|
||||||
card: "summary_large_image",
|
"RAG",
|
||||||
site: "https://surfsense.net",
|
"LangChain",
|
||||||
creator: "https://surfsense.net",
|
"FastAPI",
|
||||||
title: "SurfSense - A Personal NotebookLM and Perplexity-like AI Assistant for Everyone.",
|
"LLM apps",
|
||||||
description:
|
"AI document chat",
|
||||||
"Have your own private NotebookLM and Perplexity with better integrations.",
|
"knowledge management AI",
|
||||||
images: [
|
"AI-powered document search",
|
||||||
{
|
"personal AI assistant",
|
||||||
url: "https://surfsense.net/og-image.png",
|
"AI research tools",
|
||||||
width: 1200,
|
"AI podcast generator",
|
||||||
height: 630,
|
"AI knowledge base",
|
||||||
alt: "SurfSense - A Personal NotebookLM and Perplexity-like AI Assistant for Everyone.",
|
"AI document assistant tools",
|
||||||
},
|
"AI-powered search assistant",
|
||||||
],
|
],
|
||||||
},
|
openGraph: {
|
||||||
|
title: "SurfSense – AI Research & Knowledge Management Assistant",
|
||||||
|
description:
|
||||||
|
"Connect your documents and tools like Notion, Slack, GitHub, and more to your private AI assistant. SurfSense offers powerful search, document chat, podcast generation, and RAG APIs to enhance your workflow.",
|
||||||
|
url: "https://surfsense.net",
|
||||||
|
siteName: "SurfSense",
|
||||||
|
type: "website",
|
||||||
|
images: [
|
||||||
|
{
|
||||||
|
url: "https://surfsense.net/og-image.png",
|
||||||
|
width: 1200,
|
||||||
|
height: 630,
|
||||||
|
alt: "SurfSense AI Research Assistant",
|
||||||
|
},
|
||||||
|
],
|
||||||
|
locale: "en_US",
|
||||||
|
},
|
||||||
|
twitter: {
|
||||||
|
card: "summary_large_image",
|
||||||
|
title: "SurfSense – AI Assistant for Research & Knowledge Management",
|
||||||
|
description:
|
||||||
|
"Have your own NotebookLM or Perplexity, but better. SurfSense connects external tools, allows chat with your documents, and generates fast, high-quality podcasts.",
|
||||||
|
creator: "https://surfsense.net",
|
||||||
|
site: "https://surfsense.net",
|
||||||
|
images: [
|
||||||
|
{
|
||||||
|
url: "https://surfsense.net/og-image-twitter.png",
|
||||||
|
width: 1200,
|
||||||
|
height: 630,
|
||||||
|
alt: "SurfSense AI Assistant Preview",
|
||||||
|
},
|
||||||
|
],
|
||||||
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
export default async function RootLayout({
|
export default async function RootLayout({
|
||||||
|
|
|
||||||
43
surfsense_web/app/login/AmbientBackground.tsx
Normal file
43
surfsense_web/app/login/AmbientBackground.tsx
Normal file
|
|
@ -0,0 +1,43 @@
|
||||||
|
"use client";
|
||||||
|
import React from "react";
|
||||||
|
|
||||||
|
export const AmbientBackground = () => {
|
||||||
|
return (
|
||||||
|
<div className="pointer-events-none absolute left-0 top-0 z-0 h-screen w-screen">
|
||||||
|
<div
|
||||||
|
style={{
|
||||||
|
transform: "translateY(-350px) rotate(-45deg)",
|
||||||
|
width: "560px",
|
||||||
|
height: "1380px",
|
||||||
|
background:
|
||||||
|
"radial-gradient(68.54% 68.72% at 55.02% 31.46%, rgba(59, 130, 246, 0.08) 0%, rgba(59, 130, 246, 0.02) 50%, rgba(59, 130, 246, 0) 100%)",
|
||||||
|
}}
|
||||||
|
className="absolute left-0 top-0"
|
||||||
|
/>
|
||||||
|
<div
|
||||||
|
style={{
|
||||||
|
transform: "rotate(-45deg) translate(5%, -50%)",
|
||||||
|
transformOrigin: "top left",
|
||||||
|
width: "240px",
|
||||||
|
height: "1380px",
|
||||||
|
background:
|
||||||
|
"radial-gradient(50% 50% at 50% 50%, rgba(59, 130, 246, 0.06) 0%, rgba(59, 130, 246, 0.02) 80%, transparent 100%)",
|
||||||
|
}}
|
||||||
|
className="absolute left-0 top-0"
|
||||||
|
/>
|
||||||
|
<div
|
||||||
|
style={{
|
||||||
|
position: "absolute",
|
||||||
|
borderRadius: "20px",
|
||||||
|
transform: "rotate(-45deg) translate(-180%, -70%)",
|
||||||
|
transformOrigin: "top left",
|
||||||
|
width: "240px",
|
||||||
|
height: "1380px",
|
||||||
|
background:
|
||||||
|
"radial-gradient(50% 50% at 50% 50%, rgba(59, 130, 246, 0.04) 0%, rgba(59, 130, 246, 0.02) 80%, transparent 100%)",
|
||||||
|
}}
|
||||||
|
className="absolute left-0 top-0"
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
};
|
||||||
|
|
@ -3,6 +3,7 @@ import React from "react";
|
||||||
import { IconBrandGoogleFilled } from "@tabler/icons-react";
|
import { IconBrandGoogleFilled } from "@tabler/icons-react";
|
||||||
import { motion } from "framer-motion";
|
import { motion } from "framer-motion";
|
||||||
import { Logo } from "@/components/Logo";
|
import { Logo } from "@/components/Logo";
|
||||||
|
import { AmbientBackground } from "./AmbientBackground";
|
||||||
|
|
||||||
export function GoogleLoginButton() {
|
export function GoogleLoginButton() {
|
||||||
const handleGoogleLogin = () => {
|
const handleGoogleLogin = () => {
|
||||||
|
|
@ -34,6 +35,42 @@ export function GoogleLoginButton() {
|
||||||
Welcome Back
|
Welcome Back
|
||||||
</h1>
|
</h1>
|
||||||
|
|
||||||
|
<motion.div
|
||||||
|
initial={{ opacity: 0, y: -5 }}
|
||||||
|
animate={{ opacity: 1, y: 0 }}
|
||||||
|
transition={{ duration: 0.3 }}
|
||||||
|
className="mb-4 w-full overflow-hidden rounded-lg border border-yellow-200 bg-yellow-50 text-yellow-900 shadow-sm dark:border-yellow-900/30 dark:bg-yellow-900/20 dark:text-yellow-200"
|
||||||
|
>
|
||||||
|
<motion.div
|
||||||
|
className="flex items-center gap-2 p-4"
|
||||||
|
initial={{ x: -5 }}
|
||||||
|
animate={{ x: 0 }}
|
||||||
|
transition={{ delay: 0.1, duration: 0.2 }}
|
||||||
|
>
|
||||||
|
<svg
|
||||||
|
xmlns="http://www.w3.org/2000/svg"
|
||||||
|
width="16"
|
||||||
|
height="16"
|
||||||
|
viewBox="0 0 24 24"
|
||||||
|
fill="none"
|
||||||
|
stroke="currentColor"
|
||||||
|
strokeWidth="2"
|
||||||
|
strokeLinecap="round"
|
||||||
|
strokeLinejoin="round"
|
||||||
|
className="flex-shrink-0"
|
||||||
|
>
|
||||||
|
<path d="M10.29 3.86L1.82 18a2 2 0 0 0 1.71 3h16.94a2 2 0 0 0 1.71-3L13.71 3.86a2 2 0 0 0-3.42 0z"/>
|
||||||
|
<line x1="12" y1="9" x2="12" y2="13"/>
|
||||||
|
<line x1="12" y1="17" x2="12.01" y2="17"/>
|
||||||
|
</svg>
|
||||||
|
<div className="ml-1">
|
||||||
|
<p className="text-sm font-medium">
|
||||||
|
SurfSense Cloud is currently in development. Check <a href="/docs" className="text-blue-600 underline dark:text-blue-400 hover:text-blue-800 dark:hover:text-blue-300">Docs</a> for more information on Self-Hosted version.
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
</motion.div>
|
||||||
|
</motion.div>
|
||||||
|
|
||||||
<motion.button
|
<motion.button
|
||||||
whileHover={{ scale: 1.02 }}
|
whileHover={{ scale: 1.02 }}
|
||||||
whileTap={{ scale: 0.98 }}
|
whileTap={{ scale: 0.98 }}
|
||||||
|
|
@ -52,47 +89,4 @@ export function GoogleLoginButton() {
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
const AmbientBackground = () => {
|
|
||||||
return (
|
|
||||||
<div className="pointer-events-none absolute left-0 top-0 z-0 h-screen w-screen">
|
|
||||||
<div
|
|
||||||
style={{
|
|
||||||
transform: "translateY(-350px) rotate(-45deg)",
|
|
||||||
width: "560px",
|
|
||||||
height: "1380px",
|
|
||||||
background:
|
|
||||||
"radial-gradient(68.54% 68.72% at 55.02% 31.46%, rgba(59, 130, 246, 0.08) 0%, rgba(59, 130, 246, 0.02) 50%, rgba(59, 130, 246, 0) 100%)",
|
|
||||||
}}
|
|
||||||
className="absolute left-0 top-0"
|
|
||||||
/>
|
|
||||||
<div
|
|
||||||
style={{
|
|
||||||
transform: "rotate(-45deg) translate(5%, -50%)",
|
|
||||||
transformOrigin: "top left",
|
|
||||||
width: "240px",
|
|
||||||
height: "1380px",
|
|
||||||
background:
|
|
||||||
"radial-gradient(50% 50% at 50% 50%, rgba(59, 130, 246, 0.06) 0%, rgba(59, 130, 246, 0.02) 80%, transparent 100%)",
|
|
||||||
}}
|
|
||||||
className="absolute left-0 top-0"
|
|
||||||
/>
|
|
||||||
<div
|
|
||||||
style={{
|
|
||||||
position: "absolute",
|
|
||||||
borderRadius: "20px",
|
|
||||||
transform: "rotate(-45deg) translate(-180%, -70%)",
|
|
||||||
transformOrigin: "top left",
|
|
||||||
width: "240px",
|
|
||||||
height: "1380px",
|
|
||||||
background:
|
|
||||||
"radial-gradient(50% 50% at 50% 50%, rgba(59, 130, 246, 0.04) 0%, rgba(59, 130, 246, 0.02) 80%, transparent 100%)",
|
|
||||||
}}
|
|
||||||
className="absolute left-0 top-0"
|
|
||||||
/>
|
|
||||||
</div>
|
|
||||||
);
|
|
||||||
};
|
|
||||||
114
surfsense_web/app/login/LocalLoginForm.tsx
Normal file
114
surfsense_web/app/login/LocalLoginForm.tsx
Normal file
|
|
@ -0,0 +1,114 @@
|
||||||
|
"use client";
|
||||||
|
import React, { useState, useEffect } from "react";
|
||||||
|
import { useRouter } from "next/navigation";
|
||||||
|
import Link from "next/link";
|
||||||
|
|
||||||
|
export function LocalLoginForm() {
|
||||||
|
const [username, setUsername] = useState("");
|
||||||
|
const [password, setPassword] = useState("");
|
||||||
|
const [error, setError] = useState("");
|
||||||
|
const [isLoading, setIsLoading] = useState(false);
|
||||||
|
const [authType, setAuthType] = useState<string | null>(null);
|
||||||
|
const router = useRouter();
|
||||||
|
|
||||||
|
useEffect(() => {
|
||||||
|
// Get the auth type from environment variables
|
||||||
|
setAuthType(process.env.NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE || "GOOGLE");
|
||||||
|
}, []);
|
||||||
|
|
||||||
|
const handleSubmit = async (e: React.FormEvent) => {
|
||||||
|
e.preventDefault();
|
||||||
|
setIsLoading(true);
|
||||||
|
setError("");
|
||||||
|
|
||||||
|
try {
|
||||||
|
// Create form data for the API request
|
||||||
|
const formData = new URLSearchParams();
|
||||||
|
formData.append("username", username);
|
||||||
|
formData.append("password", password);
|
||||||
|
formData.append("grant_type", "password");
|
||||||
|
|
||||||
|
const response = await fetch(
|
||||||
|
`${process.env.NEXT_PUBLIC_FASTAPI_BACKEND_URL}/auth/jwt/login`,
|
||||||
|
{
|
||||||
|
method: "POST",
|
||||||
|
headers: {
|
||||||
|
"Content-Type": "application/x-www-form-urlencoded",
|
||||||
|
},
|
||||||
|
body: formData.toString(),
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
|
const data = await response.json();
|
||||||
|
|
||||||
|
if (!response.ok) {
|
||||||
|
throw new Error(data.detail || "Failed to login");
|
||||||
|
}
|
||||||
|
|
||||||
|
router.push("/auth/callback?token=" + data.access_token);
|
||||||
|
} catch (err: any) {
|
||||||
|
setError(err.message || "An error occurred during login");
|
||||||
|
} finally {
|
||||||
|
setIsLoading(false);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="w-full max-w-md">
|
||||||
|
<form onSubmit={handleSubmit} className="space-y-4">
|
||||||
|
{error && (
|
||||||
|
<div className="rounded-md bg-red-50 p-4 text-sm text-red-500 dark:bg-red-900/20 dark:text-red-200">
|
||||||
|
{error}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
<div>
|
||||||
|
<label htmlFor="email" className="block text-sm font-medium text-gray-700 dark:text-gray-300">
|
||||||
|
Email
|
||||||
|
</label>
|
||||||
|
<input
|
||||||
|
id="email"
|
||||||
|
type="email"
|
||||||
|
required
|
||||||
|
value={username}
|
||||||
|
onChange={(e) => setUsername(e.target.value)}
|
||||||
|
className="mt-1 block w-full rounded-md border border-gray-300 bg-white px-3 py-2 shadow-sm focus:border-blue-500 focus:outline-none focus:ring-blue-500 dark:border-gray-700 dark:bg-gray-800 dark:text-white"
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div>
|
||||||
|
<label htmlFor="password" className="block text-sm font-medium text-gray-700 dark:text-gray-300">
|
||||||
|
Password
|
||||||
|
</label>
|
||||||
|
<input
|
||||||
|
id="password"
|
||||||
|
type="password"
|
||||||
|
required
|
||||||
|
value={password}
|
||||||
|
onChange={(e) => setPassword(e.target.value)}
|
||||||
|
className="mt-1 block w-full rounded-md border border-gray-300 bg-white px-3 py-2 shadow-sm focus:border-blue-500 focus:outline-none focus:ring-blue-500 dark:border-gray-700 dark:bg-gray-800 dark:text-white"
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<button
|
||||||
|
type="submit"
|
||||||
|
disabled={isLoading}
|
||||||
|
className="w-full rounded-md bg-blue-600 px-4 py-2 text-white shadow-sm hover:bg-blue-700 focus:outline-none focus:ring-2 focus:ring-blue-500 focus:ring-offset-2 disabled:cursor-not-allowed disabled:opacity-50"
|
||||||
|
>
|
||||||
|
{isLoading ? "Signing in..." : "Sign in"}
|
||||||
|
</button>
|
||||||
|
</form>
|
||||||
|
|
||||||
|
{authType === "LOCAL" && (
|
||||||
|
<div className="mt-4 text-center text-sm">
|
||||||
|
<p className="text-gray-600 dark:text-gray-400">
|
||||||
|
Don't have an account?{" "}
|
||||||
|
<Link href="/register" className="font-medium text-blue-600 hover:text-blue-500 dark:text-blue-400">
|
||||||
|
Register here
|
||||||
|
</Link>
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
@ -1,5 +1,89 @@
|
||||||
|
"use client";
|
||||||
|
|
||||||
|
import { useState, useEffect, Suspense } from "react";
|
||||||
import { GoogleLoginButton } from "./GoogleLoginButton";
|
import { GoogleLoginButton } from "./GoogleLoginButton";
|
||||||
|
import { LocalLoginForm } from "./LocalLoginForm";
|
||||||
|
import { Logo } from "@/components/Logo";
|
||||||
|
import { AmbientBackground } from "./AmbientBackground";
|
||||||
|
import { useSearchParams } from "next/navigation";
|
||||||
|
import { Loader2 } from "lucide-react";
|
||||||
|
|
||||||
|
function LoginContent() {
|
||||||
|
const [authType, setAuthType] = useState<string | null>(null);
|
||||||
|
const [registrationSuccess, setRegistrationSuccess] = useState(false);
|
||||||
|
const [isLoading, setIsLoading] = useState(true);
|
||||||
|
const searchParams = useSearchParams();
|
||||||
|
|
||||||
|
useEffect(() => {
|
||||||
|
// Check if the user was redirected from registration
|
||||||
|
if (searchParams.get("registered") === "true") {
|
||||||
|
setRegistrationSuccess(true);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Get the auth type from environment variables
|
||||||
|
setAuthType(process.env.NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE || "GOOGLE");
|
||||||
|
setIsLoading(false);
|
||||||
|
}, [searchParams]);
|
||||||
|
|
||||||
|
// Show loading state while determining auth type
|
||||||
|
if (isLoading) {
|
||||||
|
return (
|
||||||
|
<div className="relative w-full overflow-hidden">
|
||||||
|
<AmbientBackground />
|
||||||
|
<div className="mx-auto flex h-screen max-w-lg flex-col items-center justify-center">
|
||||||
|
<Logo className="rounded-md" />
|
||||||
|
<div className="mt-8 flex items-center space-x-2">
|
||||||
|
<Loader2 className="h-5 w-5 animate-spin text-muted-foreground" />
|
||||||
|
<span className="text-muted-foreground">Loading...</span>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (authType === "GOOGLE") {
|
||||||
|
return <GoogleLoginButton />;
|
||||||
|
}
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="relative w-full overflow-hidden">
|
||||||
|
<AmbientBackground />
|
||||||
|
<div className="mx-auto flex h-screen max-w-lg flex-col items-center justify-center">
|
||||||
|
<Logo className="rounded-md" />
|
||||||
|
<h1 className="my-8 text-xl font-bold text-neutral-800 dark:text-neutral-100 md:text-4xl">
|
||||||
|
Sign In
|
||||||
|
</h1>
|
||||||
|
|
||||||
|
{registrationSuccess && (
|
||||||
|
<div className="mb-4 w-full rounded-md bg-green-50 p-4 text-sm text-green-500 dark:bg-green-900/20 dark:text-green-200">
|
||||||
|
Registration successful! You can now sign in with your credentials.
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
<LocalLoginForm />
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Loading fallback for Suspense
|
||||||
|
const LoadingFallback = () => (
|
||||||
|
<div className="relative w-full overflow-hidden">
|
||||||
|
<AmbientBackground />
|
||||||
|
<div className="mx-auto flex h-screen max-w-lg flex-col items-center justify-center">
|
||||||
|
<Logo className="rounded-md" />
|
||||||
|
<div className="mt-8 flex items-center space-x-2">
|
||||||
|
<Loader2 className="h-5 w-5 animate-spin text-muted-foreground" />
|
||||||
|
<span className="text-muted-foreground">Loading...</span>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
|
||||||
export default function LoginPage() {
|
export default function LoginPage() {
|
||||||
return <GoogleLoginButton />;
|
return (
|
||||||
|
<Suspense fallback={<LoadingFallback />}>
|
||||||
|
<LoginContent />
|
||||||
|
</Suspense>
|
||||||
|
);
|
||||||
}
|
}
|
||||||
149
surfsense_web/app/register/page.tsx
Normal file
149
surfsense_web/app/register/page.tsx
Normal file
|
|
@ -0,0 +1,149 @@
|
||||||
|
"use client";
|
||||||
|
import React, { useState, useEffect } from "react";
|
||||||
|
import { useRouter } from "next/navigation";
|
||||||
|
import Link from "next/link";
|
||||||
|
import { Logo } from "@/components/Logo";
|
||||||
|
import { AmbientBackground } from "../login/AmbientBackground";
|
||||||
|
|
||||||
|
export default function RegisterPage() {
|
||||||
|
const [email, setEmail] = useState("");
|
||||||
|
const [password, setPassword] = useState("");
|
||||||
|
const [confirmPassword, setConfirmPassword] = useState("");
|
||||||
|
const [error, setError] = useState("");
|
||||||
|
const [isLoading, setIsLoading] = useState(false);
|
||||||
|
const router = useRouter();
|
||||||
|
|
||||||
|
// Check authentication type and redirect if not LOCAL
|
||||||
|
useEffect(() => {
|
||||||
|
const authType = process.env.NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE || "GOOGLE";
|
||||||
|
if (authType !== "LOCAL") {
|
||||||
|
router.push("/login");
|
||||||
|
}
|
||||||
|
}, [router]);
|
||||||
|
|
||||||
|
const handleSubmit = async (e: React.FormEvent) => {
|
||||||
|
e.preventDefault();
|
||||||
|
|
||||||
|
// Form validation
|
||||||
|
if (password !== confirmPassword) {
|
||||||
|
setError("Passwords do not match");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
setIsLoading(true);
|
||||||
|
setError("");
|
||||||
|
|
||||||
|
try {
|
||||||
|
const response = await fetch(
|
||||||
|
`${process.env.NEXT_PUBLIC_FASTAPI_BACKEND_URL}/auth/register`,
|
||||||
|
{
|
||||||
|
method: "POST",
|
||||||
|
headers: {
|
||||||
|
"Content-Type": "application/json",
|
||||||
|
},
|
||||||
|
body: JSON.stringify({
|
||||||
|
email,
|
||||||
|
password,
|
||||||
|
is_active: true,
|
||||||
|
is_superuser: false,
|
||||||
|
is_verified: false,
|
||||||
|
}),
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
|
const data = await response.json();
|
||||||
|
|
||||||
|
if (!response.ok) {
|
||||||
|
throw new Error(data.detail || "Registration failed");
|
||||||
|
}
|
||||||
|
|
||||||
|
// Redirect to login page after successful registration
|
||||||
|
router.push("/login?registered=true");
|
||||||
|
} catch (err: any) {
|
||||||
|
setError(err.message || "An error occurred during registration");
|
||||||
|
} finally {
|
||||||
|
setIsLoading(false);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="relative w-full overflow-hidden">
|
||||||
|
<AmbientBackground />
|
||||||
|
<div className="mx-auto flex h-screen max-w-lg flex-col items-center justify-center">
|
||||||
|
<Logo className="rounded-md" />
|
||||||
|
<h1 className="my-8 text-xl font-bold text-neutral-800 dark:text-neutral-100 md:text-4xl">
|
||||||
|
Create an Account
|
||||||
|
</h1>
|
||||||
|
|
||||||
|
<div className="w-full max-w-md">
|
||||||
|
<form onSubmit={handleSubmit} className="space-y-4">
|
||||||
|
{error && (
|
||||||
|
<div className="rounded-md bg-red-50 p-4 text-sm text-red-500 dark:bg-red-900/20 dark:text-red-200">
|
||||||
|
{error}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
<div>
|
||||||
|
<label htmlFor="email" className="block text-sm font-medium text-gray-700 dark:text-gray-300">
|
||||||
|
Email
|
||||||
|
</label>
|
||||||
|
<input
|
||||||
|
id="email"
|
||||||
|
type="email"
|
||||||
|
required
|
||||||
|
value={email}
|
||||||
|
onChange={(e) => setEmail(e.target.value)}
|
||||||
|
className="mt-1 block w-full rounded-md border border-gray-300 bg-white px-3 py-2 shadow-sm focus:border-blue-500 focus:outline-none focus:ring-blue-500 dark:border-gray-700 dark:bg-gray-800 dark:text-white"
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div>
|
||||||
|
<label htmlFor="password" className="block text-sm font-medium text-gray-700 dark:text-gray-300">
|
||||||
|
Password
|
||||||
|
</label>
|
||||||
|
<input
|
||||||
|
id="password"
|
||||||
|
type="password"
|
||||||
|
required
|
||||||
|
value={password}
|
||||||
|
onChange={(e) => setPassword(e.target.value)}
|
||||||
|
className="mt-1 block w-full rounded-md border border-gray-300 bg-white px-3 py-2 shadow-sm focus:border-blue-500 focus:outline-none focus:ring-blue-500 dark:border-gray-700 dark:bg-gray-800 dark:text-white"
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div>
|
||||||
|
<label htmlFor="confirmPassword" className="block text-sm font-medium text-gray-700 dark:text-gray-300">
|
||||||
|
Confirm Password
|
||||||
|
</label>
|
||||||
|
<input
|
||||||
|
id="confirmPassword"
|
||||||
|
type="password"
|
||||||
|
required
|
||||||
|
value={confirmPassword}
|
||||||
|
onChange={(e) => setConfirmPassword(e.target.value)}
|
||||||
|
className="mt-1 block w-full rounded-md border border-gray-300 bg-white px-3 py-2 shadow-sm focus:border-blue-500 focus:outline-none focus:ring-blue-500 dark:border-gray-700 dark:bg-gray-800 dark:text-white"
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<button
|
||||||
|
type="submit"
|
||||||
|
disabled={isLoading}
|
||||||
|
className="w-full rounded-md bg-blue-600 px-4 py-2 text-white shadow-sm hover:bg-blue-700 focus:outline-none focus:ring-2 focus:ring-blue-500 focus:ring-offset-2 disabled:cursor-not-allowed disabled:opacity-50"
|
||||||
|
>
|
||||||
|
{isLoading ? "Creating account..." : "Register"}
|
||||||
|
</button>
|
||||||
|
</form>
|
||||||
|
|
||||||
|
<div className="mt-4 text-center text-sm">
|
||||||
|
<p className="text-gray-600 dark:text-gray-400">
|
||||||
|
Already have an account?{" "}
|
||||||
|
<Link href="/login" className="font-medium text-blue-600 hover:text-blue-500 dark:text-blue-400">
|
||||||
|
Sign in
|
||||||
|
</Link>
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
48
surfsense_web/app/sitemap.ts
Normal file
48
surfsense_web/app/sitemap.ts
Normal file
|
|
@ -0,0 +1,48 @@
|
||||||
|
import type { MetadataRoute } from 'next'
|
||||||
|
|
||||||
|
export default function sitemap(): MetadataRoute.Sitemap {
|
||||||
|
return [
|
||||||
|
{
|
||||||
|
url: 'https://www.surfsense.net/',
|
||||||
|
lastModified: new Date(),
|
||||||
|
changeFrequency: 'yearly',
|
||||||
|
priority: 1,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
url: 'https://www.surfsense.net/privacy',
|
||||||
|
lastModified: new Date(),
|
||||||
|
changeFrequency: 'monthly',
|
||||||
|
priority: 0.9,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
url: 'https://www.surfsense.net/terms',
|
||||||
|
lastModified: new Date(),
|
||||||
|
changeFrequency: 'monthly',
|
||||||
|
priority: 0.9,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
url: 'https://www.surfsense.net/docs',
|
||||||
|
lastModified: new Date(),
|
||||||
|
changeFrequency: 'weekly',
|
||||||
|
priority: 0.9,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
url: 'https://www.surfsense.net/docs/installation',
|
||||||
|
lastModified: new Date(),
|
||||||
|
changeFrequency: 'weekly',
|
||||||
|
priority: 0.9,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
url: 'https://www.surfsense.net/docs/docker-installation',
|
||||||
|
lastModified: new Date(),
|
||||||
|
changeFrequency: 'weekly',
|
||||||
|
priority: 0.9,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
url: 'https://www.surfsense.net/docs/manual-installation',
|
||||||
|
lastModified: new Date(),
|
||||||
|
changeFrequency: 'weekly',
|
||||||
|
priority: 0.9,
|
||||||
|
},
|
||||||
|
]
|
||||||
|
}
|
||||||
|
|
@ -19,6 +19,17 @@ export function ModernHeroWithGradients() {
|
||||||
<DarkModeGradient />
|
<DarkModeGradient />
|
||||||
|
|
||||||
<div className="relative z-20 flex flex-col items-center justify-center overflow-hidden rounded-3xl p-4 md:p-12 lg:p-16">
|
<div className="relative z-20 flex flex-col items-center justify-center overflow-hidden rounded-3xl p-4 md:p-12 lg:p-16">
|
||||||
|
<div className="flex justify-center w-full mb-4">
|
||||||
|
<Link href="https://github.com/MODSetter/SurfSense" target="_blank" rel="noopener noreferrer">
|
||||||
|
<img
|
||||||
|
src="https://trendshift.io/api/badge/repositories/13606"
|
||||||
|
alt="MODSetter%2FSurfSense | Trendshift"
|
||||||
|
style={{ width: "250px", height: "55px" }}
|
||||||
|
width={250}
|
||||||
|
height={55}
|
||||||
|
/>
|
||||||
|
</Link>
|
||||||
|
</div>
|
||||||
<Link
|
<Link
|
||||||
href="/docs"
|
href="/docs"
|
||||||
className="flex items-center gap-1 rounded-full border border-gray-200 bg-gradient-to-b from-gray-50 to-gray-100 px-4 py-1 text-center text-sm text-gray-800 shadow-sm dark:border-[#404040] dark:bg-gradient-to-b dark:from-[#5B5B5D] dark:to-[#262627] dark:text-white dark:shadow-inner dark:shadow-purple-500/10"
|
className="flex items-center gap-1 rounded-full border border-gray-200 bg-gradient-to-b from-gray-50 to-gray-100 px-4 py-1 text-center text-sm text-gray-800 shadow-sm dark:border-[#404040] dark:bg-gradient-to-b dark:from-[#5B5B5D] dark:to-[#262627] dark:text-white dark:shadow-inner dark:shadow-purple-500/10"
|
||||||
|
|
@ -36,7 +47,7 @@ export function ModernHeroWithGradients() {
|
||||||
</h1>
|
</h1>
|
||||||
</div>
|
</div>
|
||||||
<p className="mx-auto max-w-3xl py-6 text-center text-base text-gray-600 dark:text-neutral-300 md:text-lg lg:text-xl">
|
<p className="mx-auto max-w-3xl py-6 text-center text-base text-gray-600 dark:text-neutral-300 md:text-lg lg:text-xl">
|
||||||
A Customizable AI Research Agent just like NotebookLM or Perplexity, but connected to external sources such as search engines (Tavily), Slack, Linear, Notion, YouTube, GitHub and more.
|
A Customizable AI Research Agent just like NotebookLM or Perplexity, but connected to external sources such as search engines (Tavily, LinkUp), Slack, Linear, Notion, YouTube, GitHub and more.
|
||||||
</p>
|
</p>
|
||||||
<div className="flex flex-col items-center gap-6 py-6 sm:flex-row">
|
<div className="flex flex-col items-center gap-6 py-6 sm:flex-row">
|
||||||
<Link
|
<Link
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
"use client";
|
"use client";
|
||||||
import { cn } from "@/lib/utils";
|
import { cn } from "@/lib/utils";
|
||||||
import { IconMenu2, IconX, IconBrandGoogleFilled } from "@tabler/icons-react";
|
import { IconMenu2, IconX, IconBrandGoogleFilled, IconUser } from "@tabler/icons-react";
|
||||||
import {
|
import {
|
||||||
motion,
|
motion,
|
||||||
AnimatePresence,
|
AnimatePresence,
|
||||||
|
|
@ -62,26 +62,10 @@ export const Navbar = () => {
|
||||||
|
|
||||||
const DesktopNav = ({ navItems, visible }: NavbarProps) => {
|
const DesktopNav = ({ navItems, visible }: NavbarProps) => {
|
||||||
const [hoveredIndex, setHoveredIndex] = useState<number | null>(null);
|
const [hoveredIndex, setHoveredIndex] = useState<number | null>(null);
|
||||||
|
|
||||||
const handleGoogleLogin = () => {
|
const handleGoogleLogin = () => {
|
||||||
// Redirect to Google OAuth authorization URL
|
// Redirect to the login page
|
||||||
fetch(`${process.env.NEXT_PUBLIC_FASTAPI_BACKEND_URL}/auth/google/authorize`)
|
window.location.href = '/login';
|
||||||
.then((response) => {
|
|
||||||
if (!response.ok) {
|
|
||||||
throw new Error('Failed to get authorization URL');
|
|
||||||
}
|
|
||||||
return response.json();
|
|
||||||
})
|
|
||||||
.then((data) => {
|
|
||||||
if (data.authorization_url) {
|
|
||||||
window.location.href = data.authorization_url;
|
|
||||||
} else {
|
|
||||||
console.error('No authorization URL received');
|
|
||||||
}
|
|
||||||
})
|
|
||||||
.catch((error) => {
|
|
||||||
console.error('Error during Google login:', error);
|
|
||||||
});
|
|
||||||
};
|
};
|
||||||
|
|
||||||
return (
|
return (
|
||||||
|
|
@ -89,8 +73,8 @@ const DesktopNav = ({ navItems, visible }: NavbarProps) => {
|
||||||
onMouseLeave={() => setHoveredIndex(null)}
|
onMouseLeave={() => setHoveredIndex(null)}
|
||||||
animate={{
|
animate={{
|
||||||
backdropFilter: "blur(16px)",
|
backdropFilter: "blur(16px)",
|
||||||
background: visible
|
background: visible
|
||||||
? "rgba(var(--background-rgb), 0.8)"
|
? "rgba(var(--background-rgb), 0.8)"
|
||||||
: "rgba(var(--background-rgb), 0.6)",
|
: "rgba(var(--background-rgb), 0.6)",
|
||||||
width: visible ? "38%" : "80%",
|
width: visible ? "38%" : "80%",
|
||||||
height: visible ? "48px" : "64px",
|
height: visible ? "48px" : "64px",
|
||||||
|
|
@ -115,7 +99,7 @@ const DesktopNav = ({ navItems, visible }: NavbarProps) => {
|
||||||
} as React.CSSProperties}
|
} as React.CSSProperties}
|
||||||
>
|
>
|
||||||
<div className="flex flex-row items-center gap-2">
|
<div className="flex flex-row items-center gap-2">
|
||||||
<Logo className="h-8 w-8 rounded-md" />
|
<Logo className="h-8 w-8 rounded-md" />
|
||||||
<span className="dark:text-white/90 text-gray-800 text-lg font-bold">SurfSense</span>
|
<span className="dark:text-white/90 text-gray-800 text-lg font-bold">SurfSense</span>
|
||||||
</div>
|
</div>
|
||||||
<div className="flex items-center gap-4">
|
<div className="flex items-center gap-4">
|
||||||
|
|
@ -191,8 +175,8 @@ const DesktopNav = ({ navItems, visible }: NavbarProps) => {
|
||||||
variant="outline"
|
variant="outline"
|
||||||
className="hidden cursor-pointer md:flex items-center gap-2 rounded-full dark:bg-white/20 dark:hover:bg-white/30 dark:text-white bg-gray-100 hover:bg-gray-200 text-gray-800 border-0"
|
className="hidden cursor-pointer md:flex items-center gap-2 rounded-full dark:bg-white/20 dark:hover:bg-white/30 dark:text-white bg-gray-100 hover:bg-gray-200 text-gray-800 border-0"
|
||||||
>
|
>
|
||||||
<IconBrandGoogleFilled className="h-4 w-4" />
|
<IconUser className="h-4 w-4" />
|
||||||
<span>Sign in with Google</span>
|
<span>Sign in</span>
|
||||||
</Button>
|
</Button>
|
||||||
</motion.div>
|
</motion.div>
|
||||||
)}
|
)}
|
||||||
|
|
@ -204,19 +188,19 @@ const DesktopNav = ({ navItems, visible }: NavbarProps) => {
|
||||||
|
|
||||||
const MobileNav = ({ navItems, visible }: NavbarProps) => {
|
const MobileNav = ({ navItems, visible }: NavbarProps) => {
|
||||||
const [open, setOpen] = useState(false);
|
const [open, setOpen] = useState(false);
|
||||||
|
|
||||||
const handleGoogleLogin = () => {
|
const handleGoogleLogin = () => {
|
||||||
// Redirect to the login page
|
// Redirect to the login page
|
||||||
window.location.href = "./login";
|
window.location.href = "./login";
|
||||||
};
|
};
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<>
|
<>
|
||||||
<motion.div
|
<motion.div
|
||||||
animate={{
|
animate={{
|
||||||
backdropFilter: "blur(16px)",
|
backdropFilter: "blur(16px)",
|
||||||
background: visible
|
background: visible
|
||||||
? "rgba(var(--background-rgb), 0.8)"
|
? "rgba(var(--background-rgb), 0.8)"
|
||||||
: "rgba(var(--background-rgb), 0.6)",
|
: "rgba(var(--background-rgb), 0.6)",
|
||||||
width: visible ? "80%" : "90%",
|
width: visible ? "80%" : "90%",
|
||||||
y: visible ? 0 : 8,
|
y: visible ? 0 : 8,
|
||||||
|
|
@ -241,7 +225,7 @@ const MobileNav = ({ navItems, visible }: NavbarProps) => {
|
||||||
} as React.CSSProperties}
|
} as React.CSSProperties}
|
||||||
>
|
>
|
||||||
<div className="flex flex-row justify-between items-center w-full">
|
<div className="flex flex-row justify-between items-center w-full">
|
||||||
<Logo className="h-8 w-8 rounded-md" />
|
<Logo className="h-8 w-8 rounded-md" />
|
||||||
<div className="flex items-center gap-2">
|
<div className="flex items-center gap-2">
|
||||||
<ThemeTogglerComponent />
|
<ThemeTogglerComponent />
|
||||||
{open ? (
|
{open ? (
|
||||||
|
|
@ -294,8 +278,8 @@ const MobileNav = ({ navItems, visible }: NavbarProps) => {
|
||||||
variant="outline"
|
variant="outline"
|
||||||
className="flex cursor-pointer items-center gap-2 mt-4 w-full justify-center rounded-full dark:bg-white/20 dark:hover:bg-white/30 dark:text-white bg-gray-100 hover:bg-gray-200 text-gray-800 border-0"
|
className="flex cursor-pointer items-center gap-2 mt-4 w-full justify-center rounded-full dark:bg-white/20 dark:hover:bg-white/30 dark:text-white bg-gray-100 hover:bg-gray-200 text-gray-800 border-0"
|
||||||
>
|
>
|
||||||
<IconBrandGoogleFilled className="h-4 w-4" />
|
<IconUser className="h-4 w-4" />
|
||||||
<span>Sign in with Google</span>
|
<span>Sign in</span>
|
||||||
</Button>
|
</Button>
|
||||||
</motion.div>
|
</motion.div>
|
||||||
)}
|
)}
|
||||||
|
|
|
||||||
|
|
@ -11,7 +11,7 @@ import {
|
||||||
Link,
|
Link,
|
||||||
Webhook,
|
Webhook,
|
||||||
} from 'lucide-react';
|
} from 'lucide-react';
|
||||||
import { IconBrandNotion, IconBrandSlack, IconBrandYoutube, IconBrandGithub, IconLayoutKanban } from "@tabler/icons-react";
|
import { IconBrandNotion, IconBrandSlack, IconBrandYoutube, IconBrandGithub, IconLayoutKanban, IconLinkPlus } from "@tabler/icons-react";
|
||||||
import { Button } from '@/components/ui/button';
|
import { Button } from '@/components/ui/button';
|
||||||
import { Connector, ResearchMode } from './types';
|
import { Connector, ResearchMode } from './types';
|
||||||
|
|
||||||
|
|
@ -20,6 +20,8 @@ export const getConnectorIcon = (connectorType: string) => {
|
||||||
const iconProps = { className: "h-4 w-4" };
|
const iconProps = { className: "h-4 w-4" };
|
||||||
|
|
||||||
switch(connectorType) {
|
switch(connectorType) {
|
||||||
|
case 'LINKUP_API':
|
||||||
|
return <IconLinkPlus {...iconProps} />;
|
||||||
case 'LINEAR_CONNECTOR':
|
case 'LINEAR_CONNECTOR':
|
||||||
return <IconLayoutKanban {...iconProps} />;
|
return <IconLayoutKanban {...iconProps} />;
|
||||||
case 'GITHUB_CONNECTOR':
|
case 'GITHUB_CONNECTOR':
|
||||||
|
|
@ -145,7 +147,7 @@ export const ConnectorButton = ({ selectedConnectors, onClick, connectorSources
|
||||||
return (
|
return (
|
||||||
<Button
|
<Button
|
||||||
variant="outline"
|
variant="outline"
|
||||||
className="h-7 px-2 text-xs font-medium rounded-md border-border relative overflow-hidden group scale-90 origin-left"
|
className="h-8 px-2 text-xs font-medium rounded-md border-border relative overflow-hidden group"
|
||||||
onClick={onClick}
|
onClick={onClick}
|
||||||
aria-label={selectedCount === 0 ? "Select Connectors" : `${selectedCount} connectors selected`}
|
aria-label={selectedCount === 0 ? "Select Connectors" : `${selectedCount} connectors selected`}
|
||||||
>
|
>
|
||||||
|
|
|
||||||
|
|
@ -15,11 +15,11 @@ type SegmentedControlProps<T extends string> = {
|
||||||
*/
|
*/
|
||||||
function SegmentedControl<T extends string>({ value, onChange, options }: SegmentedControlProps<T>) {
|
function SegmentedControl<T extends string>({ value, onChange, options }: SegmentedControlProps<T>) {
|
||||||
return (
|
return (
|
||||||
<div className="flex rounded-md border border-border overflow-hidden scale-90 origin-left">
|
<div className="flex h-7 rounded-md border border-border overflow-hidden">
|
||||||
{options.map((option) => (
|
{options.map((option) => (
|
||||||
<button
|
<button
|
||||||
key={option.value}
|
key={option.value}
|
||||||
className={`flex items-center gap-1 px-2 py-1 text-xs transition-colors ${
|
className={`flex h-full items-center gap-1 px-2 text-xs transition-colors ${
|
||||||
value === option.value
|
value === option.value
|
||||||
? 'bg-primary text-primary-foreground'
|
? 'bg-primary text-primary-foreground'
|
||||||
: 'hover:bg-muted'
|
: 'hover:bg-muted'
|
||||||
|
|
|
||||||
|
|
@ -30,5 +30,6 @@ export const editConnectorSchema = z.object({
|
||||||
SERPER_API_KEY: z.string().optional(),
|
SERPER_API_KEY: z.string().optional(),
|
||||||
TAVILY_API_KEY: z.string().optional(),
|
TAVILY_API_KEY: z.string().optional(),
|
||||||
LINEAR_API_KEY: z.string().optional(),
|
LINEAR_API_KEY: z.string().optional(),
|
||||||
|
LINKUP_API_KEY: z.string().optional(),
|
||||||
});
|
});
|
||||||
export type EditConnectorFormValues = z.infer<typeof editConnectorSchema>;
|
export type EditConnectorFormValues = z.infer<typeof editConnectorSchema>;
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,4 @@
|
||||||
import React, { useMemo } from "react";
|
import React, { useMemo, useState, useEffect } from "react";
|
||||||
import ReactMarkdown from "react-markdown";
|
import ReactMarkdown from "react-markdown";
|
||||||
import rehypeRaw from "rehype-raw";
|
import rehypeRaw from "rehype-raw";
|
||||||
import rehypeSanitize from "rehype-sanitize";
|
import rehypeSanitize from "rehype-sanitize";
|
||||||
|
|
@ -6,6 +6,10 @@ import remarkGfm from "remark-gfm";
|
||||||
import { cn } from "@/lib/utils";
|
import { cn } from "@/lib/utils";
|
||||||
import { Citation } from "./chat/Citation";
|
import { Citation } from "./chat/Citation";
|
||||||
import { Source } from "./chat/types";
|
import { Source } from "./chat/types";
|
||||||
|
import { Prism as SyntaxHighlighter } from "react-syntax-highlighter";
|
||||||
|
import { oneLight, oneDark } from "react-syntax-highlighter/dist/cjs/styles/prism";
|
||||||
|
import { Check, Copy } from "lucide-react";
|
||||||
|
import { useTheme } from "next-themes";
|
||||||
|
|
||||||
interface MarkdownViewerProps {
|
interface MarkdownViewerProps {
|
||||||
content: string;
|
content: string;
|
||||||
|
|
@ -75,16 +79,19 @@ export function MarkdownViewer({ content, className, getCitationSource }: Markdo
|
||||||
td: ({node, ...props}: any) => <td className="px-3 py-2 border-t border-border" {...props} />,
|
td: ({node, ...props}: any) => <td className="px-3 py-2 border-t border-border" {...props} />,
|
||||||
code: ({node, className, children, ...props}: any) => {
|
code: ({node, className, children, ...props}: any) => {
|
||||||
const match = /language-(\w+)/.exec(className || '');
|
const match = /language-(\w+)/.exec(className || '');
|
||||||
|
const language = match ? match[1] : '';
|
||||||
const isInline = !match;
|
const isInline = !match;
|
||||||
return isInline
|
|
||||||
? <code className="bg-muted px-1 py-0.5 rounded text-xs" {...props}>{children}</code>
|
if (isInline) {
|
||||||
: (
|
return <code className="bg-muted px-1 py-0.5 rounded text-xs" {...props}>{children}</code>;
|
||||||
<div className="relative my-4">
|
}
|
||||||
<pre className="bg-muted p-4 rounded-md overflow-x-auto">
|
|
||||||
<code className="text-xs" {...props}>{children}</code>
|
// For code blocks, add syntax highlighting and copy functionality
|
||||||
</pre>
|
return (
|
||||||
</div>
|
<CodeBlock language={language} {...props}>
|
||||||
);
|
{String(children).replace(/\n$/, '')}
|
||||||
|
</CodeBlock>
|
||||||
|
);
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
}, [getCitationSource]);
|
}, [getCitationSource]);
|
||||||
|
|
@ -102,6 +109,102 @@ export function MarkdownViewer({ content, className, getCitationSource }: Markdo
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Code block component with syntax highlighting and copy functionality
|
||||||
|
const CodeBlock = ({ children, language }: { children: string, language: string }) => {
|
||||||
|
const [copied, setCopied] = useState(false);
|
||||||
|
const { resolvedTheme, theme } = useTheme();
|
||||||
|
const [mounted, setMounted] = useState(false);
|
||||||
|
|
||||||
|
// Prevent hydration issues
|
||||||
|
useEffect(() => {
|
||||||
|
setMounted(true);
|
||||||
|
}, []);
|
||||||
|
|
||||||
|
const handleCopy = async () => {
|
||||||
|
await navigator.clipboard.writeText(children);
|
||||||
|
setCopied(true);
|
||||||
|
setTimeout(() => setCopied(false), 2000);
|
||||||
|
};
|
||||||
|
|
||||||
|
// Choose theme based on current system/user preference
|
||||||
|
const isDarkTheme = mounted && (resolvedTheme === 'dark' || theme === 'dark');
|
||||||
|
const syntaxTheme = isDarkTheme ? oneDark : oneLight;
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="relative my-4 group">
|
||||||
|
<div className="absolute right-2 top-2 z-10">
|
||||||
|
<button
|
||||||
|
onClick={handleCopy}
|
||||||
|
className="p-1.5 rounded-md bg-background/80 hover:bg-background border border-border flex items-center justify-center transition-colors"
|
||||||
|
aria-label="Copy code"
|
||||||
|
>
|
||||||
|
{copied ?
|
||||||
|
<Check size={14} className="text-green-500" /> :
|
||||||
|
<Copy size={14} className="text-muted-foreground" />
|
||||||
|
}
|
||||||
|
</button>
|
||||||
|
</div>
|
||||||
|
{mounted ? (
|
||||||
|
<SyntaxHighlighter
|
||||||
|
language={language || 'text'}
|
||||||
|
style={{
|
||||||
|
...syntaxTheme,
|
||||||
|
'pre[class*="language-"]': {
|
||||||
|
...syntaxTheme['pre[class*="language-"]'],
|
||||||
|
margin: 0,
|
||||||
|
border: 'none',
|
||||||
|
borderRadius: '0.375rem',
|
||||||
|
background: 'var(--syntax-bg)'
|
||||||
|
},
|
||||||
|
'code[class*="language-"]': {
|
||||||
|
...syntaxTheme['code[class*="language-"]'],
|
||||||
|
border: 'none',
|
||||||
|
background: 'var(--syntax-bg)'
|
||||||
|
}
|
||||||
|
}}
|
||||||
|
customStyle={{
|
||||||
|
margin: 0,
|
||||||
|
borderRadius: '0.375rem',
|
||||||
|
fontSize: '0.75rem',
|
||||||
|
lineHeight: '1.5rem',
|
||||||
|
backgroundColor: 'var(--syntax-bg)',
|
||||||
|
border: 'none',
|
||||||
|
}}
|
||||||
|
codeTagProps={{
|
||||||
|
className: "font-mono",
|
||||||
|
style: {
|
||||||
|
border: 'none',
|
||||||
|
background: 'var(--syntax-bg)'
|
||||||
|
}
|
||||||
|
}}
|
||||||
|
showLineNumbers={false}
|
||||||
|
wrapLines={false}
|
||||||
|
lineProps={{
|
||||||
|
style: {
|
||||||
|
wordBreak: 'break-all',
|
||||||
|
whiteSpace: 'pre-wrap',
|
||||||
|
border: 'none',
|
||||||
|
borderBottom: 'none',
|
||||||
|
paddingLeft: 0,
|
||||||
|
paddingRight: 0,
|
||||||
|
margin: '0.25rem 0'
|
||||||
|
}
|
||||||
|
}}
|
||||||
|
PreTag="div"
|
||||||
|
>
|
||||||
|
{children}
|
||||||
|
</SyntaxHighlighter>
|
||||||
|
) : (
|
||||||
|
<div className="bg-muted p-4 rounded-md">
|
||||||
|
<pre className="m-0 p-0 border-0">
|
||||||
|
<code className="text-xs font-mono border-0 leading-6">{children}</code>
|
||||||
|
</pre>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
};
|
||||||
|
|
||||||
// Helper function to process citations within React children
|
// Helper function to process citations within React children
|
||||||
const processCitationsInReactChildren = (children: React.ReactNode, getCitationSource: (id: number) => Source | null): React.ReactNode => {
|
const processCitationsInReactChildren = (children: React.ReactNode, getCitationSource: (id: number) => Source | null): React.ReactNode => {
|
||||||
// If children is not an array or string, just return it
|
// If children is not an array or string, just return it
|
||||||
|
|
|
||||||
|
|
@ -14,6 +14,7 @@ import {
|
||||||
Info,
|
Info,
|
||||||
ExternalLink,
|
ExternalLink,
|
||||||
Trash2,
|
Trash2,
|
||||||
|
Podcast,
|
||||||
type LucideIcon,
|
type LucideIcon,
|
||||||
} from "lucide-react"
|
} from "lucide-react"
|
||||||
|
|
||||||
|
|
@ -45,7 +46,8 @@ export const iconMap: Record<string, LucideIcon> = {
|
||||||
AlertCircle,
|
AlertCircle,
|
||||||
Info,
|
Info,
|
||||||
ExternalLink,
|
ExternalLink,
|
||||||
Trash2
|
Trash2,
|
||||||
|
Podcast
|
||||||
}
|
}
|
||||||
|
|
||||||
const defaultData = {
|
const defaultData = {
|
||||||
|
|
|
||||||
28
surfsense_web/components/ui/slider.tsx
Normal file
28
surfsense_web/components/ui/slider.tsx
Normal file
|
|
@ -0,0 +1,28 @@
|
||||||
|
"use client"
|
||||||
|
|
||||||
|
import * as React from "react"
|
||||||
|
import * as SliderPrimitive from "@radix-ui/react-slider"
|
||||||
|
|
||||||
|
import { cn } from "@/lib/utils"
|
||||||
|
|
||||||
|
const Slider = React.forwardRef<
|
||||||
|
React.ElementRef<typeof SliderPrimitive.Root>,
|
||||||
|
React.ComponentPropsWithoutRef<typeof SliderPrimitive.Root>
|
||||||
|
>(({ className, ...props }, ref) => (
|
||||||
|
<SliderPrimitive.Root
|
||||||
|
ref={ref}
|
||||||
|
className={cn(
|
||||||
|
"relative flex w-full touch-none select-none items-center",
|
||||||
|
className
|
||||||
|
)}
|
||||||
|
{...props}
|
||||||
|
>
|
||||||
|
<SliderPrimitive.Track className="relative h-2 w-full grow overflow-hidden rounded-full bg-secondary">
|
||||||
|
<SliderPrimitive.Range className="absolute h-full bg-primary" />
|
||||||
|
</SliderPrimitive.Track>
|
||||||
|
<SliderPrimitive.Thumb className="block h-5 w-5 rounded-full border-2 border-primary bg-background ring-offset-background transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2 disabled:pointer-events-none disabled:opacity-50" />
|
||||||
|
</SliderPrimitive.Root>
|
||||||
|
))
|
||||||
|
Slider.displayName = SliderPrimitive.Root.displayName
|
||||||
|
|
||||||
|
export { Slider }
|
||||||
|
|
@ -1,8 +1,9 @@
|
||||||
---
|
---
|
||||||
title: Docker Installation
|
title: Docker Installation
|
||||||
description: Setting up SurfSense using Docker
|
description: Setting up SurfSense using Docker
|
||||||
full: true
|
full: true
|
||||||
---
|
---
|
||||||
|
|
||||||
## Known Limitations
|
## Known Limitations
|
||||||
|
|
||||||
⚠️ **Important Note:** Currently, the following features have limited functionality when running in Docker:
|
⚠️ **Important Note:** Currently, the following features have limited functionality when running in Docker:
|
||||||
|
|
@ -12,8 +13,7 @@ full: true
|
||||||
|
|
||||||
We're actively working to resolve these limitations in future releases.
|
We're actively working to resolve these limitations in future releases.
|
||||||
|
|
||||||
|
# Docker Installation
|
||||||
# Docker Installation
|
|
||||||
|
|
||||||
This guide explains how to run SurfSense using Docker Compose, which is the preferred and recommended method for deployment.
|
This guide explains how to run SurfSense using Docker Compose, which is the preferred and recommended method for deployment.
|
||||||
|
|
||||||
|
|
@ -32,125 +32,203 @@ Before you begin, ensure you have:
|
||||||
## Installation Steps
|
## Installation Steps
|
||||||
|
|
||||||
1. **Configure Environment Variables**
|
1. **Configure Environment Variables**
|
||||||
|
Set up the necessary environment variables:
|
||||||
Set up the necessary environment variables:
|
|
||||||
|
|
||||||
**Linux/macOS:**
|
|
||||||
```bash
|
|
||||||
# Copy example environment files
|
|
||||||
cp surfsense_backend/.env.example surfsense_backend/.env
|
|
||||||
cp surfsense_web/.env.example surfsense_web/.env
|
|
||||||
```
|
|
||||||
|
|
||||||
**Windows (Command Prompt):**
|
|
||||||
```cmd
|
|
||||||
copy surfsense_backend\.env.example surfsense_backend\.env
|
|
||||||
copy surfsense_web\.env.example surfsense_web\.env
|
|
||||||
```
|
|
||||||
|
|
||||||
**Windows (PowerShell):**
|
|
||||||
```powershell
|
|
||||||
Copy-Item -Path surfsense_backend\.env.example -Destination surfsense_backend\.env
|
|
||||||
Copy-Item -Path surfsense_web\.env.example -Destination surfsense_web\.env
|
|
||||||
```
|
|
||||||
|
|
||||||
Edit both `.env` files and fill in the required values:
|
**Linux/macOS:**
|
||||||
|
|
||||||
**Backend Environment Variables:**
|
```bash
|
||||||
|
# Copy example environment files
|
||||||
|
cp surfsense_backend/.env.example surfsense_backend/.env
|
||||||
|
cp surfsense_web/.env.example surfsense_web/.env
|
||||||
|
cp .env.example .env # For Docker-specific settings
|
||||||
|
```
|
||||||
|
|
||||||
| ENV VARIABLE | DESCRIPTION |
|
**Windows (Command Prompt):**
|
||||||
|--------------|-------------|
|
|
||||||
| DATABASE_URL | PostgreSQL connection string (e.g., `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`) |
|
|
||||||
| SECRET_KEY | JWT Secret key for authentication (should be a secure random string) |
|
|
||||||
| GOOGLE_OAUTH_CLIENT_ID | Google OAuth client ID obtained from Google Cloud Console |
|
|
||||||
| GOOGLE_OAUTH_CLIENT_SECRET | Google OAuth client secret obtained from Google Cloud Console |
|
|
||||||
| NEXT_FRONTEND_URL | URL where your frontend application is hosted (e.g., `http://localhost:3000`) |
|
|
||||||
| EMBEDDING_MODEL | Name of the embedding model (e.g., `mixedbread-ai/mxbai-embed-large-v1`) |
|
|
||||||
| RERANKERS_MODEL_NAME | Name of the reranker model (e.g., `ms-marco-MiniLM-L-12-v2`) |
|
|
||||||
| RERANKERS_MODEL_TYPE | Type of reranker model (e.g., `flashrank`) |
|
|
||||||
| FAST_LLM | LiteLLM routed smaller, faster LLM (e.g., `openai/gpt-4o-mini`, `ollama/deepseek-r1:8b`) |
|
|
||||||
| STRATEGIC_LLM | LiteLLM routed advanced LLM for complex tasks (e.g., `openai/gpt-4o`, `ollama/gemma3:12b`) |
|
|
||||||
| LONG_CONTEXT_LLM | LiteLLM routed LLM for longer context windows (e.g., `gemini/gemini-2.0-flash`, `ollama/deepseek-r1:8b`) |
|
|
||||||
| UNSTRUCTURED_API_KEY | API key for Unstructured.io service for document parsing |
|
|
||||||
| FIRECRAWL_API_KEY | API key for Firecrawl service for web crawling |
|
|
||||||
|
|
||||||
Include API keys for the LLM providers you're using. For example:
|
```cmd
|
||||||
- `OPENAI_API_KEY`: If using OpenAI models
|
copy surfsense_backend\.env.example surfsense_backend\.env
|
||||||
- `GEMINI_API_KEY`: If using Google Gemini models
|
copy surfsense_web\.env.example surfsense_web\.env
|
||||||
|
copy .env.example .env
|
||||||
For other LLM providers, refer to the [LiteLLM documentation](https://docs.litellm.ai/docs/providers).
|
```
|
||||||
|
|
||||||
**Frontend Environment Variables:**
|
**Windows (PowerShell):**
|
||||||
|
|
||||||
| ENV VARIABLE | DESCRIPTION |
|
```powershell
|
||||||
|--------------|-------------|
|
Copy-Item -Path surfsense_backend\.env.example -Destination surfsense_backend\.env
|
||||||
| NEXT_PUBLIC_FASTAPI_BACKEND_URL | URL of the backend service (e.g., `http://localhost:8000`) |
|
Copy-Item -Path surfsense_web\.env.example -Destination surfsense_web\.env
|
||||||
|
Copy-Item -Path .env.example -Destination .env
|
||||||
|
```
|
||||||
|
|
||||||
|
Edit all `.env` files and fill in the required values:
|
||||||
|
|
||||||
|
### Docker-Specific Environment Variables
|
||||||
|
|
||||||
|
| ENV VARIABLE | DESCRIPTION | DEFAULT VALUE |
|
||||||
|
|----------------------------|-----------------------------------------------------------------------------|---------------------|
|
||||||
|
| FRONTEND_PORT | Port for the frontend service | 3000 |
|
||||||
|
| BACKEND_PORT | Port for the backend API service | 8000 |
|
||||||
|
| POSTGRES_PORT | Port for the PostgreSQL database | 5432 |
|
||||||
|
| PGADMIN_PORT | Port for pgAdmin web interface | 5050 |
|
||||||
|
| POSTGRES_USER | PostgreSQL username | postgres |
|
||||||
|
| POSTGRES_PASSWORD | PostgreSQL password | postgres |
|
||||||
|
| POSTGRES_DB | PostgreSQL database name | surfsense |
|
||||||
|
| PGADMIN_DEFAULT_EMAIL | Email for pgAdmin login | admin@surfsense.com |
|
||||||
|
| PGADMIN_DEFAULT_PASSWORD | Password for pgAdmin login | surfsense |
|
||||||
|
| NEXT_PUBLIC_API_URL | URL of the backend API (used by frontend) | http://backend:8000 |
|
||||||
|
|
||||||
|
**Backend Environment Variables:**
|
||||||
|
|
||||||
|
| ENV VARIABLE | DESCRIPTION |
|
||||||
|
| -------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
|
| DATABASE_URL | PostgreSQL connection string (e.g., `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`) |
|
||||||
|
| SECRET_KEY | JWT Secret key for authentication (should be a secure random string) |
|
||||||
|
| AUTH_TYPE | Authentication method: `GOOGLE` for OAuth with Google, `LOCAL` for email/password authentication |
|
||||||
|
| NEXT_FRONTEND_URL | URL where your frontend application is hosted (e.g., `http://localhost:3000`) |
|
||||||
|
| EMBEDDING_MODEL | Name of the embedding model (e.g., `openai://text-embedding-ada-002`, `anthropic://claude-v1`, `mixedbread-ai/mxbai-embed-large-v1`) |
|
||||||
|
| RERANKERS_MODEL_NAME | Name of the reranker model (e.g., `ms-marco-MiniLM-L-12-v2`) |
|
||||||
|
| RERANKERS_MODEL_TYPE | Type of reranker model (e.g., `flashrank`) |
|
||||||
|
| FAST_LLM | LiteLLM routed smaller, faster LLM (e.g., `openai/gpt-4o-mini`, `ollama/deepseek-r1:8b`) |
|
||||||
|
| STRATEGIC_LLM | LiteLLM routed advanced LLM for complex tasks (e.g., `openai/gpt-4o`, `ollama/gemma3:12b`) |
|
||||||
|
| LONG_CONTEXT_LLM | LiteLLM routed LLM for longer context windows (e.g., `gemini/gemini-2.0-flash`, `ollama/deepseek-r1:8b`) |
|
||||||
|
| ETL_SERVICE | Document parsing service: `UNSTRUCTURED` (supports 34+ formats) or `LLAMACLOUD` (supports 50+ formats including legacy document types) |
|
||||||
|
| UNSTRUCTURED_API_KEY | API key for Unstructured.io service for document parsing (required if ETL_SERVICE=UNSTRUCTURED) |
|
||||||
|
| LLAMA_CLOUD_API_KEY | API key for LlamaCloud service for document parsing (required if ETL_SERVICE=LLAMACLOUD) |
|
||||||
|
| FIRECRAWL_API_KEY | API key for Firecrawl service for web crawling |
|
||||||
|
| TTS_SERVICE | Text-to-Speech API provider for Podcasts (e.g., `openai/tts-1`, `azure/neural`, `vertex_ai/`). See [supported providers](https://docs.litellm.ai/docs/text_to_speech#supported-providers) |
|
||||||
|
| STT_SERVICE | Speech-to-Text API provider for Podcasts (e.g., `openai/whisper-1`). See [supported providers](https://docs.litellm.ai/docs/audio_transcription#supported-providers) |
|
||||||
|
|
||||||
|
|
||||||
|
Include API keys for your chosen LLM providers:
|
||||||
|
|
||||||
|
| ENV VARIABLE | DESCRIPTION |
|
||||||
|
|--------------------|-----------------------------------------------------------------------------|
|
||||||
|
| `OPENAI_API_KEY` | Required if using OpenAI models |
|
||||||
|
| `GEMINI_API_KEY` | Required if using Google Gemini models |
|
||||||
|
| `ANTHROPIC_API_KEY`| Required if using Anthropic models |
|
||||||
|
|
||||||
|
### Google OAuth Configuration (if AUTH_TYPE=GOOGLE)
|
||||||
|
|
||||||
|
| ENV VARIABLE | DESCRIPTION |
|
||||||
|
|----------------------------|-----------------------------------------------------------------------------|
|
||||||
|
| `GOOGLE_OAUTH_CLIENT_ID` | Client ID from Google Cloud Console |
|
||||||
|
| `GOOGLE_OAUTH_CLIENT_SECRET` | Client secret from Google Cloud Console |
|
||||||
|
|
||||||
|
**Optional Backend LangSmith Observability:**
|
||||||
|
| ENV VARIABLE | DESCRIPTION |
|
||||||
|
|--------------|-------------|
|
||||||
|
| LANGSMITH_TRACING | Enable LangSmith tracing (e.g., `true`) |
|
||||||
|
| LANGSMITH_ENDPOINT | LangSmith API endpoint (e.g., `https://api.smith.langchain.com`) |
|
||||||
|
| LANGSMITH_API_KEY | Your LangSmith API key |
|
||||||
|
| LANGSMITH_PROJECT | LangSmith project name (e.g., `surfsense`) |
|
||||||
|
|
||||||
|
**Optional Backend LiteLLM API Base URLs:**
|
||||||
|
| ENV VARIABLE | DESCRIPTION |
|
||||||
|
|--------------|-------------|
|
||||||
|
| FAST_LLM_API_BASE | Custom API base URL for the fast LLM |
|
||||||
|
| STRATEGIC_LLM_API_BASE | Custom API base URL for the strategic LLM |
|
||||||
|
| LONG_CONTEXT_LLM_API_BASE | Custom API base URL for the long context LLM |
|
||||||
|
| TTS_SERVICE_API_BASE | Custom API base URL for the Text-to-Speech (TTS) service |
|
||||||
|
| STT_SERVICE_API_BASE | Custom API base URL for the Speech-to-Text (STT) service |
|
||||||
|
|
||||||
|
For other LLM providers, refer to the [LiteLLM documentation](https://docs.litellm.ai/docs/providers).
|
||||||
|
|
||||||
|
### Frontend Environment Variables
|
||||||
|
|
||||||
|
| ENV VARIABLE | DESCRIPTION |
|
||||||
|
| ------------------------------- | ---------------------------------------------------------- |
|
||||||
|
| NEXT_PUBLIC_FASTAPI_BACKEND_URL | URL of the backend service (e.g., `http://localhost:8000`) |
|
||||||
|
| NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE | Same value as set in backend AUTH_TYPE i.e `GOOGLE` for OAuth with Google, `LOCAL` for email/password authentication |
|
||||||
|
| NEXT_PUBLIC_ETL_SERVICE | Document parsing service (should match backend ETL_SERVICE): `UNSTRUCTURED` or `LLAMACLOUD` - affects supported file formats in upload interface |
|
||||||
|
|
||||||
2. **Build and Start Containers**
|
2. **Build and Start Containers**
|
||||||
|
|
||||||
Start the Docker containers:
|
Start the Docker containers:
|
||||||
|
|
||||||
**Linux/macOS/Windows:**
|
**Linux/macOS/Windows:**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker-compose up --build
|
docker compose up --build
|
||||||
```
|
```
|
||||||
|
|
||||||
To run in detached mode (in the background):
|
To run in detached mode (in the background):
|
||||||
|
|
||||||
**Linux/macOS/Windows:**
|
**Linux/macOS/Windows:**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker-compose up -d
|
docker compose up -d
|
||||||
```
|
```
|
||||||
|
|
||||||
**Note for Windows users:** If you're using older Docker Desktop versions, you might need to use `docker compose` (with a space) instead of `docker-compose`.
|
**Note for Windows users:** If you're using older Docker Desktop versions, you might need to use `docker compose` (with a space) instead of `docker compose`.
|
||||||
|
|
||||||
3. **Access the Applications**
|
3. **Access the Applications**
|
||||||
|
|
||||||
Once the containers are running, you can access:
|
Once the containers are running, you can access:
|
||||||
|
|
||||||
- Frontend: [http://localhost:3000](http://localhost:3000)
|
- Frontend: [http://localhost:3000](http://localhost:3000)
|
||||||
- Backend API: [http://localhost:8000](http://localhost:8000)
|
- Backend API: [http://localhost:8000](http://localhost:8000)
|
||||||
- API Documentation: [http://localhost:8000/docs](http://localhost:8000/docs)
|
- API Documentation: [http://localhost:8000/docs](http://localhost:8000/docs)
|
||||||
|
- pgAdmin: [http://localhost:5050](http://localhost:5050)
|
||||||
|
|
||||||
|
## Using pgAdmin
|
||||||
|
|
||||||
|
pgAdmin is included in the Docker setup to help manage your PostgreSQL database. To connect:
|
||||||
|
|
||||||
|
1. Open pgAdmin at [http://localhost:5050](http://localhost:5050)
|
||||||
|
2. Login with the credentials from your `.env` file (default: admin@surfsense.com / surfsense)
|
||||||
|
3. Right-click "Servers" > "Create" > "Server"
|
||||||
|
4. In the "General" tab, name your connection (e.g., "SurfSense DB")
|
||||||
|
5. In the "Connection" tab:
|
||||||
|
- Host: `db`
|
||||||
|
- Port: `5432`
|
||||||
|
- Maintenance database: `surfsense`
|
||||||
|
- Username: `postgres` (or your custom POSTGRES_USER)
|
||||||
|
- Password: `postgres` (or your custom POSTGRES_PASSWORD)
|
||||||
|
6. Click "Save" to connect
|
||||||
|
|
||||||
## Useful Docker Commands
|
## Useful Docker Commands
|
||||||
|
|
||||||
### Container Management
|
### Container Management
|
||||||
|
|
||||||
- **Stop containers:**
|
- **Stop containers:**
|
||||||
|
|
||||||
**Linux/macOS/Windows:**
|
**Linux/macOS/Windows:**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker-compose down
|
docker compose down
|
||||||
```
|
```
|
||||||
|
|
||||||
- **View logs:**
|
- **View logs:**
|
||||||
|
|
||||||
**Linux/macOS/Windows:**
|
**Linux/macOS/Windows:**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# All services
|
# All services
|
||||||
docker-compose logs -f
|
docker compose logs -f
|
||||||
|
|
||||||
# Specific service
|
# Specific service
|
||||||
docker-compose logs -f backend
|
docker compose logs -f backend
|
||||||
docker-compose logs -f frontend
|
docker compose logs -f frontend
|
||||||
docker-compose logs -f db
|
docker compose logs -f db
|
||||||
```
|
```
|
||||||
|
|
||||||
- **Restart a specific service:**
|
- **Restart a specific service:**
|
||||||
|
|
||||||
**Linux/macOS/Windows:**
|
**Linux/macOS/Windows:**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker-compose restart backend
|
docker compose restart backend
|
||||||
```
|
```
|
||||||
|
|
||||||
- **Execute commands in a running container:**
|
- **Execute commands in a running container:**
|
||||||
|
|
||||||
**Linux/macOS/Windows:**
|
**Linux/macOS/Windows:**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Backend
|
# Backend
|
||||||
docker-compose exec backend python -m pytest
|
docker compose exec backend python -m pytest
|
||||||
|
|
||||||
# Frontend
|
# Frontend
|
||||||
docker-compose exec frontend pnpm lint
|
docker compose exec frontend pnpm lint
|
||||||
```
|
```
|
||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
|
|
@ -162,7 +240,6 @@ Before you begin, ensure you have:
|
||||||
- For frontend dependency issues, check the `Dockerfile` in the frontend directory.
|
- For frontend dependency issues, check the `Dockerfile` in the frontend directory.
|
||||||
- **Windows-specific:** If you encounter line ending issues (CRLF vs LF), configure Git to handle line endings properly with `git config --global core.autocrlf true` before cloning the repository.
|
- **Windows-specific:** If you encounter line ending issues (CRLF vs LF), configure Git to handle line endings properly with `git config --global core.autocrlf true` before cloning the repository.
|
||||||
|
|
||||||
|
|
||||||
## Next Steps
|
## Next Steps
|
||||||
|
|
||||||
Once your installation is complete, you can start using SurfSense! Navigate to the frontend URL and log in using your Google account.
|
Once your installation is complete, you can start using SurfSense! Navigate to the frontend URL and log in using your Google account.
|
||||||
|
|
|
||||||
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Add a link
Reference in a new issue