mirror of https://github.com/MODSetter/SurfSense.git synced 2026-05-09 07:42:39 +02:00

API Test Bot fd9eddf7fa test: Add Vitest configuration and initial tests for the DexScreener connect form.

2026-02-01 15:05:19 +07:00

12 KiB

Raw Blame History

Story 1.1: DexScreener Connector Integration

📋 Story Overview

Story ID: 1.1
Story Title: DexScreener Connector Integration
Epic: SurfSense Connectors Enhancement
Priority: High
Status: ✅ Implementation Complete (2026-02-01)
Created: 2026-01-31

🎯 User Story

As a SurfSense user tracking cryptocurrency markets
I want to connect my DexScreener data to SurfSense
So that I can search and chat with AI about my tracked trading pairs and token data

📝 Description

Implement a custom connector for DexScreener API that allows users to:

Configure tracked tokens across multiple blockchain networks
Automatically index trading pair data (prices, volume, liquidity, etc.)
Search and retrieve indexed crypto market data
Use AI chat with context from DexScreener trading pairs

This connector will integrate with SurfSense's existing connector architecture, following the established patterns from Luma, Slack, and other connectors.

✅ Acceptance Criteria

AC1: Connector Configuration ✅

User can add DexScreener connector via API endpoint
User can configure multiple tokens to track (up to 50)
Each token config includes: chain ID, token address, optional name
User can update connector configuration
User can delete connector
Configuration is persisted in database

AC2: API Integration ✅

Connector successfully calls DexScreener API endpoints
Handles rate limits (300 req/min) appropriately
Implements retry logic with exponential backoff
Validates API responses
Handles API errors gracefully (network failures, invalid data, etc.)

AC3: Data Indexing ✅

Fetches trading pairs for all configured tokens
Converts pair data to markdown format with all key metrics:
- Token information (names, symbols, addresses)
- Price data (USD, native, 24h changes)
- Volume metrics (24h, 6h, 1h)
- Liquidity information
- Market cap and FDV
- Transaction counts
Generates unique identifier hash for each pair
Generates content hash to detect changes
Creates document chunks for vector search
Generates embeddings using configured LLM
Stores documents in database with proper metadata
Updates existing documents when content changes
Skips unchanged documents

AC4: Periodic Indexing ✅

Indexing task is registered with Celery
Periodic scheduler triggers indexing (default: 60 min interval)
Manual indexing can be triggered via API
Last indexed timestamp is updated after successful indexing
Indexing errors are logged properly
Failed indexing doesn't block future attempts

AC5: Search Integration ✅

Indexed DexScreener data appears in search results
Documents are searchable by:
- Token names and symbols
- Pair addresses
- Chain IDs
- DEX names
- Price ranges
- Volume metrics
Search results include relevant metadata
Vector search returns semantically similar pairs

AC6: AI Chat Integration ✅

AI chat can access DexScreener data as context
Chat responses include relevant trading pair information
Citations link to DexScreener URLs
Metadata is properly formatted in chat responses

🏗️ Technical Implementation

Database Schema Changes

File: app/db.py

class SearchSourceConnectorType(str, Enum):
    # ... existing types
    DEXSCREENER_CONNECTOR = "DEXSCREENER_CONNECTOR"

class DocumentType(str, Enum):
    # ... existing types
    DEXSCREENER_CONNECTOR = "DEXSCREENER_CONNECTOR"

Components to Implement

1. Connector Class

File: app/connectors/dexscreener_connector.py

Methods:

__init__() - Initialize connector (no API key needed)
make_request(endpoint, params) - Generic API request handler
search_pairs(query) - Search for trading pairs
get_token_pairs(chain_id, token_address) - Get all pairs for a token
get_pair_by_address(chain_id, pair_address) - Get specific pair details
get_multiple_tokens(chain_id, token_addresses) - Batch query tokens
format_pair_to_markdown(pair) - Convert pair data to markdown

2. Indexer

File: app/tasks/connector_indexers/dexscreener_indexer.py

Function: async def index_dexscreener_pairs(session, connector_id, search_space_id, user_id, start_date=None, end_date=None, update_last_indexed=True)

Required Imports:

from .base import (
    check_document_by_unique_identifier,
    check_duplicate_document_by_hash,
    get_connector_by_id,
    get_current_timestamp,
    logger,
    update_connector_last_indexed,
)

Logic:

Get connector from database
Extract token configuration from connector.config.get("tokens")
Initialize DexScreener connector
For each tracked token:
- Fetch all trading pairs
- For each pair:
  - Format to markdown
  - Generate hashes
  - Check if document exists
  - Create or update document
  - Batch commit every 10 documents
Update last_indexed_at timestamp
Log success/failure

3. API Routes

File: app/routes/dexscreener_add_connector_route.py

Endpoints:

POST /connectors/dexscreener/add - Add/update connector
DELETE /connectors/dexscreener - Delete connector
GET /connectors/dexscreener/test - Test API connection

4. Celery Task

File: app/tasks/celery_tasks/connector_tasks.py

Task: index_dexscreener_pairs_task(connector_id, search_space_id, user_id, start_date, end_date)

Note: Requires both the Celery task wrapper and async helper function:

@celery_app.task(bind=True) decorator on index_dexscreener_pairs_task
async def _index_dexscreener_pairs(...) helper that creates session and calls indexer

5. Scheduler Integration

File: app/utils/periodic_scheduler.py

Mappings: Add to CONNECTOR_TASK_MAP dictionary:

SearchSourceConnectorType.DEXSCREENER_CONNECTOR: "index_dexscreener_pairs"

6. Routes Registration

File: app/routes/__init__.py

Import and include dexscreener_add_connector_router

7. Indexer Export

File: app/tasks/connector_indexers/__init__.py

Export index_dexscreener_pairs

🔗 Dependencies

External APIs

DexScreener API: https://api.dexscreener.com
- No authentication required
- Rate limit: 300 requests/minute
- Free tier available

Internal Dependencies

httpx - HTTP client for API requests
app.utils.document_converters - Document processing utilities
app.services.llm_service - LLM for embeddings and summaries
app.services.task_logging_service - Task logging
SQLAlchemy models and sessions
Celery for background tasks

📊 Data Model

Connector Config Schema

{
  "tokens": [
    {
      "chain": "solana",
      "address": "So11111111111111111111111111111111111111112",
      "name": "Wrapped SOL"
    },
    {
      "chain": "ethereum",
      "address": "0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2",
      "name": "Wrapped ETH"
    }
  ]
}

Document Metadata Schema

{
  "pair_address": "string",
  "chain_id": "string",
  "dex_id": "string",
  "base_token_symbol": "string",
  "base_token_address": "string",
  "quote_token_symbol": "string",
  "quote_token_address": "string",
  "price_usd": "string",
  "price_native": "string",
  "volume_h24": "number",
  "liquidity_usd": "number",
  "market_cap": "number",
  "fdv": "number",
  "url": "string",
  "indexed_at": "string (ISO 8601)"
}

🧪 Testing Strategy

Unit Tests

Test DexScreenerConnector API methods
Test markdown formatting logic
Test error handling for API failures
Test rate limit handling
Test data validation

Integration Tests

Test full indexing flow
Test connector CRUD operations
Test periodic scheduling
Test search integration
Test AI chat context retrieval

Manual Testing

Add connector with test tokens
Verify API test endpoint works
Trigger manual indexing
Check database for created documents
Search for indexed pairs
Use AI chat to query trading data
Verify periodic indexing runs
Test connector update and deletion

⚠️ Edge Cases & Error Handling

API Errors

Rate Limit Exceeded (429): Implement exponential backoff, log warning
Network Timeout: Retry with timeout, skip token if persistent
Invalid Response: Log error, skip pair, continue indexing
Token Not Found: Log warning, skip token

Data Validation

Missing pair_address: Skip pair, log warning
Empty markdown content: Skip pair, log warning
Invalid chain_id: Validate against known chains
Malformed token config: Reject at API level with clear error

Database Errors

Duplicate documents: Update existing based on content hash
Transaction failures: Rollback, log error, retry
Connection issues: Retry with backoff

Performance Considerations

Large token lists: Batch commits every 10 documents
Slow API responses: Set reasonable timeout (30s)
Memory usage: Process pairs iteratively, not all at once

📈 Success Metrics

Functional Metrics

Connector successfully adds/updates/deletes
API test endpoint returns valid data
Indexing completes without errors
Documents searchable within 5 minutes of indexing
AI chat provides accurate trading pair information

Performance Metrics

API response time < 5 seconds for test endpoint
Indexing time < 2 minutes for 50 tokens (avg 5 pairs each)
Search latency < 500ms
Rate limit compliance: 0 violations

Quality Metrics

0 critical bugs in production
< 1% indexing failure rate
100% test coverage for core logic
All acceptance criteria met

🚀 Deployment Plan

Pre-deployment

Review and merge PR
Run all tests in CI/CD
Database migration (add enum values)
Deploy to staging environment
Smoke test on staging

Deployment Steps

Deploy backend changes
Verify Celery workers pick up new task
Verify periodic scheduler includes new connector
Monitor logs for errors
Test connector addition via API

Post-deployment

Monitor error logs for 24 hours
Check indexing task success rate
Verify search results quality
Gather user feedback

Rollback Plan

If critical issues occur:

Remove connector type from periodic scheduler
Disable connector routes
Revert database migration if needed
Investigate and fix issues
Redeploy with fixes

📚 Documentation

User Documentation

Add DexScreener to connector list in user guide
Document how to add DexScreener connector
Explain token configuration format
Provide example API requests
Document supported chains

Developer Documentation

Add inline code comments
Document API endpoints in OpenAPI spec
Update connector architecture docs
Add troubleshooting guide
Document rate limit handling

🔐 Security Considerations

No API Keys: DexScreener API is public, no sensitive data to store
Input Validation: Validate chain IDs and token addresses
Rate Limiting: Respect API rate limits to avoid IP bans
Data Privacy: No PII collected or stored
Error Messages: Don't expose internal system details in API responses

🎯 Definition of Done

All acceptance criteria met
Code reviewed and approved
Unit tests written and passing
Integration tests written and passing
Manual testing completed
Documentation updated
Deployed to staging and tested
Deployed to production
Monitoring in place
User guide updated
Story marked as complete

💬 Notes

DexScreener API is free and doesn't require authentication
Rate limit of 300 req/min is generous for typical use cases
Consider implementing priority queue for high-value tokens in future
May want to add support for custom indexing intervals per token
Consider adding alerts for significant price changes in future enhancement

Story Created By: Antigravity AI
Date: 2026-01-31
Last Updated: 2026-01-31

12 KiB Raw Blame History