SurfSense/_bmad-output/dexscreener-connector-implementation-plan.md
API Test Bot 44dc7904f3 docs: Add BMAD stories and implementation plan for DexScreener
- Add Story 1.1: DexScreener Connector Backend (complete)
- Add Story 1.2: DexScreener Connector Frontend UI (ready for dev)
- Add implementation plan with architecture decisions
- Include assets and documentation structure
- Stories follow BMAD methodology for AI-driven development
2026-01-31 17:27:21 +07:00

473 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# DexScreener Connector - Implementation Plan
## 📋 Tổng Quan
Sau khi research kỹ [DexScreener API Documentation](https://docs.dexscreener.com/api/reference) và phân tích source code SurfSense, đây là phương án implementation chính xác nhất cho DexScreener Connector.
## 🔍 DexScreener API Research Findings
### Base Information
- **Base URL**: `https://api.dexscreener.com`
- **Authentication**: KHÔNG cần API key (public API)
- **Rate Limits**:
- Profile/Ads endpoints: 60 requests/minute
- Pair/Token endpoints: **300 requests/minute**
- **Max Results**: Search endpoint trả về tối đa 30 pairs
### Core Endpoints
#### 1. Search Pairs
```
GET /latest/dex/search?q={query}
Rate Limit: 300 req/min
Max Results: 30 pairs
```
**Use Case**: Tìm kiếm trading pairs theo token name, symbol, hoặc address
**Response Structure**:
```json
{
"schemaVersion": "1.0.0",
"pairs": [
{
"chainId": "solana",
"dexId": "raydium",
"url": "https://dexscreener.com/solana/...",
"pairAddress": "...",
"baseToken": {
"address": "...",
"name": "Token Name",
"symbol": "TKN"
},
"quoteToken": {
"address": "...",
"name": "USD Coin",
"symbol": "USDC"
},
"priceNative": "0.00123",
"priceUsd": "1.23",
"txns": {
"m5": { "buys": 10, "sells": 5 },
"h1": { "buys": 100, "sells": 50 },
"h6": { "buys": 500, "sells": 250 },
"h24": { "buys": 2000, "sells": 1000 }
},
"volume": {
"h24": 1000000,
"h6": 250000,
"h1": 50000,
"m5": 5000
},
"priceChange": {
"m5": 1.5,
"h1": 5.2,
"h6": 10.5,
"h24": 25.3
},
"liquidity": {
"usd": 500000,
"base": 1000000,
"quote": 500000
},
"fdv": 10000000,
"marketCap": 5000000,
"pairCreatedAt": 1640000000000
}
]
}
```
#### 2. Get Token Pairs
```
GET /latest/dex/tokens/{chainId}/{tokenAddress}
Rate Limit: 300 req/min
```
**Use Case**: Lấy tất cả pools/pairs của một token cụ thể
#### 3. Get Specific Pair
```
GET /latest/dex/pairs/{chainId}/{pairAddress}
Rate Limit: 300 req/min
```
**Use Case**: Lấy thông tin chi tiết của một pair cụ thể
#### 4. Get Multiple Tokens
```
GET /tokens/v1/{chainId}/{tokenAddresses}
Rate Limit: 300 req/min
Max: 30 addresses (comma-separated)
```
**Use Case**: Batch query nhiều tokens cùng lúc
## 🏗️ SurfSense Architecture Analysis
### Pattern Đã Xác Định
#### 1. Connector Class Pattern
**File**: `app/connectors/{name}_connector.py`
**Responsibilities**:
- Initialize với API credentials (nếu cần)
- Methods để fetch data từ external API
- Methods để format data sang markdown
- Error handling cho API calls
**Example từ LumaConnector**:
```python
class LumaConnector:
def __init__(self, api_key: str | None = None):
self.api_key = api_key
self.base_url = "https://api.lu.ma"
def make_request(self, endpoint: str, params: dict | None = None):
# Handle API calls with error handling
def get_events_by_date_range(self, start_date: str, end_date: str):
# Fetch data from API
def format_event_to_markdown(self, event: dict) -> str:
# Convert to markdown for indexing
```
#### 2. Indexer Pattern
**File**: `app/tasks/connector_indexers/{name}_indexer.py`
**Responsibilities**:
- Async function `index_{name}()`
- Get connector từ database
- Extract config (API keys, etc.)
- Initialize connector class
- Fetch data từ API
- Loop qua items:
- Generate `unique_identifier_hash` (để track duplicates)
- Generate `content_hash` (để detect content changes)
- Check existing documents
- Create/Update `Document` objects với:
- `chunks` (text chunks cho vector search)
- `embedding` (vector embedding)
- `metadata` (structured data)
- Batch commit to database
- Update `last_indexed_at` timestamp
**Key Functions Used**:
```python
from app.utils.document_converters import (
create_document_chunks,
generate_content_hash,
generate_document_summary,
generate_unique_identifier_hash,
)
```
#### 3. Routes Pattern
**File**: `app/routes/{name}_add_connector_route.py`
**Endpoints**:
- `POST /connectors/{name}/add` - Add/Update connector
- `DELETE /connectors/{name}` - Delete connector
- `GET /connectors/{name}/test` - Test connection
**Example từ luma_add_connector_route.py**:
```python
@router.post("/connectors/luma/add")
async def add_luma_connector(
request: AddLumaConnectorRequest,
user: User = Depends(current_active_user),
session: AsyncSession = Depends(get_async_session),
):
# Check existing connector
# Create or update SearchSourceConnector
# Store config in connector.config JSON field
```
#### 4. Database Schema
**File**: `app/db.py`
**SearchSourceConnectorType Enum**:
```python
class SearchSourceConnectorType(str, Enum):
LUMA_CONNECTOR = "LUMA_CONNECTOR"
SLACK_CONNECTOR = "SLACK_CONNECTOR"
# ... thêm DEXSCREENER_CONNECTOR
```
**SearchSourceConnector Model**:
```python
class SearchSourceConnector(Base):
id: int
name: str
connector_type: SearchSourceConnectorType
config: dict # JSON field để store API keys, settings
search_space_id: int
user_id: UUID
is_indexable: bool
last_indexed_at: datetime
```
#### 5. Celery Tasks
**File**: `app/tasks/celery_tasks/connector_tasks.py`
**Pattern**:
```python
@celery_app.task(name="index_luma_events", bind=True)
def index_luma_events_task(
self,
connector_id: int,
search_space_id: int,
user_id: str,
start_date: str | None = None,
end_date: str | None = None,
):
# Wrapper cho async indexer function
return asyncio.run(_index_luma_events(...))
```
#### 6. Periodic Scheduler
**File**: `app/utils/periodic_scheduler.py`
**Mapping**:
```python
CONNECTOR_TYPE_TO_TASK_NAME = {
SearchSourceConnectorType.LUMA_CONNECTOR: "index_luma_events",
# ... thêm mapping cho DexScreener
}
CONNECTOR_TYPE_TO_TASK = {
SearchSourceConnectorType.LUMA_CONNECTOR: index_luma_events_task,
# ... thêm task cho DexScreener
}
```
## 📝 Implementation Plan
### Phase 1: Core Components
#### 1.1. Database Schema Update
**File**: `app/db.py`
**Changes**:
```python
class SearchSourceConnectorType(str, Enum):
# ... existing types
DEXSCREENER_CONNECTOR = "DEXSCREENER_CONNECTOR"
class DocumentType(str, Enum):
# ... existing types
DEXSCREENER_CONNECTOR = "DEXSCREENER_CONNECTOR"
```
#### 1.2. Connector Class
**File**: `app/connectors/dexscreener_connector.py`
Xem full implementation trong artifacts.
#### 1.3. Indexer
**File**: `app/tasks/connector_indexers/dexscreener_indexer.py`
Xem full implementation trong artifacts.
### Phase 2: API Routes & Integration
#### 2.1. Routes
**File**: `app/routes/dexscreener_add_connector_route.py`
Xem full implementation trong artifacts.
#### 2.2. Celery Task
**File**: `app/tasks/celery_tasks/connector_tasks.py`
**Add to existing file**:
```python
# Add import
from app.tasks.connector_indexers import index_dexscreener_pairs
# Add task
@celery_app.task(name="index_dexscreener_pairs", bind=True)
def index_dexscreener_pairs_task(
self,
connector_id: int,
search_space_id: int,
user_id: str,
):
"""Celery task for indexing DexScreener pairs."""
try:
return asyncio.run(
_index_dexscreener_pairs(
connector_id=connector_id,
search_space_id=search_space_id,
user_id=user_id,
)
)
except Exception as e:
logger.error(f"DexScreener indexing task failed: {str(e)}", exc_info=True)
raise
async def _index_dexscreener_pairs(
connector_id: int,
search_space_id: int,
user_id: str,
):
"""Async wrapper for DexScreener indexing."""
async with get_async_session_context() as session:
return await index_dexscreener_pairs(
session=session,
connector_id=connector_id,
search_space_id=search_space_id,
user_id=user_id,
)
```
#### 2.3. Periodic Scheduler
**File**: `app/utils/periodic_scheduler.py`
**Add to existing mappings**:
```python
# Add to CONNECTOR_TYPE_TO_TASK_NAME
CONNECTOR_TYPE_TO_TASK_NAME = {
# ... existing mappings
SearchSourceConnectorType.DEXSCREENER_CONNECTOR: "index_dexscreener_pairs",
}
# Add import
from app.tasks.celery_tasks.connector_tasks import index_dexscreener_pairs_task
# Add to CONNECTOR_TYPE_TO_TASK
CONNECTOR_TYPE_TO_TASK = {
# ... existing mappings
SearchSourceConnectorType.DEXSCREENER_CONNECTOR: index_dexscreener_pairs_task,
}
```
#### 2.4. Routes Registration
**File**: `app/routes/__init__.py`
**Add**:
```python
# Add import
from app.routes.dexscreener_add_connector_route import router as dexscreener_add_connector_router
# Add to router includes (after other connector routes)
router.include_router(dexscreener_add_connector_router)
```
#### 2.5. Indexer Export
**File**: `app/tasks/connector_indexers/__init__.py`
**Add**:
```python
# Add import
from .dexscreener_indexer import index_dexscreener_pairs
# Add to __all__
__all__ = [
# ... existing exports
"index_dexscreener_pairs",
]
```
## 🔄 Usage Flow
### 1. Add Connector via API
```bash
curl -X POST "http://localhost:8000/api/connectors/dexscreener/add" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"space_id": 1,
"tokens": [
{
"chain": "solana",
"address": "So11111111111111111111111111111111111111112",
"name": "Wrapped SOL"
},
{
"chain": "ethereum",
"address": "0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2",
"name": "Wrapped ETH"
}
]
}'
```
### 2. Test Connection
```bash
curl -X GET "http://localhost:8000/api/connectors/dexscreener/test?chain=solana&token_address=So11111111111111111111111111111111111111112" \
-H "Authorization: Bearer YOUR_TOKEN"
```
### 3. Trigger Manual Indexing
Indexing sẽ được trigger tự động qua:
- **Periodic scheduler**: Mỗi 60 phút (configurable)
- **Manual trigger**: Qua search_source_connectors_routes.py endpoint
### 4. Search Indexed Data
Data được index sẽ tự động available trong:
- AI Chat với context từ DexScreener
- Search results
- Document retrieval
## ⚠️ Important Considerations
### Rate Limiting
- DexScreener API: 300 requests/minute
- Với 50 tokens tracked, mỗi lần index = 50 requests
- Recommended indexing interval: **60 minutes**
- Implement exponential backoff nếu hit rate limit
### Data Freshness
- Crypto market data thay đổi nhanh
- Consider shorter intervals (15-30 min) cho high-priority tokens
- Implement priority queue cho important tokens
### Storage Optimization
- Mỗi pair = 1 document với chunks
- 50 tokens × 5 pairs average = 250 documents
- Monitor storage usage và implement cleanup cho old data
### Error Handling
- Network failures: Retry với exponential backoff
- API errors: Log và skip, không block toàn bộ indexing
- Invalid data: Validate trước khi index
## 🎯 Next Steps
1. **Implement Phase 1**: Core components (connector, indexer, DB schema)
2. **Test locally**: Verify API calls và data formatting
3. **Implement Phase 2**: Routes và integration
4. **Test end-to-end**: Add connector → Index → Search
5. **Deploy**: Monitor performance và adjust intervals
6. **Optimize**: Based on usage patterns và feedback
## 📊 Success Metrics
- ✅ Connector successfully fetches data from DexScreener API
- ✅ Data được format chính xác sang markdown
- ✅ Documents được index với proper chunks và embeddings
- ✅ Search results include DexScreener data
- ✅ AI Chat có context từ crypto market data
- ✅ Periodic indexing runs without errors
- ✅ Rate limits được respect
---
**Note**: Implementation này dựa trên:
- Official DexScreener API Documentation
- Existing SurfSense connector patterns (Luma, Slack, etc.)
- Best practices từ production connectors