2026-01-31 14:48:53 +07:00
# Hướng Dẫn Tạo Custom Connectors cho SurfSense
## Tổng Quan
Bạn **hoàn toàn có thể** tạo custom connectors để kết nối đến các API bên ngoài như DexScreener, DefiLlama để phân tích token crypto. SurfSense có kiến trúc connector rất linh hoạt và dễ mở rộng.
## Kiến Trúc Connector
Mỗi connector trong SurfSense bao gồm 3 phần chính:
### 1. **Connector Class** (`app/connectors/`)
- Xử lý logic kết nối đến API bên ngoài
- Fetch và transform data
- Format data thành markdown để indexing
### 2. **API Routes** (`app/routes/`)
- Endpoint để add/delete/test connector
- Lưu config vào database
- Xác thực user
### 3. **Database Schema** (`app/db.py`)
- Định nghĩa connector type trong `SearchSourceConnectorType` enum
- Lưu trữ config (API keys, settings) trong `SearchSourceConnector` table
## Ví Dụ: Tạo DexScreener Connector
### Bước 1: Tạo Connector Class
Tạo file `/Users/mac_1/Documents/GitHub/SurfSense/surfsense_backend/app/connectors/dexscreener_connector.py` :
```python
"""
DexScreener Connector Module
A module for fetching token data and analytics from DexScreener API.
"""
from typing import Any
import requests
from datetime import datetime
class DexScreenerConnector:
"""Class for retrieving token data from DexScreener."""
def __init__ (self, api_key: str | None = None):
"""
Initialize the DexScreenerConnector class.
Args:
api_key: DexScreener API key (optional for public endpoints)
"""
self.api_key = api_key
self.base_url = "https://api.dexscreener.com/latest"
def set_api_key(self, api_key: str) -> None:
"""Set the DexScreener API key."""
self.api_key = api_key
def get_headers(self) -> dict[str, str]:
"""Get headers for DexScreener API requests."""
headers = {"Content-Type": "application/json"}
if self.api_key:
headers["Authorization"] = f"Bearer {self.api_key}"
return headers
def search_pairs(
self, query: str
) -> tuple[list[dict[str, Any]], str | None]:
"""
Search for trading pairs by token address or symbol.
Args:
query: Token address or symbol to search
Returns:
Tuple containing (pairs list, error message or None)
"""
try:
url = f"{self.base_url}/dex/search"
params = {"q": query}
response = requests.get(
url,
headers=self.get_headers(),
params=params,
timeout=10
)
if response.status_code == 200:
data = response.json()
pairs = data.get("pairs", [])
return pairs, None
else:
return [], f"API request failed with status {response.status_code}"
except Exception as e:
return [], f"Error searching pairs: {str(e)}"
def get_token_pairs(
self, token_address: str
) -> tuple[list[dict[str, Any]], str | None]:
"""
Get all pairs for a specific token address.
Args:
token_address: Token contract address
Returns:
Tuple containing (pairs list, error message or None)
"""
try:
url = f"{self.base_url}/dex/tokens/{token_address}"
response = requests.get(
url,
headers=self.get_headers(),
timeout=10
)
if response.status_code == 200:
data = response.json()
pairs = data.get("pairs", [])
return pairs, None
else:
return [], f"API request failed with status {response.status_code}"
except Exception as e:
return [], f"Error fetching token pairs: {str(e)}"
def get_pair_by_address(
self, pair_address: str
) -> tuple[dict[str, Any] | None, str | None]:
"""
Get detailed information about a specific pair.
Args:
pair_address: Pair contract address
Returns:
Tuple containing (pair data dict, error message or None)
"""
try:
url = f"{self.base_url}/dex/pairs/{pair_address}"
response = requests.get(
url,
headers=self.get_headers(),
timeout=10
)
if response.status_code == 200:
data = response.json()
pair = data.get("pair")
return pair, None
else:
return None, f"API request failed with status {response.status_code}"
except Exception as e:
return None, f"Error fetching pair: {str(e)}"
def format_pair_to_markdown(self, pair: dict[str, Any]) -> str:
"""
Convert a trading pair to markdown format for indexing.
Args:
pair: The pair object from DexScreener API
Returns:
Markdown string representation of the pair
"""
# Extract basic info
chain_id = pair.get("chainId", "Unknown")
dex_id = pair.get("dexId", "Unknown")
pair_address = pair.get("pairAddress", "")
# Token info
base_token = pair.get("baseToken", {})
quote_token = pair.get("quoteToken", {})
base_name = base_token.get("name", "Unknown")
base_symbol = base_token.get("symbol", "Unknown")
quote_name = quote_token.get("name", "Unknown")
quote_symbol = quote_token.get("symbol", "Unknown")
# Price and volume
price_native = pair.get("priceNative", "N/A")
price_usd = pair.get("priceUsd", "N/A")
volume_24h = pair.get("volume", {}).get("h24", "N/A")
liquidity_usd = pair.get("liquidity", {}).get("usd", "N/A")
# Price changes
price_change_5m = pair.get("priceChange", {}).get("m5", "N/A")
price_change_1h = pair.get("priceChange", {}).get("h1", "N/A")
price_change_24h = pair.get("priceChange", {}).get("h24", "N/A")
# Build markdown
markdown = f"# {base_symbol}/{quote_symbol} Trading Pair\n\n"
markdown += f"**Chain:** {chain_id}\n"
markdown += f"**DEX:** {dex_id}\n"
markdown += f"**Pair Address:** `{pair_address}` \n\n"
markdown += "## Token Information\n\n"
markdown += f"### Base Token: {base_name} ({base_symbol})\n"
markdown += f"- **Address:** `{base_token.get('address', 'N/A')}` \n\n"
markdown += f"### Quote Token: {quote_name} ({quote_symbol})\n"
markdown += f"- **Address:** `{quote_token.get('address', 'N/A')}` \n\n"
markdown += "## Market Data\n\n"
markdown += f"- **Price (Native):** {price_native}\n"
markdown += f"- **Price (USD):** ${price_usd}\n"
markdown += f"- **24h Volume:** ${volume_24h}\n"
markdown += f"- **Liquidity (USD):** ${liquidity_usd}\n\n"
markdown += "## Price Changes\n\n"
markdown += f"- **5 minutes:** {price_change_5m}%\n"
markdown += f"- **1 hour:** {price_change_1h}%\n"
markdown += f"- **24 hours:** {price_change_24h}%\n\n"
# Add URL if available
url = pair.get("url")
if url:
markdown += f"**DexScreener URL:** {url}\n\n"
# Add timestamp
markdown += f"*Data fetched at: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}*\n"
return markdown
def get_all_token_data(
self, token_addresses: list[str]
) -> tuple[list[str], str | None]:
"""
Fetch and format data for multiple tokens.
Args:
token_addresses: List of token contract addresses
Returns:
Tuple containing (list of markdown documents, error message or None)
"""
documents = []
errors = []
for address in token_addresses:
pairs, error = self.get_token_pairs(address)
if error:
errors.append(f"Error for {address}: {error}")
continue
for pair in pairs:
markdown_doc = self.format_pair_to_markdown(pair)
documents.append(markdown_doc)
error_msg = "; ".join(errors) if errors else None
return documents, error_msg
```
### Bước 2: Thêm Connector Type vào Database
Sửa file `/Users/mac_1/Documents/GitHub/SurfSense/surfsense_backend/app/db.py` :
```python
class SearchSourceConnectorType(str, Enum):
# ... existing connectors ...
DEXSCREENER_CONNECTOR = "DEXSCREENER_CONNECTOR"
DEFILLAMA_CONNECTOR = "DEFILLAMA_CONNECTOR"
```
### Bước 3: Tạo API Routes
Tạo file `/Users/mac_1/Documents/GitHub/SurfSense/surfsense_backend/app/routes/dexscreener_add_connector_route.py` :
```python
import logging
from fastapi import APIRouter, Depends, HTTPException
from pydantic import BaseModel, Field
from sqlalchemy.exc import IntegrityError
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.future import select
from app.db import (
SearchSourceConnector,
SearchSourceConnectorType,
User,
get_async_session,
)
from app.users import current_active_user
logger = logging.getLogger(__name__)
router = APIRouter()
class AddDexScreenerConnectorRequest(BaseModel):
"""Request model for adding a DexScreener connector."""
api_key: str | None = Field(None, description="DexScreener API key (optional)")
space_id: int = Field(..., description="Search space ID")
token_addresses: list[str] = Field(
default_factory=list,
description="List of token addresses to track"
)
@router .post("/connectors/dexscreener/add")
async def add_dexscreener_connector(
request: AddDexScreenerConnectorRequest,
user: User = Depends(current_active_user),
session: AsyncSession = Depends(get_async_session),
):
"""
Add a new DexScreener connector for the authenticated user.
Args:
request: The request containing DexScreener config and space_id
user: Current authenticated user
session: Database session
Returns:
Success message and connector details
Raises:
HTTPException: If connector already exists or validation fails
"""
try:
# Check if a DexScreener connector already exists
result = await session.execute(
select(SearchSourceConnector).filter(
SearchSourceConnector.search_space_id == request.space_id,
SearchSourceConnector.user_id == user.id,
SearchSourceConnector.connector_type
== SearchSourceConnectorType.DEXSCREENER_CONNECTOR,
)
)
existing_connector = result.scalars().first()
config = {
"token_addresses": request.token_addresses,
}
if request.api_key:
config["api_key"] = request.api_key
if existing_connector:
# Update existing connector
existing_connector.config = config
existing_connector.is_indexable = True
await session.commit()
await session.refresh(existing_connector)
logger.info(
f"Updated existing DexScreener connector for user {user.id}"
)
return {
"message": "DexScreener connector updated successfully",
"connector_id": existing_connector.id,
"connector_type": "DEXSCREENER_CONNECTOR",
}
# Create new DexScreener connector
db_connector = SearchSourceConnector(
name="DexScreener Token Tracker",
connector_type=SearchSourceConnectorType.DEXSCREENER_CONNECTOR,
config=config,
search_space_id=request.space_id,
user_id=user.id,
is_indexable=True,
)
session.add(db_connector)
await session.commit()
await session.refresh(db_connector)
logger.info(
f"Successfully created DexScreener connector for user {user.id}"
)
return {
"message": "DexScreener connector added successfully",
"connector_id": db_connector.id,
"connector_type": "DEXSCREENER_CONNECTOR",
}
except IntegrityError as e:
await session.rollback()
logger.error(f"Database integrity error: {str(e)}")
raise HTTPException(
status_code=409,
detail="A DexScreener connector already exists for this user.",
) from e
except Exception as e:
await session.rollback()
logger.error(f"Unexpected error: {str(e)}", exc_info=True)
raise HTTPException(
status_code=500,
detail=f"Failed to add DexScreener connector: {str(e)}",
) from e
@router .delete("/connectors/dexscreener")
async def delete_dexscreener_connector(
space_id: int,
user: User = Depends(current_active_user),
session: AsyncSession = Depends(get_async_session),
):
"""Delete the DexScreener connector."""
try:
result = await session.execute(
select(SearchSourceConnector).filter(
SearchSourceConnector.search_space_id == space_id,
SearchSourceConnector.user_id == user.id,
SearchSourceConnector.connector_type
== SearchSourceConnectorType.DEXSCREENER_CONNECTOR,
)
)
connector = result.scalars().first()
if not connector:
raise HTTPException(
status_code=404,
detail="DexScreener connector not found.",
)
await session.delete(connector)
await session.commit()
return {"message": "DexScreener connector deleted successfully"}
except HTTPException:
raise
except Exception as e:
await session.rollback()
logger.error(f"Unexpected error: {str(e)}", exc_info=True)
raise HTTPException(
status_code=500,
detail=f"Failed to delete DexScreener connector: {str(e)}",
) from e
@router .get("/connectors/dexscreener/test")
async def test_dexscreener_connector(
space_id: int,
user: User = Depends(current_active_user),
session: AsyncSession = Depends(get_async_session),
):
"""Test the DexScreener connector."""
try:
# Get the connector
result = await session.execute(
select(SearchSourceConnector).filter(
SearchSourceConnector.search_space_id == space_id,
SearchSourceConnector.user_id == user.id,
SearchSourceConnector.connector_type
== SearchSourceConnectorType.DEXSCREENER_CONNECTOR,
)
)
connector = result.scalars().first()
if not connector:
raise HTTPException(
status_code=404,
detail="DexScreener connector not found.",
)
# Import and test
from app.connectors.dexscreener_connector import DexScreenerConnector
api_key = connector.config.get("api_key")
dex = DexScreenerConnector(api_key=api_key)
# Test with a sample search
pairs, error = dex.search_pairs("WETH")
if error:
raise HTTPException(
status_code=400,
detail=f"Failed to connect to DexScreener: {error}",
)
return {
"message": "DexScreener connector is working correctly",
"sample_pairs_count": len(pairs),
"tracked_tokens": connector.config.get("token_addresses", []),
}
except HTTPException:
raise
except Exception as e:
logger.error(f"Unexpected error: {str(e)}", exc_info=True)
raise HTTPException(
status_code=500,
detail=f"Failed to test DexScreener connector: {str(e)}",
) from e
```
### Bước 4: Đăng Ký Routes
Sửa file `/Users/mac_1/Documents/GitHub/SurfSense/surfsense_backend/app/main.py` :
```python
# Import route
from app.routes import dexscreener_add_connector_route
# Add to app
app.include_router(
dexscreener_add_connector_route.router,
prefix="/api",
tags=["connectors"]
)
```
### Bước 5: Tạo Indexing Logic
Tạo file để xử lý indexing tự động:
```python
# app/connectors/dexscreener_indexer.py
from app.connectors.dexscreener_connector import DexScreenerConnector
from app.db import SearchSourceConnector
import logging
logger = logging.getLogger(__name__)
async def index_dexscreener_data(connector: SearchSourceConnector):
"""
Index data from DexScreener connector.
This function will be called periodically by the indexing service.
"""
try:
config = connector.config
api_key = config.get("api_key")
token_addresses = config.get("token_addresses", [])
if not token_addresses:
logger.warning(f"No token addresses configured for connector {connector.id}")
return []
# Initialize connector
dex = DexScreenerConnector(api_key=api_key)
# Fetch data for all tracked tokens
documents, error = dex.get_all_token_data(token_addresses)
if error:
logger.error(f"Error indexing DexScreener data: {error}")
return []
logger.info(f"Successfully indexed {len(documents)} documents from DexScreener")
return documents
except Exception as e:
logger.error(f"Error in DexScreener indexing: {str(e)}", exc_info=True)
return []
```
## Ví Dụ: DefiLlama Connector
Tương tự, bạn có thể tạo DefiLlama connector:
```python
# app/connectors/defillama_connector.py
"""
DefiLlama Connector Module
A module for fetching DeFi protocol data and TVL analytics from DefiLlama API.
"""
from typing import Any
import requests
from datetime import datetime
class DefiLlamaConnector:
"""Class for retrieving DeFi data from DefiLlama."""
def __init__ (self):
"""Initialize the DefiLlamaConnector class."""
self.base_url = "https://api.llama.fi"
def get_protocol(
self, protocol_slug: str
) -> tuple[dict[str, Any] | None, str | None]:
"""
Get detailed information about a specific protocol.
Args:
protocol_slug: Protocol slug (e.g., "uniswap", "aave")
Returns:
Tuple containing (protocol data dict, error message or None)
"""
try:
url = f"{self.base_url}/protocol/{protocol_slug}"
response = requests.get(url, timeout=10)
if response.status_code == 200:
return response.json(), None
else:
return None, f"API request failed with status {response.status_code}"
except Exception as e:
return None, f"Error fetching protocol: {str(e)}"
def get_tvl_history(
self, protocol_slug: str
) -> tuple[list[dict[str, Any]], str | None]:
"""
Get TVL history for a protocol.
Args:
protocol_slug: Protocol slug
Returns:
Tuple containing (TVL history list, error message or None)
"""
protocol_data, error = self.get_protocol(protocol_slug)
if error:
return [], error
tvl_history = protocol_data.get("tvl", [])
return tvl_history, None
def get_all_protocols(self) -> tuple[list[dict[str, Any]], str | None]:
"""
Get list of all protocols with basic info.
Returns:
Tuple containing (protocols list, error message or None)
"""
try:
url = f"{self.base_url}/protocols"
response = requests.get(url, timeout=10)
if response.status_code == 200:
return response.json(), None
else:
return [], f"API request failed with status {response.status_code}"
except Exception as e:
return [], f"Error fetching protocols: {str(e)}"
def get_chain_tvl(
self, chain: str
) -> tuple[dict[str, Any] | None, str | None]:
"""
Get TVL for a specific chain.
Args:
chain: Chain name (e.g., "Ethereum", "BSC")
Returns:
Tuple containing (chain TVL data, error message or None)
"""
try:
url = f"{self.base_url}/tvl/{chain}"
response = requests.get(url, timeout=10)
if response.status_code == 200:
return response.json(), None
else:
return None, f"API request failed with status {response.status_code}"
except Exception as e:
return None, f"Error fetching chain TVL: {str(e)}"
def format_protocol_to_markdown(self, protocol: dict[str, Any]) -> str:
"""
Convert protocol data to markdown format.
Args:
protocol: Protocol data from DefiLlama API
Returns:
Markdown string representation
"""
name = protocol.get("name", "Unknown Protocol")
slug = protocol.get("slug", "")
symbol = protocol.get("symbol", "N/A")
# TVL data
tvl = protocol.get("tvl", 0)
chain_tvls = protocol.get("chainTvls", {})
# Categories and chains
category = protocol.get("category", "N/A")
chains = protocol.get("chains", [])
# Build markdown
markdown = f"# {name}\n\n"
if symbol != "N/A":
markdown += f"**Symbol:** {symbol}\n"
markdown += f"**Category:** {category}\n"
markdown += f"**Slug:** `{slug}` \n\n"
markdown += "## Total Value Locked (TVL)\n\n"
markdown += f"**Current TVL:** ${tvl:,.2f}\n\n"
if chains:
markdown += "### TVL by Chain\n\n"
for chain in chains:
chain_tvl = chain_tvls.get(chain, 0)
markdown += f"- ** {chain}:** ${chain_tvl:,.2f}\n"
markdown += "\n"
# Description
description = protocol.get("description")
if description:
markdown += f"## Description\n\n{description}\n\n"
# Links
url = protocol.get("url")
twitter = protocol.get("twitter")
if url or twitter:
markdown += "## Links\n\n"
if url:
markdown += f"- **Website:** {url}\n"
if twitter:
markdown += f"- **Twitter:** https://twitter.com/{twitter}\n"
markdown += "\n"
markdown += f"*Data fetched at: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}*\n"
return markdown
```
## Cách Sử Dụng
### 1. Thêm Connector qua API
```bash
# Add DexScreener connector
curl -X POST "http://localhost:8000/api/connectors/dexscreener/add" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"space_id": 1,
"token_addresses": [
"0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2",
"0xdAC17F958D2ee523a2206206994597C13D831ec7"
],
"api_key": "optional_api_key"
}'
```
### 2. Test Connector
```bash
curl -X GET "http://localhost:8000/api/connectors/dexscreener/test?space_id=1" \
-H "Authorization: Bearer YOUR_TOKEN"
```
### 3. Xóa Connector
```bash
curl -X DELETE "http://localhost:8000/api/connectors/dexscreener?space_id=1" \
-H "Authorization: Bearer YOUR_TOKEN"
```
## Tích Hợp vào Indexing Pipeline
Để connector tự động index data định kỳ, bạn cần:
1. **Thêm vào Connector Service** (`app/services/connector_service.py` )
2. **Đăng ký indexing task** trong background job scheduler
3. **Cấu hình re-indexing interval** (mặc định 60 phút)
Ví dụ:
```python
# app/services/connector_service.py
async def index_connector_data(connector: SearchSourceConnector):
"""Index data from any connector type."""
if connector.connector_type == SearchSourceConnectorType.DEXSCREENER_CONNECTOR:
from app.connectors.dexscreener_indexer import index_dexscreener_data
return await index_dexscreener_data(connector)
elif connector.connector_type == SearchSourceConnectorType.DEFILLAMA_CONNECTOR:
from app.connectors.defillama_indexer import index_defillama_data
return await index_defillama_data(connector)
# ... other connector types
```
2026-02-01 15:05:19 +07:00
---
## ⚠️ CRITICAL: Enable RAG Retrieval
**This is the most commonly missed step when adding new connectors!**
### The Problem
Even if your connector successfully:
1. ✅ Stores data in the database
2. ✅ Indexes data into vector store
3. ✅ Shows up in the UI
The LLM **WILL NOT** be able to retrieve this data unless you add the connector to the RAG mapping.
### The Fix
**File:** `surfsense_backend/app/agents/new_chat/chat_deepagent.py`
**Add your connector to `_CONNECTOR_TYPE_TO_SEARCHABLE` :**
```python
_CONNECTOR_TYPE_TO_SEARCHABLE = {
"GMAIL": "GMAIL",
"GOOGLE_DRIVE_CONNECTOR": "GOOGLE_DRIVE",
"SLACK_CONNECTOR": "SLACK",
"NOTION_CONNECTOR": "NOTION",
# ... other connectors ...
# ✅ ADD YOUR NEW CONNECTOR HERE
"DEXSCREENER_CONNECTOR": "DEXSCREENER_CONNECTOR",
"YOUR_CONNECTOR": "YOUR_CONNECTOR", # Example
}
```
### Why This Matters
This mapping is used by `_map_connectors_to_searchable_types()` to:
1. Build the list of available search spaces for the LLM
2. Include connector types in the tool description
3. Enable the LLM to search your connector's data
**Without this mapping:**
- LLM won't know your connector exists
- LLM can't search your indexed data
- Users will get "I don't have access to that data" responses
### Verification
After adding the mapping, test with a user query:
```bash
# Example for DexScreener
curl -X POST http://localhost:8000/api/chat \
-H "Authorization: Bearer $TOKEN" \
-d '{
"message": "What is the current price of WETH?",
"space_id": 7
}'
```
**Expected:** LLM retrieves data and provides answer with citations.
**If failed:** Check that:
1. Connector is in `_CONNECTOR_TYPE_TO_SEARCHABLE`
2. Connector type matches exactly (case-sensitive)
3. Data is indexed in the correct `search_space_id`
---
2026-01-31 14:48:53 +07:00
## Best Practices
### 1. **Error Handling**
- Luôn return tuple `(data, error)` để dễ xử lý
- Log errors chi tiết để debug
- Graceful degradation khi API fail
### 2. **Rate Limiting**
- Respect API rate limits
- Implement exponential backoff
- Cache data khi có thể
### 3. **Data Formatting**
- Format data thành markdown rõ ràng
- Include metadata (timestamps, sources)
- Structured format để vector search hiệu quả
### 4. **Security**
- Encrypt API keys trong database
- Validate user input
- Implement proper authentication
### 5. **Performance**
- Async/await cho I/O operations
- Batch requests khi có thể
- Optimize indexing frequency
## Kết Luận
Với kiến trúc connector linh hoạt của SurfSense, bạn có thể:
✅ Kết nối đến **bất kỳ API nào** (DexScreener, DefiLlama, CoinGecko, etc.)
✅ Tự động **index và search** data từ nhiều nguồn
✅ **Customize** logic fetch và format data
✅ **Scale** dễ dàng với nhiều connectors
Connector system cho phép bạn biến SurfSense thành một **unified search platform** cho mọi loại data - từ emails, documents đến crypto analytics!