SurfSense/docs/comments-implementation-guide.md

16 KiB

Comments & Mentions Implementation Guide

Implementation guide for adding Google Docs-style comments and @mentions to shared chats in SurfSense.


Table of Contents

  1. Overview
  2. Architecture Decisions
  3. Database Design
  4. API Design
  5. Permission Model
  6. Business Rules
  7. File Structure
  8. Implementation Phases
  9. Testing Checklist
  10. Out of Scope

Overview

Problem Statement

When team members view a shared AI chat, they cannot discuss specific answers in place. Users currently have to screenshot or copy-paste the output into external tools (Slack/Teams) to ask questions or validate data. This context switching causes friction and fragments organizational knowledge.

User Story

As a user, I want to be able to place and reply to comments on AI responses to let my team know my thoughts without leaving SurfSense.

Solution

Users can place comments on AI chat responses and reply to existing comments, similar to Google Docs, but limited to a single level of nesting per AI response.


Architecture Decisions

Threading Model

AI Response (message_id: 123)
├── Comment A (parent_id: NULL)      ← Top-level comment
│   ├── Reply A1 (parent_id: A)      ← Reply
│   ├── Reply A2 (parent_id: A)      ← Reply
│   └── Reply A3 (parent_id: A)      ← Reply
├── Comment B (parent_id: NULL)      ← Top-level comment
│   └── Reply B1 (parent_id: B)      ← Reply
└── Comment C (parent_id: NULL)      ← Top-level comment (no replies)
  • Multiple top-level comments allowed per AI response
  • One level of nesting (replies to comments, but no replies to replies)
  • API enforces nesting limit, not DB constraint

Comment Anchoring

  • Comments attach to AI responses only (not user messages)
  • Comments attach to entire message (not text selection)
  • Validated by checking message.role == 'assistant'

@Mentions

  • Storage format: @[uuid] in raw content
  • Display format: @Display Name in rendered content
  • Mentions extracted and stored in separate table for notifications
  • Only search space members can be mentioned

Read/Unread Mentions

  • Simple boolean on mention record
  • Marked read when user clicks the mention notification
  • No automatic read-on-view (keep it simple)

Cascade Behavior

  • Deleting a comment deletes all its replies (DB CASCADE)
  • Deleting/regenerating a message deletes its comments (DB CASCADE)

Backend-First Architecture

  • All business logic in backend
  • Frontend is a thin consumer
  • Backend returns computed fields: can_edit, can_delete, is_edited, content_rendered

Database Design

Table: chat_comments

Column Type Constraints Description
id SERIAL PK Primary key
message_id INTEGER FK → new_chat_messages(id) ON DELETE CASCADE, NOT NULL Which AI response
parent_id INTEGER FK → chat_comments(id) ON DELETE CASCADE, NULLABLE NULL = top-level, otherwise = reply
author_id UUID FK → user(id) ON DELETE SET NULL Who wrote it
content TEXT NOT NULL Plain text, may contain @[uuid]
created_at TIMESTAMPTZ NOT NULL, DEFAULT NOW() Creation time
updated_at TIMESTAMPTZ NOT NULL, DEFAULT NOW() Last edit time

Indexes:

  • idx_comments_message_id on message_id
  • idx_comments_parent_id on parent_id
  • idx_comments_author_id on author_id
  • idx_comments_created_at on created_at

Table: chat_comment_mentions

Column Type Constraints Description
id SERIAL PK Primary key
comment_id INTEGER FK → chat_comments(id) ON DELETE CASCADE, NOT NULL Which comment
mentioned_user_id UUID FK → user(id) ON DELETE CASCADE, NOT NULL Who was mentioned
read BOOLEAN NOT NULL, DEFAULT FALSE Has user seen it
created_at TIMESTAMPTZ NOT NULL, DEFAULT NOW() When mentioned

Constraints:

  • UNIQUE (comment_id, mentioned_user_id) - Can't mention same person twice

Indexes:

  • idx_mentions_user_unread on (mentioned_user_id) WHERE read = FALSE (partial index)
  • idx_mentions_comment_id on comment_id

Schema Diagram

new_chat_messages (existing)
         │
         │ 1:N
         ▼
┌──────────────────────────────────┐
│       chat_comments              │
├──────────────────────────────────┤
│ id (PK)                          │
│ message_id (FK)                  │
│ parent_id (FK, self-ref)         │
│ author_id (FK → user)            │
│ content                          │
│ created_at                       │
│ updated_at                       │
└──────────────────────────────────┘
         │
         │ 1:N
         ▼
┌──────────────────────────────────┐
│   chat_comment_mentions          │
├──────────────────────────────────┤
│ id (PK)                          │
│ comment_id (FK)                  │
│ mentioned_user_id (FK → user)    │
│ read                             │
│ created_at                       │
└──────────────────────────────────┘

API Design

Endpoints

Method Endpoint Purpose
GET /api/v1/messages/{message_id}/comments List comments with replies
POST /api/v1/messages/{message_id}/comments Create top-level comment
POST /api/v1/comments/{comment_id}/replies Reply to a comment
PUT /api/v1/comments/{comment_id} Edit comment (author only)
DELETE /api/v1/comments/{comment_id} Delete comment + replies
GET /api/v1/users/me/mentions List user's mentions
POST /api/v1/mentions/{mention_id}/read Mark mention as read
POST /api/v1/users/me/mentions/read-all Mark all mentions read

Request Schemas

Create/Update Comment:

{
  "content": "string (1-5000 chars)"
}

Response Schemas

Comment Response:

{
  "id": 1,
  "message_id": 123,
  "content": "Is this accurate? @[uuid-here]",
  "content_rendered": "Is this accurate? @John Doe",
  "author": {
    "id": "uuid",
    "display_name": "Jane Smith",
    "avatar_url": "https://...",
    "email": "jane@example.com"
  },
  "created_at": "2024-01-15T10:42:00Z",
  "updated_at": "2024-01-15T10:42:00Z",
  "is_edited": false,
  "can_edit": true,
  "can_delete": true,
  "reply_count": 2,
  "replies": [
    {
      "id": 2,
      "content": "Yes, verified",
      "content_rendered": "Yes, verified",
      "author": { ... },
      "created_at": "...",
      "updated_at": "...",
      "is_edited": false,
      "can_edit": false,
      "can_delete": false
    }
  ]
}

Comment List Response:

{
  "comments": [ ... ],
  "total_count": 5
}

Mention Response:

{
  "id": 1,
  "read": false,
  "created_at": "2024-01-15T10:42:00Z",
  "comment": {
    "id": 5,
    "content_preview": "Hey, can you check...",
    "author": {
      "display_name": "John Doe",
      "avatar_url": "..."
    },
    "created_at": "..."
  },
  "context": {
    "thread_id": 123,
    "thread_title": "Q3 Analysis",
    "message_id": 456,
    "search_space_id": 1,
    "search_space_name": "Finance Team"
  }
}

Mention List Response:

{
  "mentions": [ ... ],
  "unread_count": 3
}

Permission Model

New Permissions

Add to Permission enum in db.py:

Permission Value
COMMENTS_CREATE "comments:create"
COMMENTS_READ "comments:read"
COMMENTS_DELETE "comments:delete"

Default Role Assignments

Role comments:read comments:create comments:delete
Owner
Admin
Editor
Viewer

Authorization Rules

Action Who Can Do It
Read comments Anyone with comments:read
Create comment Anyone with comments:create
Create reply Anyone with comments:create
Edit comment Author only
Delete own comment Author
Delete any comment Anyone with comments:delete

Business Rules

Validation Rules

  1. Message must be AI response: message.role == 'assistant'
  2. Reply parent must be top-level: parent.parent_id IS NULL
  3. Content not empty: 1-5000 characters
  4. Mentioned users must be search space members

Computed Fields (Backend Returns)

Field Logic
is_edited updated_at > created_at
can_edit comment.author_id == current_user.id
can_delete author_id == user.id OR has_permission("comments:delete")
content_rendered Replace @[uuid] with @Display Name
reply_count Count of replies

Mention Processing

  1. On comment create/update, parse @[uuid] patterns from content
  2. Validate each UUID is a member of the search space
  3. Insert/update records in chat_comment_mentions
  4. Invalid UUIDs: silently ignore (don't create mention record)

Error Responses

Scenario HTTP Status Message
Message not found 404 "Message not found"
Message is not AI response 400 "Comments can only be added to AI responses"
Comment not found 404 "Comment not found"
Cannot reply to reply 400 "Cannot reply to a reply"
Not authorized to edit 403 "You can only edit your own comments"
Not authorized to delete 403 "You do not have permission to delete this comment"
Mention not found 404 "Mention not found"

File Structure

surfsense_backend/app/
├── routes/
│   └── comments_routes.py         # All comment endpoints
│
├── services/
│   └── comments_service.py        # Business logic + DB access
│
├── schemas/
│   └── comments.py                # Request/response schemas
│
├── utils/
│   └── comments.py                # Mention parsing, helpers
│
├── db.py                          # Add models (ChatComment, ChatCommentMention)
│
└── alembic/versions/
    └── XX_add_comments_tables.py  # Migration

Implementation Phases

Phase 1: Foundation (P0)

Task Description
1.1 Create chat_comments table migration
1.2 Create chat_comment_mentions table migration
1.3 Add ChatComment model to db.py
1.4 Add ChatCommentMention model to db.py
1.5 Add comment permissions to Permission enum
1.6 Update DEFAULT_ROLE_PERMISSIONS

Phase 2: Core CRUD (P0)

Task Description
2.1 Create Pydantic schemas in schemas/comments.py
2.2 Implement GET /messages/{id}/comments
2.3 Implement POST /messages/{id}/comments
2.4 Implement POST /comments/{id}/replies
2.5 Implement PUT /comments/{id}
2.6 Implement DELETE /comments/{id}

Phase 3: Mentions (P1)

Task Description
3.1 Create mention parser in utils/comments.py
3.2 Validate mentioned users are search space members
3.3 Insert mentions on comment create/update
3.4 Return content_rendered with names resolved
3.5 Implement GET /users/me/mentions
3.6 Implement POST /mentions/{id}/read
3.7 Implement POST /users/me/mentions/read-all

Phase 4: Authorization (P1)

Task Description
4.1 Check comments:read on list endpoint
4.2 Check comments:create on create/reply
4.3 Check author-only on edit
4.4 Check author OR comments:delete on delete
4.5 Validate message is AI response
4.6 Validate parent is top-level

Phase 5: Response Enrichment (P1)

Task Description
5.1 Return can_edit per comment
5.2 Return can_delete per comment
5.3 Return is_edited
5.4 Return author info (name, avatar, email fallback)
5.5 Return reply_count per top-level comment

Phase 6: Edge Cases (P2)

Task Description
6.1 Handle deleted user (author_id SET NULL)
6.2 Handle mention of user no longer in search space
6.3 Clear error when replying to deleted comment

Testing Checklist (will be done manual)

Functional Tests

  • Create comment on AI response
  • Create comment on user message (should fail)
  • Reply to comment
  • Reply to reply (should fail)
  • Edit own comment
  • Edit other's comment (should fail)
  • Delete own comment (and verify replies deleted)
  • Delete other's comment as owner/admin
  • Delete other's comment as editor/viewer (should fail)

Mention Tests

  • Mention valid user creates mention record
  • Mention invalid UUID is ignored
  • Mention non-member is ignored
  • content_rendered shows display names
  • List mentions returns correct data
  • Mark mention read updates read flag
  • Mark all read updates all mentions

Edge Case Tests

  • Comment with 10+ replies scrolls properly (frontend)
  • Delete parent comment cascades to replies
  • Regenerate AI response deletes comments
  • Comment on one-word AI response (frontend min height)
  • User A deletes comment while User B replies → clear error

Permission Tests

  • Viewer can read comments
  • Viewer can create comments
  • Viewer cannot delete other's comments
  • Owner can delete any comment

Out of Scope

The following are explicitly NOT included in v1:

Feature Reason
Multiple threads per message Simplicity
Text selection comments Complexity
Rich text (bold/italic/images) Complexity
Email/push notifications Infrastructure
Emoji reactions Scope
Comment on user messages Focus on AI responses
Nested replies (> 2 levels) UX complexity

Future Considerations (Not Now)

These are documented for future reference but NOT to be implemented now:

  1. Stale comment detection: Store message_content_hash to detect if AI response changed
  2. Real-time sync: Electric-SQL integration for live updates
  3. Notification center: Dedicated page for all notifications
  4. Email digests: Periodic email summaries of mentions
  5. Comment reactions: Thumbs up, etc.

References

  • Existing patterns: routes/new_chat_routes.py, services/connector_service.py
  • Permission system: db.py (Permission enum, DEFAULT_ROLE_PERMISSIONS)
  • Schema patterns: schemas/new_chat.py

Important rules

  • Never do actions in bulk , you will always need my approval before doing something i dod not specifically mention in the prompt
  • After each item task , we need to commit , for the backend we need to make sure , there are no ruff errors (ruff check, ruff format); for the frontend we need to run pnpm format:fix at web project.
  • Commits should have minimal message and use --no-verify flag