feat: various UI fixes, prompt optimizations, and allowing duplicate docs

- Updated `content_hash` in the `Document` model to remove global uniqueness, allowing identical content across different paths.
- Enhanced `_create_document` function to handle path uniqueness and prevent session-poisoning from `IntegrityError`.
- Added detailed comments for clarity on the changes and their implications.
- Introduced new citation handling in the editor for improved user experience with citation jumps.
- Updated package dependencies in the frontend for better functionality.
This commit is contained in:
DESKTOP-RTLN3BA\$punk 2026-04-28 21:30:53 -07:00
parent e6433f78c4
commit b9a66cb417
26 changed files with 1540 additions and 852 deletions

View file

@ -976,7 +976,15 @@ class Document(BaseModel, TimestampMixin):
document_metadata = Column(JSON, nullable=True)
content = Column(Text, nullable=False)
content_hash = Column(String, nullable=False, index=True, unique=True)
# ``content_hash`` is intentionally NOT globally unique. In a real
# filesystem two files at different paths can hold identical bytes,
# and the agent's ``write_file`` flow needs that semantic to support
# copy / duplicate operations. Path uniqueness lives on
# ``unique_identifier_hash`` (per search space). The hash remains
# indexed because connector indexers consult it as a change-detection
# / cross-source dedup hint via :func:`check_duplicate_document`.
# See migration 133.
content_hash = Column(String, nullable=False, index=True)
unique_identifier_hash = Column(String, nullable=True, index=True, unique=True)
embedding = Column(Vector(config.embedding_model_instance.dimension))