Native CLI i18n: The TrustGraph CLI has built-in translation support that dynamically loads language strings. You can test and use different languages by simply passing the --lang flag (e.g., --lang es for Spanish, --lang ru for Russian) or by configuring your environment's LANG variable. Automated Docs Translations: This PR introduces autonomously translated Markdown documentation into several target languages, including Spanish, Swahili, Portuguese, Turkish, Hindi, Hebrew, Arabic, Simplified Chinese, and Russian.
4.7 KiB
| layout | title | parent |
|---|---|---|
| default | Kitambulisho cha Sehemu ya Matini (Document Embeddings Chunk ID) | Swahili (Beta) |
Kitambulisho cha Sehemu ya Matini (Document Embeddings Chunk ID)
Beta Translation: This document was translated via Machine Learning and as such may not be 100% accurate. All non-English languages are currently classified as Beta.
Muhtasari
Hifadhi ya matini ya maandishi kwa sasa huhifadhi matini ya sehemu moja kwa moja katika sehemu ya data ya hifadhi ya vector, na hivyo kurudia data ambayo ipo katika Garage. Hati hii inabadilisha uhifadhi wa matini ya sehemu kwa kutumia marejeleo ya chunk_id.
Hali ya Sasa
@dataclass
class ChunkEmbeddings:
chunk: bytes = b""
vectors: list[list[float]] = field(default_factory=list)
@dataclass
class DocumentEmbeddingsResponse:
error: Error | None = None
chunks: list[str] = field(default_factory=list)
Hifadhi ya data ya aina ya vector:
payload={"doc": chunk} # Duplicates Garage content
Ubunifu
Mabadiliko ya Mpango
ChunkEmbeddings - badilisha "chunk" na "chunk_id":
@dataclass
class ChunkEmbeddings:
chunk_id: str = ""
vectors: list[list[float]] = field(default_factory=list)
Jibu la DocumentEmbeddingsResponse - irudishe chunk_ids badala ya chunks:
@dataclass
class DocumentEmbeddingsResponse:
error: Error | None = None
chunk_ids: list[str] = field(default_factory=list)
Mfumo wa Hifadhi ya Vektor
Maduka yote (Qdrant, Milvus, Pinecone):
payload={"chunk_id": chunk_id}
Mabadiliko ya Mchakato wa RAG wa Hati
Mchakato wa RAG wa hati hupata maudhui ya sehemu kutoka kwa Garage:
# Get chunk_ids from embeddings store
chunk_ids = await self.rag.doc_embeddings_client.query(...)
# Fetch chunk content from Garage
docs = []
for chunk_id in chunk_ids:
content = await self.rag.librarian_client.get_document_content(
chunk_id, self.user
)
docs.append(content)
Mabadiliko ya API/SDK
DocumentEmbeddingsClient hurudia chunk_ids:
return resp.chunk_ids # Changed from resp.chunks
Muundo wa data (Mfasiri wa Majibu ya Matangazo ya Hati):
result["chunk_ids"] = obj.chunk_ids # Changed from chunks
Mabadiliko ya CLI
Zana ya CLI inaonyesha kitambulisho cha vipande (watumiaji wanaweza kupata maudhui kando ikiwa ni lazima).
Faili Zinazohitajika Kubadilishwa
Mpango (Schema)
trustgraph-base/trustgraph/schema/knowledge/embeddings.py - ChunkEmbeddings
trustgraph-base/trustgraph/schema/services/query.py - DocumentEmbeddingsResponse
Ujumbe/Watafsiri
trustgraph-base/trustgraph/messaging/translators/embeddings_query.py - DocumentEmbeddingsResponseTranslator
Mteja (Client)
trustgraph-base/trustgraph/base/document_embeddings_client.py - rudisha kitambulisho cha vipande
SDK/API ya Python
trustgraph-base/trustgraph/api/flow.py - document_embeddings_query
trustgraph-base/trustgraph/api/socket_client.py - document_embeddings_query
trustgraph-base/trustgraph/api/async_flow.py - ikiwa inafaa
trustgraph-base/trustgraph/api/bulk_client.py - uagizaji/uangamizi wa vipande vya maandishi
trustgraph-base/trustgraph/api/async_bulk_client.py - uagizaji/uangamizi wa vipande vya maandishi
Huduma ya Vipande vya Maandishi (Embeddings Service)
trustgraph-flow/trustgraph/embeddings/document_embeddings/embeddings.py - pitisha kitambulisho cha kipande
Waandishi wa Uhifadhi (Storage Writers)
trustgraph-flow/trustgraph/storage/doc_embeddings/qdrant/write.py
trustgraph-flow/trustgraph/storage/doc_embeddings/milvus/write.py
trustgraph-flow/trustgraph/storage/doc_embeddings/pinecone/write.py
Huduma za Utafutaji (Query Services)
trustgraph-flow/trustgraph/query/doc_embeddings/qdrant/service.py
trustgraph-flow/trustgraph/query/doc_embeddings/milvus/service.py
trustgraph-flow/trustgraph/query/doc_embeddings/pinecone/service.py
Lango (Gateway)
trustgraph-flow/trustgraph/gateway/dispatch/document_embeddings_query.py
trustgraph-flow/trustgraph/gateway/dispatch/document_embeddings_export.py
trustgraph-flow/trustgraph/gateway/dispatch/document_embeddings_import.py
Utafutaji wa Hati (Document RAG)
trustgraph-flow/trustgraph/retrieval/document_rag/rag.py - ongeza mteja wa "librarian"
trustgraph-flow/trustgraph/retrieval/document_rag/document_rag.py - pata kutoka "Garage"
CLI
trustgraph-cli/trustgraph/cli/invoke_document_embeddings.py
trustgraph-cli/trustgraph/cli/save_doc_embeds.py
trustgraph-cli/trustgraph/cli/load_doc_embeds.py
Faida
- Chanzo kimoja cha ukweli - maandishi ya vipande tu katika "Garage"
- Kupunguzwa kwa uhifadhi wa hifadhi ya vector
- Inawezesha uhakikisho wa muda wa utafutaji kupitia kitambulisho cha kipande.