edit readme (#335)

This commit is contained in:
Ray 2026-06-22 23:28:08 +08:00 committed by GitHub
parent fe89f246f2
commit 54346716bd
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -13,7 +13,7 @@
# PageIndex: Vectorless, Reasoning-based RAG
<p align="center"><b>Reasoning-based RAG&nbsp;&nbsp;No Vector DB, No Chunking&nbsp;&nbsp;Context-Aware Retrieval&nbsp;&nbsp;Human-like</b></p>
<p align="center"><b>Reasoning-based RAG&nbsp;&nbsp;No Vector DB, No Chunking&nbsp;&nbsp;Context-Aware Retrieval&nbsp;&nbsp;Reads Like Humans</b></p>
<h4 align="center">
<a href="https://vectify.ai">🌐 Website</a>&nbsp;&nbsp;
@ -45,7 +45,9 @@
# 📑 Introduction to PageIndex
Are you frustrated with vector database retrieval accuracy for long professional documents? Traditional vector-based RAG relies on semantic *similarity* rather than true *relevance*. But **similarity ≠ relevance** — what we truly need in retrieval is **relevance**, and that requires **reasoning**. When working with professional documents that demand domain expertise and multi-step reasoning, similarity search often falls short — missing what's relevant but not similar, and returning what's similar yet not relevant.
**PageIndex is a vectorless, reasoning-based RAG engine that mirrors how humans read, delivering traceable, explainable, and context-aware retrieval, without vector databases or chunking.**
Are you frustrated with vector database retrieval accuracy for long professional documents? Traditional vector-based RAG relies on semantic *similarity* rather than true *relevance*. But **similarity ≠ relevance** — what we truly need in retrieval is **relevance**, and that requires **reasoning**. When working with professional documents that demand contextual understanding, domain expertise, and multi-step reasoning, similarity search often falls short — missing what's relevant but not similar, and returning what's similar yet not relevant.
Inspired by AlphaGo, we propose **[PageIndex](https://vectify.ai/pageindex)** — a **vectorless**, **reasoning-based RAG** system that builds a **hierarchical tree index** from long documents and uses LLMs to **reason** *over that index* for **agentic, context-aware retrieval**. The retrieval is *traceable* and *explainable*, with no vector DBs or chunking.
PageIndex simulates how *human experts* navigate and extract knowledge from complex documents through *tree search*, enabling LLMs to *think* and *reason* their way to the most relevant document sections. It performs retrieval in two steps: