fix image

This commit is contained in:
zmtomorrow 2025-08-22 09:57:01 +01:00
parent df637598e8
commit af64e253f5

View file

@ -40,7 +40,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"PageIndex generates a searchable tree structure of documents, enabling reasoning-based retrieval through tree search — without vectors.\n",
"PageIndex generates a searchable tree structure of documents, enabling reasoning-based retrieval through tree search. \n",
"\n",
"<div align=\"center\">\n",
" <img src=\"https://pageindex.ai/static/images/vectorless_rag_workflow.png\" width=\"100%\">\n",
"</div>\n",
"\n",
"- **No Vectors Needed**: Uses document structure and LLM reasoning for retrieval.\n",
"- **No Chunking Needed**: Documents are organized into natural sections rather than artificial chunks.\n",
@ -57,11 +61,7 @@
"This notebook demonstrates a simple example of **vectorless RAG** with PageIndex through the following steps:\n",
"- [x] Build a PageIndex tree structure of a document\n",
"- [x] Perform reasoning-based retrieval with tree search\n",
"- [x] Generate answers based on the retrieved context\n",
"\n",
"The figure below shows an overview of the workflow:\n",
"\n",
"<img src=\"https://pageindex.ai/static/images/vectorless_rag_workflow.png\" width=\"70%\">"
"- [x] Generate answers based on the retrieved context"
]
},
{
@ -85,7 +85,7 @@
"id": "edTfrizMFK4c"
},
"source": [
"#### Install dependencies"
"### Install dependencies"
]
},
{
@ -106,7 +106,7 @@
"id": "WVEWzPKGcG1M"
},
"source": [
"#### Setup environment"
"### Setup environment"
]
},
{
@ -134,7 +134,7 @@
"id": "AR7PLeVbcG1N"
},
"source": [
"#### Define utility functions"
"### Define utility functions"
]
},
{
@ -198,7 +198,7 @@
"id": "Mzd1VWjwMUJL"
},
"source": [
"#### Submit a document for PageIndex tree generation"
"### Submit a document with PageIndex SDK"
]
},
{
@ -222,7 +222,7 @@
}
],
"source": [
"# You can also use our GitHub repo to generate PageIndex tree\n",
"# You can also use our GitHub repo to generate PageIndex structure\n",
"# https://github.com/VectifyAI/PageIndex\n",
"\n",
"pdf_url = \"https://arxiv.org/pdf/2501.12948.pdf\"\n",
@ -244,7 +244,7 @@
"id": "4-Hrh0azcG1N"
},
"source": [
"#### Get the generated PageIndex tree structure"
"### Get the generated PageIndex tree structure"
]
},
{
@ -356,7 +356,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Use LLM for tree search and identify nodes that might contain relevant context"
"### Use LLM for tree search and identify nodes that might contain relevant context"
]
},
{
@ -396,7 +396,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Print retrieved nodes and reasoning process"
"### Print retrieved nodes and reasoning process"
]
},
{
@ -455,7 +455,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Extract relevant context from retrieved nodes"
"### Extract relevant context from retrieved nodes"
]
},
{
@ -507,12 +507,12 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Generate answer based on retrieved context"
"### Generate answer based on retrieved context"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 59,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
@ -564,7 +564,7 @@
"id": "_1kaGD3GcG1O"
},
"source": [
"## 🎯 What's Next\n",
"# 🎯 What's Next\n",
"\n",
"This notebook has demonstrated a basic, minimal example of **reasoning-based**, **vectorless** RAG with PageIndex. The workflow illustrates the core idea:\n",
"> *Generating a hierarchical tree structure from a document, reasoning over that tree structure, and extracting relevant context, without relying on a vector database or top-k similarity search*.\n",
@ -581,7 +581,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 🔎 Learn More About PageIndex\n",
"# 🔎 Learn More About PageIndex\n",
" <a href=\"https://vectify.ai\">🏠 Homepage</a>&nbsp; • &nbsp;\n",
" <a href=\"https://dash.pageindex.ai\">🖥️ Dashboard</a>&nbsp; • &nbsp;\n",
" <a href=\"https://docs.pageindex.ai/quickstart\">📚 API Docs</a>&nbsp; • &nbsp;\n",