fix image

This commit is contained in:
zmtomorrow 2025-08-22 09:57:01 +01:00
parent df637598e8
commit af64e253f5

View file

@ -40,7 +40,11 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"PageIndex generates a searchable tree structure of documents, enabling reasoning-based retrieval through tree search — without vectors.\n", "PageIndex generates a searchable tree structure of documents, enabling reasoning-based retrieval through tree search. \n",
"\n",
"<div align=\"center\">\n",
" <img src=\"https://pageindex.ai/static/images/vectorless_rag_workflow.png\" width=\"100%\">\n",
"</div>\n",
"\n", "\n",
"- **No Vectors Needed**: Uses document structure and LLM reasoning for retrieval.\n", "- **No Vectors Needed**: Uses document structure and LLM reasoning for retrieval.\n",
"- **No Chunking Needed**: Documents are organized into natural sections rather than artificial chunks.\n", "- **No Chunking Needed**: Documents are organized into natural sections rather than artificial chunks.\n",
@ -57,11 +61,7 @@
"This notebook demonstrates a simple example of **vectorless RAG** with PageIndex through the following steps:\n", "This notebook demonstrates a simple example of **vectorless RAG** with PageIndex through the following steps:\n",
"- [x] Build a PageIndex tree structure of a document\n", "- [x] Build a PageIndex tree structure of a document\n",
"- [x] Perform reasoning-based retrieval with tree search\n", "- [x] Perform reasoning-based retrieval with tree search\n",
"- [x] Generate answers based on the retrieved context\n", "- [x] Generate answers based on the retrieved context"
"\n",
"The figure below shows an overview of the workflow:\n",
"\n",
"<img src=\"https://pageindex.ai/static/images/vectorless_rag_workflow.png\" width=\"70%\">"
] ]
}, },
{ {
@ -85,7 +85,7 @@
"id": "edTfrizMFK4c" "id": "edTfrizMFK4c"
}, },
"source": [ "source": [
"#### Install dependencies" "### Install dependencies"
] ]
}, },
{ {
@ -106,7 +106,7 @@
"id": "WVEWzPKGcG1M" "id": "WVEWzPKGcG1M"
}, },
"source": [ "source": [
"#### Setup environment" "### Setup environment"
] ]
}, },
{ {
@ -134,7 +134,7 @@
"id": "AR7PLeVbcG1N" "id": "AR7PLeVbcG1N"
}, },
"source": [ "source": [
"#### Define utility functions" "### Define utility functions"
] ]
}, },
{ {
@ -198,7 +198,7 @@
"id": "Mzd1VWjwMUJL" "id": "Mzd1VWjwMUJL"
}, },
"source": [ "source": [
"#### Submit a document for PageIndex tree generation" "### Submit a document with PageIndex SDK"
] ]
}, },
{ {
@ -222,7 +222,7 @@
} }
], ],
"source": [ "source": [
"# You can also use our GitHub repo to generate PageIndex tree\n", "# You can also use our GitHub repo to generate PageIndex structure\n",
"# https://github.com/VectifyAI/PageIndex\n", "# https://github.com/VectifyAI/PageIndex\n",
"\n", "\n",
"pdf_url = \"https://arxiv.org/pdf/2501.12948.pdf\"\n", "pdf_url = \"https://arxiv.org/pdf/2501.12948.pdf\"\n",
@ -244,7 +244,7 @@
"id": "4-Hrh0azcG1N" "id": "4-Hrh0azcG1N"
}, },
"source": [ "source": [
"#### Get the generated PageIndex tree structure" "### Get the generated PageIndex tree structure"
] ]
}, },
{ {
@ -356,7 +356,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"#### Use LLM for tree search and identify nodes that might contain relevant context" "### Use LLM for tree search and identify nodes that might contain relevant context"
] ]
}, },
{ {
@ -396,7 +396,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"#### Print retrieved nodes and reasoning process" "### Print retrieved nodes and reasoning process"
] ]
}, },
{ {
@ -455,7 +455,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"#### Extract relevant context from retrieved nodes" "### Extract relevant context from retrieved nodes"
] ]
}, },
{ {
@ -507,12 +507,12 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"#### Generate answer based on retrieved context" "### Generate answer based on retrieved context"
] ]
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": 59,
"metadata": { "metadata": {
"colab": { "colab": {
"base_uri": "https://localhost:8080/", "base_uri": "https://localhost:8080/",
@ -564,7 +564,7 @@
"id": "_1kaGD3GcG1O" "id": "_1kaGD3GcG1O"
}, },
"source": [ "source": [
"## 🎯 What's Next\n", "# 🎯 What's Next\n",
"\n", "\n",
"This notebook has demonstrated a basic, minimal example of **reasoning-based**, **vectorless** RAG with PageIndex. The workflow illustrates the core idea:\n", "This notebook has demonstrated a basic, minimal example of **reasoning-based**, **vectorless** RAG with PageIndex. The workflow illustrates the core idea:\n",
"> *Generating a hierarchical tree structure from a document, reasoning over that tree structure, and extracting relevant context, without relying on a vector database or top-k similarity search*.\n", "> *Generating a hierarchical tree structure from a document, reasoning over that tree structure, and extracting relevant context, without relying on a vector database or top-k similarity search*.\n",
@ -581,7 +581,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## 🔎 Learn More About PageIndex\n", "# 🔎 Learn More About PageIndex\n",
" <a href=\"https://vectify.ai\">🏠 Homepage</a>&nbsp; • &nbsp;\n", " <a href=\"https://vectify.ai\">🏠 Homepage</a>&nbsp; • &nbsp;\n",
" <a href=\"https://dash.pageindex.ai\">🖥️ Dashboard</a>&nbsp; • &nbsp;\n", " <a href=\"https://dash.pageindex.ai\">🖥️ Dashboard</a>&nbsp; • &nbsp;\n",
" <a href=\"https://docs.pageindex.ai/quickstart\">📚 API Docs</a>&nbsp; • &nbsp;\n", " <a href=\"https://docs.pageindex.ai/quickstart\">📚 API Docs</a>&nbsp; • &nbsp;\n",