PageIndex/pageindex
BukeLy 6fd237986e Recognize whole-line bold as level-1 heading in markdown parser
extract_nodes_from_markdown now matches `**Title**` lines as level-1
headings (alongside ATX `#` headings) and attaches the heading level
on the producer side. extract_node_text_content reads the level from
the node instead of re-running a `^#{1,6}` regex on the source line,
which was silently dropping bold-heading nodes from OCR / MinerU output.

Bold maps to level 1 even when mixed with `#` / `##` / `###` — bold-as-
heading is a courtesy heuristic for non-ATX markdown sources, and
CommonMark has no concept of bold heading depth.
2026-04-28 15:23:34 +08:00
..
__init__.py Add PageIndexClient with agent-based retrieval via OpenAI Agents SDK (#125) 2026-03-26 23:19:50 +08:00
client.py Disable agent tracing and auto-add litellm/ prefix for retrieve_model 2026-03-29 00:55:57 +08:00
config.yaml Disable agent tracing and auto-add litellm/ prefix for retrieve_model 2026-03-29 00:55:57 +08:00
page_index.py Restructure examples directory and improve document storage (#189) 2026-03-28 04:28:59 +08:00
page_index_md.py Recognize whole-line bold as level-1 heading in markdown parser 2026-04-28 15:23:34 +08:00
retrieve.py Restructure examples directory and improve document storage (#189) 2026-03-28 04:28:59 +08:00
utils.py Add PageIndexClient with agent-based retrieval via OpenAI Agents SDK (#125) 2026-03-26 23:19:50 +08:00