update print

This commit is contained in:
Ray 2025-08-21 22:10:04 +08:00
parent 368ab3873a
commit 90ff7b7d27

View file

@ -125,7 +125,7 @@
},
{
"cell_type": "code",
"execution_count": 137,
"execution_count": 7,
"metadata": {
"id": "hmj3POkDcG1N"
},
@ -154,10 +154,6 @@
" cleaned_tree = remove_fields(tree.copy(), exclude_fields)\n",
" pprint(cleaned_tree, sort_dicts=False, width=150)\n",
"\n",
"def print_markdown(*lines):\n",
" text = \"\\n\".join(lines)\n",
" display(Markdown(text))\n",
"\n",
"def create_node_mapping(tree):\n",
" \"\"\"Create a mapping of node_id to node for quick lookup\"\"\"\n",
" def get_all_nodes(tree):\n",
@ -235,7 +231,7 @@
},
{
"cell_type": "code",
"execution_count": 138,
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
@ -245,23 +241,11 @@
"outputId": "dc944660-38ad-47ea-d358-be422edbae53"
},
"outputs": [
{
"data": {
"text/markdown": [
"## Simplified Tree Structure of the Document\n",
"---"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Simplified Tree Structure of the Document:\n",
"[{'title': 'DeepSeek-R1: Incentivizing Reasoning Capability in...',\n",
" 'node_id': '0000',\n",
" 'prefix_summary': '# DeepSeek-R1: Incentivizing Reasoning Capability ...',\n",
@ -323,7 +307,7 @@
"source": [
"if pi_client.is_retrieval_ready(doc_id):\n",
" tree = pi_client.get_tree(doc_id, node_summary=True)['result']\n",
" print_markdown('## Simplified Tree Structure of the Document', '---')\n",
" print('Simplified Tree Structure of the Document:')\n",
" print_tree(tree)\n",
"else:\n",
" print(\"Processing document, please try again later...\")"
@ -397,48 +381,13 @@
"outputId": "6bb6d052-ef30-4716-f88e-be98bcb7ebdb"
},
"outputs": [
{
"data": {
"text/markdown": [
"## Reasoning Process\n",
"---"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
"The question asks for the conclusions in the document. The most direct and relevant node is '5. Conclusion, Limitations, and Future Work' (node_id: 0019), as it explicitly contains the conclusion section. Additionally, the 'Abstract' (node_id: 0001) often summarizes the main findings and conclusions, and the 'Discussion' (node_id: 0018) may also contain concluding remarks or synthesis of results. However, the primary and most comprehensive source for conclusions is node 0019."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
"## Retrieved Nodes\n",
"---"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Reasoning Process:\n",
"The question asks for the conclusions in the document. Typically, conclusions are found in sections explicitly titled 'Conclusion' or in combined sections such as 'Conclusion, Limitations, and Future Work.' In this document tree, node 0019 ('5. Conclusion, Limitations, and Future Work') is the most directly relevant. Additionally, sometimes the 'Abstract' (node 0001) and 'Discussion' (node 0018) sections may contain summary statements or high-level conclusions, but the primary and most likely location for formal conclusions is node 0019.\n",
"Retrieved Nodes:\n",
"Node ID: 0019\t Page: 16\t Title: 5. Conclusion, Limitations, and Future Work\n"
]
}
@ -447,10 +396,10 @@
"node_map = create_node_mapping(tree)\n",
"tree_search_result_json = json.loads(tree_search_result)\n",
"\n",
"print_markdown('## Reasoning Process', '---')\n",
"print_markdown(tree_search_result_json['thinking'])\n",
"print('Reasoning Process:')\n",
"print(tree_search_result_json['thinking'])\n",
"\n",
"print_markdown('## Retrieved Nodes', '---')\n",
"print('Retrieved Nodes:')\n",
"for node_id in tree_search_result_json[\"node_list\"]:\n",
" node = node_map[node_id]\n",
" print(f\"Node ID: {node['node_id']}\\t Page: {node['page_index']}\\t Title: {node['title']}\")"
@ -474,7 +423,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 31,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
@ -485,43 +434,26 @@
},
"outputs": [
{
"data": {
"text/markdown": [
"## Retrieved Context\n",
"---"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
"## 5. Conclusion, Limitations, and Future Work\n",
"\n",
"In this work, we share our journey in enhancing model reasoning abilities through reinforcement learning. DeepSeek-R1-Zero represents a pure RL approach without relying on cold-start data, achieving strong performance across various tasks. DeepSeek-R1 is more powerful, leveraging cold-start data alongside iterative RL fine-tuning. Ultimately, DeepSeek-R1 achieves performance comparable to OpenAI-o1-1217 on a range of tasks.\n",
"\n",
"We further explore distillation the reasoning capability to small dense models. We use DeepSeek-R1 as the teacher model to generate 800K training samples, and fine-tune several small dense models. The results are promising: DeepSeek-R1-Distill-Qwen-1.5B outperforms GPT-4o and Claude-3.5-Sonnet on math benchmarks with $28.9 \\%$ on AIME and $83.9 \\%$ on MATH. Other dense models also achieve impressive results, significantly outperforming other instructiontuned models based on the same underlying checkpoints.\n",
"\n",
"In the fut ..."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
"name": "stdout",
"output_type": "stream",
"text": [
"Retrieved Context:\n",
"## 5. Conclusion, Limitations, and Future Work\n",
"\n",
"In this work, we share our journey in enhancing model reasoning abilities through reinforcement learning. DeepSeek-R1-Zero represents a pure RL approach without relying on cold-start data, achieving strong performance across various tasks. DeepSeek-R1 is more powerful, leveraging cold-start data alongside iterative RL fine-tuning. Ultimately, DeepSeek-R1 achieves performance comparable to OpenAI-o1-1217 on a range of tasks.\n",
"\n",
"We further explore distillation the reasoning capability to small dense models. We use DeepSeek-R1 as the teacher model to generate 800K training samples, and fine-tune several small dense models. The results are promising: DeepSeek-R1-Distill-Qwen-1.5B outperforms GPT-4o and Claude-3.5-Sonnet on math benchmarks with $28.9 \\%$ on AIME and $83.9 \\%$ on MATH. Other dense models also achieve impressive results, significantly outperforming other instructiontuned models based on the same underlying checkpoints.\n",
"\n",
"In the fut ...\n"
]
}
],
"source": [
"node_list = json.loads(tree_search_result)[\"node_list\"]\n",
"relevant_content = \"\\n\\n\".join(node_map[node_id][\"text\"] for node_id in node_list)\n",
"\n",
"print_markdown('## Retrieved Context', '---')\n",
"print_markdown(relevant_content[:1000] + ' ...')"
"print('Retrieved Context:')\n",
"print(relevant_content[:1000] + ' ...')"
]
},
{
@ -533,7 +465,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 33,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
@ -544,34 +476,19 @@
},
"outputs": [
{
"data": {
"text/markdown": [
"## Generated Answer\n",
"---"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
"**Conclusions in this document:**\n",
"\n",
"- DeepSeek-R1-Zero, a pure reinforcement learning (RL) model without cold-start data, achieves strong performance across various tasks.\n",
"- DeepSeek-R1, which combines cold-start data with iterative RL fine-tuning, is even more powerful and achieves performance comparable to OpenAI-o1-1217 on a range of tasks.\n",
"- The reasoning capabilities of DeepSeek-R1 can be successfully distilled into smaller dense models, with DeepSeek-R1-Distill-Qwen-1.5B outperforming GPT-4o and Claude-3.5-Sonnet on math benchmarks.\n",
"- Other small dense models fine-tuned with DeepSeek-R1 data also significantly outperform other instruction-tuned models based on the same checkpoints."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
"name": "stdout",
"output_type": "stream",
"text": [
"Generated Answer:\n",
"**Conclusions in this document:**\n",
"\n",
"- DeepSeek-R1-Zero, a pure reinforcement learning (RL) model without cold-start data, achieves strong performance across various tasks.\n",
"- DeepSeek-R1, which combines cold-start data with iterative RL fine-tuning, is more powerful and achieves performance comparable to OpenAI-o1-1217 on a range of tasks.\n",
"- Distilling DeepSeek-R1s reasoning capabilities into smaller dense models is effective; for example, DeepSeek-R1-Distill-Qwen-1.5B outperforms GPT-4o and Claude-3.5-Sonnet on math benchmarks.\n",
"- Other small dense models fine-tuned with DeepSeek-R1 data also significantly outperform other instruction-tuned models based on the same checkpoints.\n",
"\n",
"These results demonstrate the effectiveness of RL-based approaches and distillation for enhancing model reasoning abilities and performance.\n"
]
}
],
"source": [
@ -584,9 +501,9 @@
"Provide a clear, concise answer based only on the context provided.\n",
"\"\"\"\n",
"\n",
"print_markdown('## Generated Answer', '---')\n",
"print('Generated Answer:')\n",
"answer = await call_llm(answer_prompt)\n",
"print_markdown(answer)"
"print(answer)"
]
},
{