Merge branch 'geekan:main' into feat_st_game

2026-06-05 14:55:18 +02:00 · 2024-03-26 09:51:02 +08:00 · 2024-03-26 09:51:02 +08:00 · 64350d2c6d
commit 64350d2c6d
parent fac1d35223 12948a5482
303 changed files with 9155 additions and 3587 deletions
--- a/examples/crawl_webpage.py
+++ b/examples/crawl_webpage.py
@ -1,22 +0,0 @@
-# -*- encoding: utf-8 -*-
-"""
-@Date    :   2024/01/24 15:11:27
-@Author  :   orange-crow
-@File    :   crawl_webpage.py
-"""
-
-from metagpt.roles.ci.code_interpreter import CodeInterpreter
-
-
-async def main():
-    prompt = """Get data from `paperlist` table in https://papercopilot.com/statistics/iclr-statistics/iclr-2024-statistics/,
-    and save it to a csv file. paper title must include `multiagent` or `large language model`. *notice: print key variables*"""
-    ci = CodeInterpreter(goal=prompt, use_tools=True)
-
-    await ci.run(prompt)
-
-
-if __name__ == "__main__":
-    import asyncio
-
-    asyncio.run(main())
--- a/examples/data/rag/travel.txt
+++ b/examples/data/rag/travel.txt
@ -0,0 +1 @@
+Bob likes traveling.
--- a/examples/data/rag/writer.txt
+++ b/examples/data/rag/writer.txt
@ -0,0 +1,109 @@
+Productivity
+I think I am at least somewhat more productive than average, and people sometimes ask me for productivity tips.  So I decided to just write them all down in one place.
+
+Compound growth gets discussed as a financial concept, but it works in careers as well, and it is magic.  A small productivity gain, compounded over 50 years, is worth a lot.  So it’s worth figuring out how to optimize productivity. If you get 10% more done and 1% better every day compared to someone else, the compounded difference is massive. 
+
+What you work on
+
+Famous writers have some essential qualities, creativity and discipline
+
+It doesn’t matter how fast you move if it’s in a worthless direction.  Picking the right thing to work on is the most important element of productivity and usually almost ignored.  So think about it more!  Independent thought is hard but it’s something you can get better at with practice.
+
+The most impressive people I know have strong beliefs about the world, which is rare in the general population.  If you find yourself always agreeing with whomever you last spoke with, that’s bad.  You will of course be wrong sometimes, but develop the confidence to stick with your convictions.  It will let you be courageous when you’re right about something important that most people don’t see.
+
+I make sure to leave enough time in my schedule to think about what to work on.  The best ways for me to do this are reading books, hanging out with interesting people, and spending time in nature.
+
+I’ve learned that I can’t be very productive working on things I don’t care about or don’t like.  So I just try not to put myself in a position where I have to do them (by delegating, avoiding, or something else).  Stuff that you don’t like is a painful drag on morale and momentum.
+
+By the way, here is an important lesson about delegation: remember that everyone else is also most productive when they’re doing what they like, and do what you’d want other people to do for you—try to figure out who likes (and is good at) doing what, and delegate that way.  
+
+If you find yourself not liking what you’re doing for a long period of time, seriously consider a major job change.  Short-term burnout happens, but if it isn’t resolved with some time off, maybe it’s time to do something you’re more interested in. 
+
+I’ve been very fortunate to find work I like so much I’d do it for free, which makes it easy to be really productive.
+
+It’s important to learn that you can learn anything you want, and that you can get better quickly.  This feels like an unlikely miracle the first few times it happens, but eventually you learn to trust that you can do it.
+
+Doing great work usually requires colleagues of some sort.  Try to be around smart, productive, happy, and positive people that don’t belittle your ambitions.  I love being around people who push me and inspire me to be better.  To the degree you able to, avoid the opposite kind of people—the cost of letting them take up your mental cycles is horrific. 
+
+You have to both pick the right problem and do the work.  There aren’t many shortcuts.  If you’re going to do something really important, you are very likely going to work both smart and hard.  The biggest prizes are heavily competed for.  This isn’t true in every field (there are great mathematicians who never spend that many hours a week working) but it is in most.
+
+Prioritization
+
+Writers have to work hard to be successful 
+
+My system has three key pillars: “Make sure to get the important shit done”, “Don’t waste time on stupid shit”, and “make a lot of lists”.
+
+I highly recommend using lists.  I make lists of what I want to accomplish each year, each month, and each day.  Lists are very focusing, and they help me with multitasking because I don’t have to keep as much in my head.  If I’m not in the mood for some particular task, I can always find something else I’m excited to do.
+
+I prefer lists written down on paper.  It’s easy to add and remove tasks.  I can access them during meetings without feeling rude.  I re-transcribe lists frequently, which forces me to think about everything on the list and gives me an opportunity to add and remove items.
+
+I don’t bother with categorization or trying to size tasks or anything like that (the most I do is put a star next to really important items).  
+
+I try to prioritize in a way that generates momentum.  The more I get done, the better I feel, and then the more I get done.  I like to start and end each day with something I can really make progress on.
+
+I am relentless about getting my most important projects done—I’ve found that if I really want something to happen and I push hard enough, it usually happens. 
+
+I try to be ruthless about saying no to stuff, and doing non-critical things in the quickest way possible.  I probably take this too far—for example, I am almost sure I am terse to the point of rudeness when replying to emails.
+
+Passion and adaptability are key qualities to writers
+
+I generally try to avoid meetings and conferences as I find the time cost to be huge—I get the most value out of time in my office.  However, it is critical that you keep enough space in your schedule to allow for chance encounters and exposure to new people and ideas.  Having an open network is valuable; though probably 90% of the random meetings I take are a waste of time, the other 10% really make up for it.
+
+I find most meetings are best scheduled for 15-20 minutes, or 2 hours.  The default of 1 hour is usually wrong, and leads to a lot of wasted time.
+
+I have different times of day I try to use for different kinds of work.  The first few hours of the morning are definitely my most productive time of the day, so I don’t let anyone schedule anything then.  I try to do meetings in the afternoon.  I take a break, or switch tasks, whenever I feel my attention starting to fade. 
+
+I don’t think most people value their time enough—I am surprised by the number of people I know who make $100 an hour and yet will spend a couple of hours doing something they don’t want to do to save $20.
+
+Also, don’t fall into the trap of productivity porn—chasing productivity for its own sake isn’t helpful.  Many people spend too much time thinking about how to perfectly optimize their system, and not nearly enough asking if they’re working on the right problems.  It doesn’t matter what system you use or if you squeeze out every second if you’re working on the wrong thing.
+
+The right goal is to allocate your year optimally, not your day.
+
+Physical factors
+
+Very likely what is optimal for me won’t be optimal for you.  You’ll have to experiment to find out what works best for your body.  It’s definitely worth doing—it helps in all aspects of life, and you’ll feel a lot better and happier overall.
+
+It probably took a little bit of my time every week for a few years to arrive at what works best for me, but my sense is if I do a good job at all the below I’m at least 1.5x more productive than if not.
+
+Sleep seems to be the most important physical factor in productivity for me.  Some sort of sleep tracker to figure out how to sleep best is helpful.  I’ve found the only thing I’m consistent with are in the set-it-and-forget-it category, and I really like the Emfit QS+Active.
+
+I like a cold, dark, quiet room, and a great mattress (I resisted spending a bunch of money on a great mattress for years, which was stupid—it makes a huge difference to my sleep quality.  I love this one).  Not eating a lot in the few hours before sleep helps.  Not drinking alcohol helps a lot, though I’m not willing to do that all the time.
+
+I use a Chili Pad to be cold while I sleep if I can’t get the room cold enough, which is great but loud (I set it up to have the cooler unit outside my room).
+
+When traveling, I use an eye mask and ear plugs.
+
+Writers usually have empathy to write good books.
+
+This is likely to be controversial, but I take a low dose of sleeping pills (like a third of a normal dose) or a very low dose of cannabis whenever I can’t sleep.  I am a bad sleeper in general, and a particularly bad sleeper when I travel.  It likely has tradeoffs, but so does not sleeping well.  If you can already sleep well, I wouldn’t recommend this.
+
+I use a full spectrum LED light most mornings for about 10-15 minutes while I catch up on email.  It’s great—if you try nothing else in here, this is the thing I’d try.  It’s a ridiculous gain for me.  I like this one, and it’s easy to travel with.
+
+Exercise is probably the second most important physical factor.  I tried a number of different exercise programs for a few months each and the one that seemed best was lifting heavy weights 3x a week for an hour, and high intensity interval training occasionally.  In addition to productivity gains, this is also the exercise program that makes me feel the best overall.  
+
+The third area is nutrition.  I very rarely eat breakfast, so I get about 15 hours of fasting most days (except an espresso when I wake up).  I know this is contrary to most advice, and I suspect it’s not optimal for most people, but it definitely works well for me.
+
+Eating lots of sugar is the thing that makes me feel the worst and that I try hardest to avoid.  I also try to avoid foods that aggravate my digestion or spike up inflammation (for example, very spicy foods).  I don’t have much willpower when it comes to sweet things, so I mostly just try to keep junk food out of the house.
+
+I have one big shot of espresso immediately when I wake up and one after lunch.  I assume this is about 200mg total of caffeine per day.  I tried a few other configurations; this was the one that worked by far the best.  I otherwise aggressively avoid stimulants, but I will have more coffee if I’m super tired and really need to get something done.
+
+If a writer want to be super, then should include innovative thinking.
+
+I’m vegetarian and have been since I was a kid, and I supplement methyl B-12, Omega-3, Iron, and Vitamin D-3.  I got to this list with a year or so of quarterly blood tests; it’s worked for me ever since (I re-test maybe every year and a half or so).  There are many doctors who will happily work with you on a super comprehensive blood test (and services like WellnessFX).  I also go out of my way to drink a lot of protein shakes, which I hate and I wouldn’t do if I weren’t vegetarian.
+
+Other stuff
+
+Here’s what I like in a workspace: natural light, quiet, knowing that I won’t be interrupted if I don’t want to be, long blocks of time, and being comfortable and relaxed (I’ve got a beautiful desk with a couple of 4k monitors on it in my office, but I spend almost all my time on my couch with my laptop).
+
+I wrote custom software for the annoying things I have to do frequently, which is great.  I also made an effort to learn to type really fast and the keyboard shortcuts that help with my workflow.
+
+Like most people, I sometimes go through periods of a week or two where I just have no motivation to do anything (I suspect it may have something to do with nutrition).  This sucks and always seems to happen at inconvenient times.  I have not figured out what to do about it besides wait for the fog to lift, and to trust that eventually it always does.  And I generally try to avoid people and situations that put me in bad moods, which is good advice whether you care about productivity or not.
+
+In general, I think it’s good to overcommit a little bit.  I find that I generally get done what I take on, and if I have a little bit too much to do it makes me more efficient at everything, which is a way to train to avoid distractions (a great habit to build!).  However, overcommitting a lot is disastrous.
+
+Don’t neglect your family and friends for the sake of productivity—that’s a very stupid tradeoff (and very likely a net productivity loss, because you’ll be less happy).  Don’t neglect doing things you love or that clear your head either.
+
+Finally, to repeat one more time: productivity in the wrong direction isn’t worth anything at all.  Think more about what to work on.
+
+Open-Mindedness and curiosity are essential to writers
+
--- a/examples/data/search_kb/example.json
+++ b/examples/data/search_kb/example.json
--- a/examples/data/search_kb/example.xlsx
+++ b/examples/data/search_kb/example.xlsx
--- a/examples/debate.py
+++ b/examples/debate.py
@ -5,6 +5,7 @@ Author: garylin2099
@Modified By: mashenquan, 2023-11-1. In accordance with Chapter 2.1.3 of RFC 116, modify the data type of the `send_to`
        value of the `Message` object; modify the argument type of `get_by_actions`.
 """
+
 import asyncio
 import platform
 from typing import Any
@ -105,4 +106,4 @@ def main(idea: str, investment: float = 3.0, n_round: int = 10):


 if __name__ == "__main__":
-    fire.Fire(main)
+    fire.Fire(main)  # run as python debate.py --idea="TOPIC" --investment=3.0 --n_round=5
--- a/examples/debate_simple.py
+++ b/examples/debate_simple.py
@ -8,14 +8,17 @@
 import asyncio

 from metagpt.actions import Action
+from metagpt.config2 import Config
 from metagpt.environment import Environment
 from metagpt.roles import Role
 from metagpt.team import Team

-action1 = Action(name="AlexSay", instruction="Express your opinion with emotion and don't repeat it")
-action1.llm.model = "gpt-4-1106-preview"
-action2 = Action(name="BobSay", instruction="Express your opinion with emotion and don't repeat it")
-action2.llm.model = "gpt-3.5-turbo-1106"
+gpt35 = Config.default()
+gpt35.llm.model = "gpt-3.5-turbo-1106"
+gpt4 = Config.default()
+gpt4.llm.model = "gpt-4-1106-preview"
+action1 = Action(config=gpt4, name="AlexSay", instruction="Express your opinion with emotion and don't repeat it")
+action2 = Action(config=gpt35, name="BobSay", instruction="Express your opinion with emotion and don't repeat it")
 alex = Role(name="Alex", profile="Democratic candidate", goal="Win the election", actions=[action1], watch=[action2])
 bob = Role(name="Bob", profile="Republican candidate", goal="Win the election", actions=[action2], watch=[action1])
 env = Environment(desc="US election live broadcast")
--- a/examples/di/README.md
+++ b/examples/di/README.md
@ -0,0 +1,20 @@
+# Data Interpreter (DI)
+
+## What is Data Interpreter
+Data Interpreter is an agent who solves data-related problems through codes. It understands user requirements, makes plans, writes codes for execution, and uses tools if necessary. These capabilities enable it to tackle a wide range of scenarios, please check out the examples below. For overall design and technical details, please see our [paper](https://arxiv.org/abs/2402.18679).
+
+## Example List
+- Data visualization
+- Machine learning modeling
+- Image background removal
+- Solve math problems
+- Receipt OCR
+- Tool usage: web page imitation
+- Tool usage: web crawling
+- Tool usage: text2image
+- Tool usage: email summarization and response\
+- More on the way!
+
+Please see the [docs](https://docs.deepwisdom.ai/main/en/guide/use_cases/agent/interpreter/intro.html) for more explanation.
+
+We are continuously releasing codes, stay tuned!
--- a/examples/di/arxiv_reader.py
+++ b/examples/di/arxiv_reader.py
@ -0,0 +1,21 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+from metagpt.roles.di.data_interpreter import DataInterpreter
+
+
+async def main():
+    template = "https://arxiv.org/list/{tag}/pastweek?skip=0&show=300"
+    tags = ["cs.ai", "cs.cl", "cs.lg", "cs.se"]
+    urls = [template.format(tag=tag) for tag in tags]
+    prompt = f"""This is a collection of arxiv urls: '{urls}' .
+Record each article, remove duplicates by title (they may have multiple tags), filter out papers related to 
+large language model / agent / llm, print top 100 and visualize the word count of the titles"""
+    di = DataInterpreter(react_mode="react", tools=["scrape_web_playwright"])
+
+    await di.run(prompt)
+
+
+if __name__ == "__main__":
+    import asyncio
+
+    asyncio.run(main())
--- a/examples/di/crawl_webpage.py
+++ b/examples/di/crawl_webpage.py
@ -0,0 +1,40 @@
+# -*- encoding: utf-8 -*-
+"""
+@Date    :   2024/01/24 15:11:27
+@Author  :   orange-crow
+@File    :   crawl_webpage.py
+"""
+
+from metagpt.roles.di.data_interpreter import DataInterpreter
+
+PAPER_LIST_REQ = """"
+Get data from `paperlist` table in https://papercopilot.com/statistics/iclr-statistics/iclr-2024-statistics/,
+and save it to a csv file. paper title must include `multiagent` or `large language model`. *notice: print key variables*
+"""
+
+ECOMMERCE_REQ = """
+Get products data from website https://scrapeme.live/shop/ and save it as a csv file.
+**Notice: Firstly parse the web page encoding and the text HTML structure;
+The first page product name, price, product URL, and image URL must be saved in the csv;**
+"""
+
+NEWS_36KR_REQ = """从36kr创投平台https://pitchhub.36kr.com/financing-flash 所有初创企业融资的信息, **注意: 这是一个中文网站**;
+下面是一个大致流程, 你会根据每一步的运行结果对当前计划中的任务做出适当调整:
+1. 爬取并本地保存html结构;
+2. 直接打印第7个*`快讯`*关键词后2000个字符的html内容, 作为*快讯的html内容示例*;
+3. 反思*快讯的html内容示例*中的规律, 设计正则匹配表达式来获取*`快讯`*的标题、链接、时间;
+4. 筛选最近3天的初创企业融资*`快讯`*, 以list[dict]形式打印前5个。
+5. 将全部结果存在本地csv中
+"""
+
+
+async def main():
+    di = DataInterpreter(tools=["scrape_web_playwright"])
+
+    await di.run(ECOMMERCE_REQ)
+
+
+if __name__ == "__main__":
+    import asyncio
+
+    asyncio.run(main())
--- a/examples/di/custom_tool.py
+++ b/examples/di/custom_tool.py
@ -0,0 +1,36 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+"""
+@Time    : 2024/3/22 10:54
+@Author  : alexanderwu
+@File    : custom_tool.py
+"""
+
+from metagpt.roles.di.data_interpreter import DataInterpreter
+from metagpt.tools.tool_registry import register_tool
+
+
+@register_tool()
+def magic_function(arg1: str, arg2: int) -> dict:
+    """
+    The magic function that does something.
+
+    Args:
+        arg1 (str): ...
+        arg2 (int): ...
+
+    Returns:
+        dict: ...
+    """
+    return {"arg1": arg1 * 3, "arg2": arg2 * 5}
+
+
+async def main():
+    di = DataInterpreter(tools=["magic_function"])
+    await di.run("Just call the magic function with arg1 'A' and arg2 2. Tell me the result.")
+
+
+if __name__ == "__main__":
+    import asyncio
+
+    asyncio.run(main())
--- a/examples/di/data_visualization.py
+++ b/examples/di/data_visualization.py
@ -0,0 +1,17 @@
+import asyncio
+
+from metagpt.logs import logger
+from metagpt.roles.di.data_interpreter import DataInterpreter
+from metagpt.utils.recovery_util import save_history
+
+
+async def main(requirement: str = ""):
+    di = DataInterpreter()
+    rsp = await di.run(requirement)
+    logger.info(rsp)
+    save_history(role=di)
+
+
+if __name__ == "__main__":
+    requirement = "Run data analysis on sklearn Iris dataset, include a plot"
+    asyncio.run(main(requirement))
--- a/examples/di/email_summary.py
+++ b/examples/di/email_summary.py
@ -0,0 +1,33 @@
+# -*- encoding: utf-8 -*-
+"""
+@Date    :   2024/02/07 
+@Author  :   Tuo Zhou
+@File    :   email_summary.py
+"""
+import os
+
+from metagpt.roles.di.data_interpreter import DataInterpreter
+
+
+async def main():
+    email_account = "your_email_account"
+    # your password will stay only on your device and not go to LLM api
+    os.environ["email_password"] = "your_email_password"
+
+    ### Prompt for automatic email reply, uncomment to try this too ###
+    # prompt = f"""I will give you your Outlook email account ({email_account}) and password (email_password item in the environment variable). You need to find the latest email in my inbox with the sender's suffix @gmail.com and reply "Thank you! I have received your email~"""""
+
+    ### Prompt for automatic email summary ###
+    prompt = f"""I will give you your Outlook email account ({email_account}) and password (email_password item in the environment variable).
+            Firstly, Please help me fetch the latest 5 senders and full letter contents.
+            Then, summarize each of the 5 emails into one sentence (you can do this by yourself, no need to import other models to do this) and output them in a markdown format."""
+
+    di = DataInterpreter()
+
+    await di.run(prompt)
+
+
+if __name__ == "__main__":
+    import asyncio
+
+    asyncio.run(main())
--- a/examples/di/imitate_webpage.py
+++ b/examples/di/imitate_webpage.py
@ -5,19 +5,18 @@
@Author  : mannaandpoem
@File    : imitate_webpage.py
 """
-from metagpt.roles.ci.code_interpreter import CodeInterpreter
+from metagpt.roles.di.data_interpreter import DataInterpreter


 async def main():
    web_url = "https://pytorch.org/"
    prompt = f"""This is a URL of webpage: '{web_url}' .
 Firstly, utilize Selenium and WebDriver for rendering. 
-Secondly, convert image to a webpage including HTML, CSS and JS in one go. 
-Finally, save webpage in a text file. 
+Secondly, convert image to a webpage including HTML, CSS and JS in one go.
 Note: All required dependencies and environments have been fully installed and configured."""
-    ci = CodeInterpreter(goal=prompt, use_tools=True)
+    di = DataInterpreter(tools=["GPTvGenerator"])

-    await ci.run(prompt)
+    await di.run(prompt)


 if __name__ == "__main__":
--- a/examples/di/machine_learning.py
+++ b/examples/di/machine_learning.py
@ -0,0 +1,23 @@
+import fire
+
+from metagpt.roles.di.data_interpreter import DataInterpreter
+
+WINE_REQ = "Run data analysis on sklearn Wine recognition dataset, include a plot, and train a model to predict wine class (20% as validation), and show validation accuracy."
+
+DATA_DIR = "path/to/your/data"
+# sales_forecast data from https://www.kaggle.com/datasets/aslanahmedov/walmart-sales-forecast/data
+SALES_FORECAST_REQ = f"""Train a model to predict sales for each department in every store (split the last 40 weeks records as validation dataset, the others is train dataset), include plot total sales trends, print metric and plot scatter plots of
+groud truth and predictions on validation data. Dataset is {DATA_DIR}/train.csv, the metric is weighted mean absolute error (WMAE) for test data. Notice: *print* key variables to get more information for next task step.
+"""
+
+REQUIREMENTS = {"wine": WINE_REQ, "sales_forecast": SALES_FORECAST_REQ}
+
+
+async def main(use_case: str = "wine"):
+    mi = DataInterpreter()
+    requirement = REQUIREMENTS[use_case]
+    await mi.run(requirement)
+
+
+if __name__ == "__main__":
+    fire.Fire(main)
--- a/examples/di/machine_learning_with_tools.py
+++ b/examples/di/machine_learning_with_tools.py
@ -0,0 +1,16 @@
+import asyncio
+
+from metagpt.roles.di.data_interpreter import DataInterpreter
+
+
+async def main(requirement: str):
+    role = DataInterpreter(use_reflection=True, tools=["<all>"])
+    await role.run(requirement)
+
+
+if __name__ == "__main__":
+    data_path = "your/path/to/titanic"
+    train_path = f"{data_path}/split_train.csv"
+    eval_path = f"{data_path}/split_eval.csv"
+    requirement = f"This is a titanic passenger survival dataset, your goal is to predict passenger survival outcome. The target column is Survived. Perform data analysis, data preprocessing, feature engineering, and modeling to predict the target. Report accuracy on the eval data. Train data path: '{train_path}', eval data path: '{eval_path}'."
+    asyncio.run(main(requirement))
--- a/examples/di/ocr_receipt.py
+++ b/examples/di/ocr_receipt.py
@ -0,0 +1,21 @@
+from metagpt.roles.di.data_interpreter import DataInterpreter
+
+
+async def main():
+    # Notice: pip install metagpt[ocr] before using this example
+    image_path = "image.jpg"
+    language = "English"
+    requirement = f"""This is a {language} receipt image.
+    Your goal is to perform OCR on images using PaddleOCR, output text content from the OCR results and discard 
+    coordinates and confidence levels, then recognize the total amount from ocr text content, and finally save as table. 
+    Image path: {image_path}.
+    NOTE: The environments for Paddle and PaddleOCR are all ready and has been fully installed."""
+    di = DataInterpreter()
+
+    await di.run(requirement)
+
+
+if __name__ == "__main__":
+    import asyncio
+
+    asyncio.run(main())
--- a/examples/di/rm_image_background.py
+++ b/examples/di/rm_image_background.py
@ -0,0 +1,15 @@
+import asyncio
+
+from metagpt.roles.di.data_interpreter import DataInterpreter
+
+
+async def main(requirement: str = ""):
+    di = DataInterpreter()
+    await di.run(requirement)
+
+
+if __name__ == "__main__":
+    image_path = "/your/path/to/the/image.jpeg"
+    save_path = "/your/intended/save/path/for/image_rm_bg.png"
+    requirement = f"This is a image, you need to use python toolkit rembg to remove the background of the image and save the result. image path:{image_path}; save path:{save_path}."
+    asyncio.run(main(requirement))
--- a/examples/di/sd_tool_usage.py
+++ b/examples/di/sd_tool_usage.py
@ -4,12 +4,12 @@
 # @Desc    :
 import asyncio

-from metagpt.roles.ci.code_interpreter import CodeInterpreter
+from metagpt.roles.di.data_interpreter import DataInterpreter


 async def main(requirement: str = ""):
-    code_interpreter = CodeInterpreter(use_tools=True, goal=requirement)
-    await code_interpreter.run(requirement)
+    di = DataInterpreter(tools=["SDEngine"])
+    await di.run(requirement)


 if __name__ == "__main__":
--- a/examples/di/solve_math_problems.py
+++ b/examples/di/solve_math_problems.py
@ -0,0 +1,14 @@
+import asyncio
+
+from metagpt.roles.di.data_interpreter import DataInterpreter
+
+
+async def main(requirement: str = ""):
+    di = DataInterpreter()
+    await di.run(requirement)
+
+
+if __name__ == "__main__":
+    requirement = "Solve this math problem: The greatest common divisor of positive integers m and n is 6. The least common multiple of m and n is 126. What is the least possible value of m + n?"
+    # answer: 60 (m = 18, n = 42)
+    asyncio.run(main(requirement))
--- a/examples/llm_hello_world.py
+++ b/examples/llm_hello_world.py
@ -6,16 +6,25 @@
@File    : llm_hello_world.py
 """
 import asyncio
-from pathlib import Path

 from metagpt.llm import LLM
 from metagpt.logs import logger
-from metagpt.utils.common import encode_image


 async def main():
    llm = LLM()
-    logger.info(await llm.aask("hello world"))
+    # llm type check
+    question = "what's your name"
+    logger.info(f"{question}: ")
+    logger.info(await llm.aask(question))
+    logger.info("\n\n")
+
+    logger.info(
+        await llm.aask(
+            "who are you", system_msgs=["act as a robot, just answer 'I'am robot' if the question is 'who are you'"]
+        )
+    )
+
    logger.info(await llm.aask_batch(["hi", "write python hello world."]))

    hello_msg = [{"role": "user", "content": "count from 1 to 10. split by newline."}]
@ -29,12 +38,6 @@ async def main():
    if hasattr(llm, "completion"):
        logger.info(llm.completion(hello_msg))

-    # check if the configured llm supports llm-vision capacity. If not, it will throw a error
-    invoice_path = Path(__file__).parent.joinpath("..", "tests", "data", "invoices", "invoice-2.png")
-    img_base64 = encode_image(invoice_path)
-    res = await llm.aask(msg="if this is a invoice, just return True else return False", images=[img_base64])
-    assert "true" in res.lower()
-

 if __name__ == "__main__":
    asyncio.run(main())
--- a/examples/llm_vision.py
+++ b/examples/llm_vision.py
@ -0,0 +1,23 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+# @Desc   : example to run the ability of LLM vision
+
+import asyncio
+from pathlib import Path
+
+from metagpt.llm import LLM
+from metagpt.utils.common import encode_image
+
+
+async def main():
+    llm = LLM()
+
+    # check if the configured llm supports llm-vision capacity. If not, it will throw a error
+    invoice_path = Path(__file__).parent.joinpath("..", "tests", "data", "invoices", "invoice-2.png")
+    img_base64 = encode_image(invoice_path)
+    res = await llm.aask(msg="if this is a invoice, just return True else return False", images=[img_base64])
+    assert "true" in res.lower()
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
--- a/examples/rag_pipeline.py
+++ b/examples/rag_pipeline.py
@ -0,0 +1,211 @@
+"""RAG pipeline"""
+
+import asyncio
+
+from pydantic import BaseModel
+
+from metagpt.const import DATA_PATH, EXAMPLE_DATA_PATH
+from metagpt.logs import logger
+from metagpt.rag.engines import SimpleEngine
+from metagpt.rag.schema import (
+    BM25RetrieverConfig,
+    ChromaIndexConfig,
+    ChromaRetrieverConfig,
+    FAISSRetrieverConfig,
+    LLMRankerConfig,
+)
+
+DOC_PATH = EXAMPLE_DATA_PATH / "rag/writer.txt"
+QUESTION = "What are key qualities to be a good writer?"
+
+TRAVEL_DOC_PATH = EXAMPLE_DATA_PATH / "rag/travel.txt"
+TRAVEL_QUESTION = "What does Bob like?"
+
+LLM_TIP = "If you not sure, just answer I don't know."
+
+
+class Player(BaseModel):
+    """To demonstrate rag add objs."""
+
+    name: str = ""
+    goal: str = "Win The 100-meter Sprint."
+    tool: str = "Red Bull Energy Drink."
+
+    def rag_key(self) -> str:
+        """For search"""
+        return self.goal
+
+
+class RAGExample:
+    """Show how to use RAG."""
+
+    def __init__(self):
+        self.engine = SimpleEngine.from_docs(
+            input_files=[DOC_PATH],
+            retriever_configs=[FAISSRetrieverConfig(), BM25RetrieverConfig()],
+            ranker_configs=[LLMRankerConfig()],
+        )
+
+    async def run_pipeline(self, question=QUESTION, print_title=True):
+        """This example run rag pipeline, use faiss&bm25 retriever and llm ranker, will print something like:
+
+        Retrieve Result:
+        0. Productivi..., 10.0
+        1. I wrote cu..., 7.0
+        2. I highly r..., 5.0
+
+        Query Result:
+        Passion, adaptability, open-mindedness, creativity, discipline, and empathy are key qualities to be a good writer.
+        """
+        if print_title:
+            self._print_title("Run Pipeline")
+
+        nodes = await self.engine.aretrieve(question)
+        self._print_retrieve_result(nodes)
+
+        answer = await self.engine.aquery(question)
+        self._print_query_result(answer)
+
+    async def add_docs(self):
+        """This example show how to add docs.
+
+        Before add docs llm anwser I don't know.
+        After add docs llm give the correct answer, will print something like:
+
+        [Before add docs]
+        Retrieve Result:
+
+        Query Result:
+        Empty Response
+
+        [After add docs]
+        Retrieve Result:
+        0. Bob like..., 10.0
+
+        Query Result:
+        Bob likes traveling.
+        """
+        self._print_title("Add Docs")
+
+        travel_question = f"{TRAVEL_QUESTION}{LLM_TIP}"
+        travel_filepath = TRAVEL_DOC_PATH
+
+        logger.info("[Before add docs]")
+        await self.run_pipeline(question=travel_question, print_title=False)
+
+        logger.info("[After add docs]")
+        self.engine.add_docs([travel_filepath])
+        await self.run_pipeline(question=travel_question, print_title=False)
+
+    async def add_objects(self, print_title=True):
+        """This example show how to add objects.
+
+        Before add docs, engine retrieve nothing.
+        After add objects, engine give the correct answer, will print something like:
+
+        [Before add objs]
+        Retrieve Result:
+
+        [After add objs]
+        Retrieve Result:
+        0. 100m Sprin..., 10.0
+
+        [Object Detail]
+        {'name': 'Mike', 'goal': 'Win The 100-meter Sprint', 'tool': 'Red Bull Energy Drink'}
+        """
+        if print_title:
+            self._print_title("Add Objects")
+
+        player = Player(name="Mike")
+        question = f"{player.rag_key()}"
+
+        logger.info("[Before add objs]")
+        await self._retrieve_and_print(question)
+
+        logger.info("[After add objs]")
+        self.engine.add_objs([player])
+
+        try:
+            nodes = await self._retrieve_and_print(question)
+
+            logger.info("[Object Detail]")
+            player: Player = nodes[0].metadata["obj"]
+            logger.info(player.name)
+        except Exception as e:
+            logger.error(f"nodes is empty, llm don't answer correctly, exception: {e}")
+
+    async def init_objects(self):
+        """This example show how to from objs, will print something like:
+
+        Same as add_objects.
+        """
+        self._print_title("Init Objects")
+
+        pre_engine = self.engine
+        self.engine = SimpleEngine.from_objs(retriever_configs=[FAISSRetrieverConfig()])
+        await self.add_objects(print_title=False)
+        self.engine = pre_engine
+
+    async def init_and_query_chromadb(self):
+        """This example show how to use chromadb. how to save and load index. will print something like:
+
+        Query Result:
+        Bob likes traveling.
+        """
+        self._print_title("Init And Query ChromaDB")
+
+        # save index
+        output_dir = DATA_PATH / "rag"
+        SimpleEngine.from_docs(
+            input_files=[TRAVEL_DOC_PATH],
+            retriever_configs=[ChromaRetrieverConfig(persist_path=output_dir)],
+        )
+
+        # load index
+        engine = SimpleEngine.from_index(
+            index_config=ChromaIndexConfig(persist_path=output_dir),
+        )
+
+        # query
+        answer = engine.query(TRAVEL_QUESTION)
+        self._print_query_result(answer)
+
+    @staticmethod
+    def _print_title(title):
+        logger.info(f"{'#'*30} {title} {'#'*30}")
+
+    @staticmethod
+    def _print_retrieve_result(result):
+        """Print retrieve result."""
+        logger.info("Retrieve Result:")
+
+        for i, node in enumerate(result):
+            logger.info(f"{i}. {node.text[:10]}..., {node.score}")
+
+        logger.info("")
+
+    @staticmethod
+    def _print_query_result(result):
+        """Print query result."""
+        logger.info("Query Result:")
+
+        logger.info(f"{result}\n")
+
+    async def _retrieve_and_print(self, question):
+        nodes = await self.engine.aretrieve(question)
+        self._print_retrieve_result(nodes)
+        return nodes
+
+
+async def main():
+    """RAG pipeline"""
+    e = RAGExample()
+    await e.run_pipeline()
+    await e.add_docs()
+    await e.add_objects()
+    await e.init_objects()
+    await e.init_and_query_chromadb()
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
--- a/examples/rag_search.py
+++ b/examples/rag_search.py
@ -0,0 +1,21 @@
+"""Agent with RAG search."""
+
+import asyncio
+
+from examples.rag_pipeline import DOC_PATH, QUESTION
+from metagpt.logs import logger
+from metagpt.rag.engines import SimpleEngine
+from metagpt.roles import Sales
+
+
+async def search():
+    """Agent with RAG search."""
+
+    store = SimpleEngine.from_docs(input_files=[DOC_PATH])
+    role = Sales(profile="Sales", store=store)
+    result = await role.run(QUESTION)
+    logger.info(result)
+
+
+if __name__ == "__main__":
+    asyncio.run(search())
--- a/examples/reverse_engineering.py
+++ b/examples/reverse_engineering.py
@ -0,0 +1,72 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+import asyncio
+import shutil
+from pathlib import Path
+
+import typer
+
+from metagpt.actions.rebuild_class_view import RebuildClassView
+from metagpt.actions.rebuild_sequence_view import RebuildSequenceView
+from metagpt.context import Context
+from metagpt.llm import LLM
+from metagpt.logs import logger
+from metagpt.utils.git_repository import GitRepository
+from metagpt.utils.project_repo import ProjectRepo
+
+app = typer.Typer(add_completion=False, pretty_exceptions_show_locals=False)
+
+
+@app.command("", help="Python project reverse engineering.")
+def startup(
+    project_root: str = typer.Argument(
+        default="",
+        help="Specify the root directory of the existing project for reverse engineering.",
+    ),
+    output_dir: str = typer.Option(default="", help="Specify the output directory path for reverse engineering."),
+):
+    package_root = Path(project_root)
+    if not package_root.exists():
+        raise FileNotFoundError(f"{project_root} not exists")
+    if not _is_python_package_root(package_root):
+        raise FileNotFoundError(f'There are no "*.py" files under "{project_root}".')
+    init_file = package_root / "__init__.py"  # used by pyreverse
+    init_file_exists = init_file.exists()
+    if not init_file_exists:
+        init_file.touch()
+
+    if not output_dir:
+        output_dir = package_root / "../reverse_engineering_output"
+    logger.info(f"output dir:{output_dir}")
+    try:
+        asyncio.run(reverse_engineering(package_root, Path(output_dir)))
+    finally:
+        if not init_file_exists:
+            init_file.unlink(missing_ok=True)
+        tmp_dir = package_root / "__dot__"
+        if tmp_dir.exists():
+            shutil.rmtree(tmp_dir, ignore_errors=True)
+
+
+def _is_python_package_root(package_root: Path) -> bool:
+    for file_path in package_root.iterdir():
+        if file_path.is_file():
+            if file_path.suffix == ".py":
+                return True
+    return False
+
+
+async def reverse_engineering(package_root: Path, output_dir: Path):
+    ctx = Context()
+    ctx.git_repo = GitRepository(output_dir)
+    ctx.repo = ProjectRepo(ctx.git_repo)
+    action = RebuildClassView(name="ReverseEngineering", i_context=str(package_root), llm=LLM(), context=ctx)
+    await action.run()
+
+    action = RebuildSequenceView(name="ReverseEngineering", llm=LLM(), context=ctx)
+    await action.run()
+
+
+if __name__ == "__main__":
+    app()
--- a/examples/search_kb.py
+++ b/examples/search_kb.py
@ -1,33 +0,0 @@
-#!/usr/bin/env python
-# -*- coding: utf-8 -*-
-"""
-@File    : search_kb.py
-@Modified By: mashenquan, 2023-12-22. Delete useless codes.
-"""
-import asyncio
-
-from langchain.embeddings import OpenAIEmbeddings
-
-from metagpt.config2 import config
-from metagpt.const import DATA_PATH, EXAMPLE_PATH
-from metagpt.document_store import FaissStore
-from metagpt.logs import logger
-from metagpt.roles import Sales
-
-
-def get_store():
-    llm = config.get_openai_llm()
-    embedding = OpenAIEmbeddings(openai_api_key=llm.api_key, openai_api_base=llm.base_url)
-    return FaissStore(DATA_PATH / "example.json", embedding=embedding)
-
-
-async def search():
-    store = FaissStore(EXAMPLE_PATH / "example.json")
-    role = Sales(profile="Sales", store=store)
-    query = "Which facial cleanser is good for oily skin?"
-    result = await role.run(query)
-    logger.info(result)
-
-
-if __name__ == "__main__":
-    asyncio.run(search())
--- a/examples/search_with_specific_engine.py
+++ b/examples/search_with_specific_engine.py
@ -4,21 +4,17 @@
 """
 import asyncio

+from metagpt.config2 import Config
 from metagpt.roles import Searcher
-from metagpt.tools.search_engine import SearchEngine, SearchEngineType
+from metagpt.tools.search_engine import SearchEngine


 async def main():
    question = "What are the most interesting human facts?"
-    kwargs = {"api_key": "", "cse_id": "", "proxy": None}
-    # Serper API
-    # await Searcher(search_engine=SearchEngine(engine=SearchEngineType.SERPER_GOOGLE, **kwargs)).run(question)
-    # SerpAPI
-    # await Searcher(search_engine=SearchEngine(engine=SearchEngineType.SERPAPI_GOOGLE, **kwargs)).run(question)
-    # Google API
-    # await Searcher(search_engine=SearchEngine(engine=SearchEngineType.DIRECT_GOOGLE, **kwargs)).run(question)
-    # DDG API
-    await Searcher(search_engine=SearchEngine(engine=SearchEngineType.DUCK_DUCK_GO, **kwargs)).run(question)
+
+    search = Config.default().search
+    kwargs = search.model_dump()
+    await Searcher(search_engine=SearchEngine(engine=search.api_type, **kwargs)).run(question)


 if __name__ == "__main__":
--- a/examples/write_novel.py
+++ b/examples/write_novel.py
@ -14,6 +14,22 @@ from metagpt.actions.action_node import ActionNode
 from metagpt.llm import LLM


+class Chapter(BaseModel):
+    name: str = Field(default="Chapter 1", description="The name of the chapter.")
+    content: str = Field(default="...", description="The content of the chapter. No more than 1000 words.")
+
+
+class Chapters(BaseModel):
+    chapters: List[Chapter] = Field(
+        default=[
+            {"name": "Chapter 1", "content": "..."},
+            {"name": "Chapter 2", "content": "..."},
+            {"name": "Chapter 3", "content": "..."},
+        ],
+        description="The chapters of the novel.",
+    )
+
+
 class Novel(BaseModel):
    name: str = Field(default="The Lord of the Rings", description="The name of the novel.")
    user_group: str = Field(default="...", description="The user group of the novel.")
@ -28,22 +44,17 @@ class Novel(BaseModel):
    ending: str = Field(default="...", description="The ending of the novel.")


-class Chapter(BaseModel):
-    name: str = Field(default="Chapter 1", description="The name of the chapter.")
-    content: str = Field(default="...", description="The content of the chapter. No more than 1000 words.")
-
-
 async def generate_novel():
    instruction = (
-        "Write a novel named 'Harry Potter in The Lord of the Rings'. "
+        "Write a novel named 'Reborn in Skyrim'. "
        "Fill the empty nodes with your own ideas. Be creative! Use your own words!"
        "I will tip you $100,000 if you write a good novel."
    )
    novel_node = await ActionNode.from_pydantic(Novel).fill(context=instruction, llm=LLM())
-    chap_node = await ActionNode.from_pydantic(Chapter).fill(
+    chap_node = await ActionNode.from_pydantic(Chapters).fill(
        context=f"### instruction\n{instruction}\n### novel\n{novel_node.content}", llm=LLM()
    )
-    print(chap_node.content)
+    print(chap_node.instruct_content)


 asyncio.run(generate_novel())