Merge pull request #1183 from geekan/code_interpreter

Some small enhancement for DI
This commit is contained in:
Alexander Wu 2024-04-11 19:59:54 +08:00 committed by GitHub
commit 4135ac8330
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
12 changed files with 431 additions and 33 deletions

View file

@ -12,9 +12,97 @@ ## Example List
- Tool usage: web page imitation
- Tool usage: web crawling
- Tool usage: text2image
- Tool usage: email summarization and response\
- Tool usage: email summarization and response
- More on the way!
Please see the [docs](https://docs.deepwisdom.ai/main/en/guide/use_cases/agent/interpreter/intro.html) for more explanation.
We are continuously releasing codes, stay tuned!
## Experiments in the Paper
Before running the experiments, download the [di_dataset](https://drive.google.com/drive/folders/17SpI9WL9kzd260q2DArbXKNcqhidjA7s?usp=sharing) and place it in the specified path (default DATA_PATH, where DATA_PATH = METAGPT_ROOT / "data").
To reproduce the results in the paper, run the following commands:
```
python run_ml_benchmark.py --task_name 04_titanic
```
```
python run_open_ended_tasks.py --task_name 14_image_background_removal --data_dir directory_to_di_dataset --use_reflection True
```
The `run_ml_benchmark.py` and `run_open_ended_tasks.py` scripts implement the pipeline of the Data Interpreter.
Some key arguments:
- `--task_name`: required, specifies the task to run. e.g., 04_titanic and 14_image_background_removal. Refer to the table below for available task names.
- `--data_dir`: optional, the directory that stores the `di_dataset` (default is `DATA_PATH`).
- `--use_reflection`: optional, the flag to use reflection or not (default is True).
### Data Interpreter Dataset Structure
di_dataset
- ml_benchmark
- 04_titanic
- 05_house-prices-advanced-regression-techniques
- 06_santander-customer-transaction-prediction
- 07_icr-identify-age-related-conditions
- 08_santander-value-prediction-challenge
- open_ended_tasks
- 01_ocr
- 02_ocr
- 03_ocr
- 14_image_background_removal
- 16_image_2_code_generation
- 17_image_2_code_generation
### ML-Benchmark Dataset and Requirements
ML-Benchmark contains 8 typical machine learning datasets.
| ID | Task Name | Dataset Name | User Requirement |
|----|-----------------------|--------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 01 | 01_iris | Iris | Run data analysis on sklearn Iris dataset, include a plot |
| 02 | 02_wines_recognition | Wine recognition | Run data analysis on sklearn Wine recognition dataset, include a plot, and train a model to predict wine class with 20% as test set, and show prediction accuracy |
| 03 | 03_breast_cancer | Breast Cancer | Run data analysis on sklearn Wisconsin Breast Cancer dataset, include a plot, train a model to predict targets (20% as validation), and show validation accuracy |
| 04 | 04_titanic | Titanic | This is a titanic passenger survival dataset, your goal is to predict passenger survival outcome. The target column is Survived. Perform data analysis, data preprocessing, feature engineering, and modeling to predict the target. Report accuracy on the eval data. Train data path: '{data_dir}/ml_benchmark/4_titanic/split_train.csv', eval data path: '{data_dir}/ml_benchmark/04_titanic/split_eval.csv'. |
| 05 | 05_house_prices | House Prices | This is a house price dataset, your goal is to predict the sale price of a property based on its features. The target column is SalePrice. Perform data analysis, data preprocessing, feature engineering, and modeling to predict the target. Report RMSE between the logarithm of the predicted value and the logarithm of the observed sales price on the eval data. Train data path: '{data_dir}/ml_benchmark/05_house-prices-advanced-regression-techniques/split_train.csv', eval data path: '{data_dir}/ml_benchmark/05_house-prices-advanced-regression-techniques/split_eval.csv'. |
| 06 | 06_santander_customer | Santander Customer | This is a customers financial dataset. Your goal is to predict which customers will make a specific transaction in the future. The target column is target. Perform data analysis, data preprocessing, feature engineering, and modeling to predict the target. Report AUC on the eval data. Train data path: '{data_dir}/ml_benchmark/06_santander-customer-transaction-prediction/split_train.csv', eval data path: '{data_dir}/ml_benchmark/06_santander-customer-transaction-prediction/split_eval.csv' . |
| 07 | 07_icr_identify | ICR - Identifying | This is a medical dataset with over fifty anonymized health characteristics linked to three age-related conditions. Your goal is to predict whether a subject has or has not been diagnosed with one of these conditions. The target column is Class. Perform data analysis, data preprocessing, feature engineering, and modeling to predict the target. Report F1 Score on the eval data. Train data path: '{data_dir}/ml_benchmark/07_icr-identify-age-related-conditions/split_train.csv', eval data path: '{data_dir}/ml_benchmark/07_icr-identify-age-related-conditions/split_eval.csv' . |
| 08 | 08_santander_value | Santander Value | This is a customers financial dataset. Your goal is to predict the value of transactions for each potential customer. The target column is target. Perform data analysis, data preprocessing, feature engineering, and modeling to predict the target. Report RMSLE on the eval data. Train data path: '{data_dir}/ml_benchmark/08_santander-value-prediction-challenge/split_train.csv', eval data path: '{data_dir}/ml_benchmark/08_santander-value-prediction-challenge/split_eval.csv' . |
**Note**:
1. `data_dir` is the directory where the di_dataset is stored.
### Open-Ended Tasks Dataset and Requirements
Open-Ended Tasks have collected and designed 20 moderately challenging open-ended tasks, requiring Data Interpreters to understand user requirements, plan and decompose tasks, and generate and execute code.
| ID | Task Name | Scenario | Scenario Description | User Requirement |
|----|-----------------------------|------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 01 | 01_ocr | OCR | Scan all the necessary fields and amounts from the given file and then create an Excel sheet with the extracted data. | This is an English invoice image. Your goal is to perform OCR on the image, extract the total amount from ocr result and save as table, using PaddleOCR. The PaddleOCR environment has been fully installed, try to use Paddleocr as much as possible. Image path: '{data_dir}/open_ended_tasks/01_ocr.png |
| 02 | 02_ocr | OCR | Scan all the necessary fields and amounts from the given file and then create an Excel sheet with the extracted data. | This is a Chinese invoice image. Your goal is to perform OCR on the image and only output the recognized text word results, nothing else is needed, then extract the total amount and receipt ID starting with 'No' from ocr text words results and save as table, using PaddleOCR. The PaddleOCR environment has been fully installed, try to use Paddleocr as much as possible. Image path: '{data_dir}/open_ended_tasks/02_ocr.jpg' |
| 03 | 03_ocr | OCR | Scan all the necessary fields and amounts from the given file and then create an Excel sheet with the extracted data. | This is an invoice image for OCR. Your goal is to perform OCR on the image, extract the total amount and save it into an Excel table format, using PaddleOCR with lang='en' The PaddleOCR environment has been fully installed, try to use Paddleocr as much as possible. Image path: '{data_dir}/open_ended_tasks/03_ocr.jpg' |
| 04 | 04_web_search_and_crawling | Web search and crawling | Crawling and organizing web form information | Get data from `paperlist` table in https://papercopic.com/statistics/iclr-statistics/iclr-2024-statistics/ , and save it to a csv file. paper title must include `multiagent` or `large language model`. **notice: print key variables** |
| 05 | 05_web_search_and_crawling | Web search and crawling | Crawling and organizing web form information | Obtain the CPI data from https://www.stats.gov.cn/sj/sjjd/202307/t20230718_1941322.html, please follow this plan step by step: 1. Detect the encoding type and HTML structure of the target webpage. 2. Crawl the webpage, de-duplicate the body content, convert it to a clear paragraph suitable for reading as plain text, and save it to target.txt. 3. Design multiple regular expressions to match key sentences in target.txt, use try-except statements to combine the various regular expression matches, note that the webpage text is in Chinese. 4. Finally, use a Chinese summary to summarize the key sentences to answer the user's request. **Note: If it is a code block, print out the key variable results of the code block; if it is webpage text, print the first 200 characters.** |
| 06 | 06_web_search_and_crawling | Web search and crawling | Crawling and organizing web form information | Get products data from website https://scrapeme.live/shop/ and save it as a csv file. Notice: Firstly parse the web page encoding and the text HTML structure; The first page product name, price, product URL, and image URL must be saved in the csv; |
| 07 | 07_web_search_and_crawling | Web search and crawling | Crawling and organizing web form information | 从36kr创投平台https://pitchhub.36kr.com/financing-flash所有初创企业融资的信息, **注意: 这是⼀个中⽂⽹站**; 下⾯是⼀个⼤致流程, 你会根据每⼀步的运⾏结果对当前计划中的任务做出适当调整: 1. 爬取并本地保存html结构; 2. 直接打印第7个**快讯**关键词后2000个字符的html内容, 作为**快讯的html内容示例**; 3. 反思**快讯的html内容示例**中的规律, 设计正则匹配表达式**来获取快讯**的标题、链接、时间; 4. 筛选最近3天的初创企业融资**快讯**, 以list[dict]形式打印前5个。5. 将全部结果存在本地csv中 |
| 08 | 08_email_reply | Email reply | Filter through my emails and respond to them as necessary | You are an agent that automatically reads and replies to emails. I will give you your Outlook email account and password. You need to check the content of the latest email and return it to me. If the email address suffix of this email is @xxx.xxx, please automatically reply with "I've received your email and will reply as soon as possible. Thank you!" Email account: xxx@xxx.xxx Email Password: xxxx |
| 09 | 09_web_page_imitation | Web page imitation | Using Selenium and WebDriver to access a webpage and convert it to an image, with the assistance of GPT-4V to mimic the creation of a one-page website. | This is a URL of webpage: https://medium.com/ . Firstly, utilize Selenium and WebDriver for rendering. Secondly, convert image to a webpage including HTML, CSS and JS in one go. Finally, save webpage in a text file. All required dependencies and environments have been fully installed and configured. |
| 10 | 10_web_page_imitation | Web page imitation | Using Selenium and WebDriver to access a webpage and convert it to an image, with the assistance of GPT-4V to mimic the creation of a one-page website. | This is a URL of webpage: https://pytorch.org/ . Firstly, utilize Selenium and WebDriver for rendering. Secondly, convert image to a webpage including HTML, CSS and JS in one go. Finally, save webpage in a file. NOTE: All required dependencies and environments have been fully installed and configured. |
| 11 | 11_web_page_imitation | Web page imitation | Using Selenium and WebDriver to access a webpage and convert it to an image, with the assistance of GPT-4V to mimic the creation of a one-page website. | This is a URL of webpage: https://www.kaggle.com/ . Firstly, utilize Selenium and WebDriver to render the webpage, ensuring the browser window is maximized for an optimal viewing experience. Secondly, convert image to a webpage including HTML, CSS and JS in one go. Finally, save webpage in a file. NOTE: All required dependencies and environments have been fully installed and configured. |
| 12 | 12_web_page_imitation | Web page imitation | Using Selenium and WebDriver to access a webpage and convert it to an image, with the assistance of GPT-4V to mimic the creation of a one-page website. | This is a URL of webpage: https://chat.openai.com/auth/login . Firstly, utilize Selenium and WebDriver to render the webpage, ensuring the browser window is maximized for an optimal viewing experience. Secondly, convert image to a webpage including HTML, CSS and JS in one go. Finally, save webpage in a file. NOTE: All required dependencies and environments have been fully installed and configured. |
| 13 | 13_web_page_imitation | Web page imitation | Using Selenium and WebDriver to access a webpage and convert it to an image, with the assistance of GPT-4V to mimic the creation of a one-page website. | This is a URL of webpage: https://deepmind.google/technologies/gemini/#introduction . Firstly, utilize Selenium and WebDriver to render the webpage, ensuring the browser window is maximized for an optimal viewing experience. Secondly, convert image to a webpage including HTML, CSS and JS in one go. Finally, save webpage in a file. NOTE: All required dependencies and environments have been fully installed and configured. |
| 14 | 14_image_background_removal | Image Background Removal | Remove the background of a given image | This is an image, you need to use python toolkit rembg remove the background of the image. image path:'{data_dir}/open_ended_tasks/14_image_background_removal.jpg'; save path:'{data_dir}/open_ended_tasks/14_image_background_removal.jpg' |
| 15 | 15_text2img | Text2Img | Use SD tools to generate images | I want to generate an image of a beautiful girl using the stable diffusion text2image tool, sd_url = "http://your.sd.service.ip:port" |
| 16 | 16_image_2_code_generation | Image2Code Generation | Web code generation | This is a image. First, convert the image to webpage code including HTML, CSS and JS in one go, and finally save webpage code in a file.The image path: '{data_dir}/open_ended_tasks/16_image_2_code_generation.png'. NOTE: All required dependencies and environments have been fully installed and configured. |
| 17 | 17_image_2_code_generation | Image2Code Generation | Web code generation | This is a image. First, convert the image to webpage code including HTML, CSS and JS in one go, and finally save webpage code in a file.The image path: '{data_dir}/open_ended_tasks/16_image_2_code_generation.png'. NOTE: All required dependencies and environments have been fully installed and configured. |
| 18 | 18_generate_games | Generate games using existing repo | Game tool usage (pyxel) | Create a Snake game. Players need to control the movement of the snake to eat food and grow its body, while avoiding the snake's head touching their own body or game boundaries. Games need to have basic game logic, user interface. During the production process, please consider factors such as playability, beautiful interface, and convenient operation of the game. Note: pyxel environment already satisfied |
| 19 | 19_generate_games | Generate games using existing repo | Game tool usage (pyxel) | You are a professional game developer, please use pyxel software to create a simple jumping game. The game needs to include a character that can move left and right on the screen. When the player presses the spacebar, the character should jump. Please ensure that the game is easy to operate, with clear graphics, and complies with the functional limitations of pyxel software. Note: pyxel environment already satisfied |
| 20 | 20_generate_games | Generate games using existing repo | Game tool usage (pyxel) | Make a mouse click game that click button as many times as possible in 30 seconds using pyxel. Note: pyxel environment already satisfied |
**Note**:
1. `data_dir` is the directory where the di_dataset is stored.
2. The specific email account and password need to be replaced with the actual email account and password in `requirements_prompt.py`.
3. The specific sd_url need to be replaced with the actual sd_url in `requirements_prompt.py`.
4. Codes related to "Generate games using existing repo" and Math benchmark are being integrated. Stay tuned.

View file

@ -0,0 +1,65 @@
# ML-Benchmark requirements
IRIS_REQ = "Run data analysis on sklearn Iris dataset, include a plot"
WINES_RECOGNITION_REQ = "Run data analysis on sklearn Wine recognition dataset, include a plot, and train a model to predict wine class with 20% as test set, and show prediction accuracy"
BREAST_CANCER_WISCONSIN_REQ = "Run data analysis on sklearn Wisconsin Breast Cancer dataset, include a plot, train a model to predict targets (20% as validation), and show validation accuracy"
TITANIC_REQ = "This is a titanic passenger survival dataset, your goal is to predict passenger survival outcome. The target column is Survived. Perform data analysis, data preprocessing, feature engineering, and modeling to predict the target. Report accuracy on the eval data. Train data path: '{data_dir}/di_dataset/ml_benchmark/04_titanic/split_train.csv', eval data path: '{data_dir}/di_dataset/ml_benchmark/04_titanic/split_eval.csv'."
HOUSE_PRICES_ADVANCED_REGRESSION_TECHNIQUES_REQ = "This is a house price dataset, your goal is to predict the sale price of a property based on its features. The target column is SalePrice. Perform data analysis, data preprocessing, feature engineering, and modeling to predict the target. Report RMSE between the logarithm of the predicted value and the logarithm of the observed sales price on the eval data. Train data path: '{data_dir}/di_dataset/ml_benchmark/05_house-prices-advanced-regression-techniques/split_train.csv', eval data path: '{data_dir}/di_dataset/ml_benchmark/05_house-prices-advanced-regression-techniques/split_eval.csv'."
SANTANDER_CUSTOMER_TRANSACTION_PREDICTION_REQ = "This is a customers financial dataset. Your goal is to predict which customers will make a specific transaction in the future. The target column is target. Perform data analysis, data preprocessing, feature engineering, and modeling to predict the target. Report AUC on the eval data. Train data path: '{data_dir}/di_dataset/ml_benchmark/06_santander-customer-transaction-prediction/split_train.csv', eval data path: '{data_dir}/di_dataset/ml_benchmark/06_santander-customer-transaction-prediction/split_eval.csv' ."
ICR_IDENTITY_AGE_RELATED_CONDITIONS_REQ = "This is a medical dataset with over fifty anonymized health characteristics linked to three age-related conditions. Your goal is to predict whether a subject has or has not been diagnosed with one of these conditions. The target column is Class. Perform data analysis, data preprocessing, feature engineering, and modeling to predict the target. Report F1 Score on the eval data. Train data path: '{data_dir}/di_dataset/ml_benchmark/07_icr-identify-age-related-conditions/split_train.csv', eval data path: '{data_dir}/di_dataset/ml_benchmark/07_icr-identify-age-related-conditions/split_eval.csv' ."
SANTANDER_VALUE_PREDICTION_CHALLENGE_REQ = "This is a customers financial dataset. Your goal is to predict the value of transactions for each potential customer. The target column is target. Perform data analysis, data preprocessing, feature engineering, and modeling to predict the target. Report RMSLE on the eval data. Train data path: '{data_dir}/di_dataset/ml_benchmark/08_santander-value-prediction-challenge/split_train.csv', eval data path: '{data_dir}/di_dataset/ml_benchmark/08_santander-value-prediction-challenge/split_eval.csv' ."
# Open-Ended Tasks requirements
OCR_REQ_01 = "This is an English invoice image. Your goal is to perform OCR on the image, extract the total amount from ocr result and save as table, using PaddleOCR. The PaddleOCR environment has been fully installed, try to use Paddleocr as much as possible. Image path: '{data_dir}/di_dataset/open_ended_tasks/01_ocr.png"
OCR_REQ_02 = "This is a Chinese invoice image. Your goal is to perform OCR on the image and only output the recognized text word results, nothing else is needed, then extract the total amount and receipt ID starting with 'No' from ocr text words results and save as table, using PaddleOCR. The PaddleOCR environment has been fully installed, try to use Paddleocr as much as possible. Image path: '{data_dir}/di_dataset/open_ended_tasks/02_ocr.jpg"
OCR_REQ_03 = "This is an invoice image for OCR. Your goal is to perform OCR on the image, extract the total amount and save it into an Excel table format, using PaddleOCR with lang='en' The PaddleOCR environment has been fully installed, try to use Paddleocr as much as possible. Image path: '{data_dir}/di_dataset/open_ended_tasks/03_ocr.jpg"
WEB_SEARCH_AND_CRAWLING_REQ_04 = "Get data from `paperlist` table in https://papercopic.com/statistics/iclr-statistics/iclr-2024-statistics/ , and save it to a csv file. paper title must include `multiagent` or `large language model`. **notice: print key variables**"
WEB_SEARCH_AND_CRAWLING_REQ_05 = "Obtain the CPI data from https://www.stats.gov.cn/sj/sjjd/202307/t20230718_1941322.html, please follow this plan step by step: 1. Detect the encoding type and HTML structure of the target webpage. 2. Crawl the webpage, de-duplicate the body content, convert it to a clear paragraph suitable for reading as plain text, and save it to target.txt. 3. Design multiple regular expressions to match key sentences in target.txt, use try-except statements to combine the various regular expression matches, note that the webpage text is in Chinese. 4. Finally, use a Chinese summary to summarize the key sentences to answer the user's request. **Note: If it is a code block, print out the key variable results of the code block; if it is webpage text, print the first 200 characters.**"
WEB_SEARCH_AND_CRAWLING_REQ_06 = "Get products data from website https://scrapeme.live/shop/ and save it as a csv file. Notice: Firstly parse the web page encoding and the text HTML structure; The first page product name, price, product URL, and image URL must be saved in the csv;"
WEB_SEARCH_AND_CRAWLING_REQ_07 = "从36kr创投平台https://pitchhub.36kr.com/financing-flash所有初创企业融资的信息, **注意: 这是⼀个中⽂⽹站**; 下⾯是⼀个⼤致流程, 你会根据每⼀步的运⾏结果对当前计划中的任务做出适当调整: 1. 爬取并本地保存html结构; 2. 直接打印第7个**快讯**关键词后2000个字符的html内容, 作为**快讯的html内容示例**; 3. 反思**快讯的html内容示例**中的规律, 设计正则匹配表达式来获取**快讯**的标题、链接、时间; 4. 筛选最近3天的初创企业融资**快讯**, 以list[dict]形式打印前5个。5. 将全部结果存在本地csv中"
EMAIL_REPLY_REQ_08 = """You are an agent that automatically reads and replies to emails. I will give you your Outlook email account and password. You need to check the content of the latest email and return it to me. If the email address suffix of this email is @xxx.xxx, please automatically reply with "I've received your email and will reply as soon as possible. Thank you!" Email account: xxx@xxx.xxx Email Password: xxxx"""
WEB_PAGE_IMITATION_REQ_09 = "This is a URL of webpage: https://medium.com/ . Firstly, utilize Selenium and WebDriver for rendering. Secondly, convert image to a webpage including HTML, CSS and JS in one go. Finally, save webpage in a text file. All required dependencies and environments have been fully installed and configured."
WEB_PAGE_IMITATION_REQ_10 = "This is a URL of webpage: https://pytorch.org/ . Firstly, utilize Selenium and WebDriver for rendering. Secondly, convert image to a webpage including HTML, CSS and JS in one go. Finally, save webpage in a file. NOTE: All required dependencies and environments have been fully installed and configured."
WEB_PAGE_IMITATION_REQ_11 = "This is a URL of webpage: https://www.kaggle.com/ . Firstly, utilize Selenium and WebDriver to render the webpage, ensuring the browser window is maximized for an optimal viewing experience. Secondly, convert image to a webpage including HTML, CSS and JS in one go. Finally, save webpage in a file. NOTE: All required dependencies and environments have been fully installed and configured."
WEB_PAGE_IMITATION_REQ_12 = "This is a URL of webpage: https://chat.openai.com/auth/login . Firstly, utilize Selenium and WebDriver to render the webpage, ensuring the browser window is maximized for an optimal viewing experience. Secondly, convert image to a webpage including HTML, CSS and JS in one go. Finally, save webpage in a file. NOTE: All required dependencies and environments have been fully installed and configured."
WEB_PAGE_IMITATION_REQ_13 = "This is a URL of webpage: https://deepmind.google/technologies/gemini/#introduction . Firstly, utilize Selenium and WebDriver to render the webpage, ensuring the browser window is maximized for an optimal viewing experience. Secondly, convert image to a webpage including HTML, CSS and JS in one go. Finally, save webpage in a file. NOTE: All required dependencies and environments have been fully installed and configured."
IMAGE_BACKGROUND_REMOVAL_REQ_14 = "This is an image, you need to use python toolkit rembg remove the background of the image. image path:'{data_dir}/di_dataset/open_ended_tasks/14_image_background_removal.jpg'; save path:'{data_dir}/di_dataset/open_ended_tasks/14_image_background_removal_result.jpg'"
TEXT2IMG_REQ_15 = """I want to generate an image of a beautiful girl using the stable diffusion text2image tool, sd_url = 'http://your.sd.service.ip:port'"""
IMAGE2CODE_GENERATION_REQ_16 = "This is a image. First, convert the image to webpage code including HTML, CSS and JS in one go, and finally save webpage code in a file.The image path: '{data_dir}/di_dataset/open_ended_tasks/16_image_2_code_generation.png'. NOTE: All required dependencies and environments have been fully installed and configured."
IMAGE2CODE_GENERATION_REQ_17 = "This is a image. First, convert the image to webpage code including HTML, CSS and JS in one go, and finally save webpage code in a file.The image path: '{data_dir}/di_dataset/open_ended_tasks/17_image_2_code_generation.png'. NOTE: All required dependencies and environments have been fully installed and configured."
GENERATE_GAMES_REQ_18 = "Create a Snake game. Players need to control the movement of the snake to eat food and grow its body, while avoiding the snake's head touching their own body or game boundaries. Games need to have basic game logic, user interface. During the production process, please consider factors such as playability, beautiful interface, and convenient operation of the game. Note: pyxel environment already satisfied"
GENERATE_GAMES_REQ_19 = "You are a professional game developer, please use pyxel software to create a simple jumping game. The game needs to include a character that can move left and right on the screen. When the player presses the spacebar, the character should jump. Please ensure that the game is easy to operate, with clear graphics, and complies with the functional limitations of pyxel software. Note: pyxel environment already satisfied"
GENERATE_GAMES_REQ_20 = "Create a Snake game. Players need to control the movement of the snake to eat food and grow its body, while avoiding the snake's head touching their own body or game boundaries. Games need to have basic game logic, user interface. During the production process, please consider factors such as playability, beautiful interface, and convenient operation of the game. Note: pyxel environment already satisfied"
ML_BENCHMARK_REQUIREMENTS = {
"01_iris": IRIS_REQ,
"02_wines_recognition": WINES_RECOGNITION_REQ,
"03_breast_cancer": BREAST_CANCER_WISCONSIN_REQ,
"04_titanic": TITANIC_REQ,
"05_house_prices": HOUSE_PRICES_ADVANCED_REGRESSION_TECHNIQUES_REQ,
"06_santander_customer": SANTANDER_CUSTOMER_TRANSACTION_PREDICTION_REQ,
"07_icr_identify": ICR_IDENTITY_AGE_RELATED_CONDITIONS_REQ,
"08_santander_value": SANTANDER_VALUE_PREDICTION_CHALLENGE_REQ,
}
OPEN_ENDED_TASKS_REQUIREMENTS = {
"01_ocr": OCR_REQ_01,
"02_ocr": OCR_REQ_02,
"03_ocr": OCR_REQ_03,
"04_web_search_and_crawling": WEB_SEARCH_AND_CRAWLING_REQ_04,
"05_web_search_and_crawling": WEB_SEARCH_AND_CRAWLING_REQ_05,
"06_web_search_and_crawling": WEB_SEARCH_AND_CRAWLING_REQ_06,
"07_web_search_and_crawling": WEB_SEARCH_AND_CRAWLING_REQ_07,
"08_email_reply": EMAIL_REPLY_REQ_08,
"09_web_page_imitation": WEB_PAGE_IMITATION_REQ_09,
"10_web_page_imitation": WEB_PAGE_IMITATION_REQ_10,
"11_web_page_imitation": WEB_PAGE_IMITATION_REQ_11,
"12_web_page_imitation": WEB_PAGE_IMITATION_REQ_12,
"13_web_page_imitation": WEB_PAGE_IMITATION_REQ_13,
"14_image_background_removal": IMAGE_BACKGROUND_REMOVAL_REQ_14,
"15_text2img": TEXT2IMG_REQ_15,
"16_image_2_code_generation": IMAGE2CODE_GENERATION_REQ_16,
"17_image_2_code_generation": IMAGE2CODE_GENERATION_REQ_17,
"18_generate_games": GENERATE_GAMES_REQ_18,
"19_generate_games": GENERATE_GAMES_REQ_19,
"20_generate_games": GENERATE_GAMES_REQ_20,
}

View file

@ -0,0 +1,22 @@
import os
import fire
from examples.di.requirements_prompt import ML_BENCHMARK_REQUIREMENTS
from metagpt.const import DATA_PATH
from metagpt.roles.di.data_interpreter import DataInterpreter
from metagpt.tools.tool_recommend import TypeMatchToolRecommender
# Ensure ML-Benchmark dataset has been downloaded before using these example.
async def main(task_name, data_dir=DATA_PATH, use_reflection=True):
if data_dir != DATA_PATH and not os.path.exists(os.path.join(data_dir, "di_dataset/ml_benchmark")):
raise FileNotFoundError(f"ML-Benchmark dataset not found in {data_dir}.")
requirement = ML_BENCHMARK_REQUIREMENTS[task_name].format(data_dir=data_dir)
di = DataInterpreter(use_reflection=use_reflection, tool_recommender=TypeMatchToolRecommender(tools=["<all>"]))
await di.run(requirement)
if __name__ == "__main__":
fire.Fire(main)

View file

@ -0,0 +1,22 @@
import os
import fire
from examples.di.requirements_prompt import OPEN_ENDED_TASKS_REQUIREMENTS
from metagpt.const import DATA_PATH
from metagpt.roles.di.data_interpreter import DataInterpreter
from metagpt.tools.tool_recommend import TypeMatchToolRecommender
# Ensure Open-Ended Tasks dataset has been downloaded before using this example.
async def main(task_name, data_dir=DATA_PATH, use_reflection=True):
if data_dir != DATA_PATH and not os.path.exists(os.path.join(data_dir, "di_dataset/open_ended_tasks")):
raise FileNotFoundError(f"Open-ended task dataset not found in {data_dir}.")
requirement = OPEN_ENDED_TASKS_REQUIREMENTS[task_name].format(data_dir=data_dir)
di = DataInterpreter(use_reflection=use_reflection, tool_recommender=TypeMatchToolRecommender(tools=["<all>"]))
await di.run(requirement)
if __name__ == "__main__":
fire.Fire(main)

View file

@ -1,7 +1,7 @@
from __future__ import annotations
import json
from typing import Literal, Union
from typing import Literal
from pydantic import Field, model_validator
@ -39,7 +39,7 @@ class DataInterpreter(Role):
use_plan: bool = True
use_reflection: bool = False
execute_code: ExecuteNbCode = Field(default_factory=ExecuteNbCode, exclude=True)
tools: Union[str, list[str]] = [] # Use special symbol ["<all>"] to indicate use of all registered tools
tools: list[str] = [] # Use special symbol ["<all>"] to indicate use of all registered tools
tool_recommender: ToolRecommender = None
react_mode: Literal["plan_and_act", "react"] = "plan_and_act"
max_react_loop: int = 10 # used for react mode
@ -50,7 +50,7 @@ class DataInterpreter(Role):
self.use_plan = (
self.react_mode == "plan_and_act"
) # create a flag for convenience, overwrite any passed-in value
if self.tools:
if self.tools and not self.tool_recommender:
self.tool_recommender = BM25ToolRecommender(tools=self.tools)
self.set_actions([WriteAnalysisCode])
self._set_state(0)
@ -108,7 +108,7 @@ class DataInterpreter(Role):
plan_status = self.planner.get_plan_status() if self.use_plan else ""
# tool info
if self.tools:
if self.tool_recommender:
context = (
self.working_memory.get()[-1].content if self.working_memory.get() else ""
) # thoughts from _think stage in 'react' mode

View file

@ -171,7 +171,7 @@ class OneHotEncode(DataPreprocessTool):
def __init__(self, features: list):
self.features = features
self.model = OneHotEncoder(handle_unknown="ignore", sparse=False)
self.model = OneHotEncoder(handle_unknown="ignore", sparse_output=False)
def transform(self, df: pd.DataFrame) -> pd.DataFrame:
ts_data = self.model.transform(df[self.features])

View file

@ -1,3 +1,4 @@
import ast
import inspect
from metagpt.utils.parse_docstring import GoogleDocstringParser, remove_spaces
@ -5,9 +6,10 @@ from metagpt.utils.parse_docstring import GoogleDocstringParser, remove_spaces
PARSER = GoogleDocstringParser
def convert_code_to_tool_schema(obj, include: list[str] = None):
def convert_code_to_tool_schema(obj, include: list[str] = None) -> dict:
"""Converts an object (function or class) to a tool schema by inspecting the object"""
docstring = inspect.getdoc(obj)
assert docstring, "no docstring found for the objects, skip registering"
# assert docstring, "no docstring found for the objects, skip registering"
if inspect.isclass(obj):
schema = {"type": "class", "description": remove_spaces(docstring), "methods": {}}
@ -27,6 +29,16 @@ def convert_code_to_tool_schema(obj, include: list[str] = None):
return schema
def convert_code_to_tool_schema_ast(code: str) -> list[dict]:
"""Converts a code string to a list of tool schemas by parsing the code with AST"""
visitor = CodeVisitor(code)
parsed_code = ast.parse(code)
visitor.visit(parsed_code)
return visitor.get_tool_schemas()
def function_docstring_to_schema(fn_obj, docstring) -> dict:
"""
Converts a function's docstring into a schema dictionary.
@ -62,3 +74,67 @@ def get_class_method_docstring(cls, method_name):
if method.__doc__:
return method.__doc__
return None # No docstring found in the class hierarchy
class CodeVisitor(ast.NodeVisitor):
"""Visit and convert the AST nodes within a code file to tool schemas"""
def __init__(self, source_code: str):
self.tool_schemas = {} # {tool_name: tool_schema}
self.source_code = source_code
def visit_ClassDef(self, node):
class_schemas = {"type": "class", "description": remove_spaces(ast.get_docstring(node)), "methods": {}}
for body_node in node.body:
if isinstance(body_node, (ast.FunctionDef, ast.AsyncFunctionDef)) and (
not body_node.name.startswith("_") or body_node.name == "__init__"
):
func_schemas = self._get_function_schemas(body_node)
class_schemas["methods"].update({body_node.name: func_schemas})
class_schemas["code"] = ast.get_source_segment(self.source_code, node)
self.tool_schemas[node.name] = class_schemas
def visit_FunctionDef(self, node):
self._visit_function(node)
def visit_AsyncFunctionDef(self, node):
self._visit_function(node)
def _visit_function(self, node):
if node.name.startswith("_"):
return
function_schemas = self._get_function_schemas(node)
function_schemas["code"] = ast.get_source_segment(self.source_code, node)
self.tool_schemas[node.name] = function_schemas
def _get_function_schemas(self, node):
docstring = remove_spaces(ast.get_docstring(node))
overall_desc, param_desc = PARSER.parse(docstring)
return {
"type": "async_function" if isinstance(node, ast.AsyncFunctionDef) else "function",
"description": overall_desc,
"signature": self._get_function_signature(node),
"parameters": param_desc,
}
def _get_function_signature(self, node):
args = []
defaults = dict(zip([arg.arg for arg in node.args.args][-len(node.args.defaults) :], node.args.defaults))
for arg in node.args.args:
arg_str = arg.arg
if arg.annotation:
annotation = ast.unparse(arg.annotation)
arg_str += f": {annotation}"
if arg.arg in defaults:
default_value = ast.unparse(defaults[arg.arg])
arg_str += f" = {default_value}"
args.append(arg_str)
return_annotation = ""
if node.returns:
return_annotation = f" -> {ast.unparse(node.returns)}"
return f"({', '.join(args)}){return_annotation}"
def get_tool_schemas(self):
return self.tool_schemas

View file

@ -3,7 +3,6 @@ from __future__ import annotations
import json
from typing import Any
import jieba
import numpy as np
from pydantic import BaseModel, field_validator
from rank_bm25 import BM25Okapi
@ -182,7 +181,7 @@ class BM25ToolRecommender(ToolRecommender):
self.bm25 = BM25Okapi(tokenized_corpus)
def _tokenize(self, text):
return jieba.lcut(text) # FIXME: needs more sophisticated tokenization
return text.split() # FIXME: needs more sophisticated tokenization
async def recall_tools(self, context: str = "", plan: Plan = None, topk: int = 20) -> list[Tool]:
query = plan.current_task.instruction if plan else context
@ -193,7 +192,7 @@ class BM25ToolRecommender(ToolRecommender):
recalled_tools = [list(self.tools.values())[index] for index in top_indexes]
logger.info(
f"Recalled tools: \n{[tool.name for tool in recalled_tools]}; Scores: {[doc_scores[index] for index in top_indexes]}"
f"Recalled tools: \n{[tool.name for tool in recalled_tools]}; Scores: {[np.round(doc_scores[index], 4) for index in top_indexes]}"
)
return recalled_tools

View file

@ -10,14 +10,17 @@ from __future__ import annotations
import inspect
import os
from collections import defaultdict
from typing import Union
from pathlib import Path
import yaml
from pydantic import BaseModel
from metagpt.const import TOOL_SCHEMA_PATH
from metagpt.logs import logger
from metagpt.tools.tool_convert import convert_code_to_tool_schema
from metagpt.tools.tool_convert import (
convert_code_to_tool_schema,
convert_code_to_tool_schema_ast,
)
from metagpt.tools.tool_data_type import Tool, ToolSchema
@ -27,21 +30,23 @@ class ToolRegistry(BaseModel):
def register_tool(
self,
tool_name,
tool_path,
schema_path="",
tool_code="",
tags=None,
tool_source_object=None,
include_functions=None,
verbose=False,
tool_name: str,
tool_path: str,
schemas: dict = None,
schema_path: str = "",
tool_code: str = "",
tags: list[str] = None,
tool_source_object=None, # can be any classes or functions
include_functions: list[str] = None,
verbose: bool = False,
):
if self.has_tool(tool_name):
return
schema_path = schema_path or TOOL_SCHEMA_PATH / f"{tool_name}.yml"
schemas = make_schema(tool_source_object, include_functions, schema_path)
if not schemas:
schemas = make_schema(tool_source_object, include_functions, schema_path)
if not schemas:
return
@ -117,9 +122,6 @@ def make_schema(tool_source_object, include, path):
schema = convert_code_to_tool_schema(tool_source_object, include=include)
with open(path, "w", encoding="utf-8") as f:
yaml.dump(schema, f, sort_keys=False)
# import json
# with open(str(path).replace("yml", "json"), "w", encoding="utf-8") as f:
# json.dump(schema, f, ensure_ascii=False, indent=4)
except Exception as e:
schema = {}
logger.error(f"Fail to make schema: {e}")
@ -127,15 +129,49 @@ def make_schema(tool_source_object, include, path):
return schema
def validate_tool_names(tools: Union[list[str], str]) -> str:
def validate_tool_names(tools: list[str]) -> dict[str, Tool]:
assert isinstance(tools, list), "tools must be a list of str"
valid_tools = {}
for key in tools:
# one can define either tool names or tool type names, take union to get the whole set
if TOOL_REGISTRY.has_tool(key):
# one can define either tool names OR tool tags OR tool path, take union to get the whole set
# if tool paths are provided, they will be registered on the fly
if os.path.isdir(key) or os.path.isfile(key):
valid_tools.update(register_tools_from_path(key))
elif TOOL_REGISTRY.has_tool(key):
valid_tools.update({key: TOOL_REGISTRY.get_tool(key)})
elif TOOL_REGISTRY.has_tool_tag(key):
valid_tools.update(TOOL_REGISTRY.get_tools_by_tag(key))
else:
logger.warning(f"invalid tool name or tool type name: {key}, skipped")
return valid_tools
def register_tools_from_file(file_path) -> dict[str, Tool]:
file_name = Path(file_path).name
if not file_name.endswith(".py") or file_name == "setup.py" or file_name.startswith("test"):
return {}
registered_tools = {}
code = Path(file_path).read_text(encoding="utf-8")
tool_schemas = convert_code_to_tool_schema_ast(code)
for name, schemas in tool_schemas.items():
tool_code = schemas.pop("code", "")
TOOL_REGISTRY.register_tool(
tool_name=name,
tool_path=file_path,
schemas=schemas,
tool_code=tool_code,
)
registered_tools.update({name: TOOL_REGISTRY.get_tool(name)})
return registered_tools
def register_tools_from_path(path) -> dict[str, Tool]:
tools_registered = {}
if os.path.isfile(path):
tools_registered.update(register_tools_from_file(path))
elif os.path.isdir(path):
for root, _, files in os.walk(path):
for file in files:
file_path = os.path.join(root, file)
tools_registered.update(register_tools_from_file(file_path))
return tools_registered

View file

@ -3,7 +3,7 @@ from typing import Tuple
def remove_spaces(text):
return re.sub(r"\s+", " ", text).strip()
return re.sub(r"\s+", " ", text).strip() if text else ""
class DocstringParser:

View file

@ -69,5 +69,4 @@ imap_tools==1.5.0 # Used by metagpt/tools/libs/email_login.py
qianfan==0.3.2
dashscope==1.14.1
rank-bm25==0.2.2 # for tool recommendation
jieba==0.42.1 # for tool recommendation
gymnasium==0.29.1
gymnasium==0.29.1

View file

@ -2,7 +2,10 @@ from typing import Literal, Union
import pandas as pd
from metagpt.tools.tool_convert import convert_code_to_tool_schema
from metagpt.tools.tool_convert import (
convert_code_to_tool_schema,
convert_code_to_tool_schema_ast,
)
class DummyClass:
@ -128,3 +131,91 @@ def test_convert_code_to_tool_schema_function():
def test_convert_code_to_tool_schema_async_function():
schema = convert_code_to_tool_schema(dummy_async_fn)
assert schema.get("type") == "async_function"
TEST_CODE_FILE_TEXT = '''
import pandas as pd # imported obj should not be parsed
from some_module1 import some_imported_function, SomeImportedClass # imported obj should not be parsed
from ..some_module2 import some_imported_function2 # relative import should not result in an error
class MyClass:
"""This is a MyClass docstring."""
def __init__(self, arg1):
"""This is the constructor docstring."""
self.arg1 = arg1
def my_method(self, arg2: Union[list[str], str], arg3: pd.DataFrame, arg4: int = 1, arg5: Literal["a","b","c"] = "a") -> Tuple[int, str]:
"""
This is a method docstring.
Args:
arg2 (Union[list[str], str]): A union of a list of strings and a string.
...
Returns:
Tuple[int, str]: A tuple of an integer and a string.
"""
return self.arg4 + arg5
async def my_async_method(self, some_arg) -> str:
return "hi"
def _private_method(self): # private should not be parsed
return "private"
def my_function(arg1, arg2) -> dict:
"""This is a function docstring."""
return arg1 + arg2
def my_async_function(arg1, arg2) -> dict:
return arg1 + arg2
def _private_function(): # private should not be parsed
return "private"
'''
def test_convert_code_to_tool_schema_ast():
expected = {
"MyClass": {
"type": "class",
"description": "This is a MyClass docstring.",
"methods": {
"__init__": {
"type": "function",
"description": "This is the constructor docstring.",
"signature": "(self, arg1)",
"parameters": "",
},
"my_method": {
"type": "function",
"description": "This is a method docstring. ",
"signature": "(self, arg2: Union[list[str], str], arg3: pd.DataFrame, arg4: int = 1, arg5: Literal['a', 'b', 'c'] = 'a') -> Tuple[int, str]",
"parameters": "Args: arg2 (Union[list[str], str]): A union of a list of strings and a string. ... Returns: Tuple[int, str]: A tuple of an integer and a string.",
},
"my_async_method": {
"type": "async_function",
"description": "",
"signature": "(self, some_arg) -> str",
"parameters": "",
},
},
"code": 'class MyClass:\n """This is a MyClass docstring."""\n def __init__(self, arg1):\n """This is the constructor docstring."""\n self.arg1 = arg1\n\n def my_method(self, arg2: Union[list[str], str], arg3: pd.DataFrame, arg4: int = 1, arg5: Literal["a","b","c"] = "a") -> Tuple[int, str]:\n """\n This is a method docstring.\n \n Args:\n arg2 (Union[list[str], str]): A union of a list of strings and a string.\n ...\n \n Returns:\n Tuple[int, str]: A tuple of an integer and a string.\n """\n return self.arg4 + arg5\n \n async def my_async_method(self, some_arg) -> str:\n return "hi"\n \n def _private_method(self): # private should not be parsed\n return "private"',
},
"my_function": {
"type": "function",
"description": "This is a function docstring.",
"signature": "(arg1, arg2) -> dict",
"parameters": "",
"code": 'def my_function(arg1, arg2) -> dict:\n """This is a function docstring."""\n return arg1 + arg2',
},
"my_async_function": {
"type": "function",
"description": "",
"signature": "(arg1, arg2) -> dict",
"parameters": "",
"code": "def my_async_function(arg1, arg2) -> dict:\n return arg1 + arg2",
},
}
schemas = convert_code_to_tool_schema_ast(TEST_CODE_FILE_TEXT)
assert schemas == expected