diff --git a/metagpt/ext/sela/README.md b/metagpt/ext/sela/README.md index 829306e36..a942fdb7d 100644 --- a/metagpt/ext/sela/README.md +++ b/metagpt/ext/sela/README.md @@ -1,29 +1,26 @@ # SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning - - ## 1. Data Preparation -- Download Datasets:https://deepwisdom.feishu.cn/drive/folder/RVyofv9cvlvtxKdddt2cyn3BnTc?from=from_copylink -- Download and prepare datasets from scratch: -``` -cd data -python dataset.py --save_analysis_pool -python hf_data.py --save_analysis_pool -``` +You can either download the datasets from the link or prepare the datasets from scratch. +- **Download Datasets:** [Dataset Link](https://deepwisdom.feishu.cn/drive/folder/RVyofv9cvlvtxKdddt2cyn3BnTc?from=from_copylink) +- **Download and prepare datasets from scratch:** + ```bash + cd data + python dataset.py --save_analysis_pool + python hf_data.py --save_analysis_pool + ``` -## 2. Configs +## 2. Configurations ### Data Config -`datasets.yaml` Provide base prompts, metrics, target columns for respective datasets - -- Modify `datasets_dir` to the root directory of all the datasets in `data.yaml` - +- **`datasets.yaml`:** Provide base prompts, metrics, and target columns for respective datasets. +- **`data.yaml`:** Modify `datasets_dir` to the base directory of all prepared datasets. ### LLM Config -``` +```yaml llm: api_type: 'openai' model: deepseek-coder @@ -32,237 +29,57 @@ ### LLM Config temperature: 0.5 ``` -### Budget -Experiment rollouts k = 5, 10, 20 - - -### Prompt Usage - -- Use the function `generate_task_requirement` in `dataset.py` to get task requirement. - - If the method is non-DI-based, set `is_di=False`. - - Use `utils.DATA_CONFIG` as `data_config` - ## 3. SELA ### Run SELA #### Setup -In the root directory, -``` +```bash pip install -e . -cd expo +cd metagpt/ext/sela pip install -r requirements.txt ``` -#### Run +#### Running Experiments -- Examples - ``` - python run_experiment.py --exp_mode mcts --task titanic --rollouts 10 - python run_experiment.py --exp_mode mcts --task house-prices --rollouts 10 --low_is_better - ``` - - -- `--rollouts` - The number of rollouts - -- `--use_fixed_insights` - In addition to the generated insights, include the fixed insights saved in `expo/insights/fixed_insights.json` - -- `--low_is_better` - If the dataset has reg metric, remember to use `--low_is_better` - -- `--from_scratch` - Do not use pre-processed insight pool, generate new insight pool based on dataset before running MCTS, facilitating subsequent tuning to propose search space prompts - -- `--role_timeout` - The timeout for the role - - This feature limits the duration of a single simulation, making the experiment duration more controllable (for example, if you do ten rollouts and set role_timeout to 1,000, the experiment will stop at the latest after 10,000s) - - -- `--max_depth` - The maximum depth of MCTS, default is 4 (nodes at this depth directly return the previous simulation result without further expansion) - -- `--load_tree` - If MCTS was interrupted due to certain reasons but had already run multiple rollouts, you can use `--load_tree`. - - For example: - ``` +- **Examples:** + ```bash python run_experiment.py --exp_mode mcts --task titanic --rollouts 10 - ``` - - If this was interrupted after running three rollouts, you can use `--load_tree`: - ``` - python run_experiment.py --exp_mode mcts --task titanic --rollouts 7 --load_tree + python run_experiment.py --exp_mode mcts --task house-prices --rollouts 10 --low_is_better ``` +#### Parameters -#### Ablation Study +- **`--rollouts`:** The number of rollouts. +- **`--use_fixed_insights`:** Include fixed insights saved in `expo/insights/fixed_insights.json`. +- **`--low_is_better`:** Use this if the dataset has a regression metric. +- **`--from_scratch`:** Generate a new insight pool based on the dataset before running MCTS. +- **`--role_timeout`:** Limits the duration of a single simulation (e.g., `10 rollouts with timeout 1,000` = max 10,000s). +- **`--max_depth`:** Set the maximum depth of MCTS (default is 4). +- **`--load_tree`:** Load an existing MCTS tree if the previous experiment was interrupted. + - Example: + ```bash + python run_experiment.py --exp_mode mcts --task titanic --rollouts 10 + ``` + - To resume: + ```bash + python run_experiment.py --exp_mode mcts --task titanic --rollouts 7 --load_tree + ``` -**DI RandomSearch** +### Ablation Study -- Single insight -`python run_experiment.py --exp_mode rs --task titanic --rs_mode single` +**RandomSearch** -- Set insight -`python run_experiment.py --exp_mode rs --task titanic --rs_mode set` +- **Use a single insight:** + ```bash + python run_experiment.py --exp_mode rs --task titanic --rs_mode single + ``` - -## 4. Evaluation - -Each baseline needs to produce `dev_predictions.csv`和`test_predictions.csv`. Each csv file only needs a `target` column. - -- Use the function `evaluate_score` to evaluate. - -#### MLE-Bench -**Note: mle-bench requires python 3.11 or higher** -``` -git clone https://github.com/openai/mle-bench.git -cd mle-bench -pip install -e . -``` - -``` -mlebench prepare -c --data-dir -``` - -Enter the following command to run the experiment: -``` -python run_experiment.py --exp_mode mcts --custom_dataset_dir --rollouts 10 --from_scratch --role_timeout 3600 -``` - - -## 5. Baselines - -### AIDE - -#### Setup -The version of AIDE we use is dated September 30, 2024 -``` -git clone https://github.com/WecoAI/aideml.git -git checkout 77953247ea0a5dc1bd502dd10939dd6d7fdcc5cc -``` - -Modify `aideml/aide/utils/config.yaml` - change `k_fold_validation`, `code model`, and `feedback model` as follows: - -```yaml -# agent hyperparams -agent: - # how many improvement iterations to run - steps: 10 - # whether to instruct the agent to use CV (set to 1 to disable) - k_fold_validation: 1 - # LLM settings for coding - code: - model: deepseek-coder - temp: 0.5 - - # LLM settings for evaluating program output / tracebacks - feedback: - model: deepseek-coder - temp: 0.5 - - # hyperparameters for the tree search - search: - max_debug_depth: 3 - debug_prob: 0.5 - num_drafts: 5 -``` - -Since Deepseek is compatible to OpenAI's API, change `base_url` into `your own url`,`api_key` into `your api key` - -``` -export OPENAI_API_KEY="your api key" -export OPENAI_BASE_URL="your own url" -``` - -Modify `aideml/aide/backend/__init__.py`'s line 30 and below: - -```python -model_kwargs = model_kwargs | { - "model": model, - "temperature": temperature, - "max_tokens": max_tokens, - } -if "claude-" in model: - query_func = backend_anthropic.query -else: - query_func = backend_openai.query -``` - -Since deepseekV2.5 no longer supports system message using function call, modify `aideml/aide/agent.py`'s line 312: - -```python -response = cast( - dict, - query( - system_message=None, - user_message=prompt, - func_spec=review_func_spec, - model=self.acfg.feedback.model, - temperature=self.acfg.feedback.temp, - ), -) -``` - -Modify and install: - -``` -cd aideml -pip install -e . -``` - -#### Run - -Run the following script to get the running results, a `log` folder and a `workspace` folder will be generated in the current directory -The `log` folder will contain the experimental configuration and the generated scheme, and the `workspace` folder will save the final results generated by aide - -``` -python runner/aide.py -``` - -### Autogluon -#### Setup -``` -pip install -U pip -pip install -U setuptools wheel -pip install autogluon==1.1.1 -``` - -For Tabular data: -``` -python run_expriment.py --exp_mode autogluon --task {task_name} -``` -For Multimodal data: -``` -python run_expriment.py --exp_mode autogluon --task {task_name} --is_multimodal -``` -Replace {task_name} with the specific task you want to run. - - -### AutoSklearn -#### System requirements -auto-sklearn has the following system requirements: - -- Linux operating system (for example Ubuntu) - -- Python (>=3.7) - -- C++ compiler (with C++11 supports) - -In case you try to install Auto-sklearn on a system where no wheel files for the pyrfr package are provided (see here for available wheels) you also need: - -- SWIG [(get SWIG here).](https://www.swig.org/survey.html) - -For an explanation of missing Microsoft Windows and macOS support please check the Section [Windows/macOS compatibility](https://automl.github.io/auto-sklearn/master/installation.html#windows-macos-compatibility). - -#### Setup -``` -pip install auto-sklearn==0.15.0 -``` - -#### Run -``` -python run_experiment.py --exp_mode autosklearn --task titanic -``` - -### Base DI -For setup, check 4. -- `python run_experiment.py --exp_mode base --task titanic --num_experiments 10` -- Specifically instruct DI to use AutoGluon: `--special_instruction ag` -- Specifically instruct DI to use the stacking ensemble method: `--special_instruction stacking` \ No newline at end of file +- **Use a set of insights:** + ```bash + python run_experiment.py --exp_mode rs --task titanic --rs_mode set + ``` \ No newline at end of file diff --git a/metagpt/ext/sela/runner/README.md b/metagpt/ext/sela/runner/README.md new file mode 100644 index 000000000..7c031f1ee --- /dev/null +++ b/metagpt/ext/sela/runner/README.md @@ -0,0 +1,198 @@ +# SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning + +This document provides instructions for running baseline models. To start with, ensure that you prepare the datasets as instructed in `sela/README.md`. + +## Baselines + +### 1. AIDE + +#### Setup + +We use the AIDE version from September 30, 2024. Clone the repository and check out the specified commit: + +```bash +git clone https://github.com/WecoAI/aideml.git +git checkout 77953247ea0a5dc1bd502dd10939dd6d7fdcc5cc +``` + + +Modify `aideml/aide/utils/config.yaml` to set the following parameters: + +```yaml +# agent hyperparams +agent: + steps: 10 # Number of improvement iterations + k_fold_validation: 1 # Set to 1 to disable cross-validation + code: + model: deepseek-coder + temp: 0.5 + feedback: + model: deepseek-coder + temp: 0.5 + search: + max_debug_depth: 3 + debug_prob: 0.5 + num_drafts: 5 +``` + +Update your OpenAI API credentials in the environment: + +```bash +export OPENAI_API_KEY="your api key" +export OPENAI_BASE_URL="your own url" +``` + +Modify `aideml/aide/backend/__init__.py` (line 30 and below): + +```python +model_kwargs = model_kwargs | { + "model": model, + "temperature": temperature, + "max_tokens": max_tokens, + } +if "claude-" in model: + query_func = backend_anthropic.query +else: + query_func = backend_openai.query +``` + +Since Deepseek V2.5 no longer supports system messages using function calls, modify `aideml/aide/agent.py` (line 312): + +```python +response = cast( + dict, + query( + system_message=None, + user_message=prompt, + func_spec=review_func_spec, + model=self.acfg.feedback.model, + temperature=self.acfg.feedback.temp, + ), +) +``` + +Finally, install AIDE: + +```bash +cd aideml +pip install -e . +``` + +#### Run + +Execute the following script to generate results. A `log` folder (containing experimental configurations) and a `workspace` folder (storing final results) will be created: + +```bash +python runner/aide.py +``` + +--- + +### 2. Autogluon + +#### Setup + +Install Autogluon: + +```bash +pip install -U pip +pip install -U setuptools wheel +pip install autogluon==1.1.1 +``` + +#### Run + +For Tabular data: + +```bash +python run_experiment.py --exp_mode autogluon --task {task_name} +``` + +For Multimodal data: + +```bash +python run_experiment.py --exp_mode autogluon --task {task_name} --is_multimodal +``` + +Replace `{task_name}` with the specific task you want to run. + +--- + +### 3. AutoSklearn + +**Note:** +AutoSklearn requires: +- Linux operating system (e.g., Ubuntu) +- Python (>=3.7) +- C++ compiler (with C++11 support) + +If installing on a system without wheel files for the `pyrfr` package, you also need: + +- [SWIG](https://www.swig.org/survey.html) + +Refer to the [Windows/macOS compatibility](https://automl.github.io/auto-sklearn/master/installation.html#windows-macos-compatibility) section for further details. + +#### Setup + +Install AutoSklearn: + +```bash +pip install auto-sklearn==0.15.0 +``` + +#### Run + +Execute the following command for the Titanic task: + +```bash +python run_experiment.py --exp_mode autosklearn --task titanic +``` + +--- + +### 4. Base Data Interpreter + +Run the following command for the Titanic task: + +```bash +python run_experiment.py --exp_mode base --task titanic --num_experiments 10 +``` + +--- + +### 5. Custom Baselines + +To run additional baselines: + +- Each baseline must produce `dev_predictions.csv` and `test_predictions.csv` with a `target` column. +- Use the `evaluate_score` function for evaluation. + +--- + +## MLE-Bench + +**Note:** MLE-Bench requires Python 3.11 or higher. + +#### Setup + +Clone the repository and install: + +```bash +git clone https://github.com/openai/mle-bench.git +cd mle-bench +pip install -e . +``` + +Prepare the data: + +```bash +mlebench prepare -c --data-dir +``` + +#### Run the MLE-Bench Experiment + +Run the following command to execute the experiment: + +```bash +python run_experiment.py --exp_mode mcts --custom_dataset_dir --rollouts 10 --from_scratch --role_timeout 3600 +``` \ No newline at end of file