MetaGPT/expo
Yizhou Chi 3a57060e25 1. add eval_func for sela and compatibility to others
2. llm extract score (use all code block and execution results)
3. add argument for custom dataset dir
4. dataset custom requirement support
2024-10-12 17:16:51 +08:00
..
data 1. add eval_func for sela and compatibility to others 2024-10-12 17:16:51 +08:00
evaluation 1. add eval_func for sela and compatibility to others 2024-10-12 17:16:51 +08:00
experimenter 1. add eval_func for sela and compatibility to others 2024-10-12 17:16:51 +08:00
insights 1. add eval_func for sela and compatibility to others 2024-10-12 17:16:51 +08:00
results add expo 2024-08-30 16:26:05 +08:00
scripts add scripts 2024-10-10 19:43:56 +08:00
data.yaml make timeout as argument 2024-10-10 18:54:40 +08:00
datasets.yaml update pet 2024-09-28 10:39:12 +08:00
Greedy.py add random tree search 2024-09-10 15:30:23 +08:00
MCTS.py 1. add eval_func for sela and compatibility to others 2024-10-12 17:16:51 +08:00
README.md 删除图片 2024-10-11 10:03:08 +08:00
requirements.txt 1. add special instruction 2024-09-14 15:17:42 +08:00
research_assistant.py 1. add eval_func for sela and compatibility to others 2024-10-12 17:16:51 +08:00
run_experiment.py 1. add eval_func for sela and compatibility to others 2024-10-12 17:16:51 +08:00
utils.py 1. add eval_func for sela and compatibility to others 2024-10-12 17:16:51 +08:00

SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning

1. Data Preparation

2. Configs

Data Config

datasets.yaml Provide base prompts, metrics, target columns for respective datasets

  • Modify datasets_dir to the root directory of all the datasets in data.yaml

LLM Config

llm:
  api_type: 'openai'
  model: deepseek-coder
  base_url: "https://oneapi.deepwisdom.ai/v1"
  api_key: sk-xxx
  temperature: 0.5

Budget

Experiment rollouts k = 5, 10, 20

Prompt Usage

  • Use the function generate_task_requirement in dataset.py to get task requirement.
    • If the method is non-DI-based, set is_di=False.
    • Use utils.DATA_CONFIG as data_config

3. SELA

Run SELA

Setup

In the root directory,

pip install -e .

cd expo

pip install -r requirements.txt

Run

  • python run_experiment.py --exp_mode mcts --task titanic --rollouts 10

If the dataset has reg metric, remember to use --low_is_better:

  • python run_experiment.py --exp_mode mcts --task house_prices --rollouts 10 --low_is_better

In addition to the generated insights, include the fixed insights saved in expo/insights/fixed_insights.json

  • --use_fixed_insights

Ablation Study

DI RandomSearch

  • Single insight python run_experiment.py --exp_mode aug --task titanic --aug_mode single

  • Set insight python run_experiment.py --exp_mode aug --task titanic --aug_mode set

4. Evaluation

Each baseline needs to produce dev_predictions.csvtest_predictions.csv. Each csv file only needs a target column.

  • Use the function evaluate_score to evaluate.

5. Baselines

DS Agent

git clone https://github.com/guosyjlu/DS-Agent.git

将其deployment/generate.py line46-48行部分修改如下目的是用deepseek而非GPT的API

messages = [{"role": "user", "content": prompt}]

if 'gpt' in llm:
    response = openai.ChatCompletion.create(**{"messages": messages,**raw_request})
    raw_completion = response["choices"][0]["message"]["content"]
    
elif llm == 'deepseek-coder':
    from openai import OpenAI
    client = OpenAI(
        api_key="yours", 
        base_url="https://oneapi.deepwisdom.ai/v1"
    )
    response = client.chat.completions.create(
        model="deepseek-coder",
        messages=[
            # {"role": "system", "content": "You are a helpful assistant"},
            {"role": "user", "content": prompt},
        ],
        temperature=temperature,
        stream=False
    )
    raw_completion = response.choices[0].message.content

completion = raw_completion.split("```python")[1].split("```")[0]

修改完后在新建一个deployment/test.sh 分别运行下列两行,$TASK 是你要测试的task name

python -u generate.py --llm deepseek-coder --task $TASK --shot 1 --retrieval > "$TASK".txt 2>&1 

python -u evaluation.py --path "deepseek-coder_True_1" --task $TASK --device 0  > "$TASK"_eval.txt 2>&1 

AIDE

Setup

git clone https://github.com/WecoAI/aideml.git

修改 aideml/aide/utils/config.yaml 内容如下

# path to the task data directory
data_dir: null

# either provide a path to a plaintext file describing the task
desc_file: null
# or provide the task goal (and optionally evaluation information) as arguments
goal: null
eval: null

log_dir: logs
workspace_dir: workspaces

# whether to unzip any archives in the data directory
preprocess_data: True
# whether to copy the data to the workspace directory (otherwise it will be symlinked)
# copying is recommended to prevent the agent from accidentally modifying the original data
copy_data: True

exp_name: null # a random experiment name will be generated if not provided

# settings for code execution
exec:
  timeout: 3600
  agent_file_name: runfile.py
  format_tb_ipython: False

# agent hyperparams
agent:
  # how many improvement iterations to run
  steps: 10
  # whether to instruct the agent to use CV (set to 1 to disable)
  k_fold_validation: 1
  # whether to instruct the agent to generate a prediction function
  expose_prediction: False
  # whether to provide the agent with a preview of the data
  data_preview: True

  # LLM settings for coding
  code:
    model: deepseek-coder
    temp: 0.5

  # LLM settings for evaluating program output / tracebacks
  feedback:
    model: deepseek-coder
    temp: 0.5

  # hyperparameters for the tree search
  search:
    max_debug_depth: 3
    debug_prob: 0.5
    num_drafts: 5

由于 deepseek 完全兼容 OpenAI 的 API修改base_url自己的urlapi_key自己的key即可

export OPENAI_API_KEY="自己的key"
export OPENAI_BASE_URL="自己的url"

修改aideml/aide/backend/__init__.py 30 行内容如下:

model_kwargs = model_kwargs | {
        "model": model,
        "temperature": temperature,
        "max_tokens": max_tokens,
    }
    if "claude-" in model:
        query_func = backend_anthropic.query
    else:
        query_func = backend_openai.query

由于 deepseekV2.5 不再支持 system message 使用 function call修改 aideml/aide/agent.py 312 行内容如下:

response = cast(
            dict,
            query(
                system_message=None,
                user_message=prompt,
                func_spec=review_func_spec,
                model=self.acfg.feedback.model,
                temperature=self.acfg.feedback.temp,
            ),
        )

修改完后

cd aideml
pip install -e .

Run

运行下面脚本获取运行结果,在当前目录下将生成一个 log 文件夹以及 workspace 文件夹 log 文件夹中将包含实验使用配置以及生成方案记录workspace 文件夹下将保存 aide 最后生成的结果文件

python experimenter/aide.py

Autogluon

Setup

pip install -U pip
pip install -U setuptools wheel
pip install autogluon

For Tabular data:

python run_expriment.py --exp_mode autogluon --task {task_name}

For Multimodal data:

python run_expriment.py --exp_mode autogluon --task {task_name} --is_multimodal

Replace {task_name} with the specific task you want to run.

提供github链接并说明使用的命令以及参数设置

AutoSklearn

System requirements

auto-sklearn has the following system requirements:

  • Linux operating system (for example Ubuntu)

  • Python (>=3.7)

  • C++ compiler (with C++11 supports)

In case you try to install Auto-sklearn on a system where no wheel files for the pyrfr package are provided (see here for available wheels) you also need:

For an explanation of missing Microsoft Windows and macOS support please check the Section Windows/macOS compatibility.

Setup

pip install auto-sklearn

Run

python run_experiment.py --exp_mode autosklearn --task titanic

Base DI

For setup, check 4.

  • python run_experiment.py --exp_mode base --task titanic --num_experiments 10
  • Specifically instruct DI to use AutoGluon: --special_instruction ag
  • Specifically instruct DI to use the stacking ensemble method: --special_instruction stacking