Update Latest Review

2026-05-12 09:12:38 +02:00 · 2024-10-28 21:23:46 +08:00 · 2024-10-28 21:23:46 +08:00 · f0a3a3f739
commit f0a3a3f739
parent 92e520ded2
15 changed files with 276 additions and 348 deletions
--- a/examples/aflow/README.md
+++ b/examples/aflow/README.md
@ -5,7 +5,7 @@ # AFlow: Automating Agentic Workflow Generation
 [Read our paper on arXiv](https://arxiv.org/abs/2410.10762)

 <p align="center">
-<a href=""><img src="../../docs/resources/AFLOW-performance.jpg" alt="Performance Of AFLOW" title="Performance of AFlow<sub>1</sub>" width="80%"></a>
+<a href=""><img src="../../docs/resources/aflow/AFLOW-performance.jpg" alt="Performance Of AFLOW" title="Performance of AFlow<sub>1</sub>" width="80%"></a>
 </p>

 ## Framework Components
@ -17,7 +17,7 @@ ## Framework Components
 - **Evaluator**: Assesses workflow performance on given tasks. Provides feedback to guide the optimization process towards more effective workflows. See `metagpt/ext/aflow/scripts/evaluator.py` for details.

 <p align="center">
-<a href=""><img src="../../docs/resources/AFLOW-method.jpg" alt="Performance Of AFLOW" title="Framework of AFlow <sub>1</sub>" width="80%"></a>
+<a href=""><img src="../../docs/resources/aflow/AFLOW-method.jpg" alt="Performance Of AFLOW" title="Framework of AFlow <sub>1</sub>" width="80%"></a>
 </p>

 ## Datasets
@ -26,7 +26,7 @@ ### Experimental Datasets
 We conducted experiments on six datasets (HumanEval, MBPP, GSM8K, MATH, HotpotQA, DROP) and provide their evaluation code. The data can be found in this [datasets](https://drive.google.com/uc?export=download&id=1DNoegtZiUhWtvkd2xoIuElmIi4ah7k8e) link, or you can download them using `metagpt/ext/aflow/data/download_data.py`

 <p align="center">
-<a href=""><img src="../../docs/resources/AFLOW-experiment.jpg" alt="Performance Of AFLOW" title="Comparison bewteen AFlow and other methods <sub>1</sub>" width="80%"></a>
+<a href=""><img src="../../docs/resources/aflow/AFLOW-experiment.jpg" alt="Performance Of AFLOW" title="Comparison bewteen AFlow and other methods <sub>1</sub>" width="80%"></a>
 </p>

 ### Custom Datasets
@ -34,31 +34,41 @@ ### Custom Datasets

 ## Quick Start

-1. Configure your search in `optimize.py`:
-   - Open `examples/aflow/optimize.py`
-   - Set the following parameters:
+1. Configure optimization parameters:
+   - Use command line arguments or modify default parameters in `examples/aflow/optimize.py`:
     ```python
-     dataset: DatasetType = "MATH"  # Ensure the type is consistent with DatasetType
-     sample: int = 4  # Sample Count, which means how many workflows will be resampled from generated workflows
-     question_type: QuestionType = "math"  # Ensure the type is consistent with QuestionType
-     optimized_path: str = "metagpt/ext/aflow/scripts/optimized"  # Optimized Result Save Path
-     initial_round: int = 1  # Corrected the case from Initial_round to initial_round
-     max_rounds: int = 20  # The max iteration of AFLOW.
-     check_convergence: bool = True  # Whether Early Stop
-     validation_rounds: int = 5  # The validation rounds of AFLOW.
-     if_fisrt_optimize = True  # You should change it to False after the first optimize.
+     --dataset MATH          # Dataset type (HumanEval/MBPP/GSM8K/MATH/HotpotQA/DROP)
+     --sample 4             # Sample count - number of workflows to be resampled
+     --question_type math   # Question type (math/code/qa)
+     --optimized_path PATH  # Optimized result save path
+     --initial_round 1      # Initial round
+     --max_rounds 20        # Max iteration rounds for AFLOW
+     --check_convergence    # Whether to enable early stop
+     --validation_rounds 5  # Validation rounds for AFLOW
+     --if_first_optimize   # Set True for first optimization, False afterwards
     ```
-   - Adjust these parameters according to your specific requirements and dataset
-2. Set up parameters in `config/config2.yaml` (see `examples/aflow/config2.example.yaml` for reference)
-3. Set the operator you want to use in `optimize.py` and in `optimized_path/template/operator.py`, `optimized_path/template/operator.json`. You can reference our implementation to add operators for specific datasets
-4. When you first run, you can download the datasets and initial rounds by setting `download(["datasets", "initial_rounds"])` in `examples/aflow/optimize.py`
-5. (Optional) Add your custom dataset and corresponding evaluation function following the [Custom Datasets](#custom-datasets) section
-6. (Optional) If you want to use a portion of the validation data, you can set `va_list` in `examples/aflow/evaluator.py`
-6. Run `python -m examples.aflow.optimize` to start the optimization process!

+2. Configure LLM parameters in `config/config2.yaml` (see `examples/aflow/config2.example.yaml` for reference)
+
+3. Set up operators in `optimize.py` and in `optimized_path/template/operator.py`, `optimized_path/template/operator.json`. You can reference our implementation to add operators for specific datasets
+
+4. For first-time use, download datasets and initial rounds by setting `download(["datasets", "initial_rounds"])` in `examples/aflow/optimize.py`
+
+5. (Optional) Add your custom dataset and corresponding evaluation function following the [Custom Datasets](#custom-datasets) section
+
+6. (Optional) If you want to use a portion of the validation data, you can set `va_list` in `examples/aflow/evaluator.py`
+
+7. Run the optimization:
+   ```bash
+   # Using default parameters
+   python -m examples.aflow.optimize
+   
+   # Or with custom parameters
+   python -m examples.aflow.optimize --dataset MATH --sample 4 --question_type math
+   ```

 ## Reproduce the Results in the Paper
-1. We provide the raw data obtained from our experiments (link), including the workflows and prompts generated in each iteration, as well as their trajectories on the validation dataset. We also provide the optimal workflow for each dataset and the corresponding data on the test dataset. You can download these data using `metagpt/ext/aflow/data/download_data.py`.
+1. We provide the raw data obtained from our experiments ([download link](https://drive.google.com/uc?export=download&id=1Sr5wjgKf3bN8OC7G6cO3ynzJqD4w6_Dv)), including the workflows and prompts generated in each iteration, as well as their trajectories on the validation dataset. We also provide the optimal workflow for each dataset and the corresponding data on the test dataset. You can download these data using `metagpt/ext/aflow/data/download_data.py`.
 2. You can directly reproduce our experimental results by running the scripts in `examples/aflow/experiments`.


--- a/examples/aflow/experiments/optimize_drop.py
+++ b/examples/aflow/experiments/optimize_drop.py
@ -6,14 +6,6 @@
 from metagpt.configs.models_config import ModelsConfig
 from metagpt.ext.aflow.scripts.optimizer import DatasetType, Optimizer, QuestionType

-# DatasetType, QuestionType, and OptimizerType definitions
-# DatasetType = Literal["HumanEval", "MBPP", "GSM8K", "MATH", "HotpotQA", "DROP"]
-# QuestionType = Literal["math", "code", "qa"]
-# OptimizerType = Literal["Graph", "Test"]
-
-# When you fisrt use, please download the datasets and initial rounds; If you want to get a look of the results, please download the results.
-# download(["datasets", "initial_rounds"])
-
 # Crucial Parameters
 dataset: DatasetType = "DROP"  # Ensure the type is consistent with DatasetType
 sample: int = 4  # Sample Count, which means how many workflows will be resampled from generated workflows
--- a/examples/aflow/experiments/optimize_gsm8k.py
+++ b/examples/aflow/experiments/optimize_gsm8k.py
@ -6,14 +6,6 @@
 from metagpt.configs.models_config import ModelsConfig
 from metagpt.ext.aflow.scripts.optimizer import DatasetType, Optimizer, QuestionType

-# DatasetType, QuestionType, and OptimizerType definitions
-# DatasetType = Literal["HumanEval", "MBPP", "GSM8K", "MATH", "HotpotQA", "DROP"]
-# QuestionType = Literal["math", "code", "qa"]
-# OptimizerType = Literal["Graph", "Test"]
-
-# When you fisrt use, please download the datasets and initial rounds; If you want to get a look of the results, please download the results.
-# download(["datasets", "initial_rounds"])
-
 # Crucial Parameters
 dataset: DatasetType = "GSM8K"  # Ensure the type is consistent with DatasetType
 sample: int = 4  # Sample Count, which means how many workflows will be resampled from generated workflows
--- a/examples/aflow/experiments/optimize_hotpotqa.py
+++ b/examples/aflow/experiments/optimize_hotpotqa.py
@ -6,14 +6,6 @@
 from metagpt.configs.models_config import ModelsConfig
 from metagpt.ext.aflow.scripts.optimizer import DatasetType, Optimizer, QuestionType

-# DatasetType, QuestionType, and OptimizerType definitions
-# DatasetType = Literal["HumanEval", "MBPP", "GSM8K", "MATH", "HotpotQA", "DROP"]
-# QuestionType = Literal["math", "code", "qa"]
-# OptimizerType = Literal["Graph", "Test"]
-
-# When you fisrt use, please download the datasets and initial rounds; If you want to get a look of the results, please download the results.
-# download(["datasets", "initial_rounds"])
-
 # Crucial Parameters
 dataset: DatasetType = "HotpotQA"  # Ensure the type is consistent with DatasetType
 sample: int = 4  # Sample Count, which means how many workflows will be resampled from generated workflows
--- a/examples/aflow/experiments/optimize_humaneval.py
+++ b/examples/aflow/experiments/optimize_humaneval.py
@ -6,14 +6,6 @@
 from metagpt.configs.models_config import ModelsConfig
 from metagpt.ext.aflow.scripts.optimizer import DatasetType, Optimizer, QuestionType

-# DatasetType, QuestionType, and OptimizerType definitions
-# DatasetType = Literal["HumanEval", "MBPP", "GSM8K", "MATH", "HotpotQA", "DROP"]
-# QuestionType = Literal["math", "code", "qa"]
-# OptimizerType = Literal["Graph", "Test"]
-
-# When you fisrt use, please download the datasets and initial rounds; If you want to get a look of the results, please download the results.
-# download(["datasets", "initial_rounds"])
-
 # Crucial Parameters
 dataset: DatasetType = "HumanEval"  # Ensure the type is consistent with DatasetType
 sample: int = 4  # Sample Count, which means how many workflows will be resampled from generated workflows
--- a/examples/aflow/experiments/optimize_math.py
+++ b/examples/aflow/experiments/optimize_math.py
@ -6,14 +6,6 @@
 from metagpt.configs.models_config import ModelsConfig
 from metagpt.ext.aflow.scripts.optimizer import DatasetType, Optimizer, QuestionType

-# DatasetType, QuestionType, and OptimizerType definitions
-# DatasetType = Literal["HumanEval", "MBPP", "GSM8K", "MATH", "HotpotQA", "DROP"]
-# QuestionType = Literal["math", "code", "qa"]
-# OptimizerType = Literal["Graph", "Test"]
-
-# When you fisrt use, please download the datasets and initial rounds; If you want to get a look of the results, please download the results.
-# download(["datasets", "initial_rounds"])
-
 # Crucial Parameters
 dataset: DatasetType = "MATH"  # Ensure the type is consistent with DatasetType
 sample: int = 4  # Sample Count, which means how many workflows will be resampled from generated workflows
--- a/examples/aflow/experiments/optimize_mbpp.py
+++ b/examples/aflow/experiments/optimize_mbpp.py
@ -6,14 +6,6 @@
 from metagpt.configs.models_config import ModelsConfig
 from metagpt.ext.aflow.scripts.optimizer import DatasetType, Optimizer, QuestionType

-# DatasetType, QuestionType, and OptimizerType definitions
-# DatasetType = Literal["HumanEval", "MBPP", "GSM8K", "MATH", "HotpotQA", "DROP"]
-# QuestionType = Literal["math", "code", "qa"]
-# OptimizerType = Literal["Graph", "Test"]
-
-# When you fisrt use, please download the datasets and initial rounds; If you want to get a look of the results, please download the results.
-# download(["datasets", "initial_rounds"])
-
 # Crucial Parameters
 dataset: DatasetType = "MBPP"  # Ensure the type is consistent with DatasetType
 sample: int = 4  # Sample Count, which means how many workflows will be resampled from generated workflows
--- a/examples/aflow/optimize.py
+++ b/examples/aflow/optimize.py
@ -3,25 +3,33 @@
 # @Author  : didi
 # @Desc    : Entrance of AFlow.

+import argparse
+
 from metagpt.configs.models_config import ModelsConfig
 from metagpt.ext.aflow.data.download_data import download
-from metagpt.ext.aflow.scripts.optimizer import DatasetType, Optimizer, QuestionType
+from metagpt.ext.aflow.scripts.optimizer import Optimizer

 # DatasetType, QuestionType, and OptimizerType definitions
 # DatasetType = Literal["HumanEval", "MBPP", "GSM8K", "MATH", "HotpotQA", "DROP"]
 # QuestionType = Literal["math", "code", "qa"]
 # OptimizerType = Literal["Graph", "Test"]

-# Crucial Parameters
-dataset: DatasetType = "MATH"  # Ensure the type is consistent with DatasetType
-sample: int = 4  # Sample Count, which means how many workflows will be resampled from generated workflows
-question_type: QuestionType = "math"  # Ensure the type is consistent with QuestionType
-optimized_path: str = "metagpt/ext/aflow/scripts/optimized"  # Optimized Result Save Path
-initial_round: int = 1  # Corrected the case from Initial_round to initial_round
-max_rounds: int = 20  # The max iteration of AFLOW.
-check_convergence: bool = True  # Whether Early Stop
-validation_rounds: int = 5  # The validation rounds of AFLOW.
-if_fisrt_optimize = True  # You should change it to False after the first optimize.
+
+def parse_args():
+    parser = argparse.ArgumentParser(description="AFlow Optimizer")
+    parser.add_argument("--dataset", type=str, default="MATH", help="Dataset type")
+    parser.add_argument("--sample", type=int, default=4, help="Sample count")
+    parser.add_argument("--question_type", type=str, default="math", help="Question type")
+    parser.add_argument(
+        "--optimized_path", type=str, default="metagpt/ext/aflow/scripts/optimized", help="Optimized result save path"
+    )
+    parser.add_argument("--initial_round", type=int, default=1, help="Initial round")
+    parser.add_argument("--max_rounds", type=int, default=20, help="Max iteration rounds")
+    parser.add_argument("--check_convergence", type=bool, default=True, help="Whether to enable early stop")
+    parser.add_argument("--validation_rounds", type=int, default=5, help="Validation rounds")
+    parser.add_argument("--if_first_optimize", type=bool, default=True, help="Whether this is first optimization")
+    return parser.parse_args()
+

 # Config llm model, you can modify `config/config2.yaml` to use more llms.
 mini_llm_config = ModelsConfig.default().get("gpt-4o-mini")
@ -37,24 +45,26 @@ operators = [
    "Programmer",  # It's for math
 ]

-# Create an optimizer instance
-optimizer = Optimizer(
-    dataset=dataset,  # Config dataset
-    question_type=question_type,  # Config Question Type
-    opt_llm_config=claude_llm_config,  # Config Optimizer LLM
-    exec_llm_config=mini_llm_config,  # Config Execution LLM
-    check_convergence=check_convergence,  # Whether Early Stop
-    operators=operators,  # Config Operators you want to use
-    optimized_path=optimized_path,  # Config Optimized workflow's file path
-    sample=sample,  # Only Top(sample) rounds will be selected.
-    initial_round=initial_round,  # Optimize from initial round
-    max_rounds=max_rounds,  # The max iteration of AFLOW.
-    validation_rounds=validation_rounds,  # The validation rounds of AFLOW.
-)
-
 if __name__ == "__main__":
+    args = parse_args()
+
+    # Create an optimizer instance
+    optimizer = Optimizer(
+        dataset=args.dataset,  # Config dataset
+        question_type=args.question_type,  # Config Question Type
+        opt_llm_config=claude_llm_config,  # Config Optimizer LLM
+        exec_llm_config=mini_llm_config,  # Config Execution LLM
+        check_convergence=args.check_convergence,  # Whether Early Stop
+        operators=operators,  # Config Operators you want to use
+        optimized_path=args.optimized_path,  # Config Optimized workflow's file path
+        sample=args.sample,  # Only Top(sample) rounds will be selected.
+        initial_round=args.initial_round,  # Optimize from initial round
+        max_rounds=args.max_rounds,  # The max iteration of AFLOW.
+        validation_rounds=args.validation_rounds,  # The validation rounds of AFLOW.
+    )
+
    # When you fisrt use, please download the datasets and initial rounds; If you want to get a look of the results, please download the results.
-    download(["datasets", "initial_rounds"], if_first_download=if_fisrt_optimize)
+    download(["datasets", "initial_rounds"], if_first_download=args.if_first_optimize)
    # Optimize workflow via setting the optimizer's mode to 'Graph'
    optimizer.optimize("Graph")
    # Test workflow via setting the optimizer's mode to 'Test'