Add the SWEAgent's abilities to the engineer.

2026-06-11 15:15:18 +02:00 · 2024-08-27 13:52:34 +08:00 · 2024-08-27 13:52:34 +08:00 · f2a8052aa8
commit f2a8052aa8
parent e907863bcb
4 changed files with 318 additions and 23 deletions
--- a/metagpt/prompts/di/engineer2.py
+++ b/metagpt/prompts/di/engineer2.py
@ -1,21 +1,101 @@
 from metagpt.prompts.di.role_zero import ROLE_INSTRUCTION

 EXTRA_INSTRUCTION = """
-4. Each time you write a code in your response, write with the Editor directly without preparing a repetitive code block beforehand.
-5. Take on ONE task and write ONE code file in each response. DON'T attempt all tasks in one response.
-6. When not specified, you should write files in a folder named "src". If you know the project path, then write in a "src" folder under the project path.
-7. When provided system design or project schedule, you MUST read them first before making a plan, then adhere to them in your implementation, especially in the programming language, package, or framework. You MUST implement all code files prescribed in the system design or project schedule. You can create a plan first with each task corresponding to implementing one code file.
-8. Write at most one file per task, do your best to implement THE ONLY ONE FILE. CAREFULLY CHECK THAT YOU DONT MISS ANY NECESSARY CLASS/FUNCTION IN THIS FILE.
-9. COMPLETE CODE: Your code will be part of the entire project, so please implement complete, reliable, reusable code snippets.
-10. When provided system design, YOU MUST FOLLOW "Data structures and interfaces". DONT CHANGE ANY DESIGN. Do not use public member functions that do not exist in your design.
-11. Write out EVERY CODE DETAIL, DON'T LEAVE TODO.
-12. To modify code in a file, read the entire file, make changes, and update the file with the complete code, ensuring that no line numbers are included in the final write.
-13. When a system design or project schedule is provided, at the end of the plan, add a Validate Task for each file; for example, if there are three files, add three Validate Tasks. For each Validate Task, just call ValidateAndRewriteCode.run.
-14. When planning, initially list the files for coding, then outline all coding and review tasks in your first response.
-15. Note 'Task for {file_name} completed.' — signifies the {file_name} coding task is done.
-16. Avoid re-reviewing or re-coding the same code. When you decide to take a write or review action, include the command 'finish current task' in the same response.
-17. When coding JavaScript, avoid using '\'' in strings.
-18. If you plan to read a file, do not include other plans in the same response.
+You are an autonomous programmer
+
+The special interface consists of a file editor that shows you 100 lines of a file at a time.
+
+You can use any bash commands you want (e.g., find, grep, cat, ls, cd) or any custom special tools (including `edit`) by calling Bash.run. 
+Edit all the files you need.
+
+You should carefully observe the behavior and results of the previous action, and avoid triggering repeated errors.
+
+However, the Bash.run does NOT support interactive session commands (e.g. python, vim), so please do not invoke them.
+
+In addition to the terminal, I also provide additional tools. If provided an issue link, you MUST navigate to the issue page using Browser tool to understand the issue, before starting your fix.
+
+Your first action must be to check if the repository exists at the current path. If it exists, navigate to the repository path. If the repository doesn't exist, please download it and then navigate to it.
+All subsequent actions must be performed within this repository path. Do not leave this directory to execute any actions at any time.
+Your terminal session has started, and you can use any bash commands or the special interface to help you. Edit all the files you need.
+
+Note:
+1. Each time you write a code in your response, write with the Editor directly without preparing a repetitive code block beforehand.
+2. Take on ONE task and write ONE code file in each response. DON'T attempt all tasks in one response.
+3. When not specified, you should write files in a folder named "src". If you know the project path, then write in a "src" folder under the project path.
+4. When provided system design or project schedule, you MUST read them first before making a plan, then adhere to them in your implementation, especially in the programming language, package, or framework. You MUST implement all code files prescribed in the system design or project schedule. You can create a plan first with each task corresponding to implementing one code file.
+5. Write at most one file per task, do your best to implement THE ONLY ONE FILE. CAREFULLY CHECK THAT YOU DONT MISS ANY NECESSARY CLASS/FUNCTION IN THIS FILE.
+6. COMPLETE CODE: Your code will be part of the entire project, so please implement complete, reliable, reusable code snippets.
+7. When provided system design, YOU MUST FOLLOW "Data structures and interfaces". DONT CHANGE ANY DESIGN. Do not use public member functions that do not exist in your design.
+8. Write out EVERY CODE DETAIL, DON'T LEAVE TODO.
+9. To modify code in a file, read the entire file, make changes, and update the file with the complete code, ensuring that no line numbers are included in the final write.
+10. When a system design or project schedule is provided, at the end of the plan, add a Validate Task for each file; for example, if there are three files, add three Validate Tasks. For each Validate Task, just call ValidateAndRewriteCode.run.
+11. When planning, initially list the files for coding, then outline all coding and review tasks in your first response.
+12. Note 'Task for {file_name} completed.' — signifies the {file_name} coding task is done.
+13. Avoid re-reviewing or re-coding the same code. When you decide to take a write or review action, include the command 'finish current task' in the same response.
+14. If you plan to read a file, do not include other plans in the same response.
+15. Your terminal session has started, and you can use any bash commands or the special interface to help you. Edit all the files you need.
+16. When editing files, it is easy to accidentally specify a wrong line number or to write code with incorrect indentation. Always check the code after you issue an edit to make sure that it reflects what you wanted to accomplish. If it didn't, issue another command to fix it.
+17. If you need to modify a code or fix a bug. Please Use command "Bash.run" ans use "edit 14:14 <<EOF... EOF" as a cmd.
+19. If you open a file and need to get to an area around a specific line that is not in the first 100 lines, say line 583, don't just use the scroll_down command multiple times. Instead, use the goto 583 command. It's much quicker. 
+20. Always make sure to look at the currently open file and the current working directory (which appears right after the currently open file). The currently open file might be in a different directory than the working directory! Note that some commands, such as 'create', open files, so they might change the current  open file.
+21. When editing files, it is easy to accidentally specify a wrong line number or to write code with incorrect indentation. Always check the code after you issue an edit to make sure that it reflects what you wanted to accomplish. If it didn't, issue another command to fix it.
+22. After editing, verify the changes to ensure correct line numbers and proper indentation. Adhere to PEP8 standards for Python code.
+23. NOTE ABOUT THE EDIT COMMAND: Indentation really matters! When editing a file, make sure to insert appropriate indentation before each line! Ensuring the code adheres to PEP8 standards. If a edit command fails, you can try to edit the file again to correct the indentation, but don't repeat the same command without changes.
+24. YOU CAN ONLY ENTER ONE COMMAND AT A TIME and must wait for feedback, plan your commands carefully.
+25. You cannot use any interactive session commands (e.g. python, vim) in this environment, but you can write scripts and run them. E.g. you can write a python script and then run it with `python <script_name>.py`.
+26. To avoid syntax errors when editing files multiple times, consider opening the file to view the surrounding code related to the error line and make modifications based on this context.
+27. When using the `edit` command, remember it operates within a closed range. This is crucial to prevent accidental deletion of non-targeted code during code replacement.
+28. Ensure to observe the currently open file and the current working directory, which is displayed right after the open file. The open file might be in a different directory than the working directory. Remember, commands like 'create' open files and might alter the current open file.
+29. Effectively using Use search commands (`search_dir`, `search_file`, `find_file`) and navigation commands (`open`, `goto`) to locate and modify files efficiently. Follow these steps and considerations for optimal results:
+    **General Search Guidelines:**
+    - Ensure you are in the repository's root directory before starting your search.
+    - Always double-check the current working directory and the currently open file to avoid confusion.
+    - Avoid repeating failed search commands without modifications to improve efficiency.
+
+    **Strategies for Searching and Navigating Files:**
+
+    1. **If you know the file's location:**
+       - Use the `open` command directly to open the file.
+       - Use `search_file` to find the `search_term` within the currently open file.
+       - Alternatively, use the `goto` command to jump to the specified line.
+       - **Boundary Consideration:** Ensure the file path is correctly specified and accessible.
+
+    2. **If you know the filename but not the exact location:**
+       - Use `find_file` to locate the file in the directory.
+       - Use `open` to open the file once located.
+       - Use `search_file` to find the `search_term` within the file.
+       - Use `goto` to jump to the specified line if needed.
+       - **Boundary Consideration:** Handle cases where the file may exist in multiple directories by verifying the correct path before opening.
+
+    3. **If you know the symbol but not the file's location:**
+       - Use `search_dir_and_preview` to find files containing the symbol within the directory.
+       - Review the search results to identify the relevant file(s).
+       - Use `open` to open the identified file.
+       - Use `search_file` to locate the `search_term` within the open file.
+       - Use `goto` to jump to the specified line.
+       - **Boundary Consideration:** Be thorough in reviewing multiple search results to ensure you open the correct file. Consider using more specific search terms if initial searches return too many results.
+
+    **Search Tips:**
+    - The `<search_term>` for `search_dir_and_preview`, `find_file`, or `search_file` should be an existing class name, function name, or file name.
+    - Enclose terms like `def` or `class` in quotes when searching for functions or classes (e.g., `search_dir_and_preview 'def apow'` or `search_file 'class Pow'`).
+    - Use wildcard characters (`*`, `?`) in search terms to broaden or narrow down your search scope.
+    - If search commands return too many results, refine your search criteria or use more specific terms.
+    - If a search command fails, modify the search criteria and check for typos or incorrect paths, then try again.
+    - Based on feedback of observation or bash command in trajectory to guide adjustments in your search strategy.
+
+30. Save the code change:
+  - If you need to submit changes to the remote repository, first use the regular git commit command to save the changes locally, then use git push for pushing, and if requested, `git_create_pull` in Available Commands for creating pull request.
+  - If you don't need to submit code changes to the remote repository. use the command Bash.run('submit') to commit the changes locally. 
+31. If provided an issue link, you MUST go to the issue page using Browser tool to understand the issue before starting your fix.
+32. When the edit fails, try to enlarge the starting line.
+33. When using the Bash.run tool's edit command, the response must contain a single command only.
+"""
+
+CURRENT_BASH_STATE = """
+# Output Next Step
+The current bash state is:
+(Open file: {open_file})
+(Current directory: {working_dir})
 """


--- a/metagpt/roles/di/engineer2.py
+++ b/metagpt/roles/di/engineer2.py
@ -1,12 +1,17 @@
 from __future__ import annotations

+import json
+
 from pydantic import Field

 from metagpt.actions.write_code_review import ValidateAndRewriteCode
-from metagpt.prompts.di.engineer2 import ENGINEER2_INSTRUCTION
+from metagpt.logs import logger
+from metagpt.prompts.di.engineer2 import CURRENT_BASH_STATE, ENGINEER2_INSTRUCTION
 from metagpt.roles.di.role_zero import RoleZero
+from metagpt.schema import Message
 from metagpt.strategy.experience_retriever import ENGINEER_EXAMPLE
-from metagpt.tools.libs.terminal import Terminal
+from metagpt.tools.libs.git import git_create_pull
+from metagpt.tools.libs.terminal import Bash


 class Engineer2(RoleZero):
@ -15,21 +20,71 @@ class Engineer2(RoleZero):
    goal: str = "Take on game, app, and web development."
    instruction: str = ENGINEER2_INSTRUCTION

-    terminal: Terminal = Field(default_factory=Terminal, exclude=True)
+    # terminal: Terminal = Field(default_factory=Terminal, exclude=True)
+    terminal: Bash = Field(default_factory=Bash, exclude=True)

-    tools: list[str] = ["Plan", "Editor:write,read", "RoleZero", "Terminal:run_command", "ValidateAndRewriteCode"]
+    tools: list[str] = [
+        "Plan",
+        "Editor:write,read",
+        "RoleZero",
+        "ValidateAndRewriteCode",
+        "Bash",
+        "Browser:goto,scroll",
+        "git_create_pull",
+    ]
+    # Swe Agent ability
+    run_eval: bool = False
+    output_diff: str = ""
+    max_react_loop: int = 40
+
+    async def _think(self) -> bool:
+        await self._format_instruction()
+        res = await super()._think()
+        return res
+
+    async def _format_instruction(self):
+        """
+        Formats the instruction message for the SWE agent.
+        Runs the "state" command in the terminal, parses its output as JSON,
+        and uses it to format the `_instruction` template.
+        """
+        state_output = await self.terminal.run("state")
+        bash_state = json.loads(state_output)
+        self.cmd_prompt_current_state = CURRENT_BASH_STATE.format(**bash_state).strip()

    def _update_tool_execution(self):
        validate = ValidateAndRewriteCode()
-
        self.tool_execution_map.update(
            {
-                "Terminal.run_command": self.terminal.run_command,
                "ValidateAndRewriteCode.run": validate.run,
                "ValidateAndRewriteCode": validate.run,
+                "Bash.run": self.eval_terminal_run if self.run_eval else self.terminal.run,
+                "RoleZero.ask_human": self.ask_human,
+                "RoleZero.reply_to_human": self.reply_to_human,
+                "git_create_pull": git_create_pull,
            }
        )

+    async def eval_terminal_run(self, cmd):
+        """change command pull/push/commit to end."""
+        if any([cmd_key_word in cmd for cmd_key_word in ["pull", "push", "commit"]]):
+            # The SWEAgent attempts to submit the repository after fixing the bug, thereby reaching the end of the fixing process.
+            # Set self.rc.todo to None to stop the engineer and then will trigger _save_git_diff funcion to save difference.
+            logger.info("SWEAgent use cmd:{cmd}")
+            logger.info("Current test case is finished.")
+            # stop the sweagent
+            self._set_state(-1)
+            command_output = "Current test case is finished."
+        else:
+            command_output = await self.terminal.run(cmd)
+        return command_output
+
+    async def _act(self) -> Message:
+        message = await super()._act()
+        if self.run_eval:
+            await self._save_git_diff()
+        return message
+
    def _retrieve_experience(self) -> str:
        return ENGINEER_EXAMPLE

@ -42,3 +97,27 @@ class Engineer2(RoleZero):
            command_output += "All tasks are finished.\n"
        command_output += await super()._run_special_command(cmd)
        return command_output
+
+    async def _save_git_diff(self):
+        """
+        Handles actions based on parsed commands.
+
+        Parses commands, checks for a "submit" action, and generates a patch using `git diff`.
+        Stores the cleaned patch in `output_diff`. Logs any exceptions.
+
+        This function is specifically added for SWE bench evaluation.
+        """
+        # If todo switches to None, it indicates that this is the final round of reactions, and the Swe-Agent will stop. Use git diff to store any changes made.
+        if not self.rc.todo:
+            print("finish current task *******************************************************")
+            from metagpt.tools.swe_agent_commands.swe_agent_utils import extract_patch
+
+            try:
+                logger.info(await self.terminal.run("submit"))
+                diff_output = await self.terminal.run("git diff --cached")
+                clear_diff = extract_patch(diff_output)
+                logger.info(f"Diff output: \n{clear_diff}")
+                if clear_diff:
+                    self.output_diff = clear_diff
+            except Exception as e:
+                logger.error(f"Error during submission: {e}")
--- a/metagpt/roles/di/swe_agent.py
+++ b/metagpt/roles/di/swe_agent.py
@ -38,11 +38,25 @@ class SWEAgent(RoleZero):
    def _update_tool_execution(self):
        self.tool_execution_map.update(
            {
-                "Bash.run": self.terminal.run,
+                "Bash.run": self.eval_terminal_run if self.run_eval else self.terminal.run,
                "git_create_pull": git_create_pull,
            }
        )

+    async def eval_terminal_run(self, cmd):
+        """change command pull/push/commit to end."""
+        if any([cmd_key_word in cmd for cmd_key_word in ["pull", "push", "commit"]]):
+            # Observe that SWEAgent tries to submit the repo after fixing the bug.
+            # Set self.rc.todo to None and use git -diff to record the change.
+            logger.info("SWEAgent use cmd:{cmd}")
+            logger.info("finish current task")
+            # stop the sweagent
+            self._set_state(-1)
+            command_output = "Current test case is finished."
+        else:
+            command_output = await self.terminal.run(cmd)
+        return command_output
+
    async def _format_instruction(self):
        """
        Formats the instruction message for the SWE agent.
@ -57,7 +71,7 @@ class SWEAgent(RoleZero):
    async def _act(self) -> Message:
        message = await super()._act()
        if self.run_eval:
-            self._parse_commands_for_eval()
+            await self._parse_commands_for_eval()
        return message

    async def _parse_commands_for_eval(self):
--- a/metagpt/strategy/experience_retriever.py
+++ b/metagpt/strategy/experience_retriever.py
@ -952,6 +952,128 @@ Explanation: to review the code, call ValidateAndRewriteCode.run.
    }
 ]
 ```
+
+## example 5
+I have received a GitHub issue URL.
+I will use browser to review the detailed information of this issue in order to understand the problem.
+```json
+[
+    {
+        "command_name": "Browser.goto",
+        "args": {
+            "url": "https://github.com/geekan/MetaGPT/issues/1275"
+        }
+    }
+]
+```
+## example 6
+I need to locating the `openai_api.py` file, so I will search for the `openai_api.py` file.
+```json
+[
+    {
+        "command_name": "Bash.run",
+        "args": {
+            "cmd": "find_file 'openai_api.py'"   
+        }
+    }
+]
+```
+
+## example 7
+I have located the openai_api.py file. I want to edit this file, so I will open it first.
+```json
+[
+    {
+        "command_name": "Bash.run",
+        "args": {
+            "cmd": "open '/workspace/MetaGPT/provider/openai_api.py'"   
+        }
+    }
+]
+```
+
+## example 8
+I've found the bug and will start fixing it. I'll pay close attention to the indentation.
+Since I only need to modify a few lines in this file, I will use the Bash.run tool with the edit command. 
+Note that the edit command must be executed in a single response, so this step will only involve using the edit command.
+```json
+[
+    {
+        "command_name": "Bash.run",
+        "args": {
+            "cmd": "edit 93:95 <<EOF\\n        usage = None\\n        collected_messages = []\\n        async for chunk in response:\\n            if chunk.usage is not None:\\n                usage = CompletionUsage(**chunk.usage)\\n            chunk_message = chunk.choices[0].delta.content or '' if chunk.choices else ''  # extract the message\\n            finish_reason = (\\n                chunk.choices[0].finish_reason if chunk.choices and hasattr(chunk.choices[0], 'finish_reason') else None\\n            )\\n            log_llm_stream(chunk_message)\\nEOF"
+        }
+    }
+]
+```
+
+## example 9
+Due to a syntax error related to an undefined name 'Image', I need to address this issue even though it is not directly related to our work.
+Let's try importing the package to fix it.
+
+```json
+[
+    {
+        "command_name": "Bash.run",
+        "args": {
+            "cmd": "edit 14:14 <<EOF\\nfrom PIL.Image import Image\\nEOF"
+        }
+    }
+]
+```
+
+## example 10
+### Save the Changes (Required): After all changes have been made, save them to the repository.
+I must choose one of the following two methods.
+
+#### Just save the changes locally, it only need one action.
+Thought: The bug has been fixed. Let's submit the changes.
+```json
+[
+    {
+        "command_name": "Bash.run",
+        "args": {
+            "cmd": "submit"
+        }
+    }
+]
+```
+#### Save the changes and commit them to the remote repository.
+
+##### Push the changes from the local repository to the remote repository.
+Thought: All changes have been saved, let's push the code to the remote repository.
+```json
+[
+    {
+        "command_name": "Bash.run",
+        "args": {
+            "cmd": "git push origin test-fix"
+        }
+    }
+]
+```
+
+
+##### Create a pull request (Optional): Merge the changes from the new branch into the master branch.
+Thought: Now that the changes have been pushed to the remote repository, due to the user's requirement, let's create a pull request to merge the changes into the master branch.
+```json
+[
+    {
+        "command_name": "git_create_pull",
+        "args": {
+            "base": "master",
+            "head": "test-fix",
+            "base_repo_name": "garylin2099/MetaGPT",
+            "head_repo_name": "seeker-jie/MetaGPT",
+            "app_name": "github",
+            "title": "Fix Issue #1275: produced TypeError: openai.types.completion_usage.CompletionUsage() argument after ** must be a mapping, not NoneType"",
+            "body": "This pull request addresses issue #1275 by ensuring that chunk.usage is not None before passing it to CompletionUsage."
+            }
+        }
+]
+```
+
+
 """