Merge branch 'geekan:main' into feat_st_game

This commit is contained in:
better629 2024-03-26 09:51:02 +08:00 committed by GitHub
commit 64350d2c6d
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
303 changed files with 9155 additions and 3587 deletions

View file

@ -1,22 +0,0 @@
# -*- encoding: utf-8 -*-
"""
@Date : 2024/01/24 15:11:27
@Author : orange-crow
@File : crawl_webpage.py
"""
from metagpt.roles.ci.code_interpreter import CodeInterpreter
async def main():
prompt = """Get data from `paperlist` table in https://papercopilot.com/statistics/iclr-statistics/iclr-2024-statistics/,
and save it to a csv file. paper title must include `multiagent` or `large language model`. *notice: print key variables*"""
ci = CodeInterpreter(goal=prompt, use_tools=True)
await ci.run(prompt)
if __name__ == "__main__":
import asyncio
asyncio.run(main())

View file

@ -0,0 +1 @@
Bob likes traveling.

View file

@ -0,0 +1,109 @@
Productivity
I think I am at least somewhat more productive than average, and people sometimes ask me for productivity tips. So I decided to just write them all down in one place.
Compound growth gets discussed as a financial concept, but it works in careers as well, and it is magic. A small productivity gain, compounded over 50 years, is worth a lot. So its worth figuring out how to optimize productivity. If you get 10% more done and 1% better every day compared to someone else, the compounded difference is massive.
What you work on
Famous writers have some essential qualities, creativity and discipline
It doesnt matter how fast you move if its in a worthless direction. Picking the right thing to work on is the most important element of productivity and usually almost ignored. So think about it more! Independent thought is hard but its something you can get better at with practice.
The most impressive people I know have strong beliefs about the world, which is rare in the general population. If you find yourself always agreeing with whomever you last spoke with, thats bad. You will of course be wrong sometimes, but develop the confidence to stick with your convictions. It will let you be courageous when youre right about something important that most people dont see.
I make sure to leave enough time in my schedule to think about what to work on. The best ways for me to do this are reading books, hanging out with interesting people, and spending time in nature.
Ive learned that I cant be very productive working on things I dont care about or dont like. So I just try not to put myself in a position where I have to do them (by delegating, avoiding, or something else). Stuff that you dont like is a painful drag on morale and momentum.
By the way, here is an important lesson about delegation: remember that everyone else is also most productive when theyre doing what they like, and do what youd want other people to do for you—try to figure out who likes (and is good at) doing what, and delegate that way.
If you find yourself not liking what youre doing for a long period of time, seriously consider a major job change. Short-term burnout happens, but if it isnt resolved with some time off, maybe its time to do something youre more interested in.
Ive been very fortunate to find work I like so much Id do it for free, which makes it easy to be really productive.
Its important to learn that you can learn anything you want, and that you can get better quickly. This feels like an unlikely miracle the first few times it happens, but eventually you learn to trust that you can do it.
Doing great work usually requires colleagues of some sort. Try to be around smart, productive, happy, and positive people that dont belittle your ambitions. I love being around people who push me and inspire me to be better. To the degree you able to, avoid the opposite kind of people—the cost of letting them take up your mental cycles is horrific.
You have to both pick the right problem and do the work. There arent many shortcuts. If youre going to do something really important, you are very likely going to work both smart and hard. The biggest prizes are heavily competed for. This isnt true in every field (there are great mathematicians who never spend that many hours a week working) but it is in most.
Prioritization
Writers have to work hard to be successful
My system has three key pillars: “Make sure to get the important shit done”, “Dont waste time on stupid shit”, and “make a lot of lists”.
I highly recommend using lists. I make lists of what I want to accomplish each year, each month, and each day. Lists are very focusing, and they help me with multitasking because I dont have to keep as much in my head. If Im not in the mood for some particular task, I can always find something else Im excited to do.
I prefer lists written down on paper. Its easy to add and remove tasks. I can access them during meetings without feeling rude. I re-transcribe lists frequently, which forces me to think about everything on the list and gives me an opportunity to add and remove items.
I dont bother with categorization or trying to size tasks or anything like that (the most I do is put a star next to really important items).
I try to prioritize in a way that generates momentum. The more I get done, the better I feel, and then the more I get done. I like to start and end each day with something I can really make progress on.
I am relentless about getting my most important projects done—Ive found that if I really want something to happen and I push hard enough, it usually happens.
I try to be ruthless about saying no to stuff, and doing non-critical things in the quickest way possible. I probably take this too far—for example, I am almost sure I am terse to the point of rudeness when replying to emails.
Passion and adaptability are key qualities to writers
I generally try to avoid meetings and conferences as I find the time cost to be huge—I get the most value out of time in my office. However, it is critical that you keep enough space in your schedule to allow for chance encounters and exposure to new people and ideas. Having an open network is valuable; though probably 90% of the random meetings I take are a waste of time, the other 10% really make up for it.
I find most meetings are best scheduled for 15-20 minutes, or 2 hours. The default of 1 hour is usually wrong, and leads to a lot of wasted time.
I have different times of day I try to use for different kinds of work. The first few hours of the morning are definitely my most productive time of the day, so I dont let anyone schedule anything then. I try to do meetings in the afternoon. I take a break, or switch tasks, whenever I feel my attention starting to fade.
I dont think most people value their time enough—I am surprised by the number of people I know who make $100 an hour and yet will spend a couple of hours doing something they dont want to do to save $20.
Also, dont fall into the trap of productivity porn—chasing productivity for its own sake isnt helpful. Many people spend too much time thinking about how to perfectly optimize their system, and not nearly enough asking if theyre working on the right problems. It doesnt matter what system you use or if you squeeze out every second if youre working on the wrong thing.
The right goal is to allocate your year optimally, not your day.
Physical factors
Very likely what is optimal for me wont be optimal for you. Youll have to experiment to find out what works best for your body. Its definitely worth doing—it helps in all aspects of life, and youll feel a lot better and happier overall.
It probably took a little bit of my time every week for a few years to arrive at what works best for me, but my sense is if I do a good job at all the below Im at least 1.5x more productive than if not.
Sleep seems to be the most important physical factor in productivity for me. Some sort of sleep tracker to figure out how to sleep best is helpful. Ive found the only thing Im consistent with are in the set-it-and-forget-it category, and I really like the Emfit QS+Active.
I like a cold, dark, quiet room, and a great mattress (I resisted spending a bunch of money on a great mattress for years, which was stupid—it makes a huge difference to my sleep quality. I love this one). Not eating a lot in the few hours before sleep helps. Not drinking alcohol helps a lot, though Im not willing to do that all the time.
I use a Chili Pad to be cold while I sleep if I cant get the room cold enough, which is great but loud (I set it up to have the cooler unit outside my room).
When traveling, I use an eye mask and ear plugs.
Writers usually have empathy to write good books.
This is likely to be controversial, but I take a low dose of sleeping pills (like a third of a normal dose) or a very low dose of cannabis whenever I cant sleep. I am a bad sleeper in general, and a particularly bad sleeper when I travel. It likely has tradeoffs, but so does not sleeping well. If you can already sleep well, I wouldnt recommend this.
I use a full spectrum LED light most mornings for about 10-15 minutes while I catch up on email. Its great—if you try nothing else in here, this is the thing Id try. Its a ridiculous gain for me. I like this one, and its easy to travel with.
Exercise is probably the second most important physical factor. I tried a number of different exercise programs for a few months each and the one that seemed best was lifting heavy weights 3x a week for an hour, and high intensity interval training occasionally. In addition to productivity gains, this is also the exercise program that makes me feel the best overall.
The third area is nutrition. I very rarely eat breakfast, so I get about 15 hours of fasting most days (except an espresso when I wake up). I know this is contrary to most advice, and I suspect its not optimal for most people, but it definitely works well for me.
Eating lots of sugar is the thing that makes me feel the worst and that I try hardest to avoid. I also try to avoid foods that aggravate my digestion or spike up inflammation (for example, very spicy foods). I dont have much willpower when it comes to sweet things, so I mostly just try to keep junk food out of the house.
I have one big shot of espresso immediately when I wake up and one after lunch. I assume this is about 200mg total of caffeine per day. I tried a few other configurations; this was the one that worked by far the best. I otherwise aggressively avoid stimulants, but I will have more coffee if Im super tired and really need to get something done.
If a writer want to be super, then should include innovative thinking.
Im vegetarian and have been since I was a kid, and I supplement methyl B-12, Omega-3, Iron, and Vitamin D-3. I got to this list with a year or so of quarterly blood tests; its worked for me ever since (I re-test maybe every year and a half or so). There are many doctors who will happily work with you on a super comprehensive blood test (and services like WellnessFX). I also go out of my way to drink a lot of protein shakes, which I hate and I wouldnt do if I werent vegetarian.
Other stuff
Heres what I like in a workspace: natural light, quiet, knowing that I wont be interrupted if I dont want to be, long blocks of time, and being comfortable and relaxed (Ive got a beautiful desk with a couple of 4k monitors on it in my office, but I spend almost all my time on my couch with my laptop).
I wrote custom software for the annoying things I have to do frequently, which is great. I also made an effort to learn to type really fast and the keyboard shortcuts that help with my workflow.
Like most people, I sometimes go through periods of a week or two where I just have no motivation to do anything (I suspect it may have something to do with nutrition). This sucks and always seems to happen at inconvenient times. I have not figured out what to do about it besides wait for the fog to lift, and to trust that eventually it always does. And I generally try to avoid people and situations that put me in bad moods, which is good advice whether you care about productivity or not.
In general, I think its good to overcommit a little bit. I find that I generally get done what I take on, and if I have a little bit too much to do it makes me more efficient at everything, which is a way to train to avoid distractions (a great habit to build!). However, overcommitting a lot is disastrous.
Dont neglect your family and friends for the sake of productivity—thats a very stupid tradeoff (and very likely a net productivity loss, because youll be less happy). Dont neglect doing things you love or that clear your head either.
Finally, to repeat one more time: productivity in the wrong direction isnt worth anything at all. Think more about what to work on.
Open-Mindedness and curiosity are essential to writers

View file

@ -5,6 +5,7 @@ Author: garylin2099
@Modified By: mashenquan, 2023-11-1. In accordance with Chapter 2.1.3 of RFC 116, modify the data type of the `send_to`
value of the `Message` object; modify the argument type of `get_by_actions`.
"""
import asyncio
import platform
from typing import Any
@ -105,4 +106,4 @@ def main(idea: str, investment: float = 3.0, n_round: int = 10):
if __name__ == "__main__":
fire.Fire(main)
fire.Fire(main) # run as python debate.py --idea="TOPIC" --investment=3.0 --n_round=5

View file

@ -8,14 +8,17 @@
import asyncio
from metagpt.actions import Action
from metagpt.config2 import Config
from metagpt.environment import Environment
from metagpt.roles import Role
from metagpt.team import Team
action1 = Action(name="AlexSay", instruction="Express your opinion with emotion and don't repeat it")
action1.llm.model = "gpt-4-1106-preview"
action2 = Action(name="BobSay", instruction="Express your opinion with emotion and don't repeat it")
action2.llm.model = "gpt-3.5-turbo-1106"
gpt35 = Config.default()
gpt35.llm.model = "gpt-3.5-turbo-1106"
gpt4 = Config.default()
gpt4.llm.model = "gpt-4-1106-preview"
action1 = Action(config=gpt4, name="AlexSay", instruction="Express your opinion with emotion and don't repeat it")
action2 = Action(config=gpt35, name="BobSay", instruction="Express your opinion with emotion and don't repeat it")
alex = Role(name="Alex", profile="Democratic candidate", goal="Win the election", actions=[action1], watch=[action2])
bob = Role(name="Bob", profile="Republican candidate", goal="Win the election", actions=[action2], watch=[action1])
env = Environment(desc="US election live broadcast")

20
examples/di/README.md Normal file
View file

@ -0,0 +1,20 @@
# Data Interpreter (DI)
## What is Data Interpreter
Data Interpreter is an agent who solves data-related problems through codes. It understands user requirements, makes plans, writes codes for execution, and uses tools if necessary. These capabilities enable it to tackle a wide range of scenarios, please check out the examples below. For overall design and technical details, please see our [paper](https://arxiv.org/abs/2402.18679).
## Example List
- Data visualization
- Machine learning modeling
- Image background removal
- Solve math problems
- Receipt OCR
- Tool usage: web page imitation
- Tool usage: web crawling
- Tool usage: text2image
- Tool usage: email summarization and response\
- More on the way!
Please see the [docs](https://docs.deepwisdom.ai/main/en/guide/use_cases/agent/interpreter/intro.html) for more explanation.
We are continuously releasing codes, stay tuned!

View file

@ -0,0 +1,21 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from metagpt.roles.di.data_interpreter import DataInterpreter
async def main():
template = "https://arxiv.org/list/{tag}/pastweek?skip=0&show=300"
tags = ["cs.ai", "cs.cl", "cs.lg", "cs.se"]
urls = [template.format(tag=tag) for tag in tags]
prompt = f"""This is a collection of arxiv urls: '{urls}' .
Record each article, remove duplicates by title (they may have multiple tags), filter out papers related to
large language model / agent / llm, print top 100 and visualize the word count of the titles"""
di = DataInterpreter(react_mode="react", tools=["scrape_web_playwright"])
await di.run(prompt)
if __name__ == "__main__":
import asyncio
asyncio.run(main())

View file

@ -0,0 +1,40 @@
# -*- encoding: utf-8 -*-
"""
@Date : 2024/01/24 15:11:27
@Author : orange-crow
@File : crawl_webpage.py
"""
from metagpt.roles.di.data_interpreter import DataInterpreter
PAPER_LIST_REQ = """"
Get data from `paperlist` table in https://papercopilot.com/statistics/iclr-statistics/iclr-2024-statistics/,
and save it to a csv file. paper title must include `multiagent` or `large language model`. *notice: print key variables*
"""
ECOMMERCE_REQ = """
Get products data from website https://scrapeme.live/shop/ and save it as a csv file.
**Notice: Firstly parse the web page encoding and the text HTML structure;
The first page product name, price, product URL, and image URL must be saved in the csv;**
"""
NEWS_36KR_REQ = """从36kr创投平台https://pitchhub.36kr.com/financing-flash 所有初创企业融资的信息, **注意: 这是一个中文网站**;
下面是一个大致流程, 你会根据每一步的运行结果对当前计划中的任务做出适当调整:
1. 爬取并本地保存html结构;
2. 直接打印第7个*`快讯`*关键词后2000个字符的html内容, 作为*快讯的html内容示例*;
3. 反思*快讯的html内容示例*中的规律, 设计正则匹配表达式来获取*`快讯`*的标题链接时间;
4. 筛选最近3天的初创企业融资*`快讯`*, 以list[dict]形式打印前5个
5. 将全部结果存在本地csv中
"""
async def main():
di = DataInterpreter(tools=["scrape_web_playwright"])
await di.run(ECOMMERCE_REQ)
if __name__ == "__main__":
import asyncio
asyncio.run(main())

View file

@ -0,0 +1,36 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
@Time : 2024/3/22 10:54
@Author : alexanderwu
@File : custom_tool.py
"""
from metagpt.roles.di.data_interpreter import DataInterpreter
from metagpt.tools.tool_registry import register_tool
@register_tool()
def magic_function(arg1: str, arg2: int) -> dict:
"""
The magic function that does something.
Args:
arg1 (str): ...
arg2 (int): ...
Returns:
dict: ...
"""
return {"arg1": arg1 * 3, "arg2": arg2 * 5}
async def main():
di = DataInterpreter(tools=["magic_function"])
await di.run("Just call the magic function with arg1 'A' and arg2 2. Tell me the result.")
if __name__ == "__main__":
import asyncio
asyncio.run(main())

View file

@ -0,0 +1,17 @@
import asyncio
from metagpt.logs import logger
from metagpt.roles.di.data_interpreter import DataInterpreter
from metagpt.utils.recovery_util import save_history
async def main(requirement: str = ""):
di = DataInterpreter()
rsp = await di.run(requirement)
logger.info(rsp)
save_history(role=di)
if __name__ == "__main__":
requirement = "Run data analysis on sklearn Iris dataset, include a plot"
asyncio.run(main(requirement))

View file

@ -0,0 +1,33 @@
# -*- encoding: utf-8 -*-
"""
@Date : 2024/02/07
@Author : Tuo Zhou
@File : email_summary.py
"""
import os
from metagpt.roles.di.data_interpreter import DataInterpreter
async def main():
email_account = "your_email_account"
# your password will stay only on your device and not go to LLM api
os.environ["email_password"] = "your_email_password"
### Prompt for automatic email reply, uncomment to try this too ###
# prompt = f"""I will give you your Outlook email account ({email_account}) and password (email_password item in the environment variable). You need to find the latest email in my inbox with the sender's suffix @gmail.com and reply "Thank you! I have received your email~"""""
### Prompt for automatic email summary ###
prompt = f"""I will give you your Outlook email account ({email_account}) and password (email_password item in the environment variable).
Firstly, Please help me fetch the latest 5 senders and full letter contents.
Then, summarize each of the 5 emails into one sentence (you can do this by yourself, no need to import other models to do this) and output them in a markdown format."""
di = DataInterpreter()
await di.run(prompt)
if __name__ == "__main__":
import asyncio
asyncio.run(main())

View file

@ -5,19 +5,18 @@
@Author : mannaandpoem
@File : imitate_webpage.py
"""
from metagpt.roles.ci.code_interpreter import CodeInterpreter
from metagpt.roles.di.data_interpreter import DataInterpreter
async def main():
web_url = "https://pytorch.org/"
prompt = f"""This is a URL of webpage: '{web_url}' .
Firstly, utilize Selenium and WebDriver for rendering.
Secondly, convert image to a webpage including HTML, CSS and JS in one go.
Finally, save webpage in a text file.
Secondly, convert image to a webpage including HTML, CSS and JS in one go.
Note: All required dependencies and environments have been fully installed and configured."""
ci = CodeInterpreter(goal=prompt, use_tools=True)
di = DataInterpreter(tools=["GPTvGenerator"])
await ci.run(prompt)
await di.run(prompt)
if __name__ == "__main__":

View file

@ -0,0 +1,23 @@
import fire
from metagpt.roles.di.data_interpreter import DataInterpreter
WINE_REQ = "Run data analysis on sklearn Wine recognition dataset, include a plot, and train a model to predict wine class (20% as validation), and show validation accuracy."
DATA_DIR = "path/to/your/data"
# sales_forecast data from https://www.kaggle.com/datasets/aslanahmedov/walmart-sales-forecast/data
SALES_FORECAST_REQ = f"""Train a model to predict sales for each department in every store (split the last 40 weeks records as validation dataset, the others is train dataset), include plot total sales trends, print metric and plot scatter plots of
groud truth and predictions on validation data. Dataset is {DATA_DIR}/train.csv, the metric is weighted mean absolute error (WMAE) for test data. Notice: *print* key variables to get more information for next task step.
"""
REQUIREMENTS = {"wine": WINE_REQ, "sales_forecast": SALES_FORECAST_REQ}
async def main(use_case: str = "wine"):
mi = DataInterpreter()
requirement = REQUIREMENTS[use_case]
await mi.run(requirement)
if __name__ == "__main__":
fire.Fire(main)

View file

@ -0,0 +1,16 @@
import asyncio
from metagpt.roles.di.data_interpreter import DataInterpreter
async def main(requirement: str):
role = DataInterpreter(use_reflection=True, tools=["<all>"])
await role.run(requirement)
if __name__ == "__main__":
data_path = "your/path/to/titanic"
train_path = f"{data_path}/split_train.csv"
eval_path = f"{data_path}/split_eval.csv"
requirement = f"This is a titanic passenger survival dataset, your goal is to predict passenger survival outcome. The target column is Survived. Perform data analysis, data preprocessing, feature engineering, and modeling to predict the target. Report accuracy on the eval data. Train data path: '{train_path}', eval data path: '{eval_path}'."
asyncio.run(main(requirement))

View file

@ -0,0 +1,21 @@
from metagpt.roles.di.data_interpreter import DataInterpreter
async def main():
# Notice: pip install metagpt[ocr] before using this example
image_path = "image.jpg"
language = "English"
requirement = f"""This is a {language} receipt image.
Your goal is to perform OCR on images using PaddleOCR, output text content from the OCR results and discard
coordinates and confidence levels, then recognize the total amount from ocr text content, and finally save as table.
Image path: {image_path}.
NOTE: The environments for Paddle and PaddleOCR are all ready and has been fully installed."""
di = DataInterpreter()
await di.run(requirement)
if __name__ == "__main__":
import asyncio
asyncio.run(main())

View file

@ -0,0 +1,15 @@
import asyncio
from metagpt.roles.di.data_interpreter import DataInterpreter
async def main(requirement: str = ""):
di = DataInterpreter()
await di.run(requirement)
if __name__ == "__main__":
image_path = "/your/path/to/the/image.jpeg"
save_path = "/your/intended/save/path/for/image_rm_bg.png"
requirement = f"This is a image, you need to use python toolkit rembg to remove the background of the image and save the result. image path:{image_path}; save path:{save_path}."
asyncio.run(main(requirement))

View file

@ -4,12 +4,12 @@
# @Desc :
import asyncio
from metagpt.roles.ci.code_interpreter import CodeInterpreter
from metagpt.roles.di.data_interpreter import DataInterpreter
async def main(requirement: str = ""):
code_interpreter = CodeInterpreter(use_tools=True, goal=requirement)
await code_interpreter.run(requirement)
di = DataInterpreter(tools=["SDEngine"])
await di.run(requirement)
if __name__ == "__main__":

View file

@ -0,0 +1,14 @@
import asyncio
from metagpt.roles.di.data_interpreter import DataInterpreter
async def main(requirement: str = ""):
di = DataInterpreter()
await di.run(requirement)
if __name__ == "__main__":
requirement = "Solve this math problem: The greatest common divisor of positive integers m and n is 6. The least common multiple of m and n is 126. What is the least possible value of m + n?"
# answer: 60 (m = 18, n = 42)
asyncio.run(main(requirement))

View file

@ -6,16 +6,25 @@
@File : llm_hello_world.py
"""
import asyncio
from pathlib import Path
from metagpt.llm import LLM
from metagpt.logs import logger
from metagpt.utils.common import encode_image
async def main():
llm = LLM()
logger.info(await llm.aask("hello world"))
# llm type check
question = "what's your name"
logger.info(f"{question}: ")
logger.info(await llm.aask(question))
logger.info("\n\n")
logger.info(
await llm.aask(
"who are you", system_msgs=["act as a robot, just answer 'I'am robot' if the question is 'who are you'"]
)
)
logger.info(await llm.aask_batch(["hi", "write python hello world."]))
hello_msg = [{"role": "user", "content": "count from 1 to 10. split by newline."}]
@ -29,12 +38,6 @@ async def main():
if hasattr(llm, "completion"):
logger.info(llm.completion(hello_msg))
# check if the configured llm supports llm-vision capacity. If not, it will throw a error
invoice_path = Path(__file__).parent.joinpath("..", "tests", "data", "invoices", "invoice-2.png")
img_base64 = encode_image(invoice_path)
res = await llm.aask(msg="if this is a invoice, just return True else return False", images=[img_base64])
assert "true" in res.lower()
if __name__ == "__main__":
asyncio.run(main())

23
examples/llm_vision.py Normal file
View file

@ -0,0 +1,23 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Desc : example to run the ability of LLM vision
import asyncio
from pathlib import Path
from metagpt.llm import LLM
from metagpt.utils.common import encode_image
async def main():
llm = LLM()
# check if the configured llm supports llm-vision capacity. If not, it will throw a error
invoice_path = Path(__file__).parent.joinpath("..", "tests", "data", "invoices", "invoice-2.png")
img_base64 = encode_image(invoice_path)
res = await llm.aask(msg="if this is a invoice, just return True else return False", images=[img_base64])
assert "true" in res.lower()
if __name__ == "__main__":
asyncio.run(main())

211
examples/rag_pipeline.py Normal file
View file

@ -0,0 +1,211 @@
"""RAG pipeline"""
import asyncio
from pydantic import BaseModel
from metagpt.const import DATA_PATH, EXAMPLE_DATA_PATH
from metagpt.logs import logger
from metagpt.rag.engines import SimpleEngine
from metagpt.rag.schema import (
BM25RetrieverConfig,
ChromaIndexConfig,
ChromaRetrieverConfig,
FAISSRetrieverConfig,
LLMRankerConfig,
)
DOC_PATH = EXAMPLE_DATA_PATH / "rag/writer.txt"
QUESTION = "What are key qualities to be a good writer?"
TRAVEL_DOC_PATH = EXAMPLE_DATA_PATH / "rag/travel.txt"
TRAVEL_QUESTION = "What does Bob like?"
LLM_TIP = "If you not sure, just answer I don't know."
class Player(BaseModel):
"""To demonstrate rag add objs."""
name: str = ""
goal: str = "Win The 100-meter Sprint."
tool: str = "Red Bull Energy Drink."
def rag_key(self) -> str:
"""For search"""
return self.goal
class RAGExample:
"""Show how to use RAG."""
def __init__(self):
self.engine = SimpleEngine.from_docs(
input_files=[DOC_PATH],
retriever_configs=[FAISSRetrieverConfig(), BM25RetrieverConfig()],
ranker_configs=[LLMRankerConfig()],
)
async def run_pipeline(self, question=QUESTION, print_title=True):
"""This example run rag pipeline, use faiss&bm25 retriever and llm ranker, will print something like:
Retrieve Result:
0. Productivi..., 10.0
1. I wrote cu..., 7.0
2. I highly r..., 5.0
Query Result:
Passion, adaptability, open-mindedness, creativity, discipline, and empathy are key qualities to be a good writer.
"""
if print_title:
self._print_title("Run Pipeline")
nodes = await self.engine.aretrieve(question)
self._print_retrieve_result(nodes)
answer = await self.engine.aquery(question)
self._print_query_result(answer)
async def add_docs(self):
"""This example show how to add docs.
Before add docs llm anwser I don't know.
After add docs llm give the correct answer, will print something like:
[Before add docs]
Retrieve Result:
Query Result:
Empty Response
[After add docs]
Retrieve Result:
0. Bob like..., 10.0
Query Result:
Bob likes traveling.
"""
self._print_title("Add Docs")
travel_question = f"{TRAVEL_QUESTION}{LLM_TIP}"
travel_filepath = TRAVEL_DOC_PATH
logger.info("[Before add docs]")
await self.run_pipeline(question=travel_question, print_title=False)
logger.info("[After add docs]")
self.engine.add_docs([travel_filepath])
await self.run_pipeline(question=travel_question, print_title=False)
async def add_objects(self, print_title=True):
"""This example show how to add objects.
Before add docs, engine retrieve nothing.
After add objects, engine give the correct answer, will print something like:
[Before add objs]
Retrieve Result:
[After add objs]
Retrieve Result:
0. 100m Sprin..., 10.0
[Object Detail]
{'name': 'Mike', 'goal': 'Win The 100-meter Sprint', 'tool': 'Red Bull Energy Drink'}
"""
if print_title:
self._print_title("Add Objects")
player = Player(name="Mike")
question = f"{player.rag_key()}"
logger.info("[Before add objs]")
await self._retrieve_and_print(question)
logger.info("[After add objs]")
self.engine.add_objs([player])
try:
nodes = await self._retrieve_and_print(question)
logger.info("[Object Detail]")
player: Player = nodes[0].metadata["obj"]
logger.info(player.name)
except Exception as e:
logger.error(f"nodes is empty, llm don't answer correctly, exception: {e}")
async def init_objects(self):
"""This example show how to from objs, will print something like:
Same as add_objects.
"""
self._print_title("Init Objects")
pre_engine = self.engine
self.engine = SimpleEngine.from_objs(retriever_configs=[FAISSRetrieverConfig()])
await self.add_objects(print_title=False)
self.engine = pre_engine
async def init_and_query_chromadb(self):
"""This example show how to use chromadb. how to save and load index. will print something like:
Query Result:
Bob likes traveling.
"""
self._print_title("Init And Query ChromaDB")
# save index
output_dir = DATA_PATH / "rag"
SimpleEngine.from_docs(
input_files=[TRAVEL_DOC_PATH],
retriever_configs=[ChromaRetrieverConfig(persist_path=output_dir)],
)
# load index
engine = SimpleEngine.from_index(
index_config=ChromaIndexConfig(persist_path=output_dir),
)
# query
answer = engine.query(TRAVEL_QUESTION)
self._print_query_result(answer)
@staticmethod
def _print_title(title):
logger.info(f"{'#'*30} {title} {'#'*30}")
@staticmethod
def _print_retrieve_result(result):
"""Print retrieve result."""
logger.info("Retrieve Result:")
for i, node in enumerate(result):
logger.info(f"{i}. {node.text[:10]}..., {node.score}")
logger.info("")
@staticmethod
def _print_query_result(result):
"""Print query result."""
logger.info("Query Result:")
logger.info(f"{result}\n")
async def _retrieve_and_print(self, question):
nodes = await self.engine.aretrieve(question)
self._print_retrieve_result(nodes)
return nodes
async def main():
"""RAG pipeline"""
e = RAGExample()
await e.run_pipeline()
await e.add_docs()
await e.add_objects()
await e.init_objects()
await e.init_and_query_chromadb()
if __name__ == "__main__":
asyncio.run(main())

21
examples/rag_search.py Normal file
View file

@ -0,0 +1,21 @@
"""Agent with RAG search."""
import asyncio
from examples.rag_pipeline import DOC_PATH, QUESTION
from metagpt.logs import logger
from metagpt.rag.engines import SimpleEngine
from metagpt.roles import Sales
async def search():
"""Agent with RAG search."""
store = SimpleEngine.from_docs(input_files=[DOC_PATH])
role = Sales(profile="Sales", store=store)
result = await role.run(QUESTION)
logger.info(result)
if __name__ == "__main__":
asyncio.run(search())

View file

@ -0,0 +1,72 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import asyncio
import shutil
from pathlib import Path
import typer
from metagpt.actions.rebuild_class_view import RebuildClassView
from metagpt.actions.rebuild_sequence_view import RebuildSequenceView
from metagpt.context import Context
from metagpt.llm import LLM
from metagpt.logs import logger
from metagpt.utils.git_repository import GitRepository
from metagpt.utils.project_repo import ProjectRepo
app = typer.Typer(add_completion=False, pretty_exceptions_show_locals=False)
@app.command("", help="Python project reverse engineering.")
def startup(
project_root: str = typer.Argument(
default="",
help="Specify the root directory of the existing project for reverse engineering.",
),
output_dir: str = typer.Option(default="", help="Specify the output directory path for reverse engineering."),
):
package_root = Path(project_root)
if not package_root.exists():
raise FileNotFoundError(f"{project_root} not exists")
if not _is_python_package_root(package_root):
raise FileNotFoundError(f'There are no "*.py" files under "{project_root}".')
init_file = package_root / "__init__.py" # used by pyreverse
init_file_exists = init_file.exists()
if not init_file_exists:
init_file.touch()
if not output_dir:
output_dir = package_root / "../reverse_engineering_output"
logger.info(f"output dir:{output_dir}")
try:
asyncio.run(reverse_engineering(package_root, Path(output_dir)))
finally:
if not init_file_exists:
init_file.unlink(missing_ok=True)
tmp_dir = package_root / "__dot__"
if tmp_dir.exists():
shutil.rmtree(tmp_dir, ignore_errors=True)
def _is_python_package_root(package_root: Path) -> bool:
for file_path in package_root.iterdir():
if file_path.is_file():
if file_path.suffix == ".py":
return True
return False
async def reverse_engineering(package_root: Path, output_dir: Path):
ctx = Context()
ctx.git_repo = GitRepository(output_dir)
ctx.repo = ProjectRepo(ctx.git_repo)
action = RebuildClassView(name="ReverseEngineering", i_context=str(package_root), llm=LLM(), context=ctx)
await action.run()
action = RebuildSequenceView(name="ReverseEngineering", llm=LLM(), context=ctx)
await action.run()
if __name__ == "__main__":
app()

View file

@ -1,33 +0,0 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
@File : search_kb.py
@Modified By: mashenquan, 2023-12-22. Delete useless codes.
"""
import asyncio
from langchain.embeddings import OpenAIEmbeddings
from metagpt.config2 import config
from metagpt.const import DATA_PATH, EXAMPLE_PATH
from metagpt.document_store import FaissStore
from metagpt.logs import logger
from metagpt.roles import Sales
def get_store():
llm = config.get_openai_llm()
embedding = OpenAIEmbeddings(openai_api_key=llm.api_key, openai_api_base=llm.base_url)
return FaissStore(DATA_PATH / "example.json", embedding=embedding)
async def search():
store = FaissStore(EXAMPLE_PATH / "example.json")
role = Sales(profile="Sales", store=store)
query = "Which facial cleanser is good for oily skin?"
result = await role.run(query)
logger.info(result)
if __name__ == "__main__":
asyncio.run(search())

View file

@ -4,21 +4,17 @@
"""
import asyncio
from metagpt.config2 import Config
from metagpt.roles import Searcher
from metagpt.tools.search_engine import SearchEngine, SearchEngineType
from metagpt.tools.search_engine import SearchEngine
async def main():
question = "What are the most interesting human facts?"
kwargs = {"api_key": "", "cse_id": "", "proxy": None}
# Serper API
# await Searcher(search_engine=SearchEngine(engine=SearchEngineType.SERPER_GOOGLE, **kwargs)).run(question)
# SerpAPI
# await Searcher(search_engine=SearchEngine(engine=SearchEngineType.SERPAPI_GOOGLE, **kwargs)).run(question)
# Google API
# await Searcher(search_engine=SearchEngine(engine=SearchEngineType.DIRECT_GOOGLE, **kwargs)).run(question)
# DDG API
await Searcher(search_engine=SearchEngine(engine=SearchEngineType.DUCK_DUCK_GO, **kwargs)).run(question)
search = Config.default().search
kwargs = search.model_dump()
await Searcher(search_engine=SearchEngine(engine=search.api_type, **kwargs)).run(question)
if __name__ == "__main__":

View file

@ -14,6 +14,22 @@ from metagpt.actions.action_node import ActionNode
from metagpt.llm import LLM
class Chapter(BaseModel):
name: str = Field(default="Chapter 1", description="The name of the chapter.")
content: str = Field(default="...", description="The content of the chapter. No more than 1000 words.")
class Chapters(BaseModel):
chapters: List[Chapter] = Field(
default=[
{"name": "Chapter 1", "content": "..."},
{"name": "Chapter 2", "content": "..."},
{"name": "Chapter 3", "content": "..."},
],
description="The chapters of the novel.",
)
class Novel(BaseModel):
name: str = Field(default="The Lord of the Rings", description="The name of the novel.")
user_group: str = Field(default="...", description="The user group of the novel.")
@ -28,22 +44,17 @@ class Novel(BaseModel):
ending: str = Field(default="...", description="The ending of the novel.")
class Chapter(BaseModel):
name: str = Field(default="Chapter 1", description="The name of the chapter.")
content: str = Field(default="...", description="The content of the chapter. No more than 1000 words.")
async def generate_novel():
instruction = (
"Write a novel named 'Harry Potter in The Lord of the Rings'. "
"Write a novel named 'Reborn in Skyrim'. "
"Fill the empty nodes with your own ideas. Be creative! Use your own words!"
"I will tip you $100,000 if you write a good novel."
)
novel_node = await ActionNode.from_pydantic(Novel).fill(context=instruction, llm=LLM())
chap_node = await ActionNode.from_pydantic(Chapter).fill(
chap_node = await ActionNode.from_pydantic(Chapters).fill(
context=f"### instruction\n{instruction}\n### novel\n{novel_node.content}", llm=LLM()
)
print(chap_node.content)
print(chap_node.instruct_content)
asyncio.run(generate_novel())