Commit graph

509 commits

Author SHA1 Message Date
didi
d8c7174fc0 Update HotpotQA's init round 2024-10-21 23:13:29 +08:00
didi
23eec00b00 Update operator.py 2024-10-21 23:11:48 +08:00
didi
2d1d7ca219 Update Operator & Benchmark 2024-10-21 23:08:51 +08:00
Zhaoyang Yu
fe3fca514a Create download_data.py
Need to run the script in the aflow/data path
2024-10-21 16:34:48 +08:00
didi
c194415b35 Update mbpp & math's eval 2024-10-21 12:50:17 +08:00
didi
efa00f8bbb Update 2024-10-21 12:46:17 +08:00
didi
ade10684b7 Update Operator's code 2024-10-21 11:27:50 +08:00
Zhaoyang Yu
6ebf3c47c2 Update drop.py
Change comments into English, fix the in/out params'  type.
fix too many values to unpack in line 140.
Unify the quotes.
Remove "if" at line 148
2024-10-19 11:40:27 +08:00
didi
17f3cd4955 Refactor Evaluator 2024-10-19 07:41:59 +08:00
didi
5d6fa7a68f Update readme.md 2024-10-19 07:35:44 +08:00
didi
ebcacdd648 Update print error 2024-10-18 13:57:01 +08:00
didi
2b788b21f6 Update Annotation to English, And Update Operator.json 2024-10-18 13:50:51 +08:00
better629
d99054ab5e
Merge branch 'main' into main 2024-10-17 16:25:31 +08:00
didi
6aedc4a068 Update AFlow 2024-10-17 15:47:09 +08:00
Zhaoyang Yu
cea3473002 Update evaluator.py
change the data path of DROP
2024-10-16 20:19:19 +08:00
Zhaoyang Yu
859ee3d2e3 fix test() 2024-10-16 20:15:15 +08:00
Zhaoyang Yu
390b65fda3 Update HotpotQA 2024-10-16 19:55:10 +08:00
didi
eea94865ad Update Eval 2024-10-16 12:06:34 +08:00
didi
bb229f2319 Update AFlolw 2024-10-16 11:49:18 +08:00
didi
eae351466f Update AFlow 2024-10-16 11:44:01 +08:00
femto
a7efa27ce0 rm 2024-10-11 16:48:08 +08:00
didi
040a7324eb Update test_curve.py 2024-09-26 21:56:54 +08:00
didi
e8f6186a56 update 2024-09-26 20:06:57 +08:00
didi
f14830b16a Update optimizer.py 2024-09-25 16:47:12 +08:00
didi
8dfe2de34c Update 2024-09-25 16:46:20 +08:00
didi
6a84a9d49b Update HumanEval Eval 2024-09-24 19:28:03 +08:00
didi
c7f44e956d Update 2024-09-22 21:45:53 +08:00
didi
99a9f7b6e9 update humaneval data path and add baseline data 2024-09-22 19:35:06 +08:00
didi
e3bcedc298 Update Benchmark's data 2024-09-22 15:54:32 +08:00
didi
22e8f9d7fc Update baseline and benchmark; update evaluator 2024-09-22 15:46:50 +08:00
didi
63f3f884c9 Update for fengwei 2024-09-16 18:13:30 +08:00
didi
53890a5f86 更新了HotpotQA BenchMark 代码与对应的Self Consistency 实现 2024-09-13 12:56:18 +08:00
didi
0704f341de 更新了eval索引的入口 2024-09-11 17:53:52 +08:00
didi
b805da0bbe 更新了eval BUG,同时更新了新的baseline 2024-09-11 17:00:14 +08:00
didi
b9a2d94da2 更新了xml-compile方法,更新了剩余Baseline 2024-09-11 15:21:33 +08:00
Zhaoyang Yu
f691c5f439 Update QA 2024-09-10 21:51:38 +08:00
didi
bdf865eb0d Merge branch 'main' of https://github.com/didiforgithub/MetaGPT-MathAI 2024-09-10 19:06:17 +08:00
didi
7f45ef6231 Merge branch 'main' of https://github.com/didiforgithub/MetaGPT-MathAI 2024-09-10 19:04:05 +08:00
Zhaoyang Yu
445a2e6048 Update QA 2024-09-10 18:59:13 +08:00
didi
ab112462a5 添加了cost计算 2024-09-10 18:51:18 +08:00
Zhaoyang Yu
257b994409 Update drop.py 2024-09-10 18:45:54 +08:00
Zhaoyang Yu
68e87da378 Update Hotpotqa 2024-09-10 18:27:20 +08:00
Zhaoyang Yu
4ce18d7f48 Update baselines 2024-09-10 16:51:26 +08:00
didi
0b0a49d772 更新 hotpotqa Baseline 2024-09-10 12:48:28 +08:00
didi
c7c34cda7d Update human Eval 2024-09-10 10:58:39 +08:00
didi
62ffa730e0 update humaneval baseline & hotpotqa baseline 2024-09-10 10:57:21 +08:00
didi
7ffe68b499 重构了Evaluator 2024-09-09 18:19:22 +08:00
didi
4e0a896bdc 提交baseline例子;修改context-fill 格式识别方式 2024-09-09 17:17:15 +08:00
didi
ca560a844f 合入Eval与Optimize
当前问题
1. eval 存在部分参数差异(path,csv测试)
2. optimize 尝试新流程(优化后的optimize曲线);optimize 模版书写
3. optimize 在各个数据集上跑通
4. 创建baseline folder
5. 创建experiment data收集方法
6. 从ags中移出
2024-09-08 23:04:01 +08:00
didi
c3903412b4 Update Operator Optimize Method. 2024-09-02 16:47:03 +08:00