wiseflow/test
2024-12-23 10:12:52 +08:00
..
webpage_samples add test 2024-12-18 22:45:20 +08:00
fetching_for_sample.py add test for v0.3.6 2024-12-23 10:12:52 +08:00
find_article_or_list.py add test for v0.3.6 2024-12-23 10:12:52 +08:00
get_info_test.py add test for v0.3.6 2024-12-23 10:12:52 +08:00
image.png add test 2024-12-18 22:45:20 +08:00
openai_wrapper.py add test for v0.3.6 2024-12-23 10:12:52 +08:00
prompts.py add test for v0.3.6 2024-12-23 10:12:52 +08:00
README.md add test for v0.3.6 2024-12-23 10:12:52 +08:00
vl_pic_test.py add test for v0.3.6 2024-12-23 10:12:52 +08:00

模型 提示语言 漏字 不遵守指令 识别错误 幻觉 总分 评价
Qwen/Qwen2-VL-72B-Instruct cn prompt 2 1 3 0 6
en prompt 2 1 1 0 4 👍
OpenGVLab/InternVL2-26B cn prompt 1 0 2 0 3 👍👍
en prompt 0 2 3 0 5
Pro/Qwen/Qwen2-VL-7B-Instruct cn prompt 1 1 2 1 5
en prompt 0 2 3 0 5
Pro/OpenGVLab/InternVL2-8B cn prompt 3 2 2 0 7
en prompt 2 2 4 1 9
deepseek-ai/deepseek-vl2 cn prompt 1 1 1 1 4 👍
en prompt 3 0 1 4 8