OpenClawAgent生态
PinchBench
龙虾模型排行榜,用来评估不同大模型在真实 Agent 自动化任务中的能力,帮助开发者选择适合其用例的模型。Benchmarking LLM models as AI agents across standardized coding tasks
标签:Agent生态AI agent testing AI coding agent benchmark coding assistant LLM benchmark model comparison openclaw
龙虾模型排行榜,用来评估不同大模型在真实 Agent 自动化任务中的能力,帮助开发者选择适合其用例的模型。Benchmarking LLM models as AI agents across standardized coding tasks
