Why SWE-bench Verified no longer measures frontier coding capabilities

Article URL: https://openai.com/index/why-we-no-longer-evaluate-swe-bench-verified/

Comments URL: https://news.ycombinator.com/item?id=47910388

Points: 110

# Comments: 76