Hacker News
new
|
ask
|
show
|
jobs
SWE-bench Verified no longer measures frontier coding capabilities
(openai.com)
304 points
by
kmdupree
20 hours ago
|
168 comments
Loading...