Hacker News
new
|
ask
|
show
|
jobs
Senior SWE-Bench: open-source benchmark that assesses agents as senior engineers
(senior-swe-bench.snorkel.ai)
138 points
by
matt_d
15 hours ago
|
95 comments
Loading...