Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
|
lieret's submissions
login
1.
Show HN: All the LM solutions on SWE-bench are bloated compared to humans
(
twitter.com/klieret
)
1 point
by
lieret
44 days ago
|
past
2.
Show HN: New eval from SWE-bench team evalutes LMs based on goals not tickets
(
codeclash.ai
)
5 points
by
lieret
5 months ago
|
past
|
1 comment
3.
Show HN: Randomly switching between LMs at every step boosts SWE-bench score
(
swebench.com
)
5 points
by
lieret
8 months ago
|
past
|
1 comment
4.
GPT-5 on SWE-bench: Cost and performance deep-dive
(
mini-swe-agent.com
)
4 points
by
lieret
8 months ago
|
past
|
3 comments
5.
Show HN: New SWE-bench leaderboard compares LMs without fancy agent scaffolds
(
swebench.com
)
2 points
by
lieret
8 months ago
|
past
6.
Show HN: Mini-swe-agent achieves 65% on SWE-bench in 100 lines of python
(
github.com/swe-agent
)
7 points
by
lieret
8 months ago
|
past
|
4 comments
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: