Hacker Newsnew | past | comments | ask | show | jobs | submit | Sajarin's commentslogin


People aren't good at detecting AI generated/edited comments, so unsure how effective this policy will be. Though I guess there are still some obvious signs of AI speak like emdashes and sycophantic (it's not X, it's Y!) speech.

Bit of a shameless plug but I wrote a HN AI comment detector game[0] with AI and most of my friends and fellow HN users who tried it out couldn't detect them.

[0]: https://psychosis.hn/

[1]: https://sajarin.com/blog/psychosis/


Something I've noticed through moderation is that people are much more easily duped by generated comments if they like the content and/or agree with the point. We've seen several cases where a bot-generated comment has been heavily upvoted and sits at the top of the thread for hours, and any comments calling it out for being generated languish at the bottom of the subthread below other enthusiastic, heavily upvoted replies. This shouldn't be surprising, given what we've seen of LLM chatbots being tuned to be sycophantic, but it's interesting to see it in effect on HN.

This is another reason why it's good to email us (hn@ycombinator.com) rather than commenting when you see generated comments.


Do you have reason to believe that you have a reliable way in these cases of determining whether the comment is generated?


Having been reading generated comments almost daily for over three years now, I have a pretty good sense of it. There's a bunch of signals: how new the account is; how the comments look visually (the capitalization and layout of the paragraphs, particularly when all of one user's comments are displayed in a list). Em-dashes and short, emphatic sentences, make it more obvious of course.

There are cases that are more borderline; usually when someone has used a translation service or has used an LLM to polish up a comment they wrote themselves. For these ones there's less certainty, and whilst we discourage them, we're not as rigid in our aversion to them or as eager to ban accounts that do it.

But ones that are entirely generated are still pretty easy to spot, even just from visual appearance.


> HN AI comment detector game

Looks cool, but how exactly do you gather proven-to-be human comments?

I think it would be better if you used pre-ChatGPT (Nov 30 2022, I think?) stories.


I appreciate the restraint in not calling your game "AIdle".


It’s certainly hard to detect in isolation, but the thing that gives it away is the comment history.

All the AI acounts I’ve seen repeatedly post the exact same cookie cutter top-level comments over and over again. Typically some vapid observation followed by an obviously forced question serving as engagement bait. The paragraphs and sentence structure even looks visually similar across comments when you scroll down the history page.

Just look at a few of these accounts and you’ll easily be able to recognize AI posts on your own.

https://news.ycombinator.com/threads?id=naomi_kynes https://news.ycombinator.com/threads?id=aplomb1026 https://news.ycombinator.com/threads?id=decker_dev https://news.ycombinator.com/threads?id=CloakHQ https://news.ycombinator.com/threads?id=coolcoder9520 https://news.ycombinator.com/threads?id=ptak_dev https://news.ycombinator.com/threads?id=oliver_dr https://news.ycombinator.com/threads?id=agent5ravi https://news.ycombinator.com/threads?id=yuyuqueen https://news.ycombinator.com/threads?id=entrustai https://news.ycombinator.com/threads?id=coder_decoder https://news.ycombinator.com/threads?id=mergisi https://news.ycombinator.com/threads?id=JEONSEWON https://news.ycombinator.com/threads?id=devonkelley https://news.ycombinator.com/threads?id=iam_circuit https://news.ycombinator.com/threads?id=robotmem https://news.ycombinator.com/threads?id=RovaAI https://news.ycombinator.com/threads?id=ajstars https://news.ycombinator.com/threads?id=priowise https://news.ycombinator.com/threads?id=Yanko_11 https://news.ycombinator.com/threads?id=zacklee-aud https://news.ycombinator.com/threads?id=shablulman https://news.ycombinator.com/threads?id=octoclaw https://news.ycombinator.com/threads?id=zacklee1988 https://news.ycombinator.com/threads?id=bhekanik https://news.ycombinator.com/threads?id=webpolis https://news.ycombinator.com/threads?id=claud_ia https://news.ycombinator.com/threads?id=david_iqlabs https://news.ycombinator.com/threads?id=yamarldfst https://news.ycombinator.com/threads?id=julius_eth_dev https://news.ycombinator.com/threads?id=vexnull https://news.ycombinator.com/threads?id=idorozin


> obvious signs of AI speak like emdashes

Some of us were trained/self taught to write that way. Even "it's not X, it's Y" is a legitimate and subjectively effective communication tool, and there are those of us who either by training modeling have picked it up as a habit. It's not Ai that started this, Ai learned it from us.

Crap - I just did it, didn't I? Awww double crap! Did it again...


Forums and comments are not written as formal novels or text. Corporate-speak is also not typically used in these environments unless you are representing corporate.

So I think it's fine to scrutinize commenters who write that way.

Besides, the biggest offense of AI speak is making everything seem like a grand epiphany and revolutionary discovery. Aka engagement bait.


Shameless plug but made a similar tree here: https://sajarin.com/blog/modeltree/


Thanks, that's way more useful to me.

Allow me to contribute:

> Magistral: Magist(rate) + stral? Mag(nificent) + stral? Nobody knows.

That's just French for "masterful" or a way to describe lectures. There's a sense of greatness in that word that contrasts with the Mini in Ministral which is in turn might be a pun on "ménestrel" (minstrel), "ministre" (minister), or made to sound like Minitel (or all of the above).


This is great, I found it much more interesting to view this tree vs. the timeline alone.


psychosis.hn is a daily game. Every day we fetch three stories from a previous front page of HN, each with 5-7 AI comments threaded into the discussion. They have personas, reply to real people, and sometimes have real comments reparented underneath them.


I've got to say, that's pretty damn good, and pretty damn scary.


Seconded. I scored 0, missing all the bots and falsely marking some real human comments.


Wish this post got more of a response, so thanks so much for giving it a try! Hope it was at least fun (and maybe a bit horrifying) :)


Responses (well, all engagement) are a lottery*, so don't let it get you down. :)

* See e.g. mine: https://news.ycombinator.com/submitted?id=ben_w


Sonnet numbering has been weirder in the past.

Opus 3.5 was scrapped even though Sonnet 3.5 and Haiku 3.5 were released.

Not to mention Sonnet 3.7 (while Opus was still on version 3)

Shameless source: https://sajarin.com/blog/modeltree/


I like this tree visualization! The background with little squares is making the text difficult to read, though.


Thanks for the feedback friend, updated to make it (hopefully) a little easier to read!


Thanks, that means a lot! Let me know if you have any feedback or suggestions, I would love to work on any improvements :)


Those smooth chunks are all (mostly) public park land. Known as Presidio and part of the Golden Gate National Recreation Area.


You know your city.


I think this comes off a bit too strong (as well as the replies to this to be fair)

The example isn't quite accurate. If a friend bought you lunch, the social norm of reciprocity would incline you towards buying them lunch in the future (i.e part of your paycheck)

Free open source software is a public good. While there is no obligation to give back, giving back helps that public good become more useful to other people (including your future self). I'm against making contribution an obligation, but I'm not against light social pressure upon philanthropists who have the means (which is what the parent comment was doing).


In the lunch example, reciprocation would be releasing additional software under free software licenses, not payments.

There should be zero social pressure, as gifts do not convey obligation. It was the software author’s explicit choice when licensing and publishing the software to make clear that payment is not expected.


Do you routinely struggle in social situations? Do you frequently have people tell you that you misinterpreted social cues?

You are correct that no legal obligation was passed, but generally people feel that if you got something from a community that helped you succeed greatly you do have an obligation to throw something back to the organization to help it help others.

If you don't, that'ss generally classified by people as being a jackass


Gifts do confer obligations. This is widely agreed upon in human society. If you ignore this there will be consequences, just no legal ones.


What did Ilya see? (or rather what could he no longer bear to see?)

> Academics distorting graphs to make their benchmarks appear more impressive

> lavish 1.5 million dollar bonuses for everyone at the company

> Releasing an open source model that doesn't even use latent multi head attention in a open source AI world led by Chinese labs

> Constantly overhyping models as scary and dangerous to buy time to lobby against competitors and delay product launches

> Failing to match that hype as AGI is not yet here


I wonder if anyone has done an analysis on the HN user sentiment on the varying AI models over time. I'd be curious to see what that looks like. Increasingly, I'm seeing more and more people talk positively about Gemini and Google (and having used Gemini recently, I align with that sentiment)

I think Bard (lol) and Gemini got a late start and so lots of folks dismissed it but I feel like they've fully caught up. Definitely excited to see what Gemini 3 vs GPT-5 vs Claude 4 looks like!


I'm using Windsurf IDE so have all the main models available. Mainly doing Python, JS, HTML, CSS, some Go. I have found Claude 3.7 outperforms Gemini 2.5 and ChatGPT 4.1, 4o, Deepseek, etc, for my work in most cases.

I suspect that I experience some performance throttling with Gemini 2.5 in my Windsurf setup because it's just not as good as anecdotal reports by others, and benchmarks.

I also seem to run up against a kind of LLM laziness sometimes when they seemingly can't be bothered to answer a challenging prompt ... a consequence of load balancing in action perhaps.


Windsurf is about to lose its ability to use other models since it got bought by OpenAI. Still very cool tool though!


Who cares about sentiment when you can just look at a proxy for usage: https://openrouter.ai/rankings

EDIT: Specifically: https://openrouter.ai/rankings/programming?view=week


Gemini hit the top of a bunch of leaderboards recently so it probably prompted folks to try Gemini out and they found it useful.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: