Hacker Newsnew | past | comments | ask | show | jobs | submit | lostmsu's commentslogin

There might be something to read between the lines for Putin.

Putin has nuclear weapons. What does he have to fear?

An assassination.

That doesn't mean there are many of these staying alive.

Not exactly, but pretty close: https://artificialanalysis.ai/models/capabilities/coding?mod...

Somewhere between Haiku 4.5 and Sonnet 4.5


> Somewhere between Haiku 4.5 and Sonnet 4.5

That's like saying "somewhere between Eliza and Haiku 4.5". Haiku is not even a so-called 'reasoning model'.¹

¹ To preempt the easily-offended, this is what the latest Opus 4.6 in today's Claude Code update says: "Claude Haiku 4.5 is not a reasoning model — it's optimized for speed and cost efficiency. It's the fastest model in the Claude family, good for quick, straightforward tasks, but it doesn't have extended thinking/reasoning capabilities."


Haiku 4.5 is a reasoning model. [0]

[0]: https://www-cdn.anthropic.com/7aad69bf12627d42234e01ee7c3630...

> Claude Haiku 4.5, a new hybrid reasoning large language model from Anthropic in our small, fast model class.

> As with each model released by Anthropic beginning with Claude Sonnet 3.7, Claude Haiku 4.5 is a hybrid reasoning model. This means that by default the model will answer a query rapidly, but users have the option to toggle on “extended thinking mode”, where the model will spend more time considering its response before it answers. Note that our previous model in the Haiku small-model class, Claude Haiku 3.5, did not have an extended thinking mode.


Sure, marketing people gonna market. But Haiku's 'extended thinking' mode is very different than the reasoning capabilities of Sonnet or Opus.

I would absolutely believe mar-ticles that Qwen has achieved Haiku 4.5 'extended thinking' levels of coding prowess.


>Sure, marketing people gonna market.

Oh HN never change.


Not sure what this means, but as a marketing person myself, here's what happened: One day, an Anthropican involved in the Haiku 4.5 launch shrugged, weighed the odds of getting spanked for equating "extended thinking" with "reasoning", and then used Claude to generate copy declaring that. It's not rocket surgery!

It's mainly that people on here, regardless of profession, speak incorrectly but confidentally about things that could be easily verified with a Google search or basic familiarity with the thing in question.

Haiku 4.5 is a reasoning model, regardless of whatever hallucination you read. Being a hybrid reasoning model means that, depending on the complexity of the question and whether you explicitly enable reasoning (this is "extended thinking" in the API and other interfaces) when making a request to the LLM, it will emit reasoning tokens separately prior to the tokens used in the main response.

I love your theory that there was some mix up on their side because they were lazy and it was just some marketing dude being quirky with the technical language.


> It's mainly that people on here, regardless of profession, speak incorrectly but confidentally about things that could be easily verified with a Google search or basic familiarity with the thing in question.

Yep. And if your heart wants to call Haiku a "reasoning model", obviously you must listen. It doesn't meet that bar for me for a couple reasons: (1) It lacks both "adaptive thinking" and "interleaved thinking" (per Anthropic, both critical for reasoning models), and (2) it also performed unacceptably with a real-world collection of very basic reasoning tasks that I tried using it for.¹ I'm glad you're having better luck with it.

That said, it's a great and affordable little model for what it was designed for!

¹ I once made the mistake of converting a bunch of skills (which require basic reasoning) to use Haiku for Axiom (https://charleswiltgen.github.io/Axiom/). It failed miserably, and wow, did users let me have it. On the bright side, as a result I'm now far better at testing models' ability to reason.


We are all reasonable people here, and while you are (mostly) correct, I think we can all agree that Anthropic documentation sucks. If I have to infer from the doc:

* Haiku 4.5 by default doesn't think, i.e. it has a default thinking budget of 0.

* By setting a non-zero thinking budget, Haiku 4.5 can think. My guess is that Claude Code may set this differently for different tasks, e.g. thinking for Explore, no thinking for Compact.

* This hybrid thinking is different from the adaptive thinking introduced in Opus 4.6, which when enabled, can automatically adjust the thinking level based on task difficulty.


Looks much closer to Haiku than Sonnet.

Maybe "Qwen3.5 122B offers Haiku 4.5 performance on local computers" would be a more realistic and defensible claim.


I won't disagree - the guideline prescribes to keep the original title as much as possible, and I failed to find more neutral source.

Looking at their benchmarks there doesn't appear to be meaningful difference between their quants and bartowsky quants.

No our Qwen3.5 new ones show the opposite see https://unsloth.ai/docs/models/qwen3.5/gguf-benchmarks

Am I misreading the table?

  Unsloth Q4_K_M

  PPL:       6.6053     KLD 99.9%: 0.5478     KLD mean: 0.0192

  bartowski Qwen_Q4_K_M

  PPL:       6.6097     KLD 99.9%: 0.5771     KLD mean: 0.0182

Barely noticeable drop in PPL; noticeable KLD drop (good, 5%); but worse KLD mean (bad, 5%).

You forgot to check the disk sapce - _M and _XL are not the same across quants:

Unsloth Q4_K_M 18.49GB 0.5478 KLD 99.9% 0.0192 mean

Unsloth Q4_K_XL 19.17GB 0.4097 KLD 99.9% 0.0137 mean

bartowski Q4_K_M 19.77GB 0.5771 KLD 99.9% 0.0182 mean


The table doesn't have bartowski Q4_K_XL to compare, but given the metrics of _Ms aren't universally better it's unclear if smaller size doesn't come with a cost.

> And of course they're also going to train on your private inputs. It's right there in the TOS.

Anthropic actually says they won't train on your private inputs on paid plans as long as you opted out. Unlike Google and OpenAI.


GitHub also does it fully automatically (but they don't share explicit criteria).

Your list is missing Nazi parties somewhere between the non whites and voting rights. And for most of the countries in the world - gun owners at the top of the list. Just speaking from historical perspective.

Maybe I'm misunderstanding, but are you saying that banning Nazi Parties and gun regulation are the first steps toward fascism and autocracy?

I'm saying the list above carefully includes a bunch of more or less universally recognized good things, with the subject added on top, implying that the "left" views on sexuality are also good things. But that form of argument is lying to you because this list omits bad things and other things in grey area.

To be fair, that depends on what the poster meant by "to be targeted". The list looks like it implies banning or criminalizing, but again, no one is being banned or criminalized under the legislation we are discussing.


Nazis can fuck off.

PS. Damn son, you put your LinkedIn out there for everyone to see too? https://www.linkedin.com/in/victor-msu


re: LinkedIn: well yeah. I strongly believe nobody should have "одни слова для кухонь, другие — для улиц".

I'll give you credit for at least putting a name and face to completely retarded beliefs. Here's hoping you get a clue, cheers.

Are you saying that's impossible? Can you give relative estimates?

I'm not saying it's impossible. I said "which have seen zero prosecutions in the US", i.e. to date.

My estimated timeline at this point is 3 years.


From reading this link it sounds like OpenAI successfully dodged oligopoly bullet.

How exactly does this work? Does it use Gmaps API?

Yes, to self-host it you will need a Google maps API key.

In the related links at the bottom, https://gdir.telae.net/links.html, the Git repo https://github.com/pafoster/gdir.telae.net is available along with some other cool things.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: