Hacker Newsnew | past | comments | ask | show | jobs | submit | DetroitThrow's commentslogin

>In the world of harness development I think that's an interesting question to answer!

The challenge isn't about harness development though, and a sufficiently complex harness can solve these tasks rather easily.

And presenting it as if you've made a novel development for solving ARC-AGI-3 leads me to believe you're willing to waste all of our time for your benefit at every step in the future.


> a sufficiently complex harness can solve these tasks rather easily.

I claim this is not so easily done, and earlier iterations of ARC-AGI did not have the constraint in the first place. You want something that generalizes across all puzzles (hopefully even the private ones), and these puzzles are extremely diverse ... and hard; telling the model the controls and some basic guidelines for the game is the only "obvious" thing you can do.

The other point of my reply was efficiency, both in terms of creating and using the harness; the discussed solution is something that anyone (in fact, likely even an LLM itself) can cook up in a few minutes; it's not much more than a game control wrapper so the agent can play around with the game in live python and some generalities as laid out in the prompt.

(But I'm always happy to be proven wrong. What harnesses did you have in mind?)


The harness seems extremely benchmark specific that gives them a huge advantage over what most models can use. This isn't a qualifying score for that reason.

Here is the ARC-AGI-3 specific harness by the way - lots of challenge information encoded inside: https://github.com/symbolica-ai/ARC-AGI-3-Agents/blob/symbol...


Um, yes this is a extremely specific as a benchmark harness. It has a ton of knowledge encoded about the tasks at hand. The tweet is dishonest even in the best light.

The hard part of these tests isn't purely reasoning ability ffs.


S300 is very good AA, but in practice modern SEAD with a sizeable number of planes can outrange them and they're not great at protecting themselves. We saw this in India-Pakistan and seeing this again in Iran-USA. You can see more of a stale mate when they aren't getting outranged in Ukraine-Russia.

I am talking about the Chinese clones, not the original (is there a difference ?).

As you mention they did not fare very well in the India-Pakistan conflict.


There's a few of these guys that make posts about technology that doesn't materialize after a few years, they can be ignored. There are plenty of pro-China observers that offer grounded analysis of Chinese military-industrial base out there that don't make claims that China has unobtainium technology. /r/LessCredibleDefence has a shortlist of these propagandists.

Yeah it's certainly unimaginable that the civilization that invented gunpowder, cannons, guns, rockets a thousand years ago can make it for cheap now :)

'Hypersonic' missile makes it sound like it's alien technology, no it's solid boosters that do not follow the usual ballistic trajectory with a computer from 1970.

The raw materials cost less than half of a standard car.


"no it's solid boosters that do not follow the usual ballistic trajectory"

Hypersonics do not. They are extremely fast and extremely low flying.


No, hypersonic is a marketing term here that indicates 'difficult to intercept'.

It does not imply anything about speed, just automatic or controlled maneuvering later in the stage than normal missiles do.


The very definition of hypersonic requires at least Mach 5 in terms of speed.

sigh


We have had mach 5 missiles for about 60-80 years now, that's not what the novelty is.

Mach 5, high maneuverability, inside the atmosphere. Normally a non-ballistic trajectory. That's been the goal for a very long time.

https://www.armyupress.army.mil/Journals/NCO-Journal/Archive...

https://en.wikipedia.org/wiki/Hypersonic_weapon

Do you have something to add to this discussion?

We just redoing definitions, or what?


> Mach 5, high maneuverability, inside the atmosphere.

Out of these, Mach 5 and inside the atmosphere have been doable for several decades. Pretty much all countries that make missiles can make missiles with these two characteristics.

My point, which you seem to either misunderstand or deliberately misrepresent, is the other one - "maneuverability" - being the distinguishing factor for what we call hypersonic missiles. That makes these difficult to defend against.

Think of it like calling humans hyper-limbed animals, but limbs being not what really distinguishes humans from, say, chimpanzees.


Maneuverability isn't new either, aim-9x's can pull 60Gs.

MUTANT missiles will take that a step further as the tech works through the Program Office trials.

I'm not sure what point it is you're trying to make here, this whole thread seems like a silly waste of time.

There are people on this site that work directly in the offices of these programs.


> I'm not sure what point it is you're trying to make here, this whole thread seems like a silly waste of time.

Yeah, step one before replying to something should be trying to understand the point.

> There are people on this site that work directly in the offices of these programs.

Maybe even in this thread!


> Maybe even in this thread!

Maybe! :-)


I've only read a few short blurbs about this. What makes you think the booster doesn't follow a normal ballistic trajectory?

That's pretty much the entire point of what people are calling hypersonic missiles. All ballistic missiles fly at hypersonic speeds. The advance is being able to do so at low altitude with maneuverability.

You are correct, but I should point out that Russia has described its Kinzhal missiles as hypersonic, when they are really more of a traditional ballistic missile fired horizontally. So very fast (Mach 10), but not as maneuverable as what the U.S. has been calling hypersonic.

Since the original story here does not provide many details, we can't know which side of that fence this falls on (assuming it is real).


Was there any evidence that the Kinzhals fired, for example, toward Kyiv during the current conflict were fired on a depressed trajectory? I remember reading one account that looked like a plain old interception of a ballistic missile. (which is impressive enough to someone who remembers when "Patriot missile" was not exactly synonymous with excellence)

Kinzhals being intercepted all the time could also be propaganda or missile defense having progressed more than publicly known.

It's not a great idea in war to assume your enemy is incompetent (even when they are).


> That's pretty much the entire point of what people are calling hypersonic missiles.

Most missiles endowed with the "hypersonic" moniker are simply theater ballistic missiles used for standard ballistic missile things, which is part of why I asked the question.

> The advance is being able to do so at low altitude with maneuverability.

Hate to burst your bubble but arms dealers and governments are as capable as anyone else of marketing spin.


Every security engineer I know working at Azure is on the verge of self-harm because of the current situation, or is the dumbest IC I've ever met and somebody I think should have never become a security engineer. Sample size ~12.

That is quite the indictment.

I am not very close with every one of these engineers, and some no longer work at MSFT, but yes talking to employees in Seattle working on security made me never want to use Azure.

Last I heard, the CO+I org has some pretty serious cultural problems that contribute to this, and which will not be easily solved.

>It works great.

When it is online, I agree with things asides from the "fast" part, actually. But many companies have a secondary service for async comms/chat when being Teams cannot be online, and compared to Slack.


Thanks for that thought. Horrible.

300-400 miles depending on conditions

That's still phenomenal imho.

I think the gap is smaller than it has been in the past but I largely agree with you, generally larger work is done much better with Claude Code.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: