More

XCSme · 2026-05-08T12:00:37 1778241637

~3.5x more expensive to run my benchmarks[0].

[0]: https://aibenchy.com/compare/openai-gpt-5-4-medium/openai-gp...

XCSme · 2026-05-04T15:21:03 1777908063

> Old gay with tattoo of Osiris eye

Not sure if that's a typo or not in Week 3...

As the next one is

> Old guy who brought his own towel

coffeebeqn · 2026-05-04T15:46:46 1777909606

Strong gaydar on this gentleman

thitran · 2026-05-05T01:52:54 1777945974

it was a typo

XCSme · 2026-05-04T14:36:09 1777905369

Now that I think about it again, the part about "The Technical Challenge of Exporting Session Replays" might not be entirely true. Maybe it is possible (but likely costly), to pass all those replays data through a LLM and convert them to a different format/platform. Even better, maybe it's easy to ask the AI to create a conversion script between your desired formats, so you don't pass GBs of input data through the LLM.

XCSme · 2026-05-02T20:27:58 1777753678

Strangely, the V4 Flash pelican looks better than the V4 Pro one.

In my tests[0], V4 Flash actually does slightly better and for a lot cheaper than V4 Pro, mostly because it reasons twice as much.

[0]: https://aibenchy.com/compare/deepseek-deepseek-v4-flash-high...

XCSme · 2026-05-02T19:39:53 1777750793

I thought it would be a sh script to automatically set the flags for all known do not track env vars.

Einenlum · 2026-05-03T17:39:39 1777829979

XCSme · 2026-05-01T16:57:29 1777654649

Their stats look ok, but when I tested it[0], it was 4x slower than 4.20.

[0]: https://aibenchy.com/compare/x-ai-grok-4-20-medium/x-ai-grok...

XCSme · 2026-05-01T16:56:00 1777654560

It is cheaper per token, but it seems to reason a lot more, leading to costs similar to 4.20, but performance is better (similar to what 4.20 had[0]).

Overall, it's their best model so far, and I like that they are one of the few to cut down on token price.

[0]: https://aibenchy.com/compare/x-ai-grok-4-20-medium/x-ai-grok...

XCSme · 2026-04-30T21:42:34 1777585354

Is it possible to do some sort of Binary* Search (Binary Star, as in A* star search algorithm, where we use heuristics).

    a: [1,3,5,7,8,9,10,15]  
    x: 8 (query value)

For this array, we would compare a[0], a[3], a[7] (left/mid/right) by subtracting 9.

And we would get d=[-7, -1, 7]

Now, normally, with binary search, because 8 > mid, we would go to (mid+right)/2, BUT we already have some extra information: we see that x is closer to a[3] (diff of 1) than a[7] (diff of 7), so instead of going to the middle between mid and right, we could choose a new "mid" point that's closer to the desired value (maybe as a ratio of (d[right]-d[mid]).

so left=mid+1, right stays the same, but new mid is NOT half of left and right, it is (left+right)/2 + ratioOffset

Where ratioOffset makes the mid go closer to left or right, depending on d.

The idea is quite obvious, so I am pretty sure it already exists.

But what if we use SIMD, with it? So we know not only which block the number is in in, but also, which part of the block the number is likely in. Or is this what the article actually says?

mbowring · 2026-05-01T01:54:07 1777600447

Yeah this is basically interpolation search

XCSme · 2026-05-01T02:12:28 1777601548

Oh, that's what the article was referring to with "interpolation".

Weird that I didn't hear about it before, it's not that used in practice?

One reason I could see is that binary search is fast enough and easy to implement. Even on largest datasets it's still just a few tens of loop iterations.

XCSme · 2026-04-28T21:12:14 1777410734

But with new hardware comping out, and maybe models being smart enough to help with optimizing them and reducing inference costs even more, I think we should still expect the costs to go down.

XCSme · 2026-04-28T21:09:46 1777410586

Can you use those AI cards for gaming too?

Or the makers intentionally nerf them, in order to better segment the markets/product lines?

ZiiS · 2026-04-28T21:22:31 1777411351

The drivers often need per game optimisations these will be missing but I doubt Intel would nerf them, just rely on you not paying a lot for RAM the game won't use.

XCSme · 2026-04-28T21:41:03 1777412463

I actually meant it in a different way. I would get it for local AI stuff, but being able to game on it would be a huge plus, otherwise I would need two different machines.

ZiiS · 2026-04-29T08:00:45 1777449645

Much as I want diversity; a 3090 would be a billion times better for games and can probably hold its own for a broader AI workload. Anything other then running highly quantised models that don't fit in 24GB with realativly small contexts.

XCSme · 2026-04-29T10:36:20 1777458980

A 3090 is what I have now.

But I hope to somehow have 48Gb or 64GB VRAM in a GPU that's also gaming-ready.

I was looking for maybe getting a mac studio for this reason, but I don't think a mac is really good for for gaming.

MrDrMcCoy · 2026-04-29T08:24:52 1777451092

It'll work just fine for gaming. It's what the B770 would have been if it had 32GB RAM and ever got released.

wmf · 2026-04-28T21:50:07 1777413007

They nerf gaming cards to make money on the pro cards. Since this is a pro card it's not nerfed.