More

44za12 · 2026-04-12T08:18:30 1775981910

I read it as an article in defence of boring tech with a fancier/clickbaity title.

Here’s the more honest one i wrote a while back:

https://aazar.me/posts/in-defense-of-boring-technology

dvfjsdhgfv · 2026-04-12T08:28:40 1775982520

While I agree with your points, this one could be more nuanced:

> Infrastructure: Bare Server > Containers > Kubernetes

The problem with recommending a bare server first is that bare metal fails. Usually every couple of years a component fails - a PSU, a controller, a drive. Also, a bare metal server is more expensive than VPS.

Paradoxically, a k3s distro with 3 small nodes and a load balancer at Hetzner may cost you less than a bare metal server and will definitely give you much better availability in the long run, albeit with less performance for the same money.

sgarland · 2026-04-12T12:17:52 1775996272

In 5 years of running 3x Dell R620s 24/7 - which were already 9 years old when I got them - I had two sticks of RAM have ECC errors, and one PSU fail. The RAM technically didn’t have to be replaced, but I chose to. The PSU of course had a hot spare, so the system switched over and informed me without issue.

IME, hardware is much more reliable than people think.

44za12 · 2026-03-04T16:47:24 1772642844

Specialised models easily beat SOTA, case in point: https://nehmeailabs.com/flashcheck

44za12 · 2026-03-04T05:14:34 1772601274

All of us use the same keyboards more or less, maybe us randomly typing a large number is not as random as we would like to think. Just like how “asdf”, “xcyb” are common strings because these keys are together, there has to be some pattern here as well.

palmotea · 2026-03-04T05:30:16 1772602216

Especially for those very large numbers in the top ten (like 166884362531608099236779 with 6779 searches), and the relatively small number of total "votes" (probably less than a million), I think the only likely explanation for their rank is ballot-stuffing.

strongpigeon · 2026-03-04T17:00:41 1772643641

That means there is less entropy than purely random strings, not that this specific number would be so far outside the distribution. My money would be on someone hammering it.

44za12 · 2026-01-26T07:03:53 1769411033

This is the way. I actually mapped out the decision tree for this exact process and more here:

https://github.com/NehmeAILabs/llm-sanity-checks

homeonthemtn · 2026-01-26T12:45:24 1769431524

That's interesting. Is there any kind of mapping to these respective models somewhere?

44za12 · 2026-01-26T13:24:03 1769433843

Yes, I included a 'Model Selection Cheat Sheet' in the README (scroll down a bit).

I map them by task type:

Tiny (<3B): Gemma 3 1B (could try 4B as well), Phi-4-mini (Good for classification). Small (8B-17B): Qwen 3 8B, Llama 4 Scout (Good for RAG/Extraction). Frontier: GPT-5, Llama 4 Maverick, GLM, Kimi

Is that what you meant?

hyuuu · 2026-01-30T20:58:21 1769806701

at the sake of being obvious, do you have a tiny llm gating this decision and classifying and directing the task to its appropriate solution?

andai · 2026-02-01T17:05:40 1769965540

>Before you reach for a frontier model, ask yourself: does this actually need a trillion-parameter model?

>Most tasks don't. This repo helps you figure out which ones.

About a year ago I was testing Gemini 2.5 Pro and Gemini 2.5 Flash for agentic coding. I found they could both do the same task, but Gemini Pro was way slower and more expensive.

This blew my mind because I'd previously been obsessed with "best/smartest model", and suddenly realized what I actually wanted was "fastest/dumbest/cheapest model that can handle my task!"

44za12 · 2026-01-24T07:46:53 1769240813

For simple extraction tasks, a delimiter-separated string uses 11 tokens vs 35 for JSON. Output tokens are the latency bottleneck.

44za12 · 2025-09-01T10:21:24 1756722084

Shameless plug.

I’ve been using a cli tool i had created for over 2 years now, it just works. I had more ideas but never got to incorporate those.

https://github.com/44za12/horcrux

tedk-42 · 2025-09-01T11:14:44 1756725284

6 years for me if we're counting :)

https://github.com/edify42/otp-codegen

44za12 · 2025-09-01T12:35:45 1756730145

Love the minimalism.

44za12 · 2025-08-30T07:16:41 1756538201

Have been using remove.bg for this for years now.

bingbing123 · 2025-08-30T07:53:46 1756540426

Yes, I’ve built a free tool that delivers the same background removal results as remove.bg

44za12 · 2025-08-17T15:03:17 1755442997

Like a sempahore?

0x457 · 2025-08-17T18:11:30 1755454290

Semaphore limits concurrency, this one automatically groups (batches) input.

44za12 · 2025-08-14T16:55:38 1755190538

I’ve had great luck with all gemma 3 variants, on certain tasks it the 27B quantized version has worked as well as 2.5 flash. Can’t wait to get my hands dirty with this one.

44za12 · 2025-08-08T06:20:24 1754634024

Can you benchmark Kimi K2 and GLM 4.5 as well? Would be interesting to see where they land.