More

Mithriil · 2026-04-27T14:31:06 1777300266

> nobody understands the fundamentals

Funny statement to be found in the discussion about... research results on the fundamentals.

Mithriil · 2026-04-27T14:16:53 1777299413

Asymptotics has been used to validate tons of statistical tools. This is just another tool being validated.

If you have a tool that you don't know works when data increases (n-> infinity), then you shouldn't use it.

So practicaly, I believe it has serious implications.

jmalicki · 2026-04-27T18:39:00 1777315140

It's very much necessary but not sufficient. In real life the sample complexity matters a lot too, which is also asymptotics, but a more important one. E.g. how the central limit theorem is far more powerful than the law of large numbers.

Mithriil · 2026-04-27T14:14:29 1777299269

I don't think that this is true. You need an infinite number of dimensions for this (think Taylor's expansion, Fourier expansion, infinitely wide or deep NNs..)

jmalicki · 2026-04-27T18:36:40 1777315000

Yes, you do linear interpolation between an infinite number of data points.

Mithriil · 2026-04-27T13:50:00 1777297800

As someone who worked with Nadaraya-Watson regression in the pass, the result that infinitely wide NNs converges to kernel regression baffles me.

Mithriil · 2026-04-24T15:19:34 1777043974

Add the feature of doing a high five for the rare cases when it's actually good.

Mithriil · 2026-04-20T21:48:37 1776721717

> instantly

Shor's and Grover's still are algorithm that require a massive amount of steps...

tsimionescu · 2026-04-21T10:18:46 1776766726

I don't think they meant "in O(1) steps", I think they meant "the day someone figures out how to keep many thousands of qubits entangled while operating on them with gates will be the same day we have the first QC that can start breaking encryption in reasonable time". Where, of course, same day is also an exaggeration. But the general point is that we need a single breakthrough to achieve this, and it's very hard to estimate how long a breakthrough might take to appear.

dlcarrier · 2026-04-22T00:47:40 1776818860

Exactly

You could say it'd be a quantum jump in capabilities.

Mithriil · 2026-04-17T15:26:46 1776439606

I would expect such a law to be lobbied to death.

Mithriil · 2026-04-16T14:13:25 1776348805

The Google's n-gram dataset link is outdated. You can get them here: https://storage.googleapis.com/books/ngrams/books/datasetsv3...

Mithriil · 2026-04-14T18:45:08 1776192308

The half-life idea is interesting.

What's the loop behind consolidation? Random sampling and LLM to merge?

pranabsarkar · 2026-04-14T19:14:32 1776194072

No LLM in the loop. The consolidation pass is deterministic:

Pull the N most recent active memories (default 30) with embeddings Pairwise cosine similarity, threshold 0.85 For each similar pair, check if they share extracted entities Shared entities + similarity 0.85-0.98 → flag as potential contradiction (same topic, maybe different facts) No shared entities + similarity > 0.85 → redundancy (mark for consolidation) Second pass at 0.65 threshold specifically for substitution-category pairs (e.g., "MySQL" vs "PostgreSQL" in otherwise-similar sentences) — these are usually real contradictions even at lower similarity Consolidation then collapses the redundancy set into canonical memories with combined importance/certainty. No LLM call, no randomness. Reproducible, cheap, runs in a background tick every ~5 minutes.

The LLM could improve this (better merge decisions, better entity alignment) but the tradeoff is cost and non-determinism. v1 is deterministic on purpose.

Source: crates/yantrikdb-core/src/cognition/triggers.rs and consolidate.rs next to it.

SkyPuncher · 2026-04-14T23:12:53 1776208373

> with embeddings Pairwise cosine similarity, threshold 0.85

So, your system is unable to differential between AWS and Azure (~95 similarity). Probably unable to consistently differentiate between someone saying they love and hate something.

Mithriil · 2026-03-25T03:09:47 1774408187

Bayesian network is a really general concept. It applies to all multidimensional probability distribution. It's a graph that encodes independence between variables. Ish.

I have not taken the time to review the paper, but if the claim stands, it means we might have another tool to our toolbox to better understand transformers.