More

kccqzy · 2026-06-24T21:07:58 1782335278

This just feels like mostly a complaint of missing features in Traefik.

kccqzy · 2026-06-24T20:53:34 1782334414

> seems like soooo much efficiency waiting to be unlocked at the chip level

Well if you are exclusively using GPUs that are general purpose, of course you leave so much efficiency on the table. That’s why Google started making TPUs more than a decade ago. I remember that kerfuffle when Google fired Timnit Gebru when Gebru’s paper used GPUs to calculate the environment impact of LLMs while ignoring the efficiency of TPUs; this basically made Jeff Dean very angry due to that wide efficiency gap.

redox99 · 2026-06-25T01:45:57 1782351957

These NVIDIA GPUs aren't general purpose in the way that you think. They can't even run games. Nvidia blackwell is probably slightly more efficient than TPUs for training. Do you really expect a 4 trillion company with the majority of its revenue being AI for some years now, not to have built its flagship product fully around AI? The GPU name stuck around, but they are pretty terrible at graphics.

The real efficiency win in these chips is that they are made for inference only. You can throw away the vast majority of a chip if you only need a few ops, a single precision (like INT8 or FP8) and don't need ultra fast interconnects.

jacques_chester · 2026-06-24T21:01:40 1782334900

That ... wasn't the kerfuffle

janalsncm · 2026-06-25T01:00:36 1782349236

She wrote the stochastic parrots paper.

Google’s internal review blocked it from publication. Stated reasons were about paper quality. You can speculate whether that was the real reason.

Gebru issued an ultimatum email and said she would resign if some list of conditions weren’t met.

Google said “thanks, we accept your resignation”.

She claims it is retaliation, but it seems more like an own-goal if you ask me. She basically handed Google the solution to their problem.

Practical lesson: don’t tell your employer you might quit before you’re ok with leaving.

Herring · 2026-06-24T23:55:58 1782345358

It kind of was. I really hate gaslighting, but GP is not inaccurate. Google claimed it did not meet their bar for publication because it ignored recent research on how to reduce the environmental and bias-related risks of LLMs. On the other hand, a large org is unlikely to subsidize high-profile research that makes it look bad. And Gebru was critical of Google’s internal culture and diversity efforts…

qnleigh · 2026-06-25T07:33:07 1782372787

I haven't read any of these papers, but given the environmental impact of LLMs in 2026, it seems like Timnit Gebru has been thoroughly vindicated...

kccqzy · 2026-06-24T20:45:06 1782333906

Well Google has reduced reliance on Broadcom already. They found a new hardware partner, MediaTek, that’s probably much, much cheaper than Broadcom.

https://finance.yahoo.com/sectors/technology/articles/broadc...

mschuster91 · 2026-06-24T22:06:44 1782338804

> Well Google has reduced reliance on Broadcom already. They found a new hardware partner, MediaTek

Oh dear god. I'm actually feeling sorry for Google at that point. Good luck, you'll need it...

kccqzy · 2026-06-24T23:22:58 1782343378

My hunch is that this change is driven by bean counters.

amelius · 2026-06-25T11:44:13 1782387853

Who says Google isn't doing its own designs mostly?

kccqzy · 2026-06-25T12:40:29 1782391229

Oh they definitely are. But as a transitional step, replacing Broadcom with MediaTek is probably mostly about cost.

rasz · 2026-06-25T01:05:07 1782349507

MediaTek was spun out of UMC. UMC was a powerhouse of ASIC design.

kccqzy · 2026-06-23T21:09:21 1782248961

In Sunnyvale the police definitely does this! At one point the sting was a small child crossing the road repeatedly while an officer observed nearby.

kccqzy · 2026-06-23T20:54:36 1782248076

I also suspect that the frequency of outdoor exercise matters even if the total duration of outdoor exercise remains the same. Subjectively, I feel much healthier when doing thirty minutes of outdoor exercise six times a week, than when doing one hour of outdoor exercise three times a week. But then of course, all the causal effects could have been caused by a different factor (say dopamine release) than vitamin D.

kccqzy · 2026-06-23T00:37:08 1782175028

That unfortunately doesn’t match my experience at all. My Claude often runs rg in the repo attempting to find things that need to be changed. And of course Claude still needs to invoke the build tool to ensure the change can be compiled, which necessarily involves reading almost every single file at least for a fresh checkout? Or did you envision the build tool being completely remote?

kccqzy · 2026-06-23T00:26:08 1782174368

If the goal is just to make it work like google3, then hg and jj and sapling can all already achieve this. There’s no need for a new contender here. The differentiation must come from something else.

But of course at Google the file system part (CitC) is a layer beneath the version control system and is shared across different vcs tools.

zdgeier · 2026-06-23T02:12:13 1782180733

I do think hosting is an important part of the VCS story. I agree that hg and jj and sapling are capable of being front ends to a google3 like backend GitHub like thing to support it (Google has this internally for jj). Of course some people are working on hosting solutions for these but it feels wrong to me that hosting platforms and their underlying VCS are not made by the same team. IMO people like google3 so much because it’s one integrated system which is the approach I’m trying with Oak.

kccqzy · 2026-06-25T00:17:09 1782346629

Well even at Google the hosting solutions and the VCS are not made by the same team. I lack imagination in thinking how being made by the same team can improve things, but that’s on me. Good luck!

adastra22 · 2026-06-23T07:14:22 1782198862

Public hg and jj are just a front-end to git. No virtual file system overlay or anything like that. Meta has open sourced many of the components of sapling, but there is no plumbing to put it all together in the same configuration.

kccqzy · 2026-06-22T22:48:56 1782168536

The AMD 395 supports up to 128GB unified RAM. So still not enough even at 1-bit quant unfortunately.

monksy · 2026-06-23T02:54:14 1782183254

96gb vram is the max it supports.

cpburns2009 · 2026-06-23T13:45:35 1782222335

That's the max you can statically allocate in the BIOS. It's best to leave that at the minimum (500 MB I think), and let the drivers dynamically allocate. You can use up to about 120 GB on Linux.

selfhoster11 · 2026-06-23T15:16:17 1782227777

Under Linux it is allegedly 110GB, but I’m not sure.

kccqzy · 2026-06-22T20:23:51 1782159831

I was sad in a different way. I immediately realized that this could be solved by dynamic programming by computing the recurrence F(x,y)=F(x-1,y)+F(x,y-1) with the base case F(0,0)=1 and F(x,y)=0 if x<0 or y<0. The problem is that I immediately jumped to generating functions as a tool to solve this. I defined G(u,v)=\sum_x \sum_y F(x,y) u^x v^y. After maybe ten minutes of manipulation I arrived at the closed form for G(u,v)=1/(1-u-v). At this point I recognized its series expansion and its coefficients are just given by the binomial theorem.

I feel sad because I had forgotten the simple and intuitive construction of choosing “go down” and “go right” directions. When a person learns more advanced mathematics, it is often the case that the person just applies such advanced mathematics by rote without realizing that a solution can be found with more elementary mathematics and more creativity. It reminded me of the time in middle school before derivatives were taught, when my teacher reminded me that using derivatives to solve a problem would receive no credit.

srean · 2026-06-23T09:33:10 1782207190

There is nothing wrong in using generating functions. A very handy and powerful tool. I wish I was better at it than I am.

It is a common experience in mathematical problem solving that the first solution leads to more insight which illuminates a shorter slap-my-forehead solution -- bruised forehead.

kccqzy · 2026-06-22T17:25:14 1782149114

Yeah when I read a model’s chains-of-thought I have a tendency to interrupt that because it’s going down a wrong direction. But usually the end result is still fine.