I don't mean to be 'that guy', but after a quick review, this really feels like low-effort AI slop to me.
There is nothing wrong using AI tools to write code, but nothing here seems to have taken more than a generic 'write me a small LLM in PyTorch' prompt, or any specific human understanding.
The bar for what constitutes an engineering feat on HN seems to have shifted significantly.
I don't really understand the point of this project or how it demystifies anything. Click the browser demo and I get a generic AI chat screen. Is the readme the part that "demystifies" something? I feel like I am living in a bizarro world. Is this all AI? Are all the comments here from bots?
Yes. The Cray supercomputers from the 80s were crazy good matmul machines in particular. The quad-CPU Cray X-MP (1984) could sustain 800 MFLOPS to 1 GFLOPS, and with a 1 GB SSD, had enough computer power and bandwidth to train a 7-10M-parameter language model in about six months, and infer at 18-25 tok/sec.
A mid-90s Cray T3E could have handled GPT-2 124M, 24 years before OpenAI.
I also had a punch-card computer from 1965 learn XOR with backpropagation.
The hardware was never the bottleneck, the ideas were.
Post-quantum crypto is a good example of this. Lattice-based schemes were theorized in the 90s, but they took decades to actually reach production. The math existed, the hardware existed, and the ideas for making it work were just not there yet.
I am a bit surprised, but I guess everything eventually wears out.
In the 1980's I worked as a field engineer that supported a lot of pdp-11's. They were very reliable for the time; tape drives and disks were the #1 maintenance items. To actually have to open up the processor and change a board was not a regular activity.
Other machines of that era, like those from Gould or Perkin/Elmer or DG gave regular practice in the art of repairing processors.
Guess I expect them to work forever. Like a Toyota.
I encouter two main failure modes. First, the bipolar PROMs degrade at the atomic level, the metal ions in the fuses tend to migrate or 'regrow' over decades, causing bit rot.
Second, the backplanes suffer from mechanical fatigue. After forty years of thermal expansion and structural flexing, especially when inserting boards, the traces and solder joints develop stress cracks. Both are a pain to repair.
XENIX's second target processor was an 11/34 with a programmers workbench. That nightmare took 3~4 years... Microsoft years, while they used the Pdp-11/70 for development.
Thanks for reposting! I'm the author of ATTN-11. Happy to answer any questions about the fixed-point arithmetic, the PDP-11 hardware, or the training process.
Incredible work! Fitting transformer into 32KB RAM is crazy
For those who read this project and do not know PDP-11 it could be hard to understand that working with these memory limits is difficult.
Here is visual guide for PDP11 architecture - https://vectree.io/c/pdp-11-hardware-architecture
That PDP-11 was the most fun minicomputer of the late 1970s in my opinion. Growing up in NH about an hour north of Digital's HQ all sorts of schools from primary to secondary as well as museums had PDP-8, PDP-10, PDP-11 and later VAX machines.
The PDP-11 had a timesharing OS called RSTS/E which could give maybe 10 people a BASIC programming experience a little bit better than an Apple ][. If you were messing with 8-bit microcomputers in 1981 you might think a 16-bit future would look like the PDP-11 but the 1970 design was long in the tooth by 1980 -- like 8-bit micros it was limited to a 64kb logical address space. Virtual memory let it offer 64k environments to more users, but not let a user have a bigger environment.
Fun stuff! At one point I wondered about building something similar. But I lack the AI chops, and have too many other projects going on anyway.
I'm curious as to the type of memory in the 11/34. I also have a working PDP-11, an 11/05 with 32KW of actual core. I wonder what performance would be like with EIS emulation grafted in. Stunningly slow, I imagine.
I also have a working design for a small Transformer on the original Game Boy. It has around 4000 parameters fitting in the 8 KB cartridge SRAM, where the "saved game" is the trained model. A TI-82 with its 32 KB of RAM would be even more comfortable.
Around the same time (1984), there was also another very cool piece of technology that often gets overlooked: the CMU WARP. It wasn’t as flashy as the Crays and the Connection Machine, but it was the first systolic array accelerator (what we’d now call TPUs). It packed as much MFLOPS as a Cray 1.
It's also the computer that powered the Chevrolet Navlab self-driving car in 1986.
I've been building a functional language for differentiable programming that compiles to JAX. The core idea is homoiconicity applied to ML, models are data structures that can inspect and transform themselves.
For those interested, this guy is revamping the Emacs widget library with something more modern and platform agnostic, based on SDL: https://appetrosyan.github.io/posts/
Interesting, thanks for sharing! I've had thoughts about making vui.el backend-agnostic so it could target different widget implementations (like xwidgets or even native-GUI). An SDL-based widget library could potentially be one of those backends. Need to dig into appetrosyan's work before I can say anything intelligent about it though. And of course, it was an idea and I am unlikely to dive deep without practical need (time is limited, sadly).
My only complaint regarding the Zed editor is the inability to display two panes of the sidebar one below the other. Not only is it impossible to display them together, but switching between them requires clicking a tiny button in the status bar. To make matters worse, performing a search hides the symbols and the tree view.
There is nothing wrong using AI tools to write code, but nothing here seems to have taken more than a generic 'write me a small LLM in PyTorch' prompt, or any specific human understanding.
The bar for what constitutes an engineering feat on HN seems to have shifted significantly.
reply