They’re possibly great at generating alpha in highly complex systems that compose LLMs with tabular machine learning and other analytical techniques at a large scale. So yea, certainly not for these users.
You could maybe somehow relate it to supply and demand dynamics, but the idea that that higher taxes can start reducing receipts after a certain point is widely credited to Laffer (although apparently there are signals of others saying similar things throughout the history of economics). The Laffer model displays an inverted U curve.
Classical supply and demand models do not show that increasing the price of x causes quantities to first go up then down after a certain inflection point…
You could probably make the relationship through utility rather or a proxy model (you can model anything with enough imagination in economics) but to call it “the result of” as if it is immediately related is not accurate. It is apparent at first sight when you compare the curves…
If I raise the price of my product, less people will buy it but I'll make more profit per-unit -- so the amount of money I make is an inverted U with price on the x axis and money on the y axis, and I should set the price at the inflection point.
France's government expenditure is 57% of its GDP according to th OECD. It's probably an all-time record in the annals of global economic history in peacetime. It's more than the late Soviet Union spent around 1985-1990 according to the IMF / World Bank.
The 2025 deficit ran at around ~150 billion euros. The Zucman wealth tax would raise 25 billion in the most optimistic projected scenario (so, one sixth of the deficit at most). This is the very best case as projected by its proponents - there's a decent chance receipts would be significantly lower than that.
You're absolutely right that right wing parties did not, and almost certainly will not, solve any of these issues. Neither will a wealth tax. In my opinion, this is only solved in a bang.
I agree a lot of these things end in a bang at some point.
Western rich world boomers have gotten a great deal - unusually stable & high growth economies, relative housing abundance, and peace during their peak working life during which all sorts of retirement related benefits were introduced.
Unfortunately many of those benefits only work in a higher population growth, higher economic growth, lower retired lifespans world that no longer exists.
I find this analysis confusing. PMF for coding was likely reached some time last year. Profitability, which is different, we don’t know. The article kind of confuses both without making a strong economic case or using numbers in a compelling way. I don’t understand what the Uber case has to do with this either. The Uber COO clearly said that at least in terms of ROI he’s not seeing the results either.
My take is the product has been very useful for coding (PMF) for months. But it’s certainly not useful at any cost…
What I also find confusing though is that folks seem to ignore trajectory which is maybe the biggest lede to bury. As Simon says, we have had "good enough" coding agents for 6 months, that is a blink of an eye, and at my company my job has now completely changed. It's almost like a dream.
And that's just one inflection point. We've had several and there are many more on the horizon. So while I could be convinced that ROI is maybe not even positive today despite the ridiculous enterprise spend, it's perfectly rational to pave the way today for what's coming over the next few months let alone years down the line.
There may be additional major leaps forward, and there may not. I kind of struggle to imagine what the next step actually is. Certainly there will be improvements in performance (speed) and cost. But at a point you reach a barrier where the limiting factor is the specificity of the human prompt and our ability to manage all the code we’re generating.
Somewhat oversimplifying; writing software and building apps was a bottleneck - now it is not. What is the next bottleneck that LLMs can solve? Is there one? And is there enough publicly available data to solve it repeatably at scale? Or did we just automate stack overflow searches and now we’re stuck again?
Or is the endgame of this innovation cycle the complete removal of interaction with machines through code? Will we simply interact with machine coworkers purely through natural language? Can an LLM make PowerPoint slides and run a meeting? So far not seeing much progress on that.
Judging from the fact that the Opus 4.5 inflection point was not really anticipated, and we still don’t really know what threshold was crossed that suddenly made agentic coding accessible to so many more people, I think it’s safe to say we don’t know what the thresholds will be until they’re crossed. The fact that we don’t know exactly what they’ll be isn’t a good reason to think there won’t be any more.
I think we have quite good reason to expect more. As I said, we already know (caveat with your level of irrational skepticism toward the overwhelming evidence) that the best existing models are better than the ones publicly available.
For what it's worth, at PyCon US this year I ran into a few people with access to Claude Mythos and they confirmed that it's notably better at writing code than public Claude Opus 4.7.
> caveat with your level of irrational skepticism toward the overwhelming evidence
If you can talk about my irrational skepticism (because I said that "we don't know the future", I suppose?), can I talk about your total lack of common sense?
Because the economy has been growing in the last decades does not mean that it will keep growing for the next decades. Because LLMs have been improving in the last few years does not mean that they will keep improving in the next few years. Maybe, maybe not, your guess is as good as mine. If you know the future, put your money where you mouth is and invest everything you own in LLM companies.
Your overwhelming evidence is about the past: it has been improving in the past.
I must have thought I wrote something that I didn't actually write in the previous comment — I can't figure out what "as I said" is supposed to be about.
In any case, maybe I was too subtle. I was talking about Mythos, a model that continues the trend, but which is not available to the public yet. The "overwhelming evidence" is the testimony of the people who have used it. The irrational skepticism was people who don't believe that testimony. In other words, we do know the future, because we know that model and others like it will come out soon.
Mythos is already here, you cannot use it for predictions just because you don't have access to it...
I just have an issue with all the people saying "I predicted this 10 years ago" (implying something like "you should listen to me, I make good predictions") while conveniently forgetting all the things they predicted wrongly, or the survivorship bias.
We don't know that AIs will continue improving at the pace they have, because we don't know the future. Some people will guess right, some won't. And those who guess right will be tempted to believe that they guessed right because they are more clever. All we can say is that it is possible that it improves, and it is possible that is stops improving.
I am currently eating lunch. Meanwhile Claude is triaging and writing reproducers for 70+ tickets nobody has had time to look at. Next it will attempt to fix them. I have not read the tickets. I will not look at the code until there are review ready PRs and a code review bot have done the first pass.
In other words, most of the prompting will also go away.
Feels like everyone should be on one hand. On the other hand it also feels like a massive recalibration of what companies can/should do. They spend massive amounts of money on AWS, Datadog, GitHub, CircleCi, et al. If it becomes easier to host/roll your own it's a big increase in the demand for engineers.
Ultimately software is everything these days and the economics make the demand insatiable. We've gone through many cycles of "X" but on computers/web/mobile. There's going to be a massive amount of "X" but with AI companies that will need engineers.
Or at least this is what I tell myself to sleep at night.
If I don't stay ahead of the curve, yes. But I can't stop that development. What I can do is leverage the technology enough to be more valuable than those who don't. By e.g. knowing how to set up processes like the above.
Ultimately, we'll need UBI or large scale cuts in working hours or similar if AI progresses to the point of mass unemployment - the alternative would be massive social unrest. In the meantime I expect to keep doing better than average.
Pmf is this weirdly defined thing where "if you're not sure you have it then you don't".
I think it was clearly useful for months to people who had tried it and taken the time to understand it, but now that knowledge has spread to the point where wallet holders are convinced it's not just passing fad or hype so now pmf can be "claimed".
I agree it's weird to say "those people have pmf" though, usually it's something you define for yourself
> Pmf is this weirdly defined thing where "if you're not sure you have it then you don't".
I'm not sure if this runs counter to your point or not, but: I don't see any future where LLMs aren't a core part of Software Engineering. The horse is out of the barn. There is no going back.
Yeah but the product is not “LLM” it’s “proprietary frontier model LLM paid by the token”.
And I don’t even necessarily disagree with OP! It’s more like the competition is shifting so quickly that your competitors could undercut your PMF in a blink of an eye.
History bears out that cheap and satisficing soundly beats expensive and optimal every time. Until we have smarter and more prescient decision makers in leadership, the bottleneck on output will be the quality of decision making not the quality of code. Trying more things faster and cheaper will win.
> clearly useful for people who took the time to understand it
people -> programmers, I haven’t met a non-developer who reports getting more time out of current AI platforms than they put in. If anything I’ve anecdotally heard the opposite, introducing AI at work creates so much slop (output) it takes more time to process it all without a tangible bump in overall productivity
Thats why most here shouldn’t engage in the discussion - they parrot on about benefits without identifying and articulating the costs and moreover how it affects the firms financial position.
The article also treats the word "good" as load-bearing in a way that should have you questioning their analysis:
"I’ve called November 2025 the November inflection point because that was when GPT-5.1 and Opus 4.5, combined with their respective coding agent harnesses, got good—good enough that we’ve spent the last six months adapting to agent systems that can reliably get useful work done."
MongoDB was once backed up by adoption across the industry. Or for a more recent example, blockchain took off like wildfire across the industry before ultimately fizzling out in all but the most niche applications.
Not saying this trend will do the same, just that the industry adopting something doesn't guarantee its success.
I don’t think those are really comparable. The blockchain was trendy hype, relatively few companies actually adopted it. Where did Netflix use the blockchain? Google?
By comparison almost all tech companies I know have leaned heavily into AI.
It’s not supposed to be logical, it’s an LLM evangelism blog that rarely, if ever, has any critical analysis that isn’t pro-industry. Read any/all of the other posts and you won’t find much skepticism but you will find a lot of shilling how great it all is.
I like his other posts. He's bullish on AI, which is fine. I'd like to read a mix of bearish and bullish level-headed takes from people who are subject matter experts. His technical credentials are well past discussion - I love Django, and he comes across as a pretty upbeat but level-headed guy. Certainly beats radical takes in either direction from people who have no clue what they're talking about. It's just this article that I find rather confusing.
That's how I feel about most of your writing. I click through most times when I see you either on the front page or in the comments, and I generally walk away feeling like I have food for thought, without necessarily buying everything wholesale. It's part of why I keep coming back.
My root comment simply represented my two cents about the current post. I don't think anything about the post is outrageously incorrect or anything, just somewhat confusing. You're a very prolific contributor in this community and I don't think me or anyone else that welcomes your takes expects everything you write to rock our collective socks every single time, anyway.
If you want an "LLM evangelism blog that rarely, if ever, has any critical analysis that isn’t pro-industry" there are plenty out there. I'm not one of them.
People are confusing "excitement" with "evangelism". Your blog is definitely on the pro-AI side of things, but as you say, it's not one-sided or uncritical.
All of these are about AI misuse, not skepticism of AI. By skepticism I mean doubting whether AI actually delivers on its promises which, based on this last post, sounds like something you think we're already past.
Many people still think AI coding agents are slop on steroids despite all the current hype around AI actually shipping functional products.
It's hard for me to write about skepticism that coding agents deliver on their promises when I've been using them daily and know, for an absolute fact, that they boost my own productivity.
(And that's after taking into account the METR paper that says engineers over-estimate their productivity with these tools.)
I have plenty of doubts about AI delivering on its promises outside of coding. I don't write about AGI because I think it's science-fiction hysteria. I write about slop precisely because it represents a mis-use of AI that demonstrates people completely misunderstanding what it's useful for.
Love when people say "its promises". What specifically are you disappointed with? Simon's posts are high quality and evidence driven. AI has already delivered an incredible amount. Read Epoch for industry trends and analyses, METR to, everything points to a pretty consistent picture.
"Many people still think AI coding agents are slop on steroids despite all the current hype around AI actually shipping functional products."
Oh yes, tons and tons, especially on HN. But the plural of anecdote is not data. Enterprise spend speaks for itself. You are using AI-coded functional products all the time. Do you want like a diff history for the Google codebase or something?
Tbf the OPs blog and comments (including their sibling to your comment) are also heavily anecdotal.
> I’ve called November 2025 the November inflection point because that was when GPT-5.1 and Opus 4.5, combined with their respective coding agent harnesses, got good—good enough that we’ve spent the last six months adapting to agent systems that can reliably get useful work done.
Claiming a grand inflection point based on your own personal usage is very anecdotal.
If that were it I would absolutely agree with you. But this experience maps exactly to adoption trends. My job in the last 6 months has become so unrecognizeable to me it’s insane, the adoption at the very least at large companies is truly truly incredible, and it really does coincide with the quality of opus 4.5 (which has now been surpassed).
"Adoption trends" are just herd behavior which may or may not be driven by compelling anecdotes and may or may not be evidence of something more. I'm just saying it seems wrong to dismiss the post the way you did when the OP in question and your own post here are just more anecdotes.
No, if that were really true you wouldn’t see what you’re seeing today. You wouldn’t see entire companies completely retooled and refactored around these tools. You would see the mistake of “this is actually just herd behavior”, which involves such a colossal amount of impact to these companies entire stack and bottom line, resulting in systemic collapse. You don’t see that. Company leadership are not some idiot class of people, I don’t know why this is people’s prior. If companies get adoption wrong in either direction they are completely screwed. So you’re seeing people putting money where their mouth is, across the board.
Compelling anecdotes are not even the main source of evidence. Look at the enormous body of work on measurement of these systems. I always point people to epoch capability index as a good summary statistic of capabilities or METRs time horizon data which has now been topped out. They had a recent updated to the dataset, after which the corrected plots pointed to an even faster acceleration than before.
> You wouldn’t see entire companies completely retooled and refactored around these tools.
That's exactly what I'd expect people who are driven by hype and FOMO and YOLO and anecdotal evidence to do.
> resulting in systemic collapse.
Many people are noting the system is collapsing. Maybe it's not going as quickly as you expect, but there's definitely evidence of this from increased service outage frequency, billion dollar notes being passed in a circle between companies, open projects refusing AI contributions entirely because they're overwhelmed by crap, Sam Altman begging governments to force citizens to buy their product through "universal basic compute", etc.
> Look at the enormous body of work on measurement of these systems.
It's certainly possible to measure anything. Benchmarks are a form of evidence but they famously a) don't represent reality and b) can be easily gamed.
> That's exactly what I'd expect people who are driven by hype and FOMO and YOLO and anecdotal evidence to do.
Not at this scale.
> Many people are noting the system is collapsing.
On HN? any piece of evidence to support this? service outage frequency is not a sign of systemic collapse. billion dollar notes passed in a circle is brought up a lot and misunderstands how finance works. "open projects refusing AI contributions entirely because they're overwhelmed by crap" is not a systemic collapse, its not being able to adapt to a new world with new challenges. Btw "slop" is getting less and less sloppy.
> Sam Altman begging governments to force citizens to buy their product through "universal basic compute", etc.
Very interested in some citation detail that sounds like a headline quote of something more complex.
> It's certainly possible to measure anything. Benchmarks are a form of evidence but they famously a) don't represent reality and b) can be easily gamed.
I mean I work on benchmarks for a living I can tell you both of these things are true but only partially, and in aggregate they all tell a consistent story. Not to mention, static OSS benchmarks are not what these companies rely on. They have live traffic, ability to run A/B tests, full conversation traces, to ignore this is pretty incredible.
My point was claiming a broad inflection point based on your own personal usage is not "evidence driven", it's anecdote-driven. It's hard to disprove any claim you made because you didn't really make one that's disprovable, and your opinion on it now is still just an opinion.
Those have been around for a long long time, you may be focusing on anecdotes but the adoption numbers and performance trends speak for themselves and we’ve had performance trends for years. People can argue about whether or not enterprise level adoption has a clear ROI today but the fact that we’re at the point where entire large scale companies are already completely refactored, directly after opus 4.5, if that’s not a convincing enough signal I don’t know what is.
I think we're in agreement then; the point I was responded to was saying your blog was evidence-driven, and we can both agree it's not -- at least to the standard that would pass peer-review.
What they told me and I can read online is that they don’t because they can’t operate on the Austin highways. Have you read anything that’s more detailed?
For all his weirdness and moral failings, I don’t see Altman saying things like whites being under apartheid in the US. And worse. Multiple times a day. Every day.
Outcomes matter, when both people are leading you to undemocratic societies why should I care about their messaging when both result in worse material lives for 100s of millions of people?
Psychopaths wear a mask of sanity but underneath they have no moral framework. Some people can hate black people or Muslims or anything like that but these qualities are very human. Things like racism has existed across cultures for more than several millennia among many people. A psychopath is a different ballgame.
A psychopath can plunge a knife into a babies face and feel no emotion. The only reason why the psychopath doesn’t plunge a knife into a babies face is because he has nothing to gain from it. The psychopath appears more sane then a racist because the psychopath is usually better at pretending to be sane simply because he is unable to comprehend the passionate yet racial hatred that the racist feels.
What I am saying is musk does not fit that archetype as much. Altman fits that archetype more. Neither actually crosses the threshold to be called a “psychopath” but if psychopathy was a gradient Altman is further down the line then musk. Much further.
That is the literal definition of what a psychopath is. You’re operating from a lack of understanding of the definition.
You’re trying to tell me, in an infinitely smug, condescending, and unnecessarily long-winded manner that you know as a clinical fact that Sam Altman is a psycopath?
Dario Amodei said in the most recent interview with Dwarkesh that Anthropic currently gets achieves an increase of around 20-30% coding productivity, which tracks with my experience. What do you do to reap orders of magnitude more?
Also, how much more money do you make? Or are you working less?
I can clearly see, and feel, Dario's associates' "increased productivity" in their Claude Code/Chat/Cowork desktop product...
So many updates, pretty much daily, so many tweaks to the interface. Sometimes the tweaks are a bit dumb, sometimes completely trivial, and other times they just undo what they did previously.
My favorite examples of their newfound velocity:
1. When there is a nice feature that was easily discovered, and then it was gone...
2. When the "customize" section moves around to random places in settings, or entirely out of it
Special mention to their scatterbrained keyboard shortcuts strategy
> What do you do to reap orders of magnitude more?
I don't know what Dario Amodei says, does, how Anthropic is run or structured, or what kind of people work there, so I can't comment on that.
I do know about myself, though. The increase is very real, measured by the number of (Linear) issues resolved. No, I haven't changed how I open or close those issues, I've been using the system for years. During the first three months of 2026 I went through 12x more issues per month than during past years.
But I guess I am not an "average programmer": 35 years of experience means I can work with AI as I would with a small (but very skilled) team. I can architect systems, notice unnecessary complexity, and intuitively choose solutions that are more maintainable. And I am a single-person company, with no managers to report to, KPIs to achieve, presentations to make, etc.
I do not make "more money". It's a mature SaaS. Changes in revenue are over the long term, on a scale of years more than months, and implementing features is no longer enough: marketing is needed for more growth.
But, to be honest, I am tired of defending myself this way. It's not the first time I post this metric, I thought people would find it an interesting data point. Instead, I get downvoted (see my comment above which currently sits at -1 in spite of being objective and factual), and then get plenty of responses asking me to defend my statements.
Come to think of it, I'd rather not convince people about AI increasing productivity so much. I'm not really sure why I bother to post here anymore. I'd rather have everyone (including my competition) believe whatever they want to believe and not use AI.
I've been selling and managing my own projects for over six years, so I don't count myself in the antisocial camp. But LLMs haven't changed the fact that I like to have at least five hours of deep work every day.
I hope you appreciate the irony of saying that in a thread where we are discussing that OpenAI's main competitor is engaging in blatantly anti-consumer behavior.
There is no irony: both of them are bad (for different reasons, but bad) and this is not a matter of choosing the "lesser evil". Both of them should be treated as toxic and rejected as strongly as possible.
You're at least describing someone who sounds hard-working... what's the problem?
I'd be more concerned if I was someone who signed up to play ping pong two hours a day and do a bi-weekly commit.
There was a time not so long ago where I was watching "a day in the life of a software engineer" videos on Youtube and I was wondering if some of these were parodies. I still remember one in particular which I'm pretty sure was a parody, but it was only marginally distinguishable from the others.
I do believe in hardship. As sacrifice. It yields long term benefits for oneself, and for society.
But submissions into slavery for immediate gain accomplishes little, and costs society a lot more (physical and mental health issues are a huge burden).
Those parodies you saw, they were caricature of elite engineers, who sacrificed decades of his life to become so competent. Can work from home, eat pasta while glancing over a PR and just hit approve.
That you resent the luxury doesn't make it undeserved privilege.
I've met programmers who severely outclassed me. It was extremely uncomfortable and it took me months to accept that reality and reshape my hurt ego into curiosity and desire to learn from someone clearly superior in the craft.
That being said, most people in the privileged positions you described are there by sheer luck and connections. In the very very best-case scenario that offends them the least: they stumbled upon an opportune position and were smart enough to make full use of it... in the first 6 months (when people pay the most attention and lasting impressions are formed). And then rode the reputation they made for years. Their value as engineers on the team after the initial honest burst of productivity becomes... very unclear from that point and on, shall we say.
Again, I've met engineers who fully deserved their privileges. 2-3 times over 24 years of career though (a good chunk of it as a contractor so I've been around). My anecdotal evidence obviously means nothing but we all develop pattern-matching skills with time, making me think what I saw is generally the statistical curve that would apply almost everywhere. Maybe.
reply