It has been that way for a while now. I see Veritasium video titles and thumbnails change quite often, it can be quite annoying as it sometimes gives the appearance of it being a whole new video.
A/B testing a title feels wrong to me, its almost as bad as A/B testing a UUID.
Just pick a title and stick to it unless you need to fix a factual error.
Right, but then there's this thing called "shared reality" and once you break it, all kinds of bad consequences happen.
This is even worse, as it also breaks temporal continuity for individual reality. E.g. I expect that if I saw a video titled X today, I'll be able to find it under title X tomorrow, and if I can't, it's one of the rare/marginal cases when it got banned/deleted/retitled, or I just misremembered. Titles becoming unstable in the general case is a bad situation.
I watched it a few days ago and this descriptive title was part of the reason I clicked. I generally trust 3B1B anyway but normally a title like "This picture broke my brain" would put me off.
In case you're curious, when I ran that title/thumbnail AB test, the option "This picture broke my brain" did end up winning. I was a bit disappointed, because I didn't really _want_ it to win, but I did include it out of curiosity. Ultimately, I changed it to the other title, mostly because I like it better, and the margin was small.
I was genuinely torn about how to title this, because one of my aims is that it stands to be enjoyed by people outside the usual online-math-viewing circles, especially the first 12 minutes, and leaning into the idea of a complex log risks alienating some of those.
That level of granularity would be interesting. For what it's worth, the metric they go by is not click-through rate; it's expected total watch time. For example, if you have two thumbnails, A and B, and for every 100 impressions of A, there are 51 total minutes of watch time, and for every 100 impressions of B, there are 49 total, then what you'd see in the dashboard is "51% A, 49% B". More total clicks with less engagement will not necessarily win out.
I generally agree that it's a pretty wild choice to just let creators put up multiple titles. That said, it's hard not to play with the shiny toy when it's sitting right there, especially if you know it may mean the lesson reaches more people. In this case, I genuinely don't know what the "right" title is, even setting engagement aside. Is it fundamentally about analyzing an Escher piece? Is it fundamentally a lesson on complex analysis, and complex logs in particular? It's both, but you don't always want to cram two stories into one title. This becomes all the more challenging when titles are, inescapably, marketing.
perhaps a bit inappropriate of me to say so here as it is off-topic, but i am going to take the opportunity anyways:
big thanks for all of your work making math both enjoyable and accessible. my kids (and i) love your videos. your positive impact extends far and wide.
As annoying as those titles are, the work that you (and few others, like Veritasium) do makes it well worth the tradeoff. Just keep reminding everyone that the annoying title gets the video into the brain of thousands of other people who aren't subscribed yet. It's a tiny price to pay for astounding value.
Everyone who watches your videos loves them and wants everyone else to watch them.
This is a really fun project and the family interview transcripts + LLM workflow feels like a genuinely good use of the technology.
I would probably have ended well before "I exported my Google Maps location history, Uber trips, bank transactions, and Shazam history."
Aside: I've started seeing lot of AI projects in this category say some variation of:
> it runs on your machine, your data stays with you, and any model can read it
I don’t think people fully appreciate the tension in those claims, especially when the model most area reaching for is Claude or GPT or Gemini. I think these things need more precise language about where data actually goes and what tradeoffs users are implicitly accepting.
48 GB is not consumer hardware. But fundamentally, there are economies of scale due to batching, power distribution, better utilization etc.., that means data center tokens will be cheaper. Also, as the cost of training (frontier) models increases, it's not clear the Chinese companies will continue open sourcing them. Notice for example, that Qwen-Max is not open source.
Nothing obviously prevents using this approach, e.g. for 3B-active or 10B-active models, which do run on consumer hardware. I'd love to see how the 3B performs with this on the MacBook Neo, for example. More relevantly, data-center scale tokens are only cheaper for the specific type of tokens data centers sell. If you're willing to wait long enough for your inferences (and your overall volume is low enough that you can afford this) you can use approaches like OP's (offloading read-only data to storage) to handle inference on low-performing, slow "edge" devices.
It is consumer hardware in the sense that Macbook Pros come with this RAM size as base and that you can buy them as a consumer, without having to sign a special B2B contract, show that your company is big and reputable enough, and order a minimum of 10 or 100.
Technically that's correct (which as we all know is the best kind of correct), but really, how many consumers are buying a high-end MacBook Pro with 48GB or more of RAM? That's a very small percentage of the population. In these kinds of discussions, "consumer" is being used as a proxy for "something your average home laptop buyer might have". And a 48GB MBP is not that.
I know it's annoying, because a 48GB MBP is indeed technically "consumer hardware", but please understand the context and don't be pedantic. You know what the GP meant. (And if not, that's... kinda on you.)
Assuming 'moat' – they'll push the frontier forward; they don't really have to worry until progress levels off.
At that point, I suppose there's still paid harnesses (people have always paid for IDEs despite FOSS options) partly for mindshare, and they could use expertise & compute capacity to provide application-specific training for enterprises that need it.
It can, sure. However, I will not pay to be lectured to on topics I have no interest getting lectured on. I'll keep my money, they can keep the sermon. Let's see who has more to gain from listening to the other. If they want my money, what I want to hear/see matters a whole lot more than what they want to preach to me.
They simply forgot the golden rule: he who has the gold -- makes the rules. Let them rediscover it.
Some is reasonable and then some is obviously just what rich people want you to think. Like America paid Hollywood a lot to always show the US armies being macho and always on the right side of wars.
When I was studying Computer Science in college, I once remarked how lucky we, English speakers, are that programming languages use English nouns and verbs. A ton of my classmates were here on a student visa, and English was not their first language. I always thought that programming in English put me at an advantage on the learning curve. I also always thought it was silly when someone would quip that programming should count for “foreign language” credit. Anyway, always cool to see non-English programming languages.
At a risk of going against the hivemind, I disagree.
I self-taught programming quite early in my life, way before I had a good command of the English language. I've read books in my native language, talked on programming forums in my native language. In the end the "english" in programming languages is just a handful of keywords, and it didn't hinder me one bit that I had no idea "int" stands for "integer".
Of course, I started by writing code like "bool es_primo(int numero)" (in my language), but there's nothing in C that says identifiers must be english, just convention. Standard library and packages nowadays would be a problem, but back then standard library were thin and "strcpy" name is obscure anyway. The real hard part was always learning how to program and design properly.
And for more advanced topics, documentation and learning materials in english only are HUGE problem for ESL, because one has to actually read and understand them. But this is not something programming language can help with.
That's coming from a Spanish speaker used to the alphabet, QUERTY, etc. I imagine you'd find it much more difficult if C were written in Chinese or Arabic, for instance.
I have a similar experience, I learned English much later than my first programming languages, and picking up some keywords and basic APIs was never an issue (it was BASIC and C/C++ at the time). Maybe I would occasionally look up in a dictionary what is 'needle' and 'haystack' in a code snippet, and I was puzzled by the ubiquitous "foo, bar, baz", which to my relief turned out to be equally cryptic for the native speakers. I still don't feel about code as a kind of English prose, it occupies a separate part of my brain, compared to the natural languages.
For people that use similar keyboards I don’t imagine it’s that different though like you said occasionally knowing that bool means Boolean or int means integer may make it slightly easier for English speakers I think a big disadvantage would likely be for people from say China that use incredibly different keyboards if I had to add a wildly different second language and switch to it every time I wanted to create a var or import something or write an if statement I’m not sure if I would’ve continued learning to code it may have been one step to many
True. English is a major reason why India is the IT back-office for most of the western world. I too have personally observed how my fellow classmates, who had done their schooling in their regional language, struggled with the coursework in college because it was solely in English. And some of them were state rankers - it felt bad to realise that they had to put in twice the effort needed to keep up their grades. I think there's a lot of potential wasted in India because of this kind of hardship / struggle - a lot of intelligent people are held back just because they lack an aptitude for multilingualism.
Naah, my non-english-speaking friends say that the keywords are less than 1% complexity of a programmer's job, so it really doesn't matter.
Also, in most languages you already can name variables/classes/members in any Unicode letters. So only "if/for/while" keywords and stdlib classes remain English. It makes little sense to translate those.
However, in the vast majority of cases, non-ASCII characters are rarely used for variable or function names during programming. This is because they can cause conflicts when using different encoding systems, and some automation tools fail to recognize them. Consequently, programmers in non-English speaking regions must invest more effort into naming variables than English speakers, as they have to translate all localized expressions into English.
When Toss, a Korean unicorn startup, announced that they would start using Korean for variable names within financial contexts, it sparked significant debate and a wide range of reactions among Korean programmers.
Nah. If anything, treating keywords as special sigils actually helps.
Also, not all natural languages are suitable for programming languages. In highly inflected languages you often end up with grammatically incorrect forms. Or with stilted language.
Thank you for your empathy. English has been the one of the most frequent languages for globe so that it is reasonable to Eng in many coding project, though.
It's may also be reasonable to make localized translations for a programming language. This is rarely done in reality for obvious reasons. An exception are Excel's function names. People who don't know English, or hardly know it, appreciate it.
That’s the least of their problems. The best computer science textbooks are published first and foremost in English and only translated belatedly. The research papers are in English and not often translated. Even the manuals of both commercial and FOSS programming tools tend not be translated. A few keywords is what, half an hour of rote memorization.
It wasn’t “revoked under Biden.” That implies the Biden administration (or any administration) gets to define this. They don’t. Recessions in the United States are generally demarcated by NBER.¹
>It does imply that because the Trump admin killed the group involved with preventing pandemics[1]
No it doesn't, not without massively reading in between the lines. This is getting to absurd levels of nitpicking over wording, like "autistic people" vs "people with autism".
>I assume you are being disingenuous by using that claim while also trying to smear the Biden admin.
Two can play at this game. I assume you're being disingenuous by trying to put words in my mouth over tiny disagreements in wording.
Interesting article you’ve linked. I’m not sure I agree, but it was a good read and food for thought in any case.
Work is still being done on how to bulletproof input “sanitization”. Research like [1] is what I love to discover, because it’s genuinely promising. If you can formally separate out the “decider” from the “parser” unit (in this case, by running two models), together with a small allowlisted set of tool calls, it might just be possible to get around the injection risks.
Sanitization isn’t enough. We need a way to separate code and data (not just to sanitize out instructions from data) that is deterministic. If there’s a “decide whether this input is code or data” model in the mix, you’ve already lost: that model can make a bad call, be influenced or tricked, and then you’re hosed.
At a fundamental level, having two contexts as suggested by some of the research in this area isn’t enough; errors or bad LLM judgement can still leak things back and forth between them. We need something like an SQL driver’s injection prevention: when you use it correctly, code/data confusion cannot occur since the two types of information are processed separately at the protocol level.
The linked article isn't describing a form of input sanitization, it's a complete separation between trusted and untrusted contexts. The trusted model has no access to untrusted input, and the untrusted model has no access to tools.
That’s still only as good as the ability of the trusted model to delineate instructions from data. The untrusted model will inevitably be compromised so as to pass bad data to the trusted model.
I have significant doubt that a P-LLM (as in the camel paper) operating a programming-language-like instruction set with “really good checks” is sufficient to avoid this issue. If it were, the P-LLM could be replaced with a deterministic tool call.
reply