I wish a smarter person would research or comment on this theory I have: Training a model to measure the entropy of human generated content vs LLM generated content might be the best approach to detecting LLM generated content.
Consider the "will smith eating spaghetti test", if you compare the entropy (not similarity) between that and will smith actually eating spaghetti, I naively expect the main difference would be entropy. when we say something looks "real" I think we're just talking about our expectation of entropy for that scene. An LLM can detect that it is a person eating a spaghetti see what the entropy is compared to the entropy it expects for the scene based on its training. In other words, train a model with specific entropy measurements along side actual training data.
That's basically how "AI detectors" work, they're just ML models trained to classify human- vs LLM-generated content apart. As we all (hopefully) know, despite provider claims, they don't really work any well.
In a non-adversial context (so when the author isn't disclosing it, but also not actively trying to hide it), AI image detection is giving me great results.
I think (currently) the problems are more about text, or post processing of other media to hide AI.
Something like that would probably work for six months. This is going to be like CAPCHAs. Schools have been trying to do this for essays for years. They're failing. The machines will win.
The idea is interesting, but it's still operating within the content analysis paradigm. As soon as entropy-based detectors become popular, the next generation of LLMs will be specifically fine-tuned to generate higher-entropy text to evade them.
It's a cat-and-mouse game where the generator will always be one step ahead. It's far more robust to analyze things that are hard to fake at scale: domain age, anomalous publication frequency, and unnatural link structures
I doubt AI slob is the solution of AI slob, far too error prone. Problem is we already had a slob advertising/attention economy, AI just made the problem more visible.
Any AI model can easily increase entropy by adding info bits and we would have a weird AI info war where people will become victims. If you consume info we deal with unknown spaghetti. Generating false info is too easy for a model.
Consider the "will smith eating spaghetti test", if you compare the entropy (not similarity) between that and will smith actually eating spaghetti, I naively expect the main difference would be entropy. when we say something looks "real" I think we're just talking about our expectation of entropy for that scene. An LLM can detect that it is a person eating a spaghetti see what the entropy is compared to the entropy it expects for the scene based on its training. In other words, train a model with specific entropy measurements along side actual training data.