Wall Street’s thrown a trillion dollars at a technology that can’t tell you how many r’s are in “strawberry.” Let that sink in for a second.
Large language models – ChatGPT, Claude, Gemini, the whole gang – have convinced the smartest money managers on the planet that we’re basically three quarters away from artificial general intelligence. Tech companies are building nuclear reactors to power data centers. CEOs are promising us AI agents that’ll negotiate our salaries and plan our vacations. There’s just one tiny problem: these systems don’t actually understand language. They’re just really, really good at faking it.
And here’s the kicker – cutting-edge research from neuroscientists and cognitive scientists has been screaming this from the rooftops for years. We’ve just been too dazzled by the parlor tricks to listen.
The Pattern-Matching Illusion
I spent an hour last week watching ChatGPT confidently explain why a marble would fall upward if you put it in a cup and turned the cup upside down. Not because it’s stupid (it’s not stupid, it’s not anything), but because it doesn’t have a mental model of physics. It has statistical patterns of how words typically follow other words when humans discuss cups and marbles and gravity.
What LLMs Actually Do
Think of it this way. If you fed a sufficiently powerful algorithm every recipe ever written, it could probably generate a pretty convincing new recipe for chocolate chip cookies. It would know that butter and sugar usually appear near the beginning, that you cream them together, that chocolate chips get folded in. But it wouldn’t know what “creaming” actually does to the butter’s molecular structure, wouldn’t understand why room-temperature butter matters, couldn’t tell you why the cookies spread in the oven.
That’s basically what’s happening every time you chat with an AI. Extremely sophisticated autocomplete, running on billions of parameters and trained on most of the internet. It’s kind of miraculous, honestly. But it’s not thinking.

Recent studies from MIT and Harvard neuroscience labs have shown something fascinating – and slightly uncomfortable for the AI hype machine. When you look at brain imaging of humans processing language versus the computational patterns in LLMs, you see completely different architectures doing completely different things. Human brains build rich, multimodal representations: when you read “dog,” your brain activates visual cortex (what dogs look like), auditory regions (barking sounds), maybe even motor areas if you’ve pet a lot of dogs. The language is just an index to all that embodied knowledge.
LLMs? They’ve got relationships between tokens. Sophisticated, high-dimensional relationships, sure. But relationships between symbols, not between the things symbols represent.
Why This Should Terrify Investors (But Probably Won’t)
You know what’s wild? The market cap of companies primarily valued on their AI capabilities has ballooned past a trillion dollars. Microsoft, Google, NVIDIA (basically printing money selling GPUs), a dozen well-funded startups promising to revolutionize everything from legal work to software engineering.
The Fundamental Limit Nobody Wants to Discuss
Here’s what the research actually shows: LLMs hit a wall with reasoning tasks that require genuine understanding. Not the wall we keep pushing back with more parameters and compute – a harder wall, baked into the architecture. They can’t:
- Handle novel situations consistently: If the training data doesn’t contain examples of a specific type of problem, performance drops off a cliff. Humans generalize from principles. LLMs interpolate from examples.
- Maintain coherent world models: Ask an LLM a series of questions about a fictional scenario you invent on the spot, track its answers, and you’ll find contradictions. It’s not remembering and updating a model of the situation – it’s generating plausible next tokens.
- Know what they don’t know: This might be the scariest one. LLMs hallucinate with the same confident tone they use for accurate information, because confidence in the output is just… another pattern to match.
A cognitive scientist I talked to (off the record, because apparently questioning the AI narrative is career suicide in some circles) put it this way: “We’ve built a system that’s phenomenal at mimicking the surface structure of human knowledge. But surface structure isn’t understanding. It’s not even close.”
The Language Trap
Part of the problem is that we humans are suckers for anything that uses language fluently. Language feels like the ultimate marker of intelligence – it’s basically the thing that separates us from other animals (well, that and opposable thumbs and reality TV).

Why We Keep Falling For It
There’s actually a name for this: the ELIZA effect, named after a laughably simple chatbot from the 1960s that convinced people it understood their problems by basically just rephrasing their statements as questions. “I’m sad about my mother.” “Why are you sad about your mother?” People projected understanding onto pattern matching.
We’re doing the same thing now, just with way more expensive pattern matching. When ChatGPT writes you a thoughtful-sounding email or explains a concept clearly, your brain automatically attributes comprehension. It sounds like someone who gets it. But that’s the trap – sounding like understanding and actual understanding are completely different things.
“Language is not the same as thought. It’s a tool thought uses. We’ve built incredibly sophisticated tools that use the tool, and somehow convinced ourselves that’s the same as building the thing that does the thinking.”
The researchers publishing these findings aren’t Luddites or AI skeptics (mostly). They’re neuroscientists and linguists who’ve spent decades studying how actual intelligence works, and they’re watching Silicon Valley reinvent the philosophical zombie – a thing that acts intelligent without having any internal experience or understanding.
So Where Does That Leave Us?
Look, I’m not saying LLMs are useless. They’re genuinely transformative for certain tasks. Need to summarize a document? Generate marketing copy variations? Draft a boilerplate email? This technology is legitimately incredible for that stuff. It’s autocomplete on steroids, and autocomplete has always been useful.
But we’ve convinced ourselves we’re building something we’re not. The path from “really good at predicting text” to “actually intelligent” isn’t just long – it might not exist at all. At minimum, there’s zero evidence the current approach gets us there, no matter how many parameters we add or how much compute we throw at it.
The Real Innovation Might Be Elsewhere
What’s actually interesting – and what gets drowned out in all the hype – is research on hybrid systems. Models that combine language capabilities with genuine world modeling, with embodied learning, with something resembling actual reasoning architecture. That’s probably harder and less sexy and definitely won’t make you a billionaire by next quarter. But it might actually lead somewhere.
The trillion-dollar question (literally) is whether the market figures this out before or after the current bubble pops. History suggests after. We’re really good at throwing money at things that seem magical until we figure out they’re just extremely complicated tricks.
In the meantime, maybe we should listen to the people who study intelligence for a living when they tell us that language alone isn’t it. Just a thought.