top of page
Search

Why Shallow AGI Will Beat You at Everything and Not Truly Understand: A Professional's Guide to Working With Brilliant Imposters

  • 3 days ago
  • 5 min read


 It's 9:12 AM on a Wednesday in 2027. I'm stress-testing a scenario model I've been building, when the AI finishes something in forty seconds that would have taken my team a full sprint. The output is better than what we'd have produced. More thorough. Better structured. And when I push it on a specific regulatory edge case that requires real-world judgement about how a regulator actually behaves in practice, it gives me an answer that is technically correct, impressively argued, and completely wrong.


Not wrong in the way a junior analyst gets things wrong. Wrong in the way someone who has read every book about swimming but never touched water gets things wrong.

This post introduces what I'm calling Shallow AGI: the increasingly likely scenario where AI systems outperform humans across virtually every measurable cognitive task while possessing a form of understanding that remains fundamentally, stubbornly brittle. Getting your head around this paradox will help you carve out your role in an AI-first world of work.


Unfortunately, almost everyone is stuck in a binary argument. Sam Altman declared in December 2025 that OpenAI had effectively built AGI, that it "kinda went whooshing by." On the other side, Gary Marcus and Karl Friston argue that LLMs are "just a mapping between content and content" with nothing in the middle. Marcus calls this “Broad, Shallow Intelligence” (BSI). Many professionals have picked a camp and stopped thinking.


If you don't move past this, you end up in one of two bad places: over-trusting systems that fail when stakes are highest, or writing off the most powerful cognitive tool we've ever had because it can't pass a philosophy exam. I don't think either position is tenable.


Here's where I've landed: we are about to get AI that is better than humans at almost everything measurable, and it will still not truly understand. This is the era of 'Shallow AGI'. For most professional purposes, that gap will matter far less than the sceptics think. But it will matter far more than the optimists admit. And the question of when it matters is one almost nobody is asking carefully enough.


With that, let's dive in.


The Brilliant Imposter Has Arrived

For most of professional history, competence and understanding were inseparable. You For most of professional history, competence and understanding were the same thing. You couldn't produce expert-level analysis without expert-level comprehension. The two came bundled because nothing existed to separate them.


AI has torn that bundle apart. And honestly, we haven't caught up with what that means.

Ethan Mollick coined the term "Jagged Frontier" to describe this bizarre capability profile: superhuman at differential medical diagnosis but stumped by simple visual puzzles. As he wrote in December 2025, this jaggedness "remains a key feature of AI and an endless source of confusion."


The confusion runs deeper than uneven benchmarks. Research published by Quattrociocci's team at PNAS in 2025 identified what they termed "epistemia": the illusion of understanding produced when probabilistic output mimics deeper reflection. I think this is the right framing. The AI generates plausible patterns. We generate meaning. They look identical on paper until they don't.


Yann LeCun, now building his startup AMI Labs around an alternative architecture, put it plainly in January 2026: the path to genuine understanding through scaling language models is "complete bullshit." He argues that breakthroughs around world models and encoding uncertainty are prerequisites for anything resembling true comprehension. Shojaee and colleagues' 2025 research on Large Reasoning Models highlighted the fragility: these systems excel at medium-complexity tasks but collapse on both easy tasks and hard tasks in their 'The Illusion of thinking' paper. Although it is worth noting a rebuttal paper titled 'The illusion of the illusion of thinking' was published shortly thereafter.


In any event, what frustrates me about Marcus’, LeCun’s and others’ positions is that none of this stops the systems from being wildly, disruptively useful.


When "Good Enough" Outperforms "Truly Understood"

Thirty years across consulting and banking has taught me something that the AI debate keeps missing: most professional work doesn't require deep understanding. It requires reliable pattern execution.


When managing large and complex teams in risk management, the biggest risk is never that an analyst lacked philosophical depth about the nature of risk. It is that they'd miss a pattern, misclassify an event, or fail to connect a regulatory requirement to an operational process. These are precisely the tasks where AI is already better than us.


Mollick's research with Boston Consulting Group showed consultants using GPT-4 finished tasks 25% faster and produced 40% higher quality results. The tool worked as a skill leveller: the weakest performers saw a 43% jump. The system didn't understand consulting. It executed patterns of consulting better than most consultants. That should unsettle us more than it does.


With two of my kids already at university, I think about this constantly. The skills my generation trained in, the careful accumulation of domain knowledge over decades, are being compressed into API calls.


The metaphor I keep returning to is the autopilot (and you will see this in some of my early blogs). A Boeing 787's flight management system doesn't understand aerodynamics the way a pilot does. It can't reason from first principles about novel icing conditions. But it flies the plane better than humans in 99.7% of conditions. The question isn't about understanding but rather it’s about whether you’re happy it can fly the plane.


Shallow AGI is the autopilot for knowledge work. The 0.3% where it fails? That's where human expertise earns its keep for the time being.


The Shallow AGI Playbook: Three Principles for Professionals

The way I've started thinking about working with Shallow AGI comes down to three zones. I'm sure these will change over time, but they're holding up so far.



First, the Delegation Zone. Tasks where pattern execution is the work: drafting, summarising, data analysis, code generation, scenario modelling. Here, shallow understanding is irrelevant. Professional value shifts from doing this work to validating it. Mollick's December 2025 analysis showed AI reproduced twelve work-years of Cochrane medical review in two days with better accuracy than humans. The bottleneck wasn't intelligence; it was navigating actual scientific practice, like emailing authors for unpublished data.


Second, the Supervision Zone. The AI produces excellent first drafts that need expert judgement to land properly. Regulatory interpretation, stakeholder communication, risk appetite statements. Anything where context and institutional memory shape what "right" actually looks like. This is where shallowness bites you. My work in risk management lives here. The PRA doesn't just publish rules; it signals intent through emphasis, omission, and tone. An AI can parse the text. Only a professional currently can parse the institution.


Third, and this is the one that worries me most, the Vigilance Zone. These are failure modes that shallow understanding makes invisible. The AI won't flag when it's crossed from competent execution into confident hallucination. It won't show you the hesitation a human expert has at the edge of what they know. Your job here isn't to do the work. It's to spot when the AI has stopped doing the work and started performing it.


And the Harvard Business School Jagged Frontier project? They found AI actually decreased performance quality by 19% on complex domain-specific tasks when professionals leaned on it without critical oversight. Think about that. The illusion of competence might be the defining professional risk of the next five years.


The Bottom Line

The old question was: is this system truly intelligent?


I think the better question is: where does the shallowness actually matter?


That's not me conceding to the hype cycle. It's something more uncomfortable: a recognition that "understanding" was always just a proxy for "reliable performance." And that proxy has broken. We have systems that perform reliably without understanding, and fail precisely where understanding would have saved them. The professionals who do well from here will be the ones who can spot that boundary in real time (and engineer context appropriately to mitigate the risk), not the ones still arguing about whether the machine truly thinks.


Until next time, you'll find me running scenarios against the machine, looking for the moment its confidence exceeds its comprehension, and quietly grateful that twenty plus years of regulatory intuition still counts for something in the 0.3%.

 
 
 

Comments


© 2023 by therealityof.ai. All rights reserved

bottom of page