• What Sudoku reveals about the limits of LLMs

    From TechnologyDaily@1337:1/100 to All on Tue May 26 10:00:26 2026
    What Sudoku reveals about the limits of LLMs

    Date:
    Tue, 26 May 2026 08:54:12 +0000

    Description:
    The world's most advanced AI models can't solve Sudoku. That matters.

    FULL STORY ======================================================================Copy link Facebook X Whatsapp Reddit Pinterest Flipboard Threads Email Share this article 0 Join the conversation Follow us Add us as a preferred source on Google Newsletter Subscribe to our newsletter We need to talk about LLM reasoning. For all the fanfare about performance gains, the most
    sophisticated AI models continue to fail at tests of basic reasoning.

    In a study last year, Sapient Intelligence found that o3-mini-high, Claude 3.7, and DeepSeek R1 all score exactly 0% on Sudoku-Extreme (a collection of hard Sudokus). Zuzanna Stamirowska Social Links Navigation

    CEO and co-founder of Pathway. The fact that most powerful AI systems
    struggle with a puzzle most of us can solve in a short train journey exposes
    a structural limit built into the LLMs that are anticipated to reshape the economy and society. Latest Videos From You may like The post-transformer era has an answer to AIs energy crisis AI is breaking the limits of work (not jobs) Context, not compute, will define the next generation of intelligence

    That promise isnt all hype. But it's contingent on moving towards models
    built for reasoning under constraints, as well as language problems.

    That is precisely what the transformer architecture cannot do with implications that extend far beyond games. How we got here We need to put
    this into context. The companies behind the worlds most widely used LLMs compete in many ways while still gathering around an architectural orthodoxy. Rather than replace the transformer architecture that launched the first
    LLMs, these companies doubled down by betting on an ever-increasing scale of training data to make models smarter and building fragile workarounds.

    Mechanisms have not yet been introduced to address the fact that LLMs treat every problem as a language problem, converting it into text and attempting
    to solve it by predicting the next token, one step at a time. Each word of a models output commits it to a direction. LLMs lack an internal reasoning
    space large enough to keep multiple competing possibilities open at once
    while solving a problem. Are you a pro? Subscribe to our newsletter Sign up
    to the TechRadar Pro newsletter to get all the top news, opinion, features
    and guidance your business needs to succeed! Contact me with news and offers from other Future brands Receive email from us on behalf of our trusted partners or sponsors By submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over.

    Which brings us to Sudoku. Sudoku is governed by rigid rules that are deceptively simple. Every digit from one to nine must appear exactly once in each row, column and three-by-three box. A completed grid is easy to check: the solution either holds, or it doesnt. But solving it requires reasoning under constraints, not just describing them.

    And that distinction is where transformer-based LLMs hit a wall, since they cant hold multiple candidate paths in parallel. They cant step back to reconsider a dead end without verbalizing every intermediate thought. Sudoku doesn't care how fluently you can describe the rules. It demands that those who take on the challenge search, backtrack and converge.

    This problem is largely invisible for language tasks, of which there are many in everyday life and todays LLMs excel. But Sudoku doesnt live in language, and neither do most of the reasoning problems that LLMs need to be able to solve to break new ground. What to read next Microsoft scientists find most
    AI models struggle with long-running tasks We can realistically replicate human intelligence in AI: Heres how well achieve AGI Summarization is not reasoning: How hybrid AI fixes failing AIOps Getting Over Workarounds By now, weve all used LLMs enough to know this; theyre creative. Faced with a typical Sudoku, reasoning models with a clever enough prompt and access to code execution tools may write a Python script for a Sudoku solver and run the code. It works, but only because the rules are precise enough to be expressed as an algorithm.

    The model hasnt reasoned through the puzzle; it has formalized the
    constraints as a program and handed the problem off, but thats not the same
    as reasoning. For problems where the rules are less rigid, and based on interpretation or shifting context, that escape route closes, and the model
    is out of options.

    Fine-tuning tells a similar story. With enough bespoke training data, models can produce plausible solutions to particular problems. But test them on
    novel configurations and performance collapses. The model was acting on surface patterns, not native reasoning.

    Brought together, it punches a hole in a common narrative in AI today. Were told that AI has evolved from the development of niche models built for one purpose (like playing Go or an Atari game) to general models that perform across a dizzying range of problems. Sudoku is a relatively simple test of that promise.

    The fact that todays most advanced models cant pass it without workarounds says something about the depth of that general reasoning. Its thin. Why This Matters Beyond Sudoku Sudoku is a useful test because the skills it demands are not unique to puzzles. Some of the most critical workflows in medicine, law, operations and planning are constraint problems in disguise. In
    medicine, doctors choose therapies that must balance efficacy, side effects, drug interactions and patient history simultaneously. In law, practitioners navigate shifting regulatory constraints, conflicting precedents and client context. In operations, teams trade off schedules, supply chains and resource allocation in dynamic conditions.

    AI models wedded to reasoning through language alone cant be meaningfully integrated into these workflows. Thats where the promise of AI integration in society bumps up against reality.

    The path forward is not more parameters or longer chains of verbalized reasoning. Its a leap forward to a better architecture: one that grants
    models a larger internal reasoning space, intrinsic memory that supports continual learning and the ability to work through non-language problems without forcing everything through text.

    Think of a chess grandmaster playing twenty simultaneous games with his eyes closed, internalizing patterns and navigating each search space without verbalizing every step. That is what latent reasoning looks like, and what
    the transformer architecture cant deliver. The work of AI neo-labs including Pathways BDH (Dragon Hatchling) architecture is showing that it can be done once the break from the transformer is made. The Post-Transformer Moment Post-transformer frontier models must keep what transformers are genuinely great at. Thats language understanding and generation, while adding the ability to solve non-language problems that current LLMs cant handle.

    The real prize in doing so is creating AI capable of reasoning through constraints natively: the kind of capability that scheduling, compliance, planning and operations have always needed.

    Thats the true step towards AGI that we need to strive for next. We feature the best IT Automation software . This article was produced as part of TechRadar Pro Perspectives , our channel to feature the best and brightest minds in the technology industry today.

    The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/pro/perspectives-how-to-submit



    ======================================================================
    Link to news story: https://www.techradar.com/pro/what-sudoku-reveals-about-the-limits-of-llms


    --- Mystic BBS v1.12 A49 (Linux/64)
    * Origin: tqwNet Technology News (1337:1/100)