Go to the main content

This new AI, called Centaur, claims it can predict your decisions

What if an AI could take a psychology experiment like a person—and guess your next move before you make it?

News

What if an AI could take a psychology experiment like a person—and guess your next move before you make it?

Scientists in Germany say they’ve built an AI system that’s unusually good at guessing what people will do next in classic psychology tasks.

The model, called Centaur, was introduced alongside a giant dataset of human choices and reported to hit about 64% accuracy when predicting people’s behavior across many different experiments.

The work appeared on July 2, 2025 in Nature, with additional coverage the following week that brought it to a wider audience.

What Centaur actually is

Centaur is a language-model–based system trained to act as a stand-in for a person taking an experiment.

Instead of learning just one narrow task, it was fine-tuned on Psych-101, a curated collection of trial-by-trial data from more than 60,000 participants who made over 10 million choices across 160 classic behavioral studies.

Give Centaur the “script” of an experiment—what the participant saw, was told, and did—and it tries to pick the next response the way a real person would. 

How it was trained

Think of it like teaching a friend the rules of dozens of different games at once.

The researchers showed Centaur full transcripts of past psychology tasks and asked it to predict the next move for each participant.

When the prediction didn’t match what people actually did, they nudged the model toward the real choice and repeated that loop.

Over time, the system started landing much closer to human behavior—well enough to beat other models that try to mimic how we think

What makes this different from past efforts

Earlier models of human decision-making tended to be bespoke: great at one task, brittle elsewhere.

Centaur’s pitch is generality.

In testing, it outperformed established cognitive and machine-learning baselines and held up in situations it hadn’t seen before, including new cover stories or structural tweaks to the tasks.

That’s a key shift from “fit a model to one experiment” toward “start with a broad, unified predictor and see how far it generalizes.”

Another detail that stands out: Centaur doesn’t just guess the choice; it can also estimate reaction times—how quickly a person is likely to respond under changing conditions.

That’s useful for designing studies, because timing often matters as much as the answer itself.

Why researchers care

If you can simulate how people tend to respond, you can prototype experiments like you would prototype software.

Instead of recruiting a giant sample to test every idea, you can run your design through Centaur first, see where it’s likely to be confusing or ambiguous, and tighten it up.

The team describes this as a “virtual laboratory”: a way to predict behavior “in any situation described in natural language,” then go collect real data more efficiently.

Beyond logistics, there are bigger research questions in play.

Cognitive scientists want to understand how attention, memory, and learning interact when we make choices.

A model that can reliably mirror outcomes across many tasks gives them a new probe: they can ask where the model succeeds, where it fails, and what that pattern suggests about the ingredients needed to explain human behavior.

Even if Centaur is “just” a strong predictor, narrowing the space of good explanations is a real contribution.

Caution, context, and healthy skepticism

Outside experts are intrigued but not sold on what Centaur tells us about the mind.

Brenden Lake, a cognitive scientist at New York University who was not involved, called it “a step beyond” in prediction—but raised a core question: does the model actually capture how we think, or is it simply matching our final choices without the same inner steps?

That distinction matters if psychologists want to use models like this to explain mental processes, not just forecast them.

There’s also the real-world angle. Predicting behavior in carefully controlled lab tasks is one thing; everyday life is messy, social, and full of hidden variables.

Even supportive coverage has noted that it’s not obvious how far these results carry into complex, open-ended situations without careful validation.

That humility will be important as people discuss potential applications in policy, product design, or clinical settings.

What comes next for Centaur

The team plans to expand the Psych-101 dataset with demographic and psychological details—things like age, socioeconomic status, and personality measures.

The idea is to see whether the model can tailor predictions to different kinds of people and whether those differences line up with patterns psychologists already know from decades of research.

They also see room to use Centaur as a proxy “participant” to explore questions in mental-health research—for example, how certain symptoms might shift the way decisions are made.

On the tools side, a more data-rich Centaur could help scientists optimize experiments before running them with human participants: choosing clearer instructions, better difficulty levels, or more sensitive timings.

That could mean smaller, faster studies that still answer big questions, which is good news for labs with limited time and budgets. 

What it could mean outside the lab

It’s easy to jump to splashy scenarios—“AI that knows what you’ll do before you do”—but the realistic near-term uses are more grounded.

In education, for instance, if you can predict how a student is likely to step through a problem, you can compare teaching strategies and pick the one that’s most likely to help.

In healthcare, a research group might simulate how people with a particular condition tend to weigh risk and reward, then design assessments or supports around those tendencies.

These are plausible, research-assisted applications—not automated mind-reading.

The transparency problem

By the authors’ own description, Centaur is still a black box.

It’s very good at what people will choose; it’s less clear why.

That doesn’t make it useless—forecasts can be valuable on their own—but it does put a ceiling on certain claims.

If a model predicts choices but can’t show the steps, then psychologists still need to do the hard work of connecting those outcomes to mechanisms like memory retrieval, attention shifts, or habit formation.

The next stage of research will likely focus on testing whether patterns inside the model line up with patterns seen in human data across tasks, and where they diverge.

Bottom line

Centaur is a notable moment for the young field where large language models meet cognitive science.

Instead of building one-off models for each experiment, researchers now have a general system that often behaves like a human participant across many tasks and can be used to plan, stress-test, and compare ideas before recruiting a single person.

That doesn’t settle the debate about understanding vs. prediction, and it doesn’t mean we’ve bottled human thought.

But it does offer a practical tool that could make psychology research faster and, in some cases, more insightful.

As more data are added and more tough tests are run, we’ll learn whether Centaur is mostly a clever simulator—or the start of a deeper bridge between how we decide and how machines can model those decisions.

 

If You Were a Healing Herb, Which Would You Be?

Each herb holds a unique kind of magic — soothing, awakening, grounding, or clarifying.
This 9-question quiz reveals the healing plant that mirrors your energy right now and what it says about your natural rhythm.

✨ Instant results. Deeply insightful.

 

Jordan Cooper

Jordan Cooper is a pop-culture writer and vegan-snack reviewer with roots in music blogging. Known for approachable, insightful prose, Jordan connects modern trends—from K-pop choreography to kombucha fermentation—with thoughtful food commentary. In his downtime, he enjoys photography, experimenting with fermentation recipes, and discovering new indie music playlists.

More Articles by Jordan

More From Vegout