Artificial Intelligence 11 min read

Can Robots Grasp Human Intentions? Theory of Mind Meets Bayesian Prediction

This article explores how understanding others' mental states—from basic intentions to recursive mindreading—can be modeled with Bayesian inference and applied to robots for predicting human behavior in scenarios like pedestrian crossing, shopping assistance, and multi‑agent games.

Model Perspective
Model Perspective
Model Perspective
Can Robots Grasp Human Intentions? Theory of Mind Meets Bayesian Prediction

1. Levels of Mind

Theory of Mind describes the ability to understand one's own and others' mental states and can be divided into hierarchical levels:

First‑order intention : an individual forms their own beliefs, desires, and intentions (e.g., knowing they like apples).

Second‑order intention : an individual infers another's mental state (e.g., recognizing that Lisa likes apples).

Third‑order and above : an individual predicts what others think about a third party (e.g., "I think John believes Mary likes apples"). Higher‑order reasoning involves recursive inference, crucial in social interaction and strategic games.

Research shows healthy adults can handle fourth‑order reasoning, while children develop basic second‑order abilities between ages 4‑6.

2. Classic Case: False Belief Task

The false‑belief task tests second‑order Theory of Mind. A typical experiment:

Little Ming places a cup in a green cabinet.

After Ming leaves, Little Hong moves the cup to a blue cabinet.

When Ming returns, which cabinet will he search?

If participants answer "green cabinet," they demonstrate second‑order reasoning by recognizing Ming's false belief.

Experiments show 3‑4‑year‑olds usually fail, about 57% of 4‑6‑year‑olds succeed, and 86% of 6‑9‑year‑olds answer correctly, indicating developmental growth of Theory of Mind.

3. Intent Prediction and Bayesian Theorem

Beyond understanding mental states, we must predict behavior. For example, a driver predicts whether a pedestrian will cross the street.

Will the pedestrian cross?

Will the pedestrian stop?

What path will the pedestrian take?

3.1 Bayesian Theorem Formula

The basic formula is:

P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}

where:

Posterior probability – probability of A after observing B.

Prior probability – initial probability of A.

Likelihood – probability of observing B given A.

Evidence – overall probability of B (normalizing factor).

3.2 Example: Predicting Pedestrian Intent

Assume a pedestrian at a curb may either cross or stop. Define:

Event: pedestrian intends to cross.

Event: pedestrian intends to stop.

Observation: pedestrian takes a step forward.

Using Bayes, we update the crossing probability based on the prior (e.g., 50% chance) and observed step.

3.3 Example: Service Robot in a Mall

A service robot predicts whether a customer will proceed to checkout. Define:

Event: customer intends to pay.

Event: customer continues shopping.

Observation: customer moves toward the checkout area.

If the prior probability of checkout is high (e.g., 90% when moving toward the counter), the robot can adjust its strategy, such as offering quick‑pay options or a shopping bag.

4. Advanced Application: Recursive Theory of Mind

In multi‑agent interactions like negotiations or games, agents must predict not only others' actions but also others' predictions of themselves, known as recursive Theory of Mind.

4.1 Case: Number Guessing Game

Each player chooses an integer between 0 and 100.

The goal is to pick a number closest to two‑thirds of the average.

Players must anticipate others' choices and adjust accordingly.

Iterative reasoning leads players from an initial guess of 50 to 33, then 22, and so on, but most people stop after a few steps, clustering around 15‑25.

4.2 How Robots Perform Recursive Reasoning

Robots can build hierarchical Bayesian models:

Construct a prior model of other agents' preferences.

Update the model with observed behavior using Bayes.

Assume the other agents also predict the robot, and adjust its own policy.

This process can be expressed mathematically, where the robot's action at time t depends on its belief about others' actions and observations.

4.3 Inverse Reinforcement Learning (IRL)

IRL enables robots to infer hidden human goals from observed behavior, contrasting with standard reinforcement learning that learns from a predefined reward function.

In a mall, IRL helps a robot learn that customers who pick up items usually head to checkout, while those waiting may be seeking assistance.

4.4 Neural Architectures for Long‑Term Interaction

To handle extended interactions, robots can use:

RNN to store past interaction data.

Transformer with attention mechanisms to focus on the most relevant information, improving intent prediction accuracy.

These models also enhance conversational AI by better understanding dialogue context.

Combining Theory of Mind with Bayesian inference and modern neural architectures is driving AI toward more intelligent and human‑like behavior in autonomous driving, robotics, and social AI.

Artificial IntelligenceBayesian inferenceRoboticsmulti-agent systemsTheory of MindIntent Prediction
Model Perspective
Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.