Why Do Large Language Models Hallucinate? Causes, Risks, and Multi‑Dimensional Solutions

This article systematically examines the root causes of hallucinations in large language models, evaluates their pros and cons, and presents a comprehensive set of optimization techniques—including prompt engineering, RAG, sampling tweaks, supervised fine‑tuning, LoRA, RLHF, chain‑of‑thought reasoning, and agent/workflow designs—to build more reliable and trustworthy AI applications.

Tencent Technical Engineering
Tencent Technical Engineering
Tencent Technical Engineering
Why Do Large Language Models Hallucinate? Causes, Risks, and Multi‑Dimensional Solutions

LLM Working Principle Overview

Before diving into hallucinations, it is helpful to understand how large language models (LLMs) work. LLMs are trained on massive amounts of unlabelled text and learn statistical patterns of language, enabling them to generate fluent responses to a wide range of prompts.

LLM overview diagram
LLM overview diagram

Training Process

1. Pre‑training

LLMs first undergo self‑supervised pre‑training on large‑scale internet text (web pages, encyclopedias, news, forums, books, etc.). This stage teaches the model basic language structure and expression.

2. Post‑training / Fine‑tuning

In a second stage, the model is fine‑tuned on task‑ or domain‑specific labelled data to improve performance on particular applications.

3. Alignment

Finally, alignment (e.g., RLHF) incorporates human feedback to steer the model toward outputs that respect human values and reduce harmful or inaccurate content.

Training stages diagram
Training stages diagram

Data Sources

The pre‑training corpus inevitably contains errors, outdated facts, bias, and noise. Because the model treats all tokens equally, it learns both correct and incorrect information, which later manifests as hallucinations.

Training Objective

The core goal of pre‑training is to endow the model with strong language understanding and generation abilities, not to verify factual correctness. Consequently, the model learns "language patterns" rather than "ground‑truth facts".

Inference Process

1. Tokenization and Encoding

Input text is split into tokens and converted into token IDs that the model can process.

2. Embedding Mapping

Token IDs are mapped to high‑dimensional embedding vectors, forming the input matrix for subsequent layers.

3. Positional Encoding

Positional encodings are added so the model can distinguish the order of tokens.

4. Transformer Layer Processing

The embedding matrix passes through multiple Transformer blocks, each containing multi‑head self‑attention, feed‑forward networks, and normalization, allowing the model to capture contextual relationships.

5. Linear Transformation + Softmax

The final layer’s output is linearly projected onto the vocabulary space and normalized with Softmax to obtain a probability distribution over the next token.

6. Sampling Generation

Based on the chosen sampling strategy (e.g., Top‑k, Top‑p, temperature), a token is sampled from the distribution.

7. Autoregressive Generation

The newly generated token is appended to the sequence, and steps 1‑6 repeat until an end‑of‑sentence token or a maximum length is reached, ensuring coherence through self‑regression.

Inference flow diagram
Inference flow diagram

Hallucination Types

Factual Hallucination – the model outputs statements that directly contradict objective reality (e.g., "water boils at 150 °C").

Contextual Hallucination – the response deviates from the user’s context (e.g., answering about "Apple" the company when the user asked about the fruit).

Logical Hallucination – the output contains self‑contradictions or violates common sense (e.g., claiming a person can be in Beijing and New York simultaneously).

Advantages and Disadvantages of Hallucination

While hallucinations can harm reliability, they also provide creative freedom. In creative domains such as storytelling, music, or design, the model’s propensity to generate novel content can inspire new ideas. However, in factual or safety‑critical scenarios, hallucinations lead to misinformation, user mistrust, and the propagation of false data that contaminates future training corpora.

Multi‑Dimensional Optimization Strategies

Prompt Design

Clear, unambiguous prompts with well‑defined boundaries, task decomposition, and illustrative examples (in‑context learning) help the model focus on the intended answer space.

Pre‑fill Techniques

Injecting structured placeholders or tool‑call tokens (e.g., <|im_start|>assistant<tool_call>) forces the model to produce outputs in a desired format, reducing downstream parsing errors.

Retrieval‑Augmented Generation (RAG)

RAG combines external knowledge retrieval with LLM generation. By feeding relevant documents or chat histories into the model, it can ground its answers in up‑to‑date facts, dramatically lowering hallucination risk.

RAG workflow diagram
RAG workflow diagram

Supervised Fine‑Tuning (SFT)

SFT adjusts the model on high‑quality, task‑specific instruction‑response pairs. It improves performance on targeted tasks without altering the base model’s architecture.

Parameter‑Efficient Fine‑Tuning (LoRA)

LoRA injects low‑rank trainable matrices (A, B) into selected Transformer weights (e.g., Q and V) while freezing the original parameters. This reduces storage and compute costs dramatically—e.g., a 3×3 matrix can be adapted with only 6 parameters instead of 9—while preserving most of the base model’s capabilities.

LoRA architecture diagram
LoRA architecture diagram

Reinforcement Learning from Human Feedback (RLHF)

RLHF further aligns the model by rewarding outputs that match human preferences. Common algorithms include PPO (Proximal Policy Optimization), DPO (Direct Preference Optimization), KTO (K‑step Preference Optimization), and GRPO (Generalized Reward‑Based Policy Optimization). These methods use preference data (chosen vs. rejected responses) to fine‑tune the policy while optionally keeping a KL‑penalty to the SFT reference model.

RLHF pipeline diagram
RLHF pipeline diagram

Chain‑of‑Thought (CoT) Reasoning

CoT prompts encourage the model to generate step‑by‑step reasoning before the final answer, improving accuracy on complex mathematical, logical, or multi‑step tasks. Reflection mechanisms can be added to let the model review and correct its own reasoning.

Agent / Workflow Architecture

Complex real‑world tasks are often decomposed into sub‑tasks handled by multiple agents or workflow stages. A global planning step gives the model a holistic view, while dynamic tool calls, result summarization, reflection, and re‑planning improve robustness. Smaller models can handle simple steps, and larger models can be reserved for challenging reasoning.

Agent workflow diagram
Agent workflow diagram

Conclusion

Hallucination remains a central obstacle to trustworthy LLM deployment. By understanding its origins—random sampling, noisy training data, limited context windows, and autoregressive generation—and applying a layered set of mitigations—from prompt engineering and RAG to LoRA fine‑tuning, RLHF, CoT reasoning, and sophisticated agent workflows—practitioners can substantially improve factual consistency while preserving the creative benefits of generative AI.

References

https://arxiv.org/pdf/2106.09685

https://arxiv.org/pdf/2304.13785

https://arxiv.org/pdf/2501.17161

https://openlmlab.github.io/MOSS-RLHF/

https://arxiv.org/pdf/2402.03300

https://llamafactory.readthedocs.io/zh-cn/latest/getting_started/data_preparation.html

AILLMprompt engineeringRAGLoRARLHFhallucination
Tencent Technical Engineering
Written by

Tencent Technical Engineering

Official account of Tencent Technology. A platform for publishing and analyzing Tencent's technological innovations and cutting-edge developments.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.