Can AI‑Generated “Silicon Samples” Replace Real Survey Respondents?

The article explains how large language models can simulate virtual respondents—called silicon samples—to generate synthetic survey data, outlines the four fidelity criteria for evaluating their credibility, and demonstrates practical workflows with the open‑source EDSL Python library.

Model Perspective
Model Perspective
Model Perspective
Can AI‑Generated “Silicon Samples” Replace Real Survey Respondents?

Why Synthetic Survey Data?

Collecting hundreds of valid questionnaire responses is time‑consuming, especially when the target population is hard to reach (e.g., niche professions or historical cohorts). Recent advances in large language models (LLMs) enable a new approach: using LLMs to simulate virtual respondents with specific demographic attributes, producing data that can be analyzed like real survey answers. This concept is termed “silicon samples.”

What Are Silicon Samples and How Are They Trusted?

Traditional “carbon‑based samples” are real human responses; silicon samples are generated by LLMs. The model is prompted to adopt a persona (gender, age, occupation, education, region) and answer a questionnaire as that persona. The method can be used for questionnaire pilot testing, filling missing data, or building simulated experimental populations.

Research design stage : test questionnaire logic and wording.

Data collection stage : simulate hard‑to‑reach groups or augment real samples.

Mechanism‑modeling stage : create virtual agent populations for social‑simulation experiments.

The reliability of silicon samples varies across these uses; the design stage is most robust, while mechanism modeling requires the most caution.

Algorithm Fidelity Criteria

To assess credibility, the authors cite Argyle et al. (Political Analysis) and propose four fidelity standards:

Social Turing Test : Can the generated text be distinguished from a real human answer?

Reverse Continuity : Does the response reflect the assigned demographic traits (e.g., low‑income, rural, middle‑school education)?

Forward Continuity : Are answers consistent across successive questionnaire items?

Pattern Correspondence : Do internal variable relationships (e.g., higher education ↔ higher political participation) match known real‑world patterns?

These criteria focus on whether the model truly understands social structure rather than merely producing plausible text. They are qualitative judgments; no quantitative thresholds are provided.

Operational Strategy: Role‑Playing Agents

The core technique is a Role‑Playing Agent (RLPA). By embedding explicit demographic information in the prompt, the LLM assumes a consistent persona throughout the survey. Managing multiple agents, questions, and models systematically is essential for generating analyzable datasets. The open‑source EDSL (Expected Parrot Domain‑Specific Language) library implements this workflow.

Practical Example 1 – Designing a Survey and Agents with EDSL

EDSL is a Python package for LLM‑driven survey research. pip install edsl Define a simple questionnaire:

from edsl import QuestionLinearScale, QuestionFreeText, Survey

q_enjoy = QuestionLinearScale(
    question_name="enjoy",
    question_text="On a scale from 1 to 5, how much do you enjoy reading?",
    question_options=[1, 2, 3, 4, 5],
    option_labels={1: "Not at all", 5: "Very much"}
)

q_favorite_place = QuestionFreeText(
    question_name="favorite_place",
    question_text="Describe your favorite place for reading."
)

survey = Survey(questions=[q_enjoy, q_favorite_place])

Create three agents with different occupational personas:

from edsl import AgentList, Agent

agents = AgentList(
    Agent(traits={"persona": p}) for p in ["artist", "mechanic", "sailor"]
)

Select LLMs and run the survey:

from edsl import ModelList, Model

models = ModelList(Model(m) for m in ["gpt-4o", "gemini-pro"])
results = survey.by(agents).by(models).run()

The resulting structured dataset can be converted to a pandas DataFrame with to_pandas() for downstream analysis. The entire process—from installation to first results—takes roughly an afternoon for a Python‑savvy researcher.

Practical Example 2 – Advanced Control: Piping, Memory, and Scenarios

EDSL supports fine‑grained questionnaire logic, crucial for the “forward continuity” fidelity dimension.

Piping (Passing Prior Answers)

Use placeholders to feed a previous answer into a subsequent question:

from edsl import QuestionNumerical, QuestionYesNo, Survey

q1 = QuestionNumerical(
    question_name="random_number",
    question_text="Pick a random number between 1 and 1,000."
)

q2 = QuestionYesNo(
    question_name="prime",
    question_text="Is this a prime number: {{random_number.answer}}"
)

survey = Survey([q1, q2])
results = survey.run()

Memory (Providing Full Context)

Memory attaches both the prior question and its answer to the next prompt, ensuring consistent stance across a series of attitude questions. Implemented via add_targeted_memory:

survey = Survey([q1, q2]).add_targeted_memory(q2, q1)

Scenarios (Parameterised Context)

Define a list of scenarios to generate multiple datasets under different conditions:

from edsl import ScenarioList

scenarios = ScenarioList.from_list(
    "activity", ["reading", "running", "relaxing"]
)

Insert the placeholder {{activity}} into question text; EDSL substitutes the appropriate activity during execution, adding a “scenario” dimension to the output.

Value and Limitations

Silicon samples are valuable but must be used cautiously. Their main advantages are:

Cost reduction in early research phases : Rapidly identify ambiguous wording or logical flaws before recruiting real participants.

Supplementing small‑sample studies : Provide additional data when certain groups are hard to recruit, provided researchers understand model biases and disclose the synthetic nature of the data.

Key limitations include structural biases in LLM training data (e.g., over‑representation of highly educated, urban, English‑speaking populations) and the inability of models to faithfully reproduce cultural nuances or subjective attitudes without sufficient exposure in their training corpus.

Potential Uses in Mathematical Modeling

For readers interested in mathematical modeling, silicon samples can support:

Prior construction for Bayesian models : Generate simulated decision‑making data to inform prior distributions.

Initializing agent‑based or system‑dynamics models : Produce a realistic initial population rather than a uniform random one.

Stress testing and sensitivity analysis : Create multiple synthetic datasets with varied personas to assess model robustness.

All applications should still adhere to the four fidelity standards to verify reliability.

Conclusion

The silicon‑sample approach offers a promising, low‑cost way to augment social‑science research, especially when combined with the well‑documented EDSL toolkit. While it cannot fully replace real human data, it serves as a useful complement when researchers are aware of its biases and validate outputs against the fidelity criteria.

Artificial IntelligencePythonLLMSurvey ResearchSynthetic DataEDSL
Model Perspective
Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.