Operations 12 min read

Using AI to Generate Massive Data for JMeter Performance Tests

This article shows how AI can help write Groovy JSR223 scripts that dynamically generate large‑scale, unique, or realistically distributed test data inside JMeter threads, eliminating CSV I/O bottlenecks and improving maintainability for performance testing.

Advanced AI Application Practice

Oct 23, 2025

Using AI to Generate Massive Data for JMeter Performance Tests

Core Idea

Generate test data dynamically inside JMeter threads to eliminate the I/O bottleneck of external CSV files. AI is used to produce the Groovy logic for these dynamic generators.

Solution Comparison

CSV data file – simple and intuitive for small, static data sets (under ten thousand rows). Large files become I/O bottlenecks and are hard to maintain.

JMeter functions – built‑in and lightweight, suitable for simple random numbers or counters. Cannot create complex structured data.

JSR223 scripts (Groovy) – unlimited data volume, high performance, flexible for complex business rules. Requires writing scripts, which AI can assist with.

Scenario 1 – Unique User Registration Data

Requirement: each request must use a unique username, email, and a random password while supporting high concurrency.

Prompt to AI :

请帮我编写一个JMeter JSR223 Sampler（使用Groovy语言）的脚本，用于生成大量唯一的用户注册数据。要求如下：
目标：为每个JMeter线程的每次请求生成唯一的数据。
生成规则：
username：格式为 "perf_user_" + 一个全局唯一且递增的数字ID。
email：格式为 "test_" + 上述唯一ID + "@example.com"。
password：一个随机长度在8-12位，且包含大小写字母和数字的字符串。
关键点：要确保在高并发下ID的唯一性。请将生成的变量存入JMeter变量（如 vars），以便HTTP请求等后续元件引用。

AI‑generated Groovy script :

// JSR223 Sampler (Groovy) – generate unique user data
import java.util.concurrent.atomic.AtomicLong

if (!globalCounter) {
    globalCounter = new AtomicLong(1L)
}

long uniqueId = globalCounter.getAndIncrement()
String username = "perf_user_" + uniqueId
String email = "test_" + uniqueId + "@example.com"

def random = new Random()
def lower = "abcdefghijklmnopqrstuvwxyz"
def upper = lower.toUpperCase()
def digits = "0123456789"
def allChars = lower + upper + digits

def passwordLength = random.nextInt(5) + 8 // 8‑12 characters
def passwordBuilder = new StringBuilder()
// Ensure at least one of each character type
passwordBuilder.append(lower.charAt(random.nextInt(lower.length())))
passwordBuilder.append(upper.charAt(random.nextInt(upper.length())))
passwordBuilder.append(digits.charAt(random.nextInt(digits.length())))
(4..passwordLength).each {
    passwordBuilder.append(allChars.charAt(random.nextInt(allChars.length())))
}
// Shuffle for extra randomness
def password = passwordBuilder.toString().toCharArray()
Collections.shuffle(Arrays.asList(password))
password = new String(password)

vars.put("username", username)
vars.put("email", email)
vars.put("password", password)

return "Data generated successfully."

Usage in JMeter:

Add a Thread Group.

Add a JSR223 Sampler under the Thread Group and paste the script. Select groovy as the language.

Reference the variables in an HTTP Request with ${username}, ${email} and ${password}.

Scenario 2 – Correlated Order Data (Product ID & Quantity)

Requirement: simulate an e‑commerce order where each order contains 1‑3 items, each with a random SKU ID (1‑1000) and quantity (1‑5). The script must output a JSON request body.

Prompt to AI :

请帮我编写一个JMeter JSR223 Sampler（Groovy）脚本，模拟一个电商下单请求的数据。
业务背景：系统有1000种商品，每个商品有固定价格。性能测试需要模拟用户随机购买1‑3件商品。
生成规则：
商品ID (skuId)：从1到1000中随机选取。
购买数量 (quantity)：从1到5中随机选取。
需要生成一个订单项列表 items，列表长度为1到3（随机）。每个列表项是一个JSON对象，包含 skuId和 quantity。
最终，脚本需要输出一个完整的JSON字符串作为请求体，并将其存入JMeter变量 requestBody。

AI‑generated Groovy script :

// JSR223 Sampler (Groovy) – generate order request data
import groovy.json.JsonOutput
import java.util.concurrent.ThreadLocalRandom

def random = ThreadLocalRandom.current()

int itemCount = random.nextInt(1, 4) // 1, 2 or 3
def items = []
(1..itemCount).each {
    def skuId = random.nextInt(1, 1001) // 1‑1000
    def quantity = random.nextInt(1, 6) // 1‑5
    items << [skuId: skuId, quantity: quantity]
}

def requestBodyMap = [items: items]
def jsonRequestBody = JsonOutput.toJson(requestBodyMap)

vars.put("requestBody", jsonRequestBody)

return "Order data generated."

In the HTTP Request sampler set Body Data to ${requestBody} and add a Content‑Type header with value application/json.

Scenario 3 – Realistic Data Distribution (City & Amount)

Requirement: generate city names and order amounts according to weighted distributions that reflect real‑world traffic.

Prompt to AI :

请帮我编写一个Groovy脚本，生成模拟真实分布的数据。
用户城市分布：60% 的用户来自["北京","上海","广州","深圳"]，30% 来自其他省会城市，10% 来自三线城市。请按此权重随机生成城市名。
订单金额分布：70% 的订单金额在 50‑200 元之间，25% 在 200‑1000 元之间，5% 在 1000‑5000 元之间。按此权重生成金额。
将生成的城市和金额存入 JMeter 变量 city 和 amount。

AI‑generated Groovy script :

// JSR223 Sampler (Groovy) – simulate realistic data distribution
import java.util.concurrent.ThreadLocalRandom

def random = ThreadLocalRandom.current()

// 1. Weighted city selection
def tier1Cities = ["北京", "上海", "广州", "深圳"]
def tier2Cities = ["杭州", "成都", "武汉", "南京", "西安", "郑州", "长沙", "沈阳"]
def tier3Cities = ["保定", "扬州", "九江", "临沂", "绵阳"]

def allCities = tier1Cities + tier2Cities + tier3Cities

def weights = []
weights.addAll(Collections.nCopies(tier1Cities.size(), 60 / tier1Cities.size())) // 60%
weights.addAll(Collections.nCopies(tier2Cities.size(), 30 / tier2Cities.size())) // 30%
weights.addAll(Collections.nCopies(tier3Cities.size(), 10 / tier3Cities.size())) // 10%

def city = random.weightedValue(allCities, weights)
vars.put("city", city)

// 2. Weighted amount generation
def range = random.nextDouble()
def amount
if (range < 0.7) {
    amount = random.nextDouble(50, 200) // 70% between 50‑200
} else if (range < 0.95) {
    amount = random.nextDouble(200, 1000) // 25% between 200‑1000
} else {
    amount = random.nextDouble(1000, 5000) // 5% between 1000‑5000
}
amount = Math.round(amount * 100) / 100.0 // keep two decimals
vars.put("amount", amount.toString())

return "Realistic data generated."

Best‑Practice Summary

Language choice : Use Groovy in JSR223 Samplers for best performance (supports optional compilation).

Performance optimisation :

Prefer ThreadLocalRandom over Random for better concurrency.

Avoid creating large objects inside the script loop.

Use the Scripting mode rather than Compilation unless the script is completely static.

Uniqueness & thread‑safety : Use concurrent utilities such as AtomicLong to generate globally unique IDs safely under high load.

Debugging : Insert log.info(...) statements to print generated values; comment them out after debugging to avoid log overload.

Seed control : Set a random seed (e.g., random.setSeed(123L)) for reproducible runs, understanding that it reduces randomness.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI Data Generation performance testing JMeter Groovy JSR223

Written by

Advanced AI Application Practice

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Core Idea

Solution Comparison

Scenario 1 – Unique User Registration Data

Scenario 2 – Correlated Order Data (Product ID & Quantity)

Scenario 3 – Realistic Data Distribution (City & Amount)

Best‑Practice Summary

Advanced AI Application Practice

How this landed with the community

Was this worth your time?

0 Comments

Scenario 1 – Unique User Registration Data

Scenario 2 – Correlated Order Data (Product ID & Quantity)

Scenario 3 – Realistic Data Distribution (City & Amount)