Tagged articles
2 articles
Page 1 of 1
Old Zhang's AI Learning
Old Zhang's AI Learning
May 16, 2026 · Artificial Intelligence

Can Your PC Run Large Language Models? Meet BenchLoop, the Local Benchmarking Tool

BenchLoop is a CLI‑plus‑Web application that lets you reproducibly benchmark locally‑run LLMs across seven suites—including speed, tool‑calling, coding and agent tasks—while recording hardware details, scoring results with a weighted formula, and optionally publishing them to a public leaderboard.

AI EvaluationBenchLoopLLM benchmarking
0 likes · 14 min read
Can Your PC Run Large Language Models? Meet BenchLoop, the Local Benchmarking Tool
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
May 12, 2026 · Artificial Intelligence

Treating Automated Testing as AI Coding: Xiaohongshu GUI Agent Real‑World Review

During the 2026 Spring Festival promotion, Xiaohongshu replaced manual UI testing with a three‑layer AI‑driven GUI Agent that executed over 43,000 runs across 106 devices and 128 scenarios, achieving 58% automation, 82% AI‑generated case adoption, 68% bug recall, 98% stability and roughly $1 per test case while drastically cutting token costs.

AI CodingAutomated TestingCode-as-Action
0 likes · 23 min read
Treating Automated Testing as AI Coding: Xiaohongshu GUI Agent Real‑World Review