Tagged articles
7 articles
Page 1 of 1
Woodpecker Software Testing
Woodpecker Software Testing
Apr 30, 2026 · Artificial Intelligence

2026 Open-Source Landscape of AI Testing Tools

The article surveys the 2026 open‑source ecosystem for AI testing, detailing programmable runtimes, AI‑specific quality dimensions, testing‑as‑code practices, observability integration, real‑world case studies, and remaining challenges such as multimodal support and long‑context stability.

AI testingDevOpsLLM
0 likes · 8 min read
2026 Open-Source Landscape of AI Testing Tools
Woodpecker Software Testing
Woodpecker Software Testing
Apr 19, 2026 · Artificial Intelligence

Common LLM Testing Pitfalls That 90% of Test Experts Encounter

The article examines four frequent mistakes when testing large language models—misusing functional coverage, conflating hallucination detection with fact‑checking, ignoring multi‑turn interaction decay, and relying on traditional performance metrics—while offering concrete verification methods, tools, and real‑world results to improve AI quality assurance.

AI quality assuranceLLM testingcognitive SLA
0 likes · 8 min read
Common LLM Testing Pitfalls That 90% of Test Experts Encounter
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Mar 19, 2026 · Artificial Intelligence

Making LLM Answers Trustworthy: Citation Attribution and Hallucination Detection

This article explains why simple prompt‑based citation is insufficient for Retrieval‑Augmented Generation, introduces a sentence‑level attribution pipeline, combines semantic similarity with NLI verification, and presents practical hallucination detection and structured JSON output to ensure answer reliability.

LLM reliabilityNLIPrompt engineering
0 likes · 10 min read
Making LLM Answers Trustworthy: Citation Attribution and Hallucination Detection
PaperAgent
PaperAgent
Jan 5, 2026 · Artificial Intelligence

How QuCo‑RAG Replaces Model Confidence with Objective Evidence to Cut Hallucinations

QuCo‑RAG introduces a dynamic retrieval‑augmented generation framework that quantifies uncertainty using pre‑training corpus statistics, replacing unreliable model confidence with objective frequency and co‑occurrence evidence, achieving millisecond‑level hallucination detection, superior multi‑hop QA performance, and cross‑model transferability across various LLMs.

Dynamic RetrievalLLMRetrieval Augmented Generation
0 likes · 9 min read
How QuCo‑RAG Replaces Model Confidence with Objective Evidence to Cut Hallucinations
DataFunTalk
DataFunTalk
Oct 7, 2025 · Artificial Intelligence

Can Reinforcement Learning Spot Hallucinations in LLMs? Introducing RL4HS

Apple’s new paper presents RL4HS, a reinforcement‑learning framework that uses span‑level rewards and class‑aware policy optimization to detect hallucinated text spans in large language models, outperforming GPT‑5 and other baselines and offering more precise, auditable error identification.

RL4HShallucination detectionreinforcement learning
0 likes · 9 min read
Can Reinforcement Learning Spot Hallucinations in LLMs? Introducing RL4HS
DataFunSummit
DataFunSummit
Sep 29, 2025 · Artificial Intelligence

How to Detect and Prevent Hallucinations in LLM‑Powered NL2SQL Systems

This article explains the nature, types, and causes of hallucinations in large language models used for NL2SQL, reviews both unsupervised and supervised detection methods, and introduces an efficient token‑confidence based Active Sampling Detection (ASD) approach with practical deployment examples and future research directions.

AI SafetyASDLLM
0 likes · 19 min read
How to Detect and Prevent Hallucinations in LLM‑Powered NL2SQL Systems
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jan 16, 2025 · Artificial Intelligence

Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection

The authors present a semantic‑graph‑enhanced uncertainty modeling framework that captures token, sentence, and paragraph dependencies, propagates uncertainty through entity relations and contradiction probabilities, and achieves roughly a 20 % gain in paragraph‑level hallucination detection on WikiBio and NoteSum compared with existing uncertainty‑based baselines.

Semantic GraphSentence-level ModelingToken-level Modeling
0 likes · 13 min read
Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection