Tagged articles
4 articles
Page 1 of 1
DataFunTalk
DataFunTalk
Jan 21, 2026 · Artificial Intelligence

Why Traditional Coding Benchmarks Miss the Mark: Inside OctoCodingBench’s Process‑Level Evaluation

The article examines the rapid progress of AI coding agents, critiques existing benchmarks that only measure final correctness, and introduces OctoCodingBench—a new suite that simulates real‑world constraints, records full interaction traces, and evaluates both task success and strict process compliance across multiple languages.

AI EvaluationLLM-as-judgecoding agents
0 likes · 10 min read
Why Traditional Coding Benchmarks Miss the Mark: Inside OctoCodingBench’s Process‑Level Evaluation
DevOps in Software Development
DevOps in Software Development
Jan 14, 2026 · Information Security

Can a Unified Software Factory Meet Strict Secret‑Management Requirements?

The article analyzes how military‑grade software factories can reconcile unified development platforms with strict secret‑management requirements by focusing on process‑based governance, data classification, personnel behavior, and built‑in compliance mechanisms that make secret handling an intrinsic, auditable part of the development workflow.

DevOpsSecret ManagementSoftware Factory
0 likes · 8 min read
Can a Unified Software Factory Meet Strict Secret‑Management Requirements?
360 Quality & Efficiency
360 Quality & Efficiency
Mar 29, 2018 · Operations

Reflections on Testing Challenges: Session Synchronization, Configuration Migration, Identifier Issues, and Process Compliance

The article shares a tester's three‑year journey, detailing real‑world problems such as session synchronization failures, configuration migration oversights, missing identifiers, and process compliance lapses, while offering analysis, root‑cause explanations, and practical lessons learned for improving software testing and operations.

Bug AnalysisSession ManagementSoftware Testing
0 likes · 7 min read
Reflections on Testing Challenges: Session Synchronization, Configuration Migration, Identifier Issues, and Process Compliance