Artificial Intelligence

Showing 100 articles max

Jul 12, 2026 · Artificial Intelligence

Does One Update Really Strengthen a Policy? PIRL and PIPO for Closed‑Loop RL

The paper by researchers from Beihang, Peking University and Meituan proposes PIRL, a new RL‑post‑training perspective that treats policy improvement as the optimization objective, and PIPO, a plug‑and‑play framework that adds a verification loop to amplify beneficial updates and suppress harmful ones, demonstrating consistent gains across math reasoning, code and tool‑use tasks.

PIPOPIRLRL post-training

0 likes · 9 min read

Does One Update Really Strengthen a Policy? PIRL and PIPO for Closed‑Loop RL

AI Engineer Programming

Jul 12, 2026 · Artificial Intelligence

Building a Full-Agent Observability and Quality Evaluation System: From Data Collection to the Data Flywheel

This article presents a comprehensive, engineering‑focused practice for observing and evaluating large‑model agents, covering new data‑collection challenges, a three‑layer observability architecture, offline and online testing pipelines, quality‑gate mechanisms, and a self‑reinforcing data flywheel that continuously improves performance, cost, and safety.

AIOpsAgentData Flywheel

0 likes · 18 min read

Building a Full-Agent Observability and Quality Evaluation System: From Data Collection to the Data Flywheel

AI Architecture Hub

Jul 12, 2026 · Artificial Intelligence

Inside Claude Code: How 8× Code Output Reveals a Shift in Software Engineering Bottlenecks

Anthropic’s Q2 2026 internal review shows Claude Code boosting daily merged code lines eight‑fold, yet this surge uncovers new bottlenecks in verification, responsibility tracing, and collaboration, prompting a systematic redesign of engineering processes.

AI-assisted codingAgent workflowAutomation

0 likes · 14 min read

Inside Claude Code: How 8× Code Output Reveals a Shift in Software Engineering Bottlenecks

AI Architecture Path

Jul 12, 2026 · Artificial Intelligence

Archify v2.10: Open‑Source AI Drawing Skill Generates Diagrams in One Sentence

Archify v2.10, an open‑source AI drawing skill, lets developers describe system architecture, workflows, sequence or data‑flow diagrams in plain language and instantly produces high‑resolution, dual‑theme SVG/HTML outputs, while offering auto‑validation, zero‑dependency sharing, and detailed comparisons with Mermaid, Draw.io and Excalidraw.

AIArchitectureDevOps

0 likes · 15 min read

Archify v2.10: Open‑Source AI Drawing Skill Generates Diagrams in One Sentence

TonyBai

Jul 12, 2026 · Artificial Intelligence

Why AI Ignores Messy Code but Your Token Bill Doesn’t

A recent study shows that while AI coding agents can complete tasks equally well on clean or messy code, cleaner code consistently reduces token consumption and file revisits, leading to lower operational costs for developers.

AI coding agentsClaude Codecode cleanliness

0 likes · 15 min read

Why AI Ignores Messy Code but Your Token Bill Doesn’t

Liangxu Linux

Jul 11, 2026 · Artificial Intelligence

OpenAI Launches Next‑Gen Coding Assistant Codex: Key Technical Highlights and Will It Disrupt Programmers?

OpenAI’s new cloud‑based Codex AI assistant builds on the same Transformer architecture as GitHub Copilot, offering multi‑language support and faster responses, but real‑world testing shows only about 80% of its output is usable, with the remaining 20% containing bugs or mismatches—especially in embedded development—leading the author to argue that Codex is a helpful code‑completion tool rather than a revolutionary replacement for skilled engineers.

AI coding assistantCodexGitHub Copilot

0 likes · 7 min read

OpenAI Launches Next‑Gen Coding Assistant Codex: Key Technical Highlights and Will It Disrupt Programmers?

Java Tech Enthusiast

Jul 11, 2026 · Artificial Intelligence

Choosing Between LangChain4j and Spring AI for Enterprise Projects: What Really Matters

An interviewer's question about whether to use LangChain4j or Spring AI reveals that the true decision for enterprise Java AI projects hinges on team expertise, project requirements, and long‑term maintenance rather than just feature counts or ecosystem size.

AI frameworksEnterprise ArchitectureJava

0 likes · 5 min read

Choosing Between LangChain4j and Spring AI for Enterprise Projects: What Really Matters

21CTO

Jul 11, 2026 · Artificial Intelligence

OpenAI’s No.2 Executive Steps Down for Health, Shifts to Part‑Time Advisor Amid Pre‑IPO Turbulence

OpenAI’s product chief Fidji Simo announced she will resign and become a part‑time advisor due to worsening POTS, while the company grapples with declining ChatGPT market share, Anthropic’s revenue surge, and a postponed IPO timeline into 2027.

AI leadershipAI marketAnthropic

0 likes · 7 min read

OpenAI’s No.2 Executive Steps Down for Health, Shifts to Part‑Time Advisor Amid Pre‑IPO Turbulence

Machine Learning Algorithms & Natural Language Processing

Jul 11, 2026 · Artificial Intelligence

Overthinking Large Language Models: New DoS Threat to Reasoning Models Unveiled

The paper introduces a black‑box hierarchical genetic algorithm that automatically perturbs the logical structure of reasoning questions to induce excessive chain‑of‑thought in large language models, dramatically inflating output tokens (up to 26.1× on MATH) and creating a DoS‑style resource‑exhaustion attack, with extensive experiments across multiple models demonstrating the vulnerability and its transferability.

DoS attackLLMhierarchical genetic algorithm

0 likes · 10 min read

Overthinking Large Language Models: New DoS Threat to Reasoning Models Unveiled

Machine Learning Algorithms & Natural Language Processing

Jul 11, 2026 · Artificial Intelligence

GPT-5.6 solves 50‑year‑old graph theory conjecture in an hour with a 700‑word prompt and 64 sub‑agents

GPT‑5.6’s Sol Ultra model proved the long‑standing Cycle Double Cover Conjecture within an hour by orchestrating 64 sub‑agents using a detailed 700‑word prompt, illustrating how label‑based reductions and dynamic multi‑agent coordination can turn complex graph‑theoretic proofs into tractable linear‑algebra problems.

Cycle Double Cover ConjectureGPT-5.6graph theory

0 likes · 12 min read

GPT-5.6 solves 50‑year‑old graph theory conjecture in an hour with a 700‑word prompt and 64 sub‑agents

Radish, Keep Going!

Jul 11, 2026 · Artificial Intelligence

Beyond the Scores: What Really Matters in the GPT‑5.6 Release

The GPT‑5.6 launch brings three model tiers, new pricing, and a voice tool, but developers care more about prompting quirks, code verbosity, quota economics, regional access, and real‑world usability than the headline benchmark numbers.

AI DeploymentGPT-5.6OpenAI

0 likes · 9 min read

Beyond the Scores: What Really Matters in the GPT‑5.6 Release

Mingyi World Elasticsearch

Jul 11, 2026 · Artificial Intelligence

Migrating Elasticsearch Vectors to Easysearch: Why Changing Only the Field Name Isn’t Enough

The article explains that while dense vector data can be moved from Elasticsearch to Easysearch, the field types, indexing algorithms, query DSL, and filter semantics differ, requiring careful mapping changes, query rewrites, and thorough validation to avoid mismatched results.

EasysearchLSHMigration

0 likes · 11 min read

Migrating Elasticsearch Vectors to Easysearch: Why Changing Only the Field Name Isn’t Enough

Old Zhang's AI Learning

Jul 11, 2026 · Artificial Intelligence

Unsloth’s Dynamic NVFP4 Makes Qwen3.6 Run 2.5× Faster Than NVIDIA’s Official Quantization

Unsloth’s Dynamic NVFP4 quantization (W4A4) lets Qwen3.6‑27B run up to 2.5× faster on Blackwell GPUs while keeping near‑BF16 accuracy, adds FP8 KV‑Cache calibration, provides detailed hardware requirements, benchmark tables, and step‑by‑step deployment guides via vLLM, SGLang or Unsloth Studio.

BenchmarkBlackwell GPUDynamic Quantization

0 likes · 13 min read

Unsloth’s Dynamic NVFP4 Makes Qwen3.6 Run 2.5× Faster Than NVIDIA’s Official Quantization

DataFunSummit

Jul 11, 2026 · Artificial Intelligence

Why Diversity Beats Data Scale: Insights from MiniMax & Fudan’s DIVE Paper

The DIVE study shows that expanding the diversity of tool pools and task structures, rather than merely increasing the amount of homogeneous training data, dramatically improves LLM agents' ability to generalize to unseen tools, as demonstrated by a 12k‑vs‑48k experiment and reinforced by a four‑stage synthesis pipeline and RL fine‑tuning.

AI agentsDIVELLM training

0 likes · 14 min read

Why Diversity Beats Data Scale: Insights from MiniMax & Fudan’s DIVE Paper

DataFunSummit

Jul 11, 2026 · Artificial Intelligence

Agent Architecture and Practice: Building the Next‑Generation Recommendation and Search Systems

The article analyzes the technical evolution of AI‑driven recommendation and search, covering Alibaba Cloud's Agentic RAG architecture, Huawei Noah's LLM‑enhanced recommendation pipeline, and Baidu's generative ranking model GRAB, while presenting design choices, performance metrics, and real‑world deployment results.

AI agentsAgentic RAGRecommendation Systems

0 likes · 5 min read

Agent Architecture and Practice: Building the Next‑Generation Recommendation and Search Systems

Data Party THU

Jul 11, 2026 · Artificial Intelligence

From Prompt to Loop: A Comprehensive 7,500‑Word Review of AI Engineering Paradigms

This article surveys the four major AI engineering paradigms—Prompt, Context, Harness, and Loop—detailing their technical logic, practical implementations, trade‑offs, and real‑world incidents, while providing concrete guidelines and comparative analysis for building autonomous AI systems.

AI agentsAutonomous AIContext Engineering

0 likes · 25 min read

From Prompt to Loop: A Comprehensive 7,500‑Word Review of AI Engineering Paradigms

Advanced AI Application Practice

Jul 11, 2026 · Artificial Intelligence

11 Mind‑Blowing GPT‑5.6 Design Cases That Showcase Its New Capabilities

The article presents eleven striking GPT‑5.6 design examples—from a voxel‑style Manhattan and Blender‑driven scenes to city‑floating islands, a 3D globe dashboard, procedural terrain, a Google‑Earth clone, UI replica, Kyoto street view, a 3D watch, a GTA‑style world, and a Xiaohongshu clone—highlighting the model's design power, cost, token usage, and code size compared to earlier versions.

3D ModelingCost ComparisonDesign automation

0 likes · 7 min read

11 Mind‑Blowing GPT‑5.6 Design Cases That Showcase Its New Capabilities

Machine Heart

Jul 11, 2026 · Artificial Intelligence

Real-Time Multi-Shot Long Video Generation: Introducing ShotStream (ECCV 2026)

ShotStream tackles the high latency and zero‑interaction problems of multi‑shot long video generation by proposing a streaming architecture with a dual‑cache memory, discontinuous RoPE, and a two‑stage self‑forcing distillation, achieving over 25× speedup to 16 FPS on a single H200 GPU and outperforming existing bidirectional and autoregressive models.

ECCV 2026ShotStreamdual cache

0 likes · 8 min read

Real-Time Multi-Shot Long Video Generation: Introducing ShotStream (ECCV 2026)

DataFunTalk

Jul 11, 2026 · Artificial Intelligence

Ending the AI Coding Loop: Applying Control Theory for Safe Incremental Automation

The article critiques blind AI coding loops that generate massive, unreviewed PRs and proposes a control‑theory‑based framework—using sensors, controllers, and actuators—to make AI‑assisted code changes incremental, measurable, and safely integrated into real‑world engineering workflows.

AI codingEffect-TSagent loops

0 likes · 13 min read

Ending the AI Coding Loop: Applying Control Theory for Safe Incremental Automation

DataFunSummit

Jul 11, 2026 · Artificial Intelligence

Tencent CodeBuddy’s AI DLC Slashes Training Time and Costs with a Unified Spark‑Ray Service

The article explains how Tencent CodeBuddy’s AI DLC platform unifies Spark batch processing and Ray training to eliminate data movement, turning agent trajectories into reusable training fuel, which reduces monthly‑level training cycles to weekly, enables in‑place computation on billions of features, and cuts operational costs by 60%.

AI DLCData LakeGPU utilization

0 likes · 2 min read

Tencent CodeBuddy’s AI DLC Slashes Training Time and Costs with a Unified Spark‑Ray Service