LLM engineering — 6 Technical Articles

Machine Learning Algorithms & Natural Language Processing

Apr 28, 2026 · Artificial Intelligence

Why DeepSeek V4 Insists on Batch Invariance—and What It Costs

DeepSeek V4 achieves ultra‑long context, complex training pipelines, and custom high‑performance kernels by enforcing batch invariance, a design that guarantees bit‑wise identical outputs across varying batch shapes but incurs lower GPU utilization, reduced small‑batch speed, and added engineering complexity.

DeepSeek V4GPU utilizationLLM engineering

0 likes · 8 min read

Why DeepSeek V4 Insists on Batch Invariance—and What It Costs

AI Large-Model Wave and Transformation Guide

Apr 22, 2026 · Artificial Intelligence

How to Tame LLMs with a Seven‑Layer Constraint Architecture

The article analyzes the shortcomings of model‑centric LLM designs and presents Harness’s seven‑layer “rope engineering” framework, detailing each layer’s responsibilities, design principles, formalizations, and applicability to build reliable, production‑grade AI systems.

AI reliabilityLLM engineeringProduction AI

0 likes · 14 min read

How to Tame LLMs with a Seven‑Layer Constraint Architecture

AI Architecture Hub

Apr 21, 2026 · Artificial Intelligence

Why Harness Architecture Turns LLMs into Production‑Ready Agents

This article explains why the Harness architecture—linking prompts, context, and runtime support—is the decisive factor that turns large language models from demo prototypes into reliable production agents, detailing its core capabilities, structural components, execution loop, design trade‑offs, and industry trends.

AI OperationsAgent HarnessContext Management

0 likes · 35 min read

Why Harness Architecture Turns LLMs into Production‑Ready Agents

AI Architecture Hub

Mar 15, 2026 · Artificial Intelligence

How OpenClaw Solves Long‑Task Context Challenges for AI Agents

This article analyses the real‑world pain points of long‑running AI agents, breaks down OpenClaw’s core concepts, explains its three‑layer context‑compression pipeline, presents four key engineering decisions, shares six practical techniques with essential parameters, and compares OpenClaw to competing approaches.

AI agentsLLM engineeringOpenClaw

0 likes · 17 min read

How OpenClaw Solves Long‑Task Context Challenges for AI Agents

Code Mala Tang

Mar 9, 2026 · Artificial Intelligence

How Claude’s New Prompt Caching Cuts Token Costs by 90% for Long‑Running Agents

Claude’s API now automatically caches static parts of prompts—system instructions, tool definitions, and context—so repeated calls reuse these sections at only 10% of the standard token price, dramatically reducing costs for multi‑turn agents, but developers must manage prefixes and avoid cache‑breaking changes.

Claude APILLM engineeringToken Optimization

0 likes · 15 min read

How Claude’s New Prompt Caching Cuts Token Costs by 90% for Long‑Running Agents

dbaplus Community

Jan 21, 2026 · Information Security

How Large Language Models Transform Data Security: Frameworks, Challenges, and Real-World Practices

This article reviews the current state, feasibility, industry adoption, concrete deployment scenarios, and future directions of applying large language models to data security, covering technical challenges, architectural designs, prompt engineering, privacy‑preserving techniques, and practical case studies.

AI applicationsInformation SecurityLLM engineering

0 likes · 21 min read

How Large Language Models Transform Data Security: Frameworks, Challenges, and Real-World Practices