10 min read

What Enterprises Are Learning from the State of Agent Engineering Report

The recent LangChain "State of Agent Engineering" report, combined with data from the AI‑Native Application Architecture whitepaper, reveals rapid production adoption of AI agents, persistent quality challenges, widespread observability, multi‑model strategies, and evolving evaluation practices across organizations of all sizes.

Alibaba Cloud Native

Dec 19, 2025

What Enterprises Are Learning from the State of Agent Engineering Report

Background

LangChain released the State of Agent Engineering report, which surveys enterprise adoption of AI agents, identifies challenges, and outlines emerging trends. The report was translated and reorganized for Chinese readers and compared with data from the AI‑Native Application Architecture whitepaper (September 2023) to highlight similarities and differences in agent engineering across regions.

Key Findings

Production Adoption

57% of respondents have agents in production, with large enterprises leading the wave. An additional 30.4% are actively developing agents with clear launch plans, indicating a shift from proof‑of‑concept to sustained deployment.

Quality as Primary Barrier

Quality concerns (accuracy, relevance, consistency, tone, brand/policy compliance) are the top obstacle for 32% of participants, while cost worries have decreased compared with previous years.

Observability as Standard

Nearly 89% of organizations have implemented some form of observability for agents, far exceeding the 52% adoption rate of formal evaluation (evals). Detailed tracing of multi‑step reasoning and tool calls is now considered a basic requirement.

Multi‑Model Strategies

OpenAI’s GPT series remains dominant, but Gemini, Claude, and open‑source models see significant use. Over 75% of teams employ multiple models, routing tasks based on complexity, cost, and latency rather than locking to a single vendor.

Use‑Case Distribution

Customer service (26.5%) and research/data analysis (24.4%) together account for more than half of primary deployment scenarios.

In enterprises with >10,000 employees, internal productivity improvement is the top use case (26.8%), followed by customer service and research.

Evaluation and Testing

Observability is widespread, but evaluation lags: 52.4% run offline evals on test sets, and only 37.3% conduct online evals, though the latter is growing rapidly. Teams combine automated LLM‑as‑Judge assessments with human review, especially for high‑risk or nuanced tasks.

Infrastructure and Fine‑Tuning

About one‑third of organizations invest in self‑hosted infrastructure for open‑source models to address cost, data sovereignty, or regulatory requirements. Fine‑tuning remains uncommon; 57% rely on prompting and retrieval‑augmented generation instead.

Implications

Enterprises have moved beyond the question of “whether” to deploy agents and are now focused on “how” to do so reliably, efficiently, and at scale. The main production barriers are quality, latency, and security, while observability and multi‑model orchestration are becoming essential capabilities for successful agent engineering.

References

https://www.langchain.com/state-of-agent-engineering

https://developer.aliyun.com/ebook/8479

AI agents LLM observability evaluation multi-model Enterprise Adoption

Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.