Artificial Intelligence 23 min read

AI Handles 80% of a Medical Triage Agent, Product Managers Cover the Rest

The article walks through a medical triage AI Agent built with LangChain, LangGraph, and LangSmith, showing how the framework supplies core model and tool interfaces, how graph‑based orchestration manages complex branching, loops and human‑in‑the‑loop steps, and how tracing and evaluation prove reliability for product managers.

PMTalk Product Manager Community

Apr 10, 2026

AI Handles 80% of a Medical Triage Agent, Product Managers Cover the Rest

Setting the Stage

Before diving in, the three core concepts—LangChain, LangGraph, and LangSmith—are defined as the building blocks of an AI Agent lifecycle, analogous to parts suppliers, assembly lines, and quality‑control stations in car manufacturing.

Phase 1: Build the Skeleton with LangChain

Standardised Model Interface

LangChain abstracts away the differences between Claude, GPT‑4, Gemini and other LLM APIs. By changing a single model name string, the underlying request format, parameters and response parsing are automatically adapted, allowing rapid model swaps without code changes. This standardisation is crucial for product managers who need to benchmark multiple models for medical reasoning.

Tool‑Calling Wrapper

Agents can invoke external tools such as a disease‑knowledge base or department‑matching database. Each tool is defined by a name, description, and execution function. During a triage conversation, the model decides when and which tool to call, passing relevant arguments (e.g., symptoms “dizziness”, “morning‑worsening”, “tinnitus”).

Agent Pre‑built Architecture (ReAct)

LangChain’s built‑in ReAct loop (Reason → Act → Observe → Reason) enables multi‑step reasoning. In the example, the agent first recognises the need for more information, calls the knowledge‑base tool, observes the results, and then decides whether to ask follow‑up questions or generate a structured recommendation.

Phase 2: Compose the Workflow with LangGraph

When the basic agent runs, limitations appear: real‑world triage requires conditional branches, loops, and human approval. LangGraph models the agent as a directed graph where each node represents an action (symptom collection, information‑sufficiency check, emergency assessment, department matching, recommendation generation) and edges encode transition rules.

Persistent Execution : After each node, the state is checkpointed, allowing the system to resume after crashes without losing patient input.

State Backtracking : If an unexpected recommendation occurs, the full state history (dialogue, tool outputs, decisions) can be inspected.

Memory Management : Short‑term memory holds the current conversation; long‑term memory can store past triage records for chronic‑patient follow‑up.

The graph includes loops for “information‑sufficiency” (re‑collect symptoms until enough data) and conditional branches for emergency detection (immediate advice to call 120) and age‑specific handling (different pathways for children).

Phase 3: Validate Quality with LangSmith

LangSmith provides full‑trace visibility, batch evaluation, and an interactive Studio for debugging.

Trace : Every node’s input, output, latency, token usage, and tool calls are recorded and visualised as a waterfall diagram.

Eval : A test set of 100 real triage cases is run; metrics such as triage accuracy, emergency‑recognition recall, follow‑up relevance, and recommendation readability are computed.

Studio : Engineers can replay a failing case, pause at any node, modify intermediate data, and observe the impact on the final recommendation.

Eval results (e.g., overall accuracy 87 %, emergency recall 95 %) guide iterative improvements—updating knowledge‑base content, adjusting prompts, or refining graph edges.

Phase 4: Deploy and Operate

In production, LangSmith shifts from a debugging tool to a monitoring hub, tracking metrics like average response time, emergency‑trigger frequency, human‑approval rates, and drop‑off ratios. Anomalies (e.g., a sudden spike in emergency triggers) are traced back to data or graph changes, fixed, re‑evaluated, and redeployed, completing a closed‑loop lifecycle.

Takeaways for Product Managers

Product managers should:

Define the agent’s required capabilities (model, tools, data sources) using LangChain’s abstraction.

Sketch the workflow as a state‑graph (nodes, edges, conditions) to communicate with engineers via LangGraph.

Design evaluation criteria (accuracy, emergency detection, follow‑up relevance) and leverage LangSmith for systematic testing and monitoring.

Understanding these three layers—capability, orchestration, and observability—provides a concrete framework for turning vague AI product ideas into reliable, measurable solutions.

workflow LangChain AI Agent product management LangGraph LangSmith Medical Triage

Written by

PMTalk Product Manager Community

One of China's top product manager communities, gathering 210,000 product managers, operations specialists, designers and other internet professionals; over 800 leading product experts nationwide are signed authors; hosts more than 70 product and growth events each year; all the product manager knowledge you want is right here.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Setting the Stage

Phase 1: Build the Skeleton with LangChain

Standardised Model Interface

Tool‑Calling Wrapper

Agent Pre‑built Architecture (ReAct)

Phase 2: Compose the Workflow with LangGraph

Phase 3: Validate Quality with LangSmith

Phase 4: Deploy and Operate

Takeaways for Product Managers

PMTalk Product Manager Community

How this landed with the community

Was this worth your time?

0 Comments

Phase 1: Build the Skeleton with LangChain

Phase 2: Compose the Workflow with LangGraph

Phase 3: Validate Quality with LangSmith

Phase 4: Deploy and Operate