How to Build Custom LLM and Chat Models in LangChain: A Step‑by‑Step Guide

This article explains why and how to create custom LLM and chat model wrappers in LangChain by inheriting base classes, implementing core methods like _call and _generate, and demonstrates a full example with an echo‑based chat model, including streaming support and testing benefits.

BirdNest Tech Talk
BirdNest Tech Talk
BirdNest Tech Talk
How to Build Custom LLM and Chat Models in LangChain: A Step‑by‑Step Guide

Why create custom models?

Integrate new models : expose a standard LangChain interface for internal models or newly released cloud services that are not yet supported.

Modify existing behavior : wrap an existing model such as ChatOpenAI and add custom retry policies, request logging, or default‑parameter overrides.

Testing : implement a deterministic mock model to avoid costly API calls during unit and integration tests, making the test suite faster and cheaper.

How to create a custom model

1. Custom LLM (simple string‑in / string‑out interface)

Inherit from langchain.llms.base.LLM.

Implement the core method:

def _call(self, prompt: str, stop: Optional[List[str]] = None, run_manager: Optional[CallbackManagerForLLMRun] = None, **kwargs: Any) -> str:
    # generate a response string from *prompt*
    ...

Provide a unique identifier by implementing the _llm_type property, e.g. return "my_custom_llm".

2. Custom Chat Model (recommended for conversational agents)

Inherit from langchain.chat_models.base.BaseChatModel.

Implement the generation method:

def _generate(self, messages: List[BaseMessage], stop: Optional[List[str]] = None, run_manager: Optional[CallbackManagerForLLMRun] = None, **kwargs: Any) -> ChatResult:
    # *messages* is a list of BaseMessage objects (SystemMessage, HumanMessage, AIMessage)
    # return a ChatResult that contains an AIMessage
    ...

(Optional) Implement _stream to yield ChatGenerationChunk objects for character‑by‑character streaming.

Implement _llm_type to return a distinct model name.

Example: EchoChatModel (custom chat model)

The file example_2_custom_chat_model.py defines EchoChatModel, a minimal echo bot that demonstrates all required hooks.

Model identifier : _llm_type returns the string "echo_chat_model" for easy debugging.

Core generation logic : _generate counts the incoming messages, extracts the content of the last message, wraps it in an AIMessage, and returns a ChatResult. An optional llm_output dictionary can carry metadata such as token counts.

def _generate(self, messages: List[BaseMessage], **kwargs) -> ChatResult:
    num_messages = len(messages)
    last_content = messages[-1].content
    ai_msg = AIMessage(content=last_content)
    return ChatResult(messages=[ai_msg], llm_output={"num_messages": num_messages})

Optional streaming : _stream yields one ChatGenerationChunk per character, enabling real‑time display.

def _stream(self, messages: List[BaseMessage], **kwargs):
    response = self._generate(messages).messages[0].content
    for ch in response:
        yield ChatGenerationChunk(text=ch)
main()

demonstrates the usage flow:

Instantiate EchoChatModel and print model._llm_type to confirm registration.

Create a list containing a SystemMessage (role definition) and a HumanMessage (user query), then call model.invoke(messages) to obtain a single AIMessage response.

Call model.stream(messages), consume the yielded ChatGenerationChunk objects, and concatenate them to reconstruct the full reply.

Messages are first‑class citizens : the custom chat model works with BaseMessage sequences, allowing system messages for role setting and AI messages for context management.

Structured output : returning a ChatResult containing an AIMessage lets the model plug seamlessly into downstream LangChain components such as parsers or routers.

Extensible design : developers can enrich _generate with caching, external API calls, or implement true incremental inference in _stream.

References

How to: create a custom chat model class – https://python.langchain.com/docs/how_to/custom_chat_model

How to: create a custom LLM class – https://python.langchain.com/docs/how_to/custom_llm

PythonAILLMLangChaintutorialChatModelcustom model
BirdNest Tech Talk
Written by

BirdNest Tech Talk

Author of the rpcx microservice framework, original book author, and chair of Baidu's Go CMC committee.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.