Artificial Intelligence 15 min read

Step‑by‑Step Guide to Setting Up OpenAI SDK, LangChain Agents, and Multimodal Calls with Python

This tutorial walks you through installing the uv package manager, configuring OpenAI credentials, running a basic OpenAI SDK example, exploring detailed API request and response fields, and building both blocking and streaming LangChain agents with custom tools and multimodal image inputs.

AI Software Product Manager

Apr 17, 2026

Step‑by‑Step Guide to Setting Up OpenAI SDK, LangChain Agents, and Multimodal Calls with Python

1. Environment Setup

Install the uv package manager. On Windows run:

# Windows installation
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

On macOS/Linux run:

# macOS and Linux installation
curl -LsSf https://astral.sh/uv/install.sh | sh

Install OpenAI dependencies using uv:

uv add python-dotenv
uv add openai

2. Basic OpenAI SDK Example

Create a .env file with your credentials:

OPENAI_API_KEY=sk-YourKeyHere
OPENAI_BASE_URL=http://model.xxx.ai.srv/v1

Python script openai_sdk_demo.py demonstrates loading the environment, creating an OpenAI client, sending a chat completion request, and printing the response.

import os
from dotenv import load_dotenv
from openai import OpenAI

def main() -> None:
    load_dotenv()
    client = OpenAI(
        api_key=os.getenv("OPENAI_API_KEY"),
        base_url=os.getenv("OPENAI_BASE_URL"),
    )
    response = client.chat.completions.create(
        model="azure_openai/gpt-5.3-codex",
        messages=[
            {"role": "system", "content": "You are a helpful AI assistant."},
            {"role": "user", "content": "你好, 你是谁?"},
        ],
        stream=False,
        temperature=0.9,
    )
    print(response.model_dump_json(indent=2))
    print(response.choices[0].message.content)

if __name__ == "__main__":
    main()

Sample output shows a JSON object containing the assistant’s reply, token usage, and metadata.

3. OpenAI API Request Details

The request uses a POST method to /chat/completions with headers Content-Type: application/json and Authorization: API KEY. The JSON body includes fields such as:

model : name of the model (e.g., gpt-5.3-codex)

messages : ordered list of role‑content pairs (system, user, assistant)

temperature : sampling temperature (0‑2, higher = more diverse)

stream : true for token‑wise streaming, false for a single response

max_completion_tokens , tool_choice , tools (for function calling)

The response contains identifiers ( id), an array of choices with message objects, token usage statistics, and a finish_reason indicating why generation stopped.

4. LangChain Blocking Call

Install LangChain and its OpenAI integration:

uv add langchain
uv add langchain-openai

Define a tool function, create a ChatOpenAI model, build an agent with the tool, and invoke it synchronously.

from dotenv import load_dotenv
from langchain.agents import create_agent, tool
from langchain_openai import ChatOpenAI
import os

@tool
def get_weather(location: str) -> str:
    """Get the weather in a given location."""
    return f"Current weather in {location} is snow."

def main() -> None:
    load_dotenv()
    llm = ChatOpenAI(
        model=os.getenv("OPENAI_MODEL", "azure_openai/gpt-5.3-codex"),
        api_key=os.getenv("OPENAI_API_KEY"),
        base_url=os.getenv("OPENAI_BASE_URL"),
        temperature=0.2,
    )
    agent = create_agent(
        model=llm,
        tools=[get_weather],
        system_prompt="You are a helpful assistant.",
    )
    response = agent.invoke({
        "messages": [{"role": "user", "content": "杭州今天天气如何？"}]
    })
    print(response)

if __name__ == "__main__":
    main()

The output shows the tool call payload and the final answer returned by the model.

5. LangChain Streaming Call

Using the same setup, call agent.stream with stream_mode="messages" to receive tokens as they are generated.

messages = agent.stream(
    {"messages": [{"role": "user", "content": "你是谁?"}]},
    stream_mode="messages",
)
for token, metadata in messages:
    if token.content:
        print(token.content, end="
", flush=True)

6. init_chat_model Helper

The init_chat_model function abstracts model initialization across providers, allowing a unified call without importing provider‑specific packages.

from langchain.chat_models import init_chat_model

llm = init_chat_model(
    model=os.getenv("OPENAI_MODEL", "azure_openai/gpt-5.3-codex"),
    model_provider="openai",
    api_key=os.getenv("OPENAI_API_KEY"),
    base_url=os.getenv("OPENAI_BASE_URL"),
    temperature=0.2,
)

7. BaseMessage Types in LangChain

SystemMessage : role system, used for background or role description.

HumanMessage : role user, contains user input.

AIMessage : role assistant, holds model output, tool calls, and metadata.

ToolMessage : role tool, represents the result of a tool invocation.

8. Multimodal Example

Demonstrates sending an image URL together with a text prompt using a HumanMessage that contains a list of content blocks.

from langchain.chat_models import init_chat_model
from langchain.agents import create_agent
from langchain_core.messages import SystemMessage, HumanMessage
import os

llm = init_chat_model(
    model=os.getenv("OPENAI_MODEL", "azure_openai/gpt-5.3-codex"),
    model_provider="openai",
    api_key=os.getenv("OPENAI_API_KEY"),
    base_url=os.getenv("OPENAI_BASE_URL"),
    temperature=0.2,
)

agent = create_agent(model=llm, system_prompt="You are a helpful assistant.")

response = agent.invoke({
    "messages": [
        SystemMessage("你是一个热心的AI助手。"),
        HumanMessage([
            {"type": "text", "text": "描述以下这张图片的内容。"},
            {"type": "image", "url": "https://i-blog.csdnimg.cn/direct/7e252620ed2d4e4a8b2d0910198f0401.png"},
        ]),
    ]
})
print(response)

This call returns a description generated by the model for the supplied image.

SDK Python AI LangChain API OpenAI uv