Artificial Intelligence 45 min read

Comprehensive Guide to Using OpenAI APIs: Models, Prompts, Embeddings, Fine‑Tuning, LangChain, and Multimodal Applications

This article provides a detailed, step‑by‑step tutorial on OpenAI’s language models, API endpoints, prompt engineering, embeddings, moderation, fine‑tuning, LangChain workflows, memory management, and multimodal capabilities such as audio transcription and image generation, complete with code examples and practical usage tips.

Rare Earth Juejin Tech Community

Jun 12, 2023

Comprehensive Guide to Using OpenAI APIs: Models, Prompts, Embeddings, Fine‑Tuning, LangChain, and Multimodal Applications

Introduction

The author revisits basic machine‑learning concepts and explains the motivation for exploring OpenAI’s APIs, then presents a concise comparison of major models (BERT, GPT, T5, ChatGPT, InstructGPT) highlighting their architectures and use‑cases.

Getting Started with Tools

GitHub Copilot : code generation, usage limits, installation steps for VS Code and JetBrains.

SciSpace : PDF summarisation.

Glarity : video/audio summarisation.

NotionAI : article polishing and rewriting.

Preparation

OpenAI account creation (Google access, email, virtual phone number).

Python environment setup (Python, IDE, package manager, optional CUDA).

API Overview

Prompt Design

Effective prompts require clear instructions, high‑quality examples, and sometimes role‑playing. Example formats are shown using 人名a：地名a and a Q&A template.

Basic Endpoints

Text Completion – openai.Completion.create(...) Chat Completion –

openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=messages, temperature=0.5, max_tokens=2048, n=3, ...)

Embedding –

openai.Embedding.create(input="Your text", model="text-embedding-ada-002")

Moderation –

openai.Moderation.create(input=text)

Key Parameters

engine

(e.g., text-davinci-003 or gpt-3.5-turbo) max_tokens, temperature, top_p, presence_penalty, frequency_penalty,

logit_bias

Reducing Hallucinations

Provide reliable sources and few‑shot examples.

Set temperature=0 for deterministic answers.

Teach the model to answer "I don’t know" when uncertain.

Embedding Applications

Embeddings (model text-embedding-ada-002) enable similarity search, clustering, recommendation, outlier detection, and zero‑shot classification. Cosine similarity is recommended for distance calculation.

Moderation

The moderation endpoint checks content against OpenAI policy and returns a flagged boolean, category list, and confidence scores.

Fine‑Tuning

Fine‑tuning steps:

Prepare a JSONL training file (prompt + completion).

Upload and start fine‑tuning with

openai api fine_tunes.create --training_file data.jsonl --model curie --suffix "myproject"

Use the fine‑tuned model via its name.

Key hyper‑parameters include model, n_epochs, batch_size, learning_rate_multiplier, and optional compute_classification_metrics.

LlamaIndex (Second Brain)

Build a vector store from documents and query it:

import openai, os
from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader

openai.api_key = os.getenv("OPENAI_API_KEY")

documents = SimpleDirectoryReader('path/to/docs').load_data()
index = GPTVectorStoreIndex.from_documents(documents)
index.save_to_disk('index.json')

# Load and query
index = GPTVectorStoreIndex.load_from_disk('index.json')
response = index.query("Your question")
print(response)

LangChain Workflows

LLMChain & SimpleSequentialChain

Define prompts, create chains, and chain them together for translation, answering, and back‑translation:

from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain, SimpleSequentialChain

en_to_zh_prompt = PromptTemplate(template="请把下面这句话翻译成英文：

{question}?", input_variables=["question"])
question_prompt = PromptTemplate(template="{english_question}", input_variables=["english_question"])
zh_to_cn_prompt = PromptTemplate(template="请把下面这一段翻译成中文：

{english_answer}?", input_variables=["english_answer"])

llm = OpenAI(model_name="text-davinci-003", temperature=0.5, max_tokens=2048)
question_chain = LLMChain(llm=llm, prompt=en_to_zh_prompt, output_key="english_question")
qa_chain = LLMChain(llm=llm, prompt=question_prompt, output_key="english_answer")
answer_chain = LLMChain(llm=llm, prompt=zh_to_cn_prompt)

chinese_qa = SimpleSequentialChain(chains=[question_chain, qa_chain, answer_chain], input_key="question", verbose=True)

result = chinese_qa.run(question="如何使用OpenAI的ChatCompletion？")
print(result)

Memory Management

Two main memory strategies:

ConversationBufferWindowMemory – keeps the last *k* turns.

from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(memory_key="chat_history", k=3)

ConversationSummaryMemory – summarizes older turns.

from langchain.memory import ConversationSummaryMemory
memory = ConversationSummaryMemory(llm=OpenAI())

Application Examples

Sentiment Analysis with Embeddings

import openai, os
from openai.embeddings_utils import cosine_similarity, get_embedding

openai.api_key = os.getenv("OPENAI_API_KEY")
model = "text-embedding-ada-002"

pos = get_embedding("好评", engine=model)
neg = get_embedding("差评", engine=model)
sample = get_embedding("这款产品真的很棒！", engine=model)
score = cosine_similarity(sample, pos) - cosine_similarity(sample, neg)
print(score)

Customer Service Bot (ChatCompletion + Tools)

from langchain.agents import initialize_agent, Tool
from langchain.llms import OpenAI

llm = OpenAI(temperature=0)

def search_order(query):
    return "Order status ..."

def recommend_product(query):
    return "Recommended items ..."

def faq(query):
    return "Policy information ..."

tools = [
    Tool(name="Search Order", func=search_order, description="Answer questions about order status"),
    Tool(name="Recommend Product", func=recommend_product, description="Suggest products"),
    Tool(name="FAQ", func=faq, description="Handle policy questions")
]

agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)
answer = agent.run("我的订单号是2023ABCD，什么时候能到？")
print(answer)

Multimodal Extensions

Audio Transcription (Whisper)

import openai, os
openai.api_key = os.getenv("OPENAI_API_KEY")
audio = open("speech.mp3", "rb")
result = openai.Audio.transcribe("whisper-1", audio)
print(result["text"])

Speech Synthesis (Azure)

import os, azure.cognitiveservices.speech as speechsdk
config = speechsdk.SpeechConfig(subscription=os.getenv('AZURE_SPEECH_KEY'), region=os.getenv('AZURE_SPEECH_REGION'))
config.speech_synthesis_language = 'zh-CN'
config.speech_synthesis_voice_name = 'zh-CN-XiaoxiaoNeural'
audio_cfg = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)
synthesizer = speechsdk.SpeechSynthesizer(speech_config=config, audio_config=audio_cfg)
synthesizer.speak_text_async("欢迎使用语音合成").get()

Image Classification (CLIP)

from PIL import Image
from transformers import CLIPProcessor, CLIPModel
model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
image = Image.open("cat.jpg")
labels = ["one cat", "two cats", "three cats"]
inputs = processor(text=[f"a photo of {l}" for l in labels], images=image, return_tensors="pt", padding=True)
outputs = model(**inputs)
probs = outputs.logits_per_image.softmax(dim=1)
for i, label in enumerate(labels):
    print(f"{label}: {probs[0][i].item():.2%}")

Image Generation (Stable Diffusion)

from diffusers import DiffusionPipeline
pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
# pipeline.to("cuda")  # optional GPU
image = pipeline("A futuristic city at sunset", num_inference_steps=100).images[0]
image.save("output.png")

All the above examples demonstrate how to combine OpenAI’s language models with external tools, LangChain orchestration, and multimodal APIs to build robust AI‑driven applications.

Content summarized from OpenAI official documentation and the course "AI 大模型之美".

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Prompt Engineering LangChain fine-tuning Embedding API OpenAI Multimodal

Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.