Comprehensive Guide to Using OpenAI APIs: Models, Prompts, Embeddings, Fine‑Tuning, LangChain, and Multimodal Applications
This article provides a detailed, step‑by‑step tutorial on OpenAI’s language models, API endpoints, prompt engineering, embeddings, moderation, fine‑tuning, LangChain workflows, memory management, and multimodal capabilities such as audio transcription and image generation, complete with code examples and practical usage tips.
Introduction
The author revisits basic machine‑learning concepts and explains the motivation for exploring OpenAI’s APIs, then presents a concise comparison of major models (BERT, GPT, T5, ChatGPT, InstructGPT) highlighting their architectures and use‑cases.
Getting Started with Tools
GitHub Copilot : code generation, usage limits, installation steps for VS Code and JetBrains.
SciSpace : PDF summarisation.
Glarity : video/audio summarisation.
NotionAI : article polishing and rewriting.
Preparation
OpenAI account creation (Google access, email, virtual phone number).
Python environment setup (Python, IDE, package manager, optional CUDA).
API Overview
Prompt Design
Effective prompts require clear instructions, high‑quality examples, and sometimes role‑playing. Example formats are shown using 人名a:地名a and a Q&A template.
Basic Endpoints
Text Completion – openai.Completion.create(...) Chat Completion –
openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=messages, temperature=0.5, max_tokens=2048, n=3, ...)Embedding –
openai.Embedding.create(input="Your text", model="text-embedding-ada-002")Moderation –
openai.Moderation.create(input=text)Key Parameters
engine(e.g., text-davinci-003 or gpt-3.5-turbo) max_tokens, temperature, top_p, presence_penalty, frequency_penalty,
logit_biasReducing Hallucinations
Provide reliable sources and few‑shot examples.
Set temperature=0 for deterministic answers.
Teach the model to answer "I don’t know" when uncertain.
Embedding Applications
Embeddings (model text-embedding-ada-002) enable similarity search, clustering, recommendation, outlier detection, and zero‑shot classification. Cosine similarity is recommended for distance calculation.
Moderation
The moderation endpoint checks content against OpenAI policy and returns a flagged boolean, category list, and confidence scores.
Fine‑Tuning
Fine‑tuning steps:
Prepare a JSONL training file (prompt + completion).
Upload and start fine‑tuning with
openai api fine_tunes.create --training_file data.jsonl --model curie --suffix "myproject".
Use the fine‑tuned model via its name.
Key hyper‑parameters include model, n_epochs, batch_size, learning_rate_multiplier, and optional compute_classification_metrics.
LlamaIndex (Second Brain)
Build a vector store from documents and query it:
import openai, os
from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader
openai.api_key = os.getenv("OPENAI_API_KEY")
documents = SimpleDirectoryReader('path/to/docs').load_data()
index = GPTVectorStoreIndex.from_documents(documents)
index.save_to_disk('index.json')
# Load and query
index = GPTVectorStoreIndex.load_from_disk('index.json')
response = index.query("Your question")
print(response)LangChain Workflows
LLMChain & SimpleSequentialChain
Define prompts, create chains, and chain them together for translation, answering, and back‑translation:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain, SimpleSequentialChain
en_to_zh_prompt = PromptTemplate(template="请把下面这句话翻译成英文:
{question}?", input_variables=["question"])
question_prompt = PromptTemplate(template="{english_question}", input_variables=["english_question"])
zh_to_cn_prompt = PromptTemplate(template="请把下面这一段翻译成中文:
{english_answer}?", input_variables=["english_answer"])
llm = OpenAI(model_name="text-davinci-003", temperature=0.5, max_tokens=2048)
question_chain = LLMChain(llm=llm, prompt=en_to_zh_prompt, output_key="english_question")
qa_chain = LLMChain(llm=llm, prompt=question_prompt, output_key="english_answer")
answer_chain = LLMChain(llm=llm, prompt=zh_to_cn_prompt)
chinese_qa = SimpleSequentialChain(chains=[question_chain, qa_chain, answer_chain], input_key="question", verbose=True)
result = chinese_qa.run(question="如何使用OpenAI的ChatCompletion?")
print(result)Memory Management
Two main memory strategies:
ConversationBufferWindowMemory – keeps the last *k* turns.
from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(memory_key="chat_history", k=3)ConversationSummaryMemory – summarizes older turns.
from langchain.memory import ConversationSummaryMemory
memory = ConversationSummaryMemory(llm=OpenAI())Application Examples
Sentiment Analysis with Embeddings
import openai, os
from openai.embeddings_utils import cosine_similarity, get_embedding
openai.api_key = os.getenv("OPENAI_API_KEY")
model = "text-embedding-ada-002"
pos = get_embedding("好评", engine=model)
neg = get_embedding("差评", engine=model)
sample = get_embedding("这款产品真的很棒!", engine=model)
score = cosine_similarity(sample, pos) - cosine_similarity(sample, neg)
print(score)Customer Service Bot (ChatCompletion + Tools)
from langchain.agents import initialize_agent, Tool
from langchain.llms import OpenAI
llm = OpenAI(temperature=0)
def search_order(query):
return "Order status ..."
def recommend_product(query):
return "Recommended items ..."
def faq(query):
return "Policy information ..."
tools = [
Tool(name="Search Order", func=search_order, description="Answer questions about order status"),
Tool(name="Recommend Product", func=recommend_product, description="Suggest products"),
Tool(name="FAQ", func=faq, description="Handle policy questions")
]
agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)
answer = agent.run("我的订单号是2023ABCD,什么时候能到?")
print(answer)Multimodal Extensions
Audio Transcription (Whisper)
import openai, os
openai.api_key = os.getenv("OPENAI_API_KEY")
audio = open("speech.mp3", "rb")
result = openai.Audio.transcribe("whisper-1", audio)
print(result["text"])Speech Synthesis (Azure)
import os, azure.cognitiveservices.speech as speechsdk
config = speechsdk.SpeechConfig(subscription=os.getenv('AZURE_SPEECH_KEY'), region=os.getenv('AZURE_SPEECH_REGION'))
config.speech_synthesis_language = 'zh-CN'
config.speech_synthesis_voice_name = 'zh-CN-XiaoxiaoNeural'
audio_cfg = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)
synthesizer = speechsdk.SpeechSynthesizer(speech_config=config, audio_config=audio_cfg)
synthesizer.speak_text_async("欢迎使用语音合成").get()Image Classification (CLIP)
from PIL import Image
from transformers import CLIPProcessor, CLIPModel
model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
image = Image.open("cat.jpg")
labels = ["one cat", "two cats", "three cats"]
inputs = processor(text=[f"a photo of {l}" for l in labels], images=image, return_tensors="pt", padding=True)
outputs = model(**inputs)
probs = outputs.logits_per_image.softmax(dim=1)
for i, label in enumerate(labels):
print(f"{label}: {probs[0][i].item():.2%}")Image Generation (Stable Diffusion)
from diffusers import DiffusionPipeline
pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
# pipeline.to("cuda") # optional GPU
image = pipeline("A futuristic city at sunset", num_inference_steps=100).images[0]
image.save("output.png")All the above examples demonstrate how to combine OpenAI’s language models with external tools, LangChain orchestration, and multimodal APIs to build robust AI‑driven applications.
Content summarized from OpenAI official documentation and the course "AI 大模型之美".
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
