Artificial Intelligence 15 min read

Building a Custom LLM Chatbot with LangChain, ChromaDB, and LLaMA‑2

This tutorial explains how to leverage generative AI tools—including LLMs, embedding models, vector databases, and the LangChain framework—to create a custom chatbot that answers user queries using a knowledge base, with step‑by‑step code examples for Google Colab.

Architect
Architect
Architect
Building a Custom LLM Chatbot with LangChain, ChromaDB, and LLaMA‑2

Since the release of ChatGPT, generative AI has rapidly advanced, offering many open‑source models and tools; this article demonstrates how to use these resources to build a custom chatbot powered by a large language model (LLM).

Generative AI differs from predictive AI by creating new content such as text, images, or audio. The focus here is on text generation driven by LLMs such as OpenAI GPT‑4, Meta LLaMA‑2, Google PaLM, and Anthropic Claude 2.

LLMs are deep‑learning models trained on massive text corpora; they can be adapted to specific tasks via fine‑tuning or, more simply, through context injection (prompt engineering) without modifying the model weights.

Context injection typically follows these steps: collect structured or unstructured data, load and split the data into text chunks, embed the chunks into vectors, store the vectors in a vector database (e.g., ChromaDB), retrieve the most similar chunks for a user query, and combine the retrieved context with a prompt template before sending it to the LLM.

Embedding models convert tokens into high‑dimensional vectors; the article uses OpenAI’s text‑embedding‑ada‑002 (1536‑dimensional) and stores the vectors in ChromaDB.

LangChain is a framework that orchestrates LLMs, document loaders, splitters, embeddings, vector stores, and prompt templates. The tutorial uses LangChain’s CSVLoader, CharacterTextSplitter, OpenAIEmbeddings, Chroma, PromptTemplate, and RetrievalQA components.

In Google Colab the required packages are installed, the necessary libraries are imported, and a HuggingFace LLaMA‑2‑7B‑chat‑hf model is loaded via a Transformers pipeline. The pipeline is wrapped with HuggingFacePipeline to create an LLM object.

A prompt template is defined to make the LLM act as a customer‑service chatbot for an online perfume company, and the template is passed to a RetrievalQA chain that connects the LLM, the Chroma vector store, and the prompt.

Finally, a sample query (“What types of perfumes do you sell?”) is run through the chain, and the response demonstrates how the system can provide concise, human‑like answers based on the custom knowledge base.

The tutorial concludes that even users with basic programming experience can build functional LLM applications by following these steps.

!pip install -q transformers einops accelerate langchain bitsandbytes !pip install -qqq openai !pip install -Uqqq chromadb

import os import textwrap import langchain import chromadb import transformers import openai import torch from transformers import AutoTokenizer from langchain import HuggingFacePipeline from langchain.text_splitter import CharacterTextSplitter from langchain.document_loaders.csv_loader import CSVLoader from langchain.embeddings import OpenAIEmbeddings from langchain.vectorstores import Chroma from langchain.chains import RetrievalQA from langchain.prompts import PromptTemplate

!huggingface-cli login

os.environ["OPENAI_API_KEY"] = "INSERT_YOUR_API_KEY"

# Set up HuggingFace Pipeline with Llama-2-7b-chat-hf model model = "meta-llama/Llama-2-7b-chat-hf" tokenizer = AutoTokenizer.from_pretrained(model) pipeline = transformers.pipeline( "text-generation", # task model=model, tokenizer=tokenizer, torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto", max_length=1000, do_sample=True, top_k=10, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id ) # LLM initialized in HuggingFace Pipeline wrapper llm = HuggingFacePipeline(pipeline=pipeline, model_kwargs={'temperature':0})

# Load documents locally as CSV loader = CSVLoader('YOUR_CSV_FILE_PATH') docs = loader.load() docs[0] # Output: # Document(page_content='...Question: ...', metadata={'source': '/content/sample_data/Fragrances-Dataset.csv', 'row': 0})

# Split document into text chunks text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) docs = text_splitter.split_documents(docs)

# Initialize the open-source embedding function, default: text-embedding-ada-002 embedding_function = OpenAIEmbeddings()

# Load it into ChromaDB db = Chroma.from_documents(docs, embedding_function)

# Design Prompt Template template = """ You are a customer service chatbot for an online perfume company called Fragrances International. {context} Answer the customer's questions only using the source data provided. If you are unsure, say "I don't know, please call our customer support". Use engaging, courteous, and professional language similar to a customer representative. Keep your answers concise. Question: Answer: """

# Initialize prompt using PromptTemplate via LangChain prompt = PromptTemplate(template=template, input_variables=["context"]) print(prompt.format(context="A customer is on the perfume company website and wants to chat with the website chatbot."))

# Chain to have all components together and query the LLM chain_type_kwargs = {"prompt": prompt} chain = RetrievalQA.from_chain_type( llm=llm, chain_type="stuff", retriever=db.as_retriever(search_kwargs={"k": 1}), chain_type_kwargs=chain_type_kwargs, )

# Formatted printing def print_response(response: str): print("\n".join(textwrap.wrap(response, width=80))) # Running chain through LLM with query query = "What types of perfumes do you sell?" response = chain.run(query) print_response(response)

PythonLLMLangChainVector Databaseembeddingchatbotgenerative AI
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.