Artificial Intelligence 12 min read

Designing Generative AI Agents: Models, Tools, Extensions, Function Calls, and Data Storage

The article explains how generative AI agents combine language models, tool integration, self‑guided planning, prompt‑engineering frameworks, extensions, function calls, and vector‑based data storage to create adaptable, retrieval‑augmented systems that can interact with real‑world APIs and perform complex tasks.

DevOps

Jan 8, 2025

Designing Generative AI Agents: Models, Tools, Extensions, Function Calls, and Data Storage

Humans excel at complex pattern‑recognition tasks and often rely on external tools such as books, search engines, or calculators to augment their prior knowledge before reaching conclusions.

Similarly, generative AI models can be trained to use tools that provide real‑time information or suggest real‑world actions, such as retrieving a customer's purchase history from a database to generate personalized recommendations or invoking APIs to send emails or execute financial transactions.

For this capability, a model must not only access a set of external tools but also be able to plan and execute tasks autonomously , a combination of reasoning, logic, and external information access that gives rise to the concept of an agent —a program that extends beyond the core generative model.

1. Model

In the context of agents, the model refers to the language model (LM) that serves as the central decision‑maker. An agent may employ one or multiple LMs of any size, capable of following instruction‑based reasoning frameworks such as ReAct , Chain‑of‑Thought (CoT) , or Tree‑of‑Thoughts (ToT) . Models can be general‑purpose, multimodal, or fine‑tuned for specific agent architectures, and should be selected and, ideally, pre‑trained on data that reflects the tools the agent will use.

2. Tools

Tools come in various forms and complexities, often aligning with common web API methods (GET, POST, PATCH, DELETE). They enable agents to update databases, fetch weather data, or retrieve other real‑world information, thereby supporting advanced systems such as Retrieval‑Augmented Generation (RAG) that extend the agent’s capabilities beyond the base model.

3. Agent vs. Model Differences

Model

Agent

Knowledge limited to training data.

Extends knowledge by connecting to external systems via tools.

Performs a single inference per user query; no built‑in conversation history.

Manages conversation history and can perform multi‑turn reasoning based on orchestrated decisions.

No native tool implementation.

Tools are native components of the agent architecture.

No native logical layer; prompts are simple or use reasoning frameworks.

Uses native cognitive architectures like CoT, ReAct, or frameworks such as LangChain.

4. Common Prompt‑Engineering Frameworks

ReAct provides a thinking‑process strategy that lets a language model reason about a query and take actions, improving human‑AI interaction and benchmark performance. CoT introduces intermediate reasoning steps, with variants such as self‑consistency, active prompting, and multimodal CoT. Tree‑of‑Thoughts (ToT) generalizes CoT for exploratory or strategic planning tasks.

5. Extensions (Custom Plugins)

Extensions act as standardized bridges between APIs and agents, allowing seamless execution of API calls without exposing the underlying implementation. They enable agents to learn from examples how to invoke specific endpoints and which parameters are required, supporting dynamic selection of the most appropriate extension for a given user query.

6. Function Calls

Function calls resemble extensions but differ in execution location: the model outputs a function name and parameters, which are then executed on the client side rather than within the agent. This approach is useful when direct API access is restricted, when additional data‑transformation logic is needed, or when developers want to stub APIs during iterative development.

7. Data Storage

Data storage lets developers provide raw documents to an agent, which are transformed into vector‑database embeddings. The agent can retrieve relevant information from these embeddings to inform subsequent actions or responses, enabling Retrieval‑Augmented Generation without costly re‑training or fine‑tuning.

Summary Comparison

Extension

Function Call

Data Storage

Execution

Agent‑side execution

Client‑side execution

Agent‑side execution

Use Cases

Developers want agents to control API interactions; useful for multi‑hop planning and local pre‑built extensions (e.g., Vertex Search, code interpreter).

Security or authentication limits prevent direct API calls; time‑ordering constraints; APIs not publicly exposed.

Developers need RAG with website content, PDFs, Word, CSV, spreadsheets, or unstructured data formats.

The page also contains a promotional notice for a “DevOps Engineer” certification from the Ministry of Industry and Information Technology, encouraging readers to enroll via a contact person.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

RAG data storage Extensions

Written by

DevOps

Share premium content and events on trends, applications, and practices in development efficiency, AI and related technologies. The IDCF International DevOps Coach Federation trains end‑to‑end development‑efficiency talent, linking high‑performance organizations and individuals to achieve excellence.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.