How to Combine Pandas with ChatGPT Using PandasAI for Smart Data Analysis
This guide shows how to install PandasAI, connect it to OpenAI's language model, and use natural‑language prompts to query and visualize Pandas DataFrames, including examples with sample data, database connections, and custom aggregations.
This article describes a new method of using Pandas together with ChatGPT to process data.
Python Pandas is an open‑source library that provides data manipulation and analysis capabilities for Python, offering efficient handling of structured data such as Series and DataFrames.
In the AI field, Pandas is often used for preprocessing steps in machine learning and deep learning pipelines, cleaning, reshaping, merging, and aggregating data into ready‑to‑use 2‑dimensional tables for AI algorithms.
Install PandasAI
pip install pandasaiImport OpenAI and PandasAI
First import the installed pandasai library and the large language model (LLM) functionality. As of May 2023, PandasAI only supports OpenAI models.
import pandas as pd
from pandasai import PandasAI
# Sample DataFrame
df = pd.DataFrame({
"country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
"gdp": [19294482071552, 2891615567872, 2411255037952, 3435817336832, 1745433788416, 1181205135360, 1607402389504, 1490967855104, 4380756541440, 14631844184064],
"happiness_index": [6.94, 7.16, 6.66, 7.07, 6.38, 6.4, 7.23, 7.22, 5.87, 5.12]
})
# Instantiate a LLM
from pandasai.llm.openai import OpenAI
llm = OpenAI(api_token="your_API_key")
pandas_ai = PandasAI(llm)
result = pandas_ai.run(df, prompt='Which are the 5 happiest countries?')
print(result)To use the OpenAI API you need to generate your own API key.
Because of Pandas' flexibility, you can also connect to relational databases such as PostgreSQL:
# creating the uri and connecting to database
pg_conn = "postgresql://YOUR_URI_HERE"
# Query SQL database
query = """
SELECT *
FROM table_name
"""
# Create dataframe named df
df = pd.read_sql(query, pg_conn)After loading data, you can interact with it via natural‑language prompts:
# Using pandas‑ai!
pandas_ai = PandasAI(llm)
pandas_ai.run(df, prompt='Place your prompt here')You can also ask more complex questions, such as the sum of GDP for the two least happy countries:
pandas_ai.run(df, prompt='What is the sum of the GDPs of the 2 unhappiest countries?')The above returns a numeric result, for example: 19012600725504 PandasAI can even generate plots:
pandas_ai.run(
df,
"Plot the histogram of countries showing for each the gdp, using different colors for each bar"
)Conclusion
ChatGPT and Pandas are powerful tools, and their combination can fundamentally change how we interact with and analyze data. ChatGPT’s advanced natural‑language processing enables intuitive, human‑like interaction with data, while PandasAI turns natural‑language queries into executable code, making data insights accessible without extensive programming.
This approach is especially useful for users unfamiliar with Python or Pandas transformations, allowing them to simply describe the desired outcome and let the AI generate the necessary code.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
