Master Pandas Basics: From DataFrames to Quick Data Insights
This tutorial introduces Pandas fundamentals, covering installation, DataFrame and Series concepts, creating, reading, and storing data, quick inspection methods, essential column operations, and handling of common data types such as strings, numerics, and dates.
01 Important Preface
This is the first article of the Python data analysis practical series, focusing on a simple encounter with Pandas. Beginners often learn Python syntax quickly, dive into the classic "Python for Data Analysis" book, and then feel lost when applying the knowledge to real tasks.
The main obstacles are insufficient understanding and lack of practice, plus a common "three‑methods‑one‑confusion" trap where learners try many solutions for a single problem without deep practice.
02 Pandas Introduction
Pandas is a professional data analysis tool built on NumPy, offering flexible and efficient handling of various datasets. It provides two core data structures: DataFrame (similar to an Excel sheet) and Series (a single column of the sheet).
Before processing data, it is crucial to plan the analysis, clarify its purpose, and outline the workflow.
03 Create, Read, and Store
First, import the library: import pandas as pd Creation : The most common way to build a DataFrame is using a dictionary of lists.
df = pd.DataFrame({"Column1": [value1, value2], "Column2": [value3, value4]})Reading : Load data from CSV or Excel files.
df_csv = pd.read_csv("file.csv", engine="python") df_excel = pd.read_excel("file.xlsx")Storing : Save DataFrames back to CSV or Excel.
df.to_csv("output.csv", index=False) df.to_excel("output.xlsx", index=False)04 Quick Data Inspection
Use df.head() and df.tail() to view the first and last rows. df.info() reveals column data types and missing values. df.describe() provides statistical summaries for numeric columns.
05 Basic Column Operations
Add : df["NewColumn"] = new_values Delete : df.drop("ColumnName", axis=1, inplace=True) Select : Single column – df["ColumnName"]; multiple columns – df[["Col1", "Col2"]] Modify :
df["ExistingColumn"] = updated_values06 Common Data Types and Operations
String : Operate with .str accessor, e.g., df["col"].str.replace("-", "").
Numeric : Perform arithmetic directly, e.g., df["Visitors"] + 10000 or calculate sales:
df["Sales"] = df["Visitors"] * df["ConversionRate"] * df["AvgOrderValue"]Be aware of type mismatches; convert percentage strings to floats before calculations.
Time : Convert string dates to datetime objects: df["Date"] = pd.to_datetime(df["DateString"]) After conversion, date arithmetic is straightforward, such as computing days until year‑end:
(pd.Timestamp("2019-12-31") - df["Date"]).dt.daysSummary
Understand what Pandas is and its core data structures.
Learn how to create, read, and store DataFrames.
Quickly inspect data with head, tail, info, and describe.
Perform basic column operations: add, delete, select, and modify.
Handle common data types: strings, numerics, and dates.
Each step is designed to be concise and practical, preparing you for more advanced case studies.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
