Fundamentals 5 min read

Master Pandas query: Simple Filters, Complex Conditions, and Math Operations

Learn how to use pandas' query() function to filter DataFrames with single or multiple conditions, combine logical operators, handle string queries, and perform inline arithmetic and built‑in functions, illustrated with a synthetic dataset and step‑by‑step code examples.

Model Perspective
Model Perspective
Model Perspective
Master Pandas query: Simple Filters, Complex Conditions, and Math Operations

Pandas' query() function provides a concise way to filter DataFrames, especially when many conditions are needed.

Creating a Sample Dataset

<code>import numpy as np
import pandas as pd
np.random.seed(1)

n = 20
ID = np.arange(1, n+1)
SCORE = np.random.normal(80, 10, n).astype('int')
SEX = ['male', 'famele'] * 10
AGE = np.random.randint(15, 17, n)
df = pd.DataFrame({'ID': ID, 'SEX': SEX, 'AGE': AGE, 'SCORE': SCORE})
</code>

The resulting DataFrame contains columns ID , SEX , AGE , and SCORE with 20 rows of mixed values.

Filtering with a Single Condition

When the filter contains only one condition, place the expression directly inside query() . For example, to select rows where SCORE is at least 90:

<code>df.query('SCORE>=90')
</code>

This returns the rows whose SCORE values are 90 or higher.

Filtering with Multiple Conditions

Combine conditions using logical operators and or or . Example: rows where SCORE > 80 and AGE > 15:

<code>df.query("SCORE>80 and AGE>15")
</code>

The query yields all records satisfying both criteria.

For string comparisons, wrap the whole query in double quotes and the string value in single quotes. Example: select male samples with SCORE > 80 and AGE > 15:

<code>df.query("SCORE>80 and AGE>15 and SEX=='male'")
</code>

This returns only the male rows that meet the numeric conditions.

Simple Mathematical Calculations in Queries

Arithmetic operations can be used directly in the query expression. Example: select rows where SCORE**2 + 100 is less than 4900:

<code>df.query("SCORE**2+100<4900")
</code>

The result includes rows whose squared score plus 100 stays below the threshold.

Python built‑in functions such as abs , sqrt , or exp can also be used. To find rows with SCORE between 50 and 70:

<code>df.query("abs(SCORE-60)<10")
</code>

This returns records where the score deviates from 60 by less than 10.

Reference: https://mp.weixin.qq.com/s/oi9y0NKuwyNQy4t23hvPIw

Pythondata analysisdataframepandasfilteringQuery
Model Perspective
Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.