Why Do Data Analysis? 10 Practical Python Data Analysis Scenarios with Code Examples
The article explains the importance of data analysis for business insight, problem detection, decision support, operational optimization, forecasting, and competitiveness, and then presents ten practical Python code scenarios covering data loading, cleaning, filtering, aggregation, visualization, statistics, transformation, time‑series analysis, export, and machine‑learning applications.
Data analysis plays a crucial role in modern society by providing insights into business performance, uncovering problems and opportunities, supporting evidence‑based decision making, optimizing operations, enabling forecasting and planning, and enhancing competitiveness.
The following ten practical Python scenarios demonstrate how to apply data‑analysis techniques using pandas, matplotlib, and scikit‑learn.
1. Data Reading and Inspection:
import pandas as pd # Read CSV file data = pd.read_csv('data.csv') # View first rows print(data.head()) # Basic statistical summary print(data.describe())2. Data Cleaning and Processing:
# Drop missing values data.dropna() # Fill missing values data.fillna(0) # Remove duplicate rows data.drop_duplicates()3. Data Filtering:
# Filter by a single condition filtered_data = data[data['column'] > 10] # Filter by multiple conditions filtered_data = data[(data['column1'] > 10) & (data['column2'] == 'value')]4. Grouping and Aggregation:
# Group by a column and compute mean grouped_data = data.groupby('column')['column2'].mean() # Group by multiple columns and compute sum grouped_data = data.groupby(['column1', 'column2'])['column3'].sum()5. Data Visualization:
import matplotlib.pyplot as plt # Bar chart plt.bar(data['column1'], data['column2']) # Scatter plot plt.scatter(data['column1'], data['column2']) # Line plot plt.plot(data['column1'], data['column2'])6. Statistical Analysis:
# Mean of a column mean_value = data['column'].mean() # Median of a column median_value = data['column'].median() # Standard deviation std_value = data['column'].std()7. Data Transformation:
# Apply function to a column data['new_column'] = data['column'].apply(lambda x: x * 2) # Replace values in a column data['column'].replace({'value1': 'new_value1', 'value2': 'new_value2'}, inplace=True)8. Time‑Series Analysis:
# Convert to datetime data['date_column'] = pd.to_datetime(data['date_column']) # Set datetime column as index data.set_index('date_column', inplace=True) # Resample to daily, weekly, monthly sums daily_data = data.resample('D').sum() weekly_data = data.resample('W').sum() monthly_data = data.resample('M').sum()9. Data Export:
# Export to CSV data.to_csv('output.csv', index=False) # Export to Excel data.to_excel('output.xlsx', index=False)10. Machine Learning Application:
from sklearn.model_selection import train_test_split from sklearn.linear_model import LinearRegression # Split dataset X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) # Create and train model model = LinearRegression() model.fit(X_train, y_train) # Predict on test set predictions = model.predict(X_test)These code snippets provide a hands‑on foundation for performing Python data analysis, from data ingestion and cleaning to visualization and simple machine‑learning modeling.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
