Fundamentals 7 min read

Analyzing the World Happiness Index with Python: Data Preparation, Visualization, and Insights

This article demonstrates how to use Python, pandas, and Plotly to download, clean, and visualize World Happiness Report data from 2015‑2019, revealing the strongest correlations between happiness scores and factors such as GDP per capita, healthy life expectancy, and social support.

Python Programming Learning Circle

Dec 6, 2024

Analyzing the World Happiness Index with Python: Data Preparation, Visualization, and Insights

The article introduces the World Happiness Report, explaining that Gallup has surveyed over 150 countries each year since 2012, asking respondents to rate their current life on a 0‑10 ladder, and uses these scores to rank nations.

It then outlines the analytical project: using Python to explore the happiness rankings and the factors influencing them.

Data preparation steps are described, including downloading CSV files for the years 2015‑2019 and loading them with pandas, adding a "year" column, and concatenating the yearly data frames.

pip install -r requirement.txt
import numpy as np
import pandas as pd
import os, sys
import matplotlib.pyplot as plt
import seaborn as sns
import plotly as py
import plotly.graph_objs as go
import plotly.express as px
from plotly.offline import init_notebook_mode, iplot, plot

# File path
file_path = os.path.dirname(os.path.abspath(__file__))

# Load data
df_2015 = pd.read_csv(f'{file_path}/2015.csv')
df_2016 = pd.read_csv(f'{file_path}/2016.csv')
df_2017 = pd.read_csv(f'{file_path}/2017.csv')
df_2018 = pd.read_csv(f'{file_path}/2018.csv')
df_2019 = pd.read_csv(f'{file_path}/2019.csv')

# Add year column
df_2015["year"] = "2015"
df_2016["year"] = "2016"
df_2017["year"] = "2017"
df_2018["year"] = "2018"
df_2019["year"] = "2019"

# Merge all data
df_all = df_2015.append([df_2016, df_2017, df_2018, df_2019], sort=False)
df_all.drop('Unnamed: 0', axis=1, inplace=True)

Using Plotly, a choropleth map of the 2019 happiness scores is generated, followed by a bar chart that displays the top‑10 and bottom‑10 countries in 2019.

# Top‑10 and Bottom‑10 bar chart
rank_top10 = df_2019.head(10)[['rank', 'region', 'happiness']]
last_top10 = df_2019.tail(10)[['rank', 'region', 'happiness']]
rank_concat = pd.concat([rank_top10, last_top10])
fig = px.bar(rank_concat, x="region", y="happiness", color="region",
             title="2019 Global Most and Least Happy Countries")
plot(fig, filename=f'{file_path}/2019世界幸福国家排行Top10和Last10.html')

Scatter plots are then created to examine relationships between happiness and various indicators: GDP per capita, healthy life expectancy, and social support, each faceted by year and fitted with an OLS trend line.

# GDP vs happiness
fig = px.scatter(df_all, x='gdp_per_capita', y='happiness', facet_row='year',
                 color='year', trendline='ols')
fig.update_layout(height=800, title_text='人均GDP和幸福指数')
plot(fig, filename=f'{file_path}/GDP和幸福得分.html')

# Healthy life expectancy vs happiness
fig = px.scatter(df_all, x='healthy_life_expectancy', y='happiness', facet_row='year',
                 color='year', trendline='ols')
fig.update_layout(height=800, title_text='健康预期寿命和幸福指数')
plot(fig, filename=f'{file_path}/健康预期寿命和幸福得分.html')

# Social support vs happiness
fig = px.scatter(df_all, x='social_support', y='happiness', facet_row='year',
                 color='year', trendline='ols')
fig.update_layout(height=800, title_text='社会援助和幸福指数')
plot(fig, filename=f'{file_path}/社会援助和幸福得分.html')

The analysis concludes that GDP per capita, healthy life expectancy, and social support show strong positive correlations with happiness scores, while freedom and government integrity have moderate to low correlations, and generosity shows a very weak correlation.

Overall, the study demonstrates how Python’s data‑science ecosystem can be leveraged to extract meaningful insights from the World Happiness Report.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python visualization Pandas data-analysis plotly happiness-index

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.