Fundamentals 14 min read

Visualizing Invisible SO₂ After the Tonga Eruption with NASA Data and Python

This tutorial shows how to download NASA GES DISC SO₂ data for the 2022 Hunga Tonga–Hunga Ha'apai eruption, extract relevant fields with H5py, build a Pandas DataFrame, and create both scatter‑plot and time‑varying heat‑map visualizations using Matplotlib, Seaborn and Folium.

Code DAO
Code DAO
Code DAO
Visualizing Invisible SO₂ After the Tonga Eruption with NASA Data and Python

Getting the Data

SO₂ atmospheric composition data are available from NASA’s GES DISC. The Level‑2 product OMPS_NPP_NMSO2_PCA_L2 provides derived geophysical variables at the same resolution as the original Level‑1 data. The dataset covers 2022‑01‑12 to 2022‑01‑23, the period of the Tonga eruption.

Importing the Data

Required libraries are numpy, pandas, matplotlib.pyplot, seaborn, h5py, and folium. The HDF5 files listed in a downloaded .txt file are opened with h5py.File and stored in a list called dataset.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import h5py
%matplotlib inline
f = open('<text file location>', 'r')
files = [i.split('/')[-1] for i in f.read().split('
')[1:]]
dataset = []
for file in files:
    try:
        h5_read = h5py.File('<file location>/' + file, 'r')
        dataset.append(h5_read)
    except:
        print(file)

Selecting Keys

The HDF5 files contain many keys; the tutorial uses GEOLOCATION_DATA (latitude, longitude, UTC time) and SCIENCE_DATA (column‑amount SO₂ in Dobson Units).

list(dataset[0].keys())
# ['ANCILLARY_DATA', 'GEOLOCATION_DATA', 'SCIENCE_DATA', ...]

Creating the DataFrame

Helper function exact_h5 extracts a list from a given key. Latitude, longitude, SO₂ amount, and date are flattened into plain Python lists and combined into a Pandas DataFrame.

def exact_h5(h5_file, key):
    result = list(h5_file[key])
    return result

# Latitude
get_lat = [exact_h5(i, 'GEOLOCATION_DATA/Latitude') for i in dataset]
lat_ = [sum([list(ii) for ii in i], []) for i in get_lat]
lat = sum(lat_, [])

# Longitude (convert negative values to 0‑360 range)
get_lon = [exact_h5(i, 'GEOLOCATION_DATA/Longitude') for i in dataset]
lon_ = [sum([list(ii) for ii in i], []) for i in get_lon]
lon = sum(lon_, [])
lon_mod = [360+i if i < 0 else i for i in lon]

# SO₂ (Dobson Units)
get_so2 = [exact_h5(i, 'SCIENCE_DATA/ColumnAmountSO2') for i in dataset]
so2_ = [sum([list(ii) for ii in i], []) for i in get_so2]
so2 = sum(so2_, [])

# Date (YYYY‑MM‑DD)
get_date = [exact_h5(i, 'GEOLOCATION_DATA/UTC_CCSDS_A') for i in dataset]
date_ = [[ii.decode('UTF-8').split('T')[0] for ii in i] for i in get_date]
date = sum([list(np.repeat(i, 36)) for i in date_], [])

df = pd.DataFrame(zip(lat, lon_mod, so2, date), columns=['Lat', 'Long', 'SO2', 'Date'])

Analysis

The DataFrame is filtered to a 15‑degree square around the eruption epicenter (lat ‑35.55 to ‑5.45, long 169.61 to 199.61) and to rows with positive SO₂ values. Daily totals are computed by grouping on the Date column.

df_filtered = df[(df['Lat'] >= -35.55) & (df['Lat'] <= -5.45) &
                 (df['Long'] >= 169.61) & (df['Long'] <= 199.61) &
                 (df['SO2'] > 0)]

df_groupdate = df_filtered[['SO2','Date']].groupby('Date').sum().reset_index()

The resulting table shows a peak SO₂ emission on 2022‑01‑16. A line plot visualizes the daily SO₂ column amount.

sns.set_style('darkgrid')
df_groupdate.plot(x='Date', y='SO2', figsize=(11,6), lw=4)
plt.legend(labels=['SO2'])
plt.ylabel('Dobson Units')
plt.show()

Visualization – Scatter Plot

For each day, rows with SO₂ > 0.5 DU are selected to avoid overcrowding. The example shown is for 2022‑01‑17.

dates = sorted(set(df['Date']))[1:]  # remove 2022‑01‑11
df_plot = [df[(df['Date']==d) & (df['SO2']>0.5)] for d in dates]

sns.set_style('darkgrid')
plt.figure(figsize=(8,6))
g = sns.scatterplot(data=df_plot[5], x='Long', y='Lat',
                    palette='coolwarm', hue='SO2', linewidth=0.01,
                    hue_norm=(0.5,5.5))
g.set(xlim=(65,270), ylim=(-55,36))
plt.scatter(184.615, -20.55, color='gray', marker='^')
plt.text(191, -21.5, '<= Hunga Tonga–Hunga Ha\'apai', size=9, color='black')
plt.text(80, -50, '2022-01-17', size=10, color='black')
plt.xlabel('')
plt.ylabel('Latitude')
plt.xticks([])
plt.show()

All daily scatter plots can be saved as PNG files and combined into an animated GIF to illustrate the SO₂ plume movement over time.

save_order = list(range(1, len(df_plot)+1))
for i, d, so in zip(df_plot, dates, save_order):
    sns.set_style('darkgrid')
    plt.figure(figsize=(8,6))
    g = sns.scatterplot(data=i, x='Long', y='Lat', palette='coolwarm',
                        hue='SO2', linewidth=0.01, hue_norm=(0.5,5.5))
    g.set(xlim=(65,270), ylim=(-55,36))
    plt.scatter(184.615, -20.55, color='gray', marker='^')
    plt.text(191, -21.5, '<= Hunga Tonga–Hunga Ha\'apai', size=9, color='black')
    plt.text(80, -50, d, size=10, color='black')
    plt.xlabel('')
    plt.ylabel('Latitude')
    plt.xticks([])
    plt.savefig(f"{so}.png", bbox_inches='tight', pad_inches=0)
    plt.show()
from PIL import Image
import imageio
img = []
for i in save_order[:-1]:
    myImage = Image.open(f"{i}.png")
    img.append(myImage)
imageio.mimsave('so2_2.gif', img, duration=0.3)

Visualization – Heat Map

Folium creates a geographic heat map for a single day (e.g., 2022‑01‑17). Latitude, longitude, and SO₂ values are assembled into a list of points.

import folium
from folium import Map, plugins
from folium.plugins import HeatMap, HeatMapWithTime

# Data for 2022‑01‑17
d17 = df[df['Date']=='2022-01-17']
d17.dropna(inplace=True)
data17 = [[lat, lon, so2] for lat, lon, so2 in zip(d17.Lat, d17.Long, d17.SO2)]

m = folium.Map(location=[-20,160], zoom_start=4)
HeatMap(data17, min_opacity=0.05).add_to(m)
folium.LayerControl().add_to(m)
m.save('folium.html')

Visualization – Time‑Varying Heat Map

Folium’s HeatMapWithTime visualizes the daily evolution of the plume. Each day’s filtered DataFrame is transformed into a list of [lat, lon, so2] points, which are then passed to the function.

# List of daily DataFrames
df_plot = [df[(df['Date']==d) & (df['SO2']>0.5)] for d in dates]
# Convert to list of points per day
df_plot_f = [[[lat,lon,so2] for lat,lon,so2 in zip(df_i.Lat, df_i.Long, df_i.SO2)]
             for df_i in df_plot]

hm = folium.Map(location=[-20,160], zoom_start=3)
HeatMapWithTime(df_plot_f, auto_play=True, min_opacity=0.1).add_to(hm)
folium.LayerControl().add_to(hm)
hm.save('folium_s.html')

The workflow demonstrates that Python and publicly available NASA data can be combined to map post‑eruption SO₂ transport.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonData VisualizationH5pyNASAFoliumSO2Volcanology
Code DAO
Written by

Code DAO

We deliver AI algorithm tutorials and the latest news, curated by a team of researchers from Peking University, Shanghai Jiao Tong University, Central South University, and leading AI companies such as Huawei, Kuaishou, and SenseTime. Join us in the AI alchemy—making life better!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.