Artificial Intelligence 9 min read

How Geometric Deep Learning Enables Spherical CNNs for Rotationally Equivariant Vision

The article explains why traditional planar CNNs fail on spherical data, describes how encoding rotational symmetry through continuous spherical representations and spherical harmonics leads to spherical convolutions that are rotation‑equivariant, and outlines the practical computation using harmonic coefficients.

Code DAO

May 10, 2022

How Geometric Deep Learning Enables Spherical CNNs for Rotationally Equivariant Vision

Convolutional neural networks (CNN) revolutionized computer vision by encoding translational symmetry, but extending this success to data with non‑planar geometry—such as 360° images, cosmic microwave background maps, 3D medical scans, or mesh surfaces—requires a different approach.

Projecting spherical data onto a plane inevitably introduces distortion, as illustrated by the size discrepancy between Greenland and Africa on a Miller cylindrical map. This distortion breaks the translational equivariance that planar CNNs rely on, making direct application of standard convolutions ineffective for spherical images.

To respect the inherent symmetry of spherical data, models must be invariant to rotations rather than translations. A rotation‑equivariant operation ensures that rotating the input on the sphere yields an equivalent rotation of the output, preserving the relationship between features and their orientations.

The solution begins with a continuous representation of signals on the sphere: functions f: S² → ℝ. Like periodic signals on a circle, these can be expanded in spherical harmonic bases, yielding a set of coefficients that capture low‑frequency (smooth) to high‑frequency (detailed) variations. Truncating the coefficient vector provides a finite, accurate approximation of real‑world signals.

Using this representation, a spherical convolution is defined as (f * g)(ρ) = ∫_{S²} f(ω)·g(ρ⁻¹ω) dω, where ρ is a rotation operator and g is a spherical filter. This formulation guarantees rotation equivariance because rotating the input is equivalent to applying the inverse rotation to the filter before integration.

Although the convolution appears to require a costly 2‑D integral for each rotation, the harmonic representations of f and g turn the operation into a simple matrix multiplication in the spectral domain. Practitioners can therefore implement spherical convolutions efficiently on GPUs by multiplying the harmonic coefficient matrices.

Key references include Cohen et al., “Spherical CNNs” (ICLR 2018) and Esteves et al., “Learning SO(3) Equivariant Representations with Spherical CNNs” (ECCV 2018).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Computer Vision geometric-deep-learning spherical-harmonics rotational equivariance spherical CNN

Written by

Code DAO

We deliver AI algorithm tutorials and the latest news, curated by a team of researchers from Peking University, Shanghai Jiao Tong University, Central South University, and leading AI companies such as Huawei, Kuaishou, and SenseTime. Join us in the AI alchemy—making life better!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.