Hybrid Rotationally Equivariant Generalized Spherical CNNs Explained

The article introduces hybrid rotationally equivariant spherical CNNs, explains generalized signals on the sphere and rotation group, describes how linear and nonlinear operations using Clebsch‑Gordan coefficients preserve equivariance, and demonstrates efficient architectures that achieve state‑of‑the‑art results on 3D shape classification and atomic energy prediction.

Code DAO
Code DAO
Code DAO
Hybrid Rotationally Equivariant Generalized Spherical CNNs Explained

Spherical convolutions enable deep learning on data defined on the sphere, but introducing a rotationally equivariant non‑linearity is difficult. Quantum‑physics‑inspired Clebsch‑Gordan tensors provide a solution.

Signals on the Sphere and Rotation Group

Signals on the sphere (or on the rotation group SO(3)) are represented by spherical‑harmonic coefficients. For each degree ℓ = 0,1,2,… there are 2ℓ+1 harmonic components, collected into a vector. Rotation of a signal is a linear transformation of each degree‑ℓ vector by the Wigner‑D matrix D^{ℓ}(ρ). Convolution on the sphere is implemented by learned linear combinations of vectors within each degree.

Spherical harmonic basis
Spherical harmonic basis
Wigner D matrix
Wigner D matrix

Generalized Signals

Instead of a single vector per degree, an arbitrary number of vectors can be stored for each degree. Such collections, called generalized (rotatable) signals , no longer correspond directly to functions on the sphere but can still be rotated and convolved equivariantly because the same Wigner‑D action applies to each vector. An example uses per‑degree vector counts (2, 1, 3, 2, 0, 0,…).

Generalized signal example
Generalized signal example

Equivariant Non‑Linearity via Outer Products

Applying pointwise non‑linearity to sampled spherical representations breaks strict equivariance because sampling on the sphere is non‑uniform and conversion to/from the harmonic basis is costly. Kondor et al. (2018) propose computing an outer product between two vectors of a generalized signal, then projecting the result back to a 2ℓ+1‑dimensional space with a learned linear map built from Clebsch‑Gordan coefficients. The projection is chosen so that the whole pipeline remains equivariant. The quadratic outer product provides non‑linearity, while the Clebsch‑Gordan projection guarantees rotation equivariance.

Outer‑product non‑linearity
Outer‑product non‑linearity

Generalized Spherical CNN

Using the equivariant linear transform and the Clebsch‑Gordan‑based non‑linearity, a generalized spherical CNN (Kondor 2020) is built. The only architectural difference from a conventional CNN stack is that the outer‑product non‑linearity is applied before convolution; applying it after would cause the number of vectors per degree to grow quadratically (up to the fifth power of the maximum degree). A more flexible ordering (linear → non‑linear → linear) also subsumes the spherical CNN of Cohen et al. (2018). Hybrid models can combine both approaches.

Hybrid architecture
Hybrid architecture

Efficient Generalized Spherical CNN

Quadratic growth of vectors per degree after the outer‑product non‑linearity is reduced by channel‑wise factorisation (Cobb et al. 2021). The collection of vectors is split into separate channels; the outer‑product non‑linearity is applied independently per channel and the results are recombined, yielding a speed‑up proportional to the number of channels.

Convolution is decomposed into three steps after channelisation: (i) a “shrink” step that linearly mixes vectors within each channel, (ii) an intra‑channel linear mix, and (iii) an inter‑channel linear mix. This factorisation enables expressive feature learning at lower cost.

Instead of computing all possible outer‑product pairs, a graph‑based method selects a minimal‑cost subset by constructing a degree‑mixing graph and extracting its minimum spanning tree.

Degree‑mixing graph
Degree‑mixing graph

Optimal sampling schemes for the sphere and rotation group (McEwen & Wiaux 2011; McEwen et al. 2015) are used to reduce the cost of sample‑based operations.

Experiments

3D Shape Classification

Meshes are projected onto a sphere by ray‑casting from a bounding sphere, enabling rotational invariance. Using the hybrid architecture, the model attains top‑3 performance on five SHREC’17 benchmark metrics while using far fewer parameters.

SHREC’17 results
SHREC’17 results

Atomic Energy Prediction

Each atom’s directional charge distribution is treated as a spherical signal, preserving molecular rotational symmetry. The hybrid spherical CNN reduces root‑mean‑square error from 5.96 to 3.16 with substantially fewer parameters.

Energy regression RMSE
Energy regression RMSE

Conclusion

Converting arbitrary data into spherical representations and applying spherical CNNs exploits rotational symmetry even for non‑spherical problems. Generalized signals allow strict equivariant non‑linearity via Clebsch‑Gordan‑based outer products, and channel factorisation together with graph‑based pair selection makes the approach computationally practical.

References

Cobb, Wallis, Mavor‑Parker, Marignier, Price, d’Avezac, McEwen. Efficient Generalised Spherical CNNs . ICLR 2021. arXiv:2010.11661

Cohen, Geiger, Koehler, Welling. Spherical CNNs . ICLR 2018. arXiv:1801.10130

Esteves, Allen‑Blanchette, Makadia, Daniilidis. Learning SO(3) Equivariant Representations with Spherical CNNs . ECCV 2018. arXiv:1711.06721

Kondor, Lin, Trivedi. Clebsch‑Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network . NeurIPS 2018. arXiv:1806.09231

McEwen & Wiaux. A novel sampling theorem on the sphere . IEEE TSP 2012. arXiv:1110.6298

McEwen, Büttner, Leistedt, Peiris, Wiaux. A novel sampling theorem on the rotation group . IEEE SPL 2015. arXiv:1508.03101

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

rotational equivariancespherical CNN3D shape classificationatomic energy predictionClebsch-Gordangeneralized signals
Code DAO
Written by

Code DAO

We deliver AI algorithm tutorials and the latest news, curated by a team of researchers from Peking University, Shanghai Jiao Tong University, Central South University, and leading AI companies such as Huawei, Kuaishou, and SenseTime. Join us in the AI alchemy—making life better!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.