Hybrid Rotationally Equivariant Generalized Spherical CNNs Explained
The article introduces hybrid rotationally equivariant spherical CNNs, explains generalized signals on the sphere and rotation group, describes how linear and nonlinear operations using Clebsch‑Gordan coefficients preserve equivariance, and demonstrates efficient architectures that achieve state‑of‑the‑art results on 3D shape classification and atomic energy prediction.
Spherical convolutions enable deep learning on data defined on the sphere, but introducing a rotationally equivariant non‑linearity is difficult. Quantum‑physics‑inspired Clebsch‑Gordan tensors provide a solution.
Signals on the Sphere and Rotation Group
Signals on the sphere (or on the rotation group SO(3)) are represented by spherical‑harmonic coefficients. For each degree ℓ = 0,1,2,… there are 2ℓ+1 harmonic components, collected into a vector. Rotation of a signal is a linear transformation of each degree‑ℓ vector by the Wigner‑D matrix D^{ℓ}(ρ). Convolution on the sphere is implemented by learned linear combinations of vectors within each degree.
Generalized Signals
Instead of a single vector per degree, an arbitrary number of vectors can be stored for each degree. Such collections, called generalized (rotatable) signals , no longer correspond directly to functions on the sphere but can still be rotated and convolved equivariantly because the same Wigner‑D action applies to each vector. An example uses per‑degree vector counts (2, 1, 3, 2, 0, 0,…).
Equivariant Non‑Linearity via Outer Products
Applying pointwise non‑linearity to sampled spherical representations breaks strict equivariance because sampling on the sphere is non‑uniform and conversion to/from the harmonic basis is costly. Kondor et al. (2018) propose computing an outer product between two vectors of a generalized signal, then projecting the result back to a 2ℓ+1‑dimensional space with a learned linear map built from Clebsch‑Gordan coefficients. The projection is chosen so that the whole pipeline remains equivariant. The quadratic outer product provides non‑linearity, while the Clebsch‑Gordan projection guarantees rotation equivariance.
Generalized Spherical CNN
Using the equivariant linear transform and the Clebsch‑Gordan‑based non‑linearity, a generalized spherical CNN (Kondor 2020) is built. The only architectural difference from a conventional CNN stack is that the outer‑product non‑linearity is applied before convolution; applying it after would cause the number of vectors per degree to grow quadratically (up to the fifth power of the maximum degree). A more flexible ordering (linear → non‑linear → linear) also subsumes the spherical CNN of Cohen et al. (2018). Hybrid models can combine both approaches.
Efficient Generalized Spherical CNN
Quadratic growth of vectors per degree after the outer‑product non‑linearity is reduced by channel‑wise factorisation (Cobb et al. 2021). The collection of vectors is split into separate channels; the outer‑product non‑linearity is applied independently per channel and the results are recombined, yielding a speed‑up proportional to the number of channels.
Convolution is decomposed into three steps after channelisation: (i) a “shrink” step that linearly mixes vectors within each channel, (ii) an intra‑channel linear mix, and (iii) an inter‑channel linear mix. This factorisation enables expressive feature learning at lower cost.
Instead of computing all possible outer‑product pairs, a graph‑based method selects a minimal‑cost subset by constructing a degree‑mixing graph and extracting its minimum spanning tree.
Optimal sampling schemes for the sphere and rotation group (McEwen & Wiaux 2011; McEwen et al. 2015) are used to reduce the cost of sample‑based operations.
Experiments
3D Shape Classification
Meshes are projected onto a sphere by ray‑casting from a bounding sphere, enabling rotational invariance. Using the hybrid architecture, the model attains top‑3 performance on five SHREC’17 benchmark metrics while using far fewer parameters.
Atomic Energy Prediction
Each atom’s directional charge distribution is treated as a spherical signal, preserving molecular rotational symmetry. The hybrid spherical CNN reduces root‑mean‑square error from 5.96 to 3.16 with substantially fewer parameters.
Conclusion
Converting arbitrary data into spherical representations and applying spherical CNNs exploits rotational symmetry even for non‑spherical problems. Generalized signals allow strict equivariant non‑linearity via Clebsch‑Gordan‑based outer products, and channel factorisation together with graph‑based pair selection makes the approach computationally practical.
References
Cobb, Wallis, Mavor‑Parker, Marignier, Price, d’Avezac, McEwen. Efficient Generalised Spherical CNNs . ICLR 2021. arXiv:2010.11661
Cohen, Geiger, Koehler, Welling. Spherical CNNs . ICLR 2018. arXiv:1801.10130
Esteves, Allen‑Blanchette, Makadia, Daniilidis. Learning SO(3) Equivariant Representations with Spherical CNNs . ECCV 2018. arXiv:1711.06721
Kondor, Lin, Trivedi. Clebsch‑Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network . NeurIPS 2018. arXiv:1806.09231
McEwen & Wiaux. A novel sampling theorem on the sphere . IEEE TSP 2012. arXiv:1110.6298
McEwen, Büttner, Leistedt, Peiris, Wiaux. A novel sampling theorem on the rotation group . IEEE SPL 2015. arXiv:1508.03101
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code DAO
We deliver AI algorithm tutorials and the latest news, curated by a team of researchers from Peking University, Shanghai Jiao Tong University, Central South University, and leading AI companies such as Huawei, Kuaishou, and SenseTime. Join us in the AI alchemy—making life better!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
