SSAN: A Novel Dual‑Stream Network for Domain‑Generalized Face Anti‑Spoofing
This paper proposes SSAN, a novel dual‑stream network that separates content and style features to achieve domain‑generalized face anti‑spoofing, employing adversarial learning for content, contrastive learning for style, and a large‑scale evaluation protocol across twelve public datasets, achieving state‑of‑the‑art performance.
Background and Motivation Face anti‑spoofing has become increasingly important as presentation attacks evolve. Existing methods mainly rely on holistic image representations for domain generalization, but statistical differences between content (semantic, physical) and style (texture, domain‑specific) features can be exploited for better generalization.
The proposed study separates image representations into content features (extracted via Batch Normalization‑based structures) and style features (extracted via Instance Normalization‑based structures). By treating these two types of features differently, the method aims to combine them effectively for robust anti‑spoofing.
Figure 1 illustrates the style‑transfer based approach where real faces serve as content inputs and fake faces as style inputs.
Method Overview The SSAN architecture consists of a dual‑stream network that extracts content and style features separately. A style‑recombination module then merges these features, followed by contrastive learning on the recombined space to suppress domain‑specific style information while enhancing live‑face‑related style cues. The final loss trains the whole network.
Figure 2 shows the overall network framework.
a) Content and Style Aggregation Content features are assumed to have small distribution shifts across domains because facial regions share semantic space and physical attributes. An adversarial generator‑discriminator pair (with GRL) aligns content features across domains. Style features are aggregated from multiple layers to capture both global (background illumination) and local (material texture) style cues.
b) Re‑composition of Features Content feature f_c and style feature f_s are combined using an AdaIN‑based SAL layer followed by convolution. The paper provides the following formulation (illustrated as images):
The resulting recomposed feature space is denoted as a “self‑composed” feature.
c) Contrastive Learning in the Re‑composed Space To prevent domain‑related style features from overwhelming live‑face cues, a contrastive loss is applied. Self‑composed features act as anchors; shuffled‑composition features are pushed closer or farther based on live‑face labels, while gradients are stopped for the anchors.
The contrastive loss formulation (shown in the following images) measures the relationship between live‑face labels of anchor and shuffled features.
Large‑Scale Evaluation Protocol To bridge the gap between academia and industry, the authors merge twelve public face anti‑spoofing datasets and define intra‑dataset (protocol 1) and cross‑dataset (protocol 2) testing schemes. Protocol 2 splits the datasets into two groups (P1 and P2) and evaluates cross‑domain generalization.
Experimental Results Using Leave‑One‑Out across OULU‑NPU, CASIA‑MFSD, Replay‑Attack, and MSU‑MFSD, SSAN‑M (ResNet‑18 backbone) achieves the best performance on several cross‑domain protocols, surpassing existing SOTA methods. Ablation studies confirm the contribution of each module.
Feature Visualization t‑SNE plots show that content features form compact clusters across domains, while style features separate live and spoof samples, demonstrating that contrastive learning effectively enhances live‑related style cues and suppresses domain‑related ones.
Conclusion SSAN introduces a content‑style separation strategy combined with adversarial and contrastive learning to achieve domain‑generalized face anti‑spoofing. The large‑scale protocol further validates its superiority over existing methods, narrowing the gap between academic research and real‑world deployment.
Kuaishou Tech
Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.