iCartoonFace: A Large-Scale Cartoon Face Recognition Dataset and Multi‑Task Learning Framework
The paper presents iCartoonFace, the largest manually annotated cartoon‑face dataset with 5,013 identities and 389,678 images, and a multi‑task learning framework that jointly trains on cartoon and real faces using classification, unknown‑identity rejection, and domain‑adaptation losses, achieving state‑of‑the‑art recognition despite pose, occlusion, and illumination challenges.
The paper introduces iCartoonFace, a benchmark dataset for cartoon face recognition, and proposes a multi‑task learning framework that jointly handles cartoon and real‑person faces. The dataset contains 5,013 cartoon identities and 389,678 images collected from 1,302 cartoon albums, making it the largest manually annotated cartoon face dataset to date.
Dataset Construction A semi‑automatic pipeline is used to reduce annotation effort. The pipeline consists of three stages: (1) hierarchical data collection from cartoon albums to character names and images; (2) data filtering using cartoon face detection, feature extraction, and clustering to remove noisy samples; (3) a Q/A verification step where annotators confirm whether two images belong to the same character based on clustering results.
Dataset Statistics The dataset is large‑scale and high‑quality: over 65% of images have a resolution higher than 200×200 pixels, and the annotation error rate is kept below 5% through cross‑validation. It exhibits high diversity and challenge, with intra‑class variations caused by pose, occlusion, and lighting.
Challenges Four representative challenges are identified: (a) high inter‑class similarity, (b) pose variation, (c) occlusion, and (d) illumination changes. These require robust recognition algorithms.
Proposed Method A cartoon‑real multi‑task training framework is presented, comprising three loss components: (1) classification loss for distinguishing cartoon and real faces; (2) unknown‑identity rejection loss that performs unsupervised regularization across domains; (3) domain‑adaptation loss that reduces the gap between cartoon and real domains. The architecture is illustrated in Figure 4.
Experimental Analysis The authors evaluate several loss functions (SoftMax, SphereFace, CosFace, ArcFace, ArcFace+FL) and find that ArcFace+FL achieves the best rank‑1 accuracy. They also demonstrate that incorporating real‑face data improves cartoon detection and recognition performance (see Figures 5 and 6). Adding contextual information around the cartoon face further boosts accuracy, as shown in Figure 7.
Conclusion and Outlook iCartoonFace provides a strong foundation for advancing cartoon face recognition research. The multi‑task learning framework effectively leverages both cartoon and real‑person data, achieving superior performance. Future work includes designing more robust algorithms to handle occlusion, side views, blur, and transformation, and expanding applications such as video understanding, intelligent editing, image search, and content moderation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
