UniOD: A Single Model for Zero‑Training Cross‑Domain Anomaly Detection

UniOD introduces a universal outlier detection model that leverages historical labeled datasets to train one deep graph‑neural‑network‑based model, enabling plug‑and‑play anomaly detection on unseen domains without any retraining, and is backed by theoretical guarantees and extensive cross‑domain experiments.

Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
UniOD: A Single Model for Zero‑Training Cross‑Domain Anomaly Detection

Introduction

Outliers are observations that deviate markedly from the normal data distribution and often indicate critical events such as fraud, security breaches, or system failures. Detecting such anomalies across diverse domains has become a key problem in data‑driven applications.

Method

We propose UniOD, a universal anomaly‑detection model that trains a single deep model on heterogeneous, labeled historical datasets and can detect anomalies on any new, unseen dataset without additional training. To unify feature spaces, each dataset is first transformed into a sample‑level similarity matrix (a graph) by computing Gaussian‑kernel similarities with multiple bandwidths. Singular‑value decomposition (SVD) is then applied to the similarity matrix to obtain unified node features that are comparable across datasets. For classification, we treat each dataset as a graph and employ a parallel Graph Isomorphism Network (GIN) and Transformer architecture, which fully exploits the relational information in the similarity graph.

Theoretical Analysis

We present Theorem 4.1, which bounds the expected generalization error by the average training error. The bound tightens as the number of training datasets grows, indicating smaller generalization error with more historical data. The analysis also shows that increasing the number of GIN and Transformer layers reduces training error and improves test accuracy, while overly deep models can degrade generalization ability.

Experiments

We evaluate UniOD on 57 heterogeneous datasets from the ADbench benchmark (including tabular, image, and text data) and compare against 17 baseline methods using AUROC and AUPRC. UniOD consistently outperforms baselines on most datasets and achieves higher average performance. Additional tests on 27 unseen modalities demonstrate that UniOD trained only on tabular data can generalize to other modalities. Domain‑robustness experiments remove historical data from the same domain as the test set and observe negligible performance loss, which we attribute to the similarity‑matrix‑based unified features. Ablation studies vary the number of historical datasets (1, 3, 5, 10, 15) and the number of Gaussian bandwidths, showing that more historical data and larger bandwidths reduce information loss and improve generalization, matching the theoretical predictions.

Conclusion

UniOD provides a novel, efficient universal outlier detection method. By converting each dataset into a graph structure and generating dimension‑unified node features, a single model can handle heterogeneous datasets without retraining. Theoretical analysis and extensive empirical results validate its effectiveness and low computational overhead, and the approach can be extended from transductive to inductive anomaly‑detection scenarios.

anomaly detectiongraph neural networkcross-domainoutlier detectiontheoretical analysisUniOD
Machine Learning Algorithms & Natural Language Processing
Written by

Machine Learning Algorithms & Natural Language Processing

Focused on frontier AI technologies, empowering AI researchers' progress.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.