Hierarchical Latent Factor Deep Generative Model for Time‑Series Anomaly Detection
The article presents DGHL, a deep generative model that uses a ConvNet generator and alternating back‑propagation to learn hierarchical latent factors for online detection of point and subsequence anomalies in multivariate time‑series, handling missing data and achieving state‑of‑the‑art F1 scores on several benchmark datasets.
Anomalies are common in data and especially critical in time‑series across many industries; they can degrade model accuracy and reliability, making effective detection methods essential.
DGHL Model Overview
The proposed Deep Generative model with Hierarchical Latent factors (DGHL) replaces encoder‑decoder structures with a top‑down ConvNet generator. The generator maps a latent vector \(Z\) to a time‑series window, expanding the temporal dimension at each layer while halving the number of filters. Each convolutional layer is followed by batch normalization and ReLU activation.
class Generator(nn.Module):
def __init__(self, window_size=32, hidden_multiplier=32, latent_size=100,
n_channels=3, max_filters=256, kernel_multiplier=1):
super(Generator, self).__init__()
n_layers = int(np.log2(window_size))
layers = []
filters_list = []
# First layer
filters = min(max_filters, hidden_multiplier * (2**(n_layers-2)))
layers.append(nn.ConvTranspose1d(in_channels=latent_size, out_channels=filters,
kernel_size=4, stride=1, padding=0, bias=False))
layers.append(nn.BatchNorm1d(filters))
filters_list.append(filters)
# Hidden layers
for i in reversed(range(1, n_layers-1)):
filters = min(max_filters, hidden_multiplier * (2**(i-1)))
layers.append(nn.ConvTranspose1d(in_channels=filters_list[-1], out_channels=filters,
kernel_size=4*kernel_multiplier, stride=2,
padding=1 + (kernel_multiplier-1)*2, bias=False))
layers.append(nn.BatchNorm1d(filters))
layers.append(nn.ReLU())
filters_list.append(filters)
# Output layer
layers.append(nn.ConvTranspose1d(in_channels=filters_list[-1], out_channels=n_channels,
kernel_size=3, stride=1, padding=1))
self.layers = nn.Sequential(*layers)
def forward(self, x, m=None):
x = x[:,:,0,:]
x = self.layers(x)
x = x[:,:,None,:]
if m is not None:
x = x * m
return xHierarchical Latent Factors
DGHL learns a hierarchical latent space (e.g., \(a=[1,3,6]\)) that captures long‑term dynamics while allowing flexible adjustment of hyper‑parameters. The lowest‑level state vector controls fine‑grained details, whereas higher‑level vectors modulate overall dynamics, enabling the model to adapt to evolving time‑series patterns.
Training with Alternating Back‑Propagation (ABP)
Instead of auxiliary encoder or discriminator networks, DGHL maximizes the observed data likelihood directly using an alternating back‑propagation algorithm combined with short‑term Markov Chain Monte Carlo (MCMC). The ABP procedure consists of two steps per mini‑batch: (1) inference back‑propagation to infer latent \(Z\) via Langevin dynamics, and (2) learning back‑propagation to update model parameters \(\Theta\) by ascending the log‑likelihood gradient.
Figure 4 illustrates the ABP workflow.
Online Anomaly Detection
During inference, DGHL reconstructs each incoming window from the test stream \(Y^{test}\) and computes an anomaly score based on reconstruction error. The test stream is segmented into windows with the same parameters as training (window size \(S_w\) and stride \(S\)). For each window, the model evaluates a score that reflects deviation from learned normal patterns.
def anomaly_score(self, X, mask, z_iters):
x, x_hat, z, mask = self.predict(X=X, mask=mask, z_iters=z_iters)
x_hat = x_hat * mask # hide non‑available data
x_flatten = x.squeeze(2)
x_hat_flatten = x_hat.squeeze(2)
mask_flatten = mask.squeeze(2)
z = z.squeeze((2,3))
ts_score = np.square(x_flatten - x_hat_flatten)
score = np.average(ts_score, axis=1, weights=mask_flatten)
return score, ts_score, x, x_hat, z, maskThe algorithm can recover missing data points, which is crucial when anomalies are detected on partially observed streams.
Handling Missing Data
When data contain gaps, the first ABP step uses Langevin dynamics to infer latent \(Z\) based only on observed signals \(Y_{obs}\). The inferred \(Z\) is then used to reconstruct the full window, effectively imputing missing values. Experiments on the SMD dataset show that DGHL reconstructs masked segments accurately, outperforming VAE and GAN baselines.
Experiments
Four public multivariate time‑series datasets are used: Server Machine Dataset (SMD), Soil Moisture Active Passive (SMAP), Mars Science Laboratory (MSL), and Secure Water Treatment (SWaT). DGHL is trained on each dataset and evaluated with a single threshold to compute F1 scores.
Figure 6 shows that DGHL achieves competitive or superior F1 scores compared with a range of recent baselines.
Table 1 lists the exact F1 scores on each benchmark; DGHL ranks among the top performers while using a simpler model where each window has an independent latent vector.
References
Challu, C., et al., Deep Generative model with Hierarchical Latent Factors for Time Series Anomaly Detection , arXiv preprint arXiv:2202.07586, 2022.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code DAO
We deliver AI algorithm tutorials and the latest news, curated by a team of researchers from Peking University, Shanghai Jiao Tong University, Central South University, and leading AI companies such as Huawei, Kuaishou, and SenseTime. Join us in the AI alchemy—making life better!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
