How Frequency‑Refined Augmentation Boosts Contrastive Learning for Time‑Series Classification

FreRA introduces a lightweight, plug‑in frequency‑refined augmentation that adaptively refines spectral components to preserve global semantics while injecting variance, dramatically improving contrastive learning performance on time‑series classification, anomaly detection, and transfer learning across multiple benchmark datasets.

Data Party THU
Data Party THU
Data Party THU
How Frequency‑Refined Augmentation Boosts Contrastive Learning for Time‑Series Classification

Background

Contrastive learning is an effective unsupervised representation technique for time‑series classification, but its performance heavily depends on the choice of data augmentation. Most existing augmentations are borrowed from computer‑vision and operate in the time domain, which can introduce patterns that conflict with the intrinsic semantics of time‑series data.

Frequency‑Domain Advantages

Globality : Frequency components capture global information, analogous to a global convolution in the time domain.

Independence : Orthogonal Fourier bases allow each component to be manipulated independently, preventing interference between critical and non‑critical information.

Compactness : A small subset of frequency components can retain the semantic content, enabling concise representations.

FreRA: Frequency‑Refined Augmentation

FreRA is a lightweight, plug‑in module for contrastive learning on time‑series classification. It can be inserted into any contrastive framework and is jointly optimized with the encoder.

1. Importance Scoring

A trainable vector s = [s_1, s_2, ..., s_n] assigns an importance score to each of the n frequency components. Positive scores indicate semantic relevance; negative scores suggest noise.

2. Semantic‑Aware Identity Modification

Critical components are protected by a binary mask W_{crit} sampled via the Gumbel‑Softmax re‑parameterization trick, making the operation differentiable. When the temperature approaches zero, W_{crit} behaves like a Bernoulli sample, effectively preserving important frequencies.

3. Semantic‑Agnostic Distortion

Non‑critical components are altered using a non‑negative vector W_{dist} that injects controlled distortion. An adaptive threshold derived from s automatically selects the set of non‑important frequencies, and a scalar controls the overall distortion magnitude. A stop‑gradient on W_{dist} prevents gradient flow through the distortion path, stabilizing training.

4. Optimization Objective

The total loss combines three terms:

Contrastive loss : Standard InfoNCE loss that pulls together the original sample and its FreRA‑augmented view while pushing apart negative pairs.

Regularization loss : L1 penalty on W_{crit} to encourage sparsity and avoid the trivial solution where all components are marked critical.

Total loss : L = L_{InfoNCE} + \lambda \cdot L_{L1}, where \lambda balances the contrastive and regularization terms.

Advantages Summary

Globality : Frequency‑domain operations act as global convolutions, preserving overall semantics.

Independence : Orthogonal Fourier bases enable independent modification of each component, avoiding semantic loss.

Compactness : L1 regularization drives the model to concentrate semantic information in a few frequencies.

Experimental Evaluation

Time‑Series Classification

FreRA achieves the highest accuracy on three benchmark datasets (UCIHAR, MS, WISDM). Paired t‑tests confirm its superiority over both time‑domain and handcrafted frequency‑domain augmentations.

Anomaly Detection (Fault Diagnosis)

On a fault‑diagnosis dataset, FreRA outperforms baselines in accuracy and Macro‑F1 under a domain‑generalization setting, demonstrating robustness to distribution shifts and class imbalance.

Transfer Learning

When pre‑training on the SHAR dataset, encoders enhanced with FreRA consistently deliver the best performance in both low‑resource (3 source domains) and high‑resource (19 source domains) scenarios.

Ablation Studies

Removing any of the three components—semantic‑aware identity modification, semantic‑agnostic distortion, or L1 regularization—significantly degrades performance, confirming their necessity. Performance remains stable across a wide range of the hyper‑parameter \lambda.

Compatibility with Different Frameworks

FreRA was integrated into multiple contrastive learning backbones (TS2Vec, TS‑TCC, SoftCLT, SimCLR, BYOL). In every case, replacing the original augmentation with FreRA yielded consistent performance gains, proving its framework‑agnostic nature.

Conclusion

FreRA introduces a frequency‑domain perspective for automatic view generation in time‑series contrastive learning. By exploiting the global, independent, and compact properties of the spectral domain, it provides a lightweight, plug‑in augmentation that improves classification, anomaly detection, and transfer learning while preserving semantic integrity.

Source code: https://github.com/Tian0426/FreRA

Paper: https://arxiv.org/abs/2505.23181

FreRA architecture diagram
FreRA architecture diagram
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Data Augmentationcontrastive learningtransfer learningtime seriesrepresentation learningfrequency domain
Data Party THU
Written by

Data Party THU

Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.