A Survey of Time Series Forecasting Augmentation: Frequency Domain, Decomposition, and Patch Methods
The article reviews why classic classification augmentations fail for forecasting, outlines a taxonomy of effective time‑series augmentation techniques—including frequency‑domain, decomposition, and patch‑based methods—details the Temporal Patch Shuffle (TPS) pipeline, and presents extensive experiments showing TPS achieves state‑of‑the‑art improvements across long‑term, short‑term, and classification tasks.
Why Classification‑Oriented Augmentation Fails in Forecasting
Techniques such as jittering, scaling, window warping, permutation, and rotation were designed for classification where the label is discrete and unchanged; applying them to forecasting disrupts the input‑target continuity, breaking the look‑back window and prediction horizon relationship and causing performance drops.
Data‑Label Consistency: A Necessary Condition
For a look‑back window x and target y, the training object is the concatenated sequence s = x ∥ y. Augmentation must be applied to s before splitting, ensuring the augmented input \tilde{x} and target \tilde{y} remain temporally aligned.
Taxonomy of Forecast Augmentation Methods
Frequency‑based: RobustTAD, FreqMask, FreqMix, WaveMask, WaveMix, Dominant Shuffle
Decomposition‑based: STAug
Other: wDBA, MBB, Upsample
Patch‑based: TPS
RobustTAD
Applies discrete Fourier transform to the concatenated sequence, perturbs selected frequency bands (amplitude or phase) with a Gaussian‑controlled intensity, then inverse‑transforms back to the time domain.
FreqMask and FreqMix
Both start with s = x ∥ y and compute its real FFT S = rFFT(s). FreqMask zeros out selected frequencies using a binary mask M ( S̃ = M ⊙ S), while FreqMix blends spectra from two sequences: S̃ = M ⊙ S₁ + (1−M) ⊙ S₂. The inverse FFT yields the augmented signal.
WaveMask and WaveMix
Use discrete wavelet transform (DWT) to decompose s into multi‑level coefficients W^{(l)}. WaveMask applies a mask per level ( \tilde{W}^{(l)} = M^{(l)} ⊙ W^{(l)}), while WaveMix mixes coefficients from two sequences (
\tilde{W}^{(l)} = M^{(l)} ⊙ W₁^{(l)} + (1−M^{(l)}) ⊙ W₂^{(l)}). Inverse DWT reconstructs the augmented series.
Dominant Shuffle
Selects the top‑k dominant frequencies from the FFT of s and shuffles only those components before inverse transforming, avoiding wholesale spectral distortion.
STAug
Applies Empirical Mode Decomposition (EMD) to two sequences, obtains intrinsic mode functions (IMFs), then recombines them using mixup‑style interpolation weights sampled from a uniform distribution. High memory consumption limits its scalability.
Other Non‑Frequency Methods
wDBAaligns series via DTW and averages them; MBB decomposes series with STL, bootstraps residual blocks; Upsample linearly stretches a contiguous segment back to original length, providing a strong non‑frequency baseline.
From Image Patches to Temporal Patches
Patch‑based augmentation works in vision because spatial redundancy tolerates local shuffling. In time series, naive non‑overlapping patches create hard boundaries and break input‑target alignment, so patch operations must be re‑designed for the temporal domain.
Temporal Patch Shuffle (TPS)
The TPS pipeline concatenates look‑back and horizon, extracts overlapping patches (length p, stride s), scores each patch by variance, selects the lowest‑variance fraction α for random shuffling, then reconstructs the series by averaging overlapping regions and finally splits back into augmented input and target.
Algorithm Details
Concatenate look‑back window and horizon to enforce data‑label consistency.
Extract overlapping patches using length p and stride s.
Compute variance of each patch (across channels) in normalized space; low variance patches are deemed safe to shuffle.
Randomly permute the selected α proportion of patches while leaving others unchanged.
Reconstruct the series by placing each patch back (averaging overlaps) to smooth discontinuities.
Split the reconstructed series back into augmented input and target.
Ablation Findings
Joint augmentation of input and target is decisive; augmenting only the input causes the largest performance drop.
Overlapping patches are crucial; non‑overlapping patches degrade results noticeably.
Variance‑based ranking yields modest gains; its benefit disappears when all patches are shuffled ( α = 1.0).
Operating directly in the time domain outperforms FFT‑based variants.
Higher shuffle ratios (0.7–1.0) consistently deliver stronger performance.
Long‑Term Forecasting Results
Evaluated on nine long‑term datasets with five backbones (TSMixer, DLinear, PatchTST, TiDE, LightTS). TPS achieved the best average MSE on every backbone, improving the strongest competitor by 2.08%–10.51% (10.51% on LightTS).
Short‑Term Traffic Forecasting
On four PeMS traffic datasets (03, 04, 07, 08) using PatchTST, TPS again delivered the strongest augmentation gains, with MSE improvements ranging from 0.00% to 7.14%.
Extension to Time‑Series Classification
For classification, TPS skips concatenation and shuffles at the sample level. It achieves the highest average accuracy among compared augmentations on 30 univariate UCR datasets (≈+0.50% accuracy) and 10 multivariate UEA datasets (≈+1.10%).
Conclusion
TPS’s uniqueness stems from avoiding costly decomposition, refraining from indiscriminate spectral perturbation, and preserving input‑target alignment. By applying controlled randomness—overlapping patches, variance‑aware shuffling, and strict data‑label consistency—it consistently improves forecasting and classification across diverse models and datasets, achieving SOTA‑level augmentation performance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DeepHub IMBA
A must‑follow public account sharing practical AI insights. Follow now. internet + machine learning + big data + architecture = IMBA
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
