Artificial Intelligence 17 min read

From Bayesian to LLMs: A Comprehensive Survey of Recent Temporal Point Process Advances

This article reviews the rapid evolution of Temporal Point Processes, covering Bayesian non‑parametric models, neural architectures—including RNN, Transformer, and ODE‑based designs—and the emerging LLM‑driven approaches, while discussing training methods, benchmarks, applications, and open research challenges.

Machine Heart

Jun 16, 2026

From Bayesian to LLMs: A Comprehensive Survey of Recent Temporal Point Process Advances

Machine learning traditionally handles regularly spaced sequences, but many real‑world events occur at irregular timestamps with rich contextual information. Temporal Point Processes (TPPs) model such data by describing how event times are generated in continuous time.

Why revisit TPP?

Classic models like Poisson, Hawkes, and self‑correcting processes have long been used for phone calls, earthquakes, financial trades, neural spikes, and social diffusion. Recent years have shifted the research focus because:

Parametric models are interpretable but lack expressive power for nonlinear, non‑stationary, multi‑type events with complex context.

Deep learning introduces flexible representation learning, allowing RNN, LSTM, Transformer, ODE/SDE, and diffusion models to capture intricate dynamics.

Large language models (LLMs) expand the scope from pure time‑type pairs to multimodal event histories, enabling semantic understanding of event streams.

Three main research strands

Bayesian TPP : Emphasizes uncertainty quantification and principled inference. Non‑parametric Bayesian approaches treat the intensity function as an infinite‑dimensional object, using Gaussian‑process priors for Poisson processes and extending them to Hawkes processes. Inference is challenging due to double intractability, leading to MCMC, Laplace approximations, variational inference, and Pólya‑Gamma augmentation. While offering interpretability and calibrated uncertainty, these methods are computationally intensive for large datasets.

Neural TPP : Focuses on expressive power and end‑to‑end prediction. Main architectures include:

Recurrent neural TPPs (RNN/LSTM) that compress history into hidden states for fast online prediction but suffer from limited parallelism and long‑range dependency modeling.

Autoregressive neural TPPs, especially Transformer‑based models, which use self‑attention to capture long‑range effects and support parallel training; however, they incur quadratic time‑memory costs.

ODE/SDE‑based neural TPPs that evolve hidden states continuously between events, providing a more natural representation of intensity dynamics at the cost of slower training and sampling.

Diffusion‑based TPPs that generate entire event sequences via iterative denoising, offering new perspectives for long‑horizon prediction but with high computational overhead.

The survey also highlights the emerging trend of combining efficient sequence models such as RWKV, S4, and Mamba with TPPs to improve scalability.

LLM‑based TPP : Divided into two categories.

LLM‑inspired TPPs augment existing neural TPPs with prompt learning or reasoning (e.g., PromptTPP, LAMP) to improve adaptability and interpretability.

Direct LLM‑TPP integration treats the event sequence as textual input, injects temporal embeddings, and fine‑tunes large language models (e.g., TPP‑LLM, Language‑TPP) to handle multimodal descriptions and semantic reasoning.

These approaches broaden TPP tasks to include retrieval, question answering, and multimodal inference, though their advantage on pure time‑prediction benchmarks remains unclear.

Core concept: Conditional intensity

The conditional intensity function λ(t|H_t) answers the instantaneous probability of an event occurring in a short future window given the full history H_t. Poisson processes assume independence, while Hawkes processes introduce history‑dependent triggering functions, enabling causal discovery via Granger‑type analysis.

Parameterization choices

Neural TPPs traditionally predict the intensity, which requires numerical integration for maximum‑likelihood training. Recent work explores "intensity‑free" modeling by directly parameterizing the conditional density, cumulative intensity, or using log‑normal mixtures and monotonic neural networks, thereby avoiding costly integration.

Datasets, benchmarks, and evaluation

The field suffers from fragmented datasets and inconsistent preprocessing. Tools like EasyTPP aim to provide unified benchmarks, standardizing data splits and evaluation scripts. Common tasks include next‑event prediction, long‑horizon prediction, semantic/multimodal tasks, and causal discovery.

Applications

TPP applications fall into event prediction (e.g., social reposts, epidemic spread, aftershock forecasting, financial order flow, recommendation clicks) and causal discovery (e.g., neural connectivity, financial market impact, AIOps root‑cause analysis, epidemiology, cybersecurity).

Future challenges

Standardization of data formats and model evaluation.

Interpretability of neural and LLM‑based TPPs, especially for causal inference.

Scalability to millions of timestamps and multi‑type interactions.

Efficient sampling methods beyond thinning and inverse‑transform, including diffusion, flow‑based, and speculative decoding techniques.

Multimodal modeling that aligns textual, visual, and sensor modalities with continuous time.

Conclusion

The surveyed TMLR paper "Advances in Temporal Point Processes: Bayesian, Neural, and LLM Approaches" (Zhou et al., 2025; https://openreview.net/forum?id=SXgGKkShhT) argues that TPPs are converging: statistical foundations provide calibrated intensity and causal tools, deep learning supplies expressive representations, and LLMs inject semantic and multimodal reasoning. The next generation of TPPs is expected to move beyond predicting "when the next event occurs" toward a unified framework that both predicts and understands complex continuous‑time event streams.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Benchmark Causal Discovery Bayesian TPP Event Modeling LLM TPP Neural TPP Temporal Point Processes

Written by

Machine Heart

Professional AI media and industry service platform

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.