Tagged articles

VQ-VAE

6 articles · Page 1 of 1
Alimama Tech
Alimama Tech
May 28, 2026 · Artificial Intelligence

TAR: Multi‑Scale Trajectory Model Fixes Granularity Mismatch, Raising CTR >12%

The paper introduces the Trajectory Auto‑Regressive (TAR) model, which uses multi‑scale trajectory generation, a VQ‑VAE latent compression, and a state‑action fusion architecture to address granularity mismatch between fine‑grained decision steps and coarse‑grained feedback in online advertising, achieving over 12% CTR lift, smoother budget pacing, and faster inference compared to prior baselines.

Budget PacingMulti-Scale GenerationOnline Advertising
0 likes · 18 min read
TAR: Multi‑Scale Trajectory Model Fixes Granularity Mismatch, Raising CTR >12%
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Apr 9, 2026 · Artificial Intelligence

WSDM2026 Quantitative Research Papers: Summaries and Insights

This article presents concise summaries of three recent AI‑driven finance papers—Diffolio’s diffusion‑based risk‑aware portfolio optimization, STORM’s dual‑vector‑quantized VAE factor model, and AutoHypo‑Fin’s autonomous web‑mined hypothesis generation—highlighting their motivations, methods, and experimental gains.

AI for financeDiffusion ModelsVQ-VAE
0 likes · 9 min read
WSDM2026 Quantitative Research Papers: Summaries and Insights
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Apr 6, 2026 · Artificial Intelligence

STORM: A Bidirectional Spatiotemporal Factor Model Achieving Sharpe Ratio >1

STORM introduces a bidirectional VQ‑VAE‑based spatiotemporal factor model that extracts fine‑grained time‑series and cross‑sectional features, uses discrete codebooks for orthogonal, diverse factor embeddings, and outperforms nine baselines on portfolio management and algorithmic trading tasks, delivering Sharpe ratios exceeding 1.

Algorithmic TradingPortfolio ManagementTransformer
0 likes · 17 min read
STORM: A Bidirectional Spatiotemporal Factor Model Achieving Sharpe Ratio >1
DevOps
DevOps
Apr 13, 2025 · Artificial Intelligence

The Amazing Magic of GPT‑4o and a Speculative Technical Roadmap

This article reviews the breakthrough image‑generation capabilities of GPT‑4o, showcases diverse examples, and offers a detailed speculation on its underlying autoregressive architecture, tokenization methods, VQ‑VAE/GAN advances, and training strategies that could explain its performance.

AI researchGPT-4oTokenization
0 likes · 16 min read
The Amazing Magic of GPT‑4o and a Speculative Technical Roadmap
Tencent Cloud Developer
Tencent Cloud Developer
Apr 10, 2025 · Artificial Intelligence

The Magic of GPT‑4o: Technical Overview and Speculated Architecture

GPT‑4o combines extremely long‑form text generation, high‑quality image creation and interactive editing by likely using an autoregressive multimodal transformer that tokenizes visuals via VQ‑VAE/GAN pipelines, trained on massive data and refined through fine‑tuning and RLHF, offering a unified model for generation, editing, and understanding.

GPT-4oMultimodal AIVQ-VAE
0 likes · 17 min read
The Magic of GPT‑4o: Technical Overview and Speculated Architecture
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Aug 10, 2022 · Artificial Intelligence

Multi-Stage Multi-Codebook VQ-VAE for High-Performance Neural Text-to-Speech (MSMC‑TTS)

The MSMC‑TTS system, a multi‑stage multi‑codebook VQ‑VAE based neural text‑to‑speech solution, delivers near‑human audio quality (MOS 4.41) with a compact 3.12 MB acoustic model, substantially surpassing Mel‑Spectrogram FastSpeech baselines in naturalness and efficiency.

Compact RepresentationMulti-Stage ModelingSpeech synthesis
0 likes · 10 min read
Multi-Stage Multi-Codebook VQ-VAE for High-Performance Neural Text-to-Speech (MSMC‑TTS)