Tag

constrained optimization

0 views collected around this technical thread.

DataFunSummit
DataFunSummit
Dec 27, 2023 · Artificial Intelligence

Two-Stage Constrained Actor-Critic for Short‑Video Recommendation and a Reinforcement‑Learning Multi‑Task Framework

This article presents a two‑stage constrained actor‑critic (TSCAC) algorithm that models short‑video recommendation as a constrained reinforcement‑learning problem, details its theoretical formulation and optimization loss, and validates its superiority through extensive offline and online experiments, followed by a multi‑task reinforcement‑learning framework (RMTL) that further improves multi‑objective recommendation performance.

Recommendation systemsconstrained optimizationmulti-task learning
0 likes · 16 min read
Two-Stage Constrained Actor-Critic for Short‑Video Recommendation and a Reinforcement‑Learning Multi‑Task Framework
Sohu Tech Products
Sohu Tech Products
Nov 8, 2023 · Artificial Intelligence

Two‑Stage Constrained Actor‑Critic for Short‑Video Recommendation and a Reinforcement‑Learning Multi‑Task Recommendation Framework

The presentation introduces a two‑stage constrained actor‑critic algorithm that learns auxiliary policies for interaction signals before optimizing watch‑time under KL constraints, and a reinforcement‑learning multi‑task learning framework that models session‑level dynamics with adaptive multi‑critic weighting, both achieving significant offline and online gains in short‑video recommendation.

Recommendation systemsactor-criticconstrained optimization
0 likes · 16 min read
Two‑Stage Constrained Actor‑Critic for Short‑Video Recommendation and a Reinforcement‑Learning Multi‑Task Recommendation Framework
DataFunTalk
DataFunTalk
Nov 6, 2023 · Artificial Intelligence

Two‑Stage Constrained Actor‑Critic Reinforcement Learning for Short‑Video Recommendation and a Multi‑Task RL Framework

This article presents a two‑stage constrained actor‑critic reinforcement learning algorithm for short‑video recommendation, models the problem as a constrained MDP, details the algorithm’s stages, and reports extensive offline and online experiments showing superior watch‑time and interaction metrics, followed by a multi‑task RL framework and its evaluations.

Recommendation systemsconstrained optimizationmulti-task learning
0 likes · 16 min read
Two‑Stage Constrained Actor‑Critic Reinforcement Learning for Short‑Video Recommendation and a Multi‑Task RL Framework
Kuaishou Tech
Kuaishou Tech
Apr 27, 2023 · Artificial Intelligence

Two-Stage Constrained Actor‑Critic (TSCAC) for Short‑Video Recommendation

The paper models short‑video recommendation as a constrained Markov decision process and introduces a two‑stage constrained actor‑critic algorithm that jointly maximizes watch time while satisfying multiple interaction constraints, demonstrating superior offline and online performance on the KuaiRand dataset and Kuaishou app.

Recommendation systemsactor-criticconstrained optimization
0 likes · 7 min read
Two-Stage Constrained Actor‑Critic (TSCAC) for Short‑Video Recommendation
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Oct 19, 2022 · Artificial Intelligence

Modeling and Optimizing Real‑Time Bidding for Xiaohongshu "Fries" Advertising

Xiaohongshu’s commercial team modeled the real‑time bidding process for its “Fries” ad product, derived an optimal linear‑programming bid formula, and implemented a simple two‑parameter PID‑controlled scheme that meets client pacing, delivery guarantees, and platform profit goals while using practical heuristics.

Real-Time Biddingadvertising optimizationalgorithmic strategy
0 likes · 12 min read
Modeling and Optimizing Real‑Time Bidding for Xiaohongshu "Fries" Advertising
Alimama Tech
Alimama Tech
Sep 29, 2021 · Artificial Intelligence

Unified Solution to Constrained Bidding in Online Display Advertising (USCB)

The paper proposes a unified solution for real‑time bidding in online display ads that formulates advertiser budget and KPI limits as a constrained linear program, derives a closed‑form optimal bidding function with m+1 parameters, and uses model‑free reinforcement learning to dynamically adjust those parameters, achieving superior traffic‑value capture in large‑scale deployment on Alibaba’s Taobao platform.

Real-Time Biddingadvertisingconstrained optimization
0 likes · 11 min read
Unified Solution to Constrained Bidding in Online Display Advertising (USCB)