Tagged articles
20 articles
Page 1 of 1
AI Algorithm Path
AI Algorithm Path
Jun 28, 2025 · Artificial Intelligence

Implementing Greedy and Beam Decoding for Large Language Models from Scratch

This article walks through the mechanics of greedy search and beam search in large language models, demonstrates both methods with GPT‑2 on the prompt "I have a dream", visualizes the decoding trees, compares their scores, and discusses the trade‑offs between efficiency and output quality.

Beam SearchGPT-2Greedy Search
0 likes · 16 min read
Implementing Greedy and Beam Decoding for Large Language Models from Scratch
AI Algorithm Path
AI Algorithm Path
Mar 4, 2025 · Artificial Intelligence

How to Control LLM Output Using Temperature, Top‑K, and Top‑P

The article explains how sampling parameters—Temperature, Top‑k, and Top‑p—shape the output of large language models, comparing greedy and beam search, illustrating probability changes with concrete examples, and offering practical guidance on adjusting these settings for different tasks.

Beam SearchGreedy SearchLLM
0 likes · 9 min read
How to Control LLM Output Using Temperature, Top‑K, and Top‑P
Tencent Advertising Technology
Tencent Advertising Technology
Jan 9, 2025 · Artificial Intelligence

Applying Large Language Models to Search Advertising: End‑to‑End Generative Recall and System Optimizations

This report details how large language models (LLMs) were integrated into Tencent's search advertising pipeline—from early extraction‑distillation experiments in 2023 to a 2024 end‑to‑end generative recall architecture—showing significant improvements in relevance, diversity, and revenue through knowledge injection, supervised fine‑tuning, constrained beam‑search decoding, and high‑performance inference services.

AIBeam SearchLLM
0 likes · 11 min read
Applying Large Language Models to Search Advertising: End‑to‑End Generative Recall and System Optimizations
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Nov 30, 2023 · Artificial Intelligence

Mastering LLM Text Generation: Decoding Methods Explained

This review of the recent MindSpore NLP public class walks through the fundamentals of large language model text generation, detailing deterministic decoding such as greedy and beam search, stochastic sampling techniques like temperature, top‑k and top‑p, and advanced methods including constrained beam, contrastive, and assisted search, with illustrative examples.

Beam SearchGreedy SearchLLM
0 likes · 5 min read
Mastering LLM Text Generation: Decoding Methods Explained
Baidu Geek Talk
Baidu Geek Talk
Aug 21, 2023 · Artificial Intelligence

Decoding Strategies for Generative Models: Top‑k, Top‑p, Contrastive Search, Beam Search, and Sampling

The article explains how generative models use deterministic methods like greedy and beam search and stochastic techniques such as top‑k, top‑p, contrastive search and sampling, describing their mechanisms, temperature control, repetition penalties, and practical trade‑offs for balancing fluency, diversity and coherence.

AIBeam SearchSampling
0 likes · 9 min read
Decoding Strategies for Generative Models: Top‑k, Top‑p, Contrastive Search, Beam Search, and Sampling
Architect
Architect
Jul 1, 2023 · Artificial Intelligence

Comprehensive Guide to Text Generation Decoding Strategies with HuggingFace Transformers

This tutorial explores various text generation decoding methods—including greedy search, beam search, top‑k/top‑p sampling, sample‑and‑rank, and group beam search—explaining their principles, providing detailed Python code examples, and comparing their use in modern large language models.

Beam SearchGreedy SearchSampling
0 likes · 59 min read
Comprehensive Guide to Text Generation Decoding Strategies with HuggingFace Transformers
Tencent Cloud Developer
Tencent Cloud Developer
Jun 1, 2023 · Artificial Intelligence

A Comprehensive Guide to Decoding Strategies for Text Generation with HuggingFace Transformers

This guide thoroughly explains the major decoding strategies for neural text generation in HuggingFace Transformers—including greedy, beam, diverse beam, sampling, top‑k, top‑p, sample‑and‑rank, beam sampling, and group beam search—detailing their principles, Python implementations with LogitsProcessor components, workflow diagrams, comparative analysis, and references to original research.

Beam SearchSamplingText Generation
0 likes · 60 min read
A Comprehensive Guide to Decoding Strategies for Text Generation with HuggingFace Transformers
DaTaobao Tech
DaTaobao Tech
Apr 7, 2023 · Artificial Intelligence

Two‑Level Store Recommendation and Experience Optimization in Taobao’s Daily Good Store

Taobao’s Daily Good Store tackles a two‑level recommendation challenge by jointly ranking shops and their items through a dual‑link system enhanced with a novel scatter‑score metric, personalized category scattering via Earth Mover’s Distance, beam‑search optimization, and UI upgrades, delivering higher efficiency, relevance, diversity, and ecosystem health.

Beam SearchUser experiencerecommendation system
0 likes · 11 min read
Two‑Level Store Recommendation and Experience Optimization in Taobao’s Daily Good Store
Kuaishou Tech
Kuaishou Tech
Oct 21, 2022 · Artificial Intelligence

Real-time Short Video Recommendation on Mobile Devices: System Design, Model Architecture, and Experimental Evaluation

The paper presents a lightweight on‑device re‑ranking system for short‑video recommendation that leverages real‑time user feedback and context‑aware generative ranking, detailing its architecture, feature engineering, beam‑search optimization, and both offline and online experimental results showing significant performance gains.

Beam SearchContext-Awarefeature engineering
0 likes · 12 min read
Real-time Short Video Recommendation on Mobile Devices: System Design, Model Architecture, and Experimental Evaluation
Zuoyebang Tech Team
Zuoyebang Tech Team
Jul 14, 2022 · Artificial Intelligence

Enhancing Speech Keyword Detection Using Prefix Automaton Beam Search

This article presents a method to improve keyword detection in large‑scale speech recognition by integrating a prefix automaton into the beam‑search decoding of seq2seq models, enabling real‑time addition of new terms while reducing computational overhead compared to traditional approaches.

Beam SearchSeq2Seqkeyword detection
0 likes · 12 min read
Enhancing Speech Keyword Detection Using Prefix Automaton Beam Search
Youku Technology
Youku Technology
Feb 28, 2022 · Artificial Intelligence

Seq2Path: Generating Sentiment Tuples as Paths of a Tree

Seq2Path treats each sentiment tuple as an independent tree path, training with average path loss and decoding via constrained beam search with a discriminative token, achieving state‑of‑the‑art results on five aspect‑based sentiment analysis datasets and deployment in Alibaba Entertainment AI Brain.

ACLBeam SearchInformation Extraction
0 likes · 3 min read
Seq2Path: Generating Sentiment Tuples as Paths of a Tree
DataFunTalk
DataFunTalk
Nov 10, 2021 · Artificial Intelligence

Learnable Index Structures for Large‑Scale Retrieval: Deep Retrieval Model and Training Methods

This article introduces ByteDance's Deep Retrieval (DR) framework, describing its learnable index structure that aligns embedding training with retrieval objectives, detailing the core model, structure‑loss training via EM and online EM algorithms, beam‑search serving, multi‑task learning, and practical insights from Q&A.

Beam SearchEM algorithmRecommendation Systems
0 likes · 11 min read
Learnable Index Structures for Large‑Scale Retrieval: Deep Retrieval Model and Training Methods
Alimama Tech
Alimama Tech
Sep 8, 2021 · Artificial Intelligence

Engineering Optimizations for Large‑Scale Advertising Recall Models: Full‑Cache Scoring and Index Flattening

Alibaba Mama’s advertising platform modernized its Tree‑based Deep Model by introducing a dual‑tower full‑library DNN with aggressive pre‑filtering and custom GPU TopK kernels, and a flattened‑tree model that retains beam search with multi‑head attention, while applying memory‑aware tricks such as attention swapping, softmax approximation, tiled‑matmul splitting, TensorCore batching, INT8 quantization and cache‑resident ad vectors, enabling multi‑fold latency reductions with minimal recall loss.

Beam SearchGPU AccelerationModel Optimization
0 likes · 15 min read
Engineering Optimizations for Large‑Scale Advertising Recall Models: Full‑Cache Scoring and Index Flattening
DataFunTalk
DataFunTalk
Feb 15, 2021 · Artificial Intelligence

Deep Tree Matching (TDM): Evolution and Practice in Large-Scale Retrieval at Alibaba

This article explains Alibaba's Deep Tree Matching (TDM) technology, covering the challenges of large‑scale match retrieval, the progression from classic two‑stage recall to tree‑based indexing, max‑heap tree modeling, beam‑search retrieval, and the joint model‑index learning across TDM 1.0, 2.0, and 3.0, highlighting significant offline and online performance gains and future research directions.

AlibabaBeam SearchDeep Learning
0 likes · 15 min read
Deep Tree Matching (TDM): Evolution and Practice in Large-Scale Retrieval at Alibaba
New Oriental Technology
New Oriental Technology
Feb 1, 2021 · Artificial Intelligence

Neural Machine Translation: Seq2Seq, Beam Search, BLEU, Attention Mechanisms, and GNMT Improvements

This article explains key concepts of neural machine translation, covering Seq2Seq encoder‑decoder models, beam search strategies, BLEU evaluation, various attention mechanisms, and the enhancements introduced in Google's Neural Machine Translation system to improve speed, OOV handling, and translation quality.

BLEUBeam SearchGNMT
0 likes · 11 min read
Neural Machine Translation: Seq2Seq, Beam Search, BLEU, Attention Mechanisms, and GNMT Improvements
Didi Tech
Didi Tech
Oct 10, 2020 · Artificial Intelligence

Deep Reinforcement Learning for Route Planning in DiDi Ride‑Hailing

DiDi’s route engine, handling over 40 billion daily requests, replaces static graph algorithms with a deep‑reinforcement‑learning system that first learns intersection decisions via behavior‑cloning LSTM models and then refines them through self‑play Q‑learning, using beam‑search decoding to produce globally optimal, low‑deviation routes for ride‑hailing.

AIBeam SearchRoute Planning
0 likes · 12 min read
Deep Reinforcement Learning for Route Planning in DiDi Ride‑Hailing
DataFunTalk
DataFunTalk
Sep 4, 2020 · Artificial Intelligence

Beam Search Aware Training for Optimal Tree-Based Retrieval Models

This article presents a comprehensive study of tree-based deep models for large-scale matching, introduces the theoretical framework of optimal tree models, proposes a Beam Search aware training algorithm (BSAT/OTM) to address training-test mismatch, and demonstrates significant recall improvements on Amazon Books and UserBehavior datasets.

Beam SearchDeep Learninglarge-scale matching
0 likes · 23 min read
Beam Search Aware Training for Optimal Tree-Based Retrieval Models
DataFunTalk
DataFunTalk
Dec 20, 2019 · Artificial Intelligence

AutoCross: Automatic Feature Crossing for Tabular Data in Real-World Applications

The article presents AutoCross, a system that automatically generates and selects high‑order feature crossings for tabular data using multi‑granularity discretization, beam search, field‑wise logistic regression and successive mini‑batch gradient descent, achieving superior accuracy and efficiency in large‑scale recommendation scenarios.

AutoCrossBeam SearchRecommendation Systems
0 likes · 10 min read
AutoCross: Automatic Feature Crossing for Tabular Data in Real-World Applications
Hulu Beijing
Hulu Beijing
Dec 14, 2017 · Artificial Intelligence

Understanding Seq2Seq: Framework, Advantages, and Decoding Techniques

This article explains the Seq2Seq encoder‑decoder framework, its benefits for various sequence modeling tasks, and compares common decoding strategies such as greedy search and beam search, while also introducing attention and other enhancements for improved performance.

Beam SearchEncoder-Decoderattention
0 likes · 9 min read
Understanding Seq2Seq: Framework, Advantages, and Decoding Techniques