Artificial Intelligence 14 min read

From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MARKERGEN

MarkerGen introduces a novel, plug‑and‑play framework that decomposes length‑controllable text generation into four sub‑abilities—identifying, counting, planning, and aligning—integrates external tokenizers and dynamic markers, and achieves significantly lower length errors and higher quality across diverse models, tasks, and languages.

Xiaohongshu Tech REDtech

Jun 4, 2025

From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MARKERGEN

Length‑controllable text generation (LCTG) remains a bottleneck for large language models (LLMs). Existing end‑to‑end methods lack fine‑grained supervision of the sub‑abilities required for precise length control, leading to poor generalization across tasks, scales, and languages.

The authors propose a bottom‑up decomposition of LCTG into four sub‑abilities: Identifying, Counting, Planning, and Aligning. Detailed error analyses reveal that Identifying and Counting errors dominate overall performance.

Based on this analysis, the MARKERGEN framework is introduced. It augments LLMs with external tokenizers and counters to compensate for deficiencies in basic length modeling, and dynamically inserts explicit length markers during generation. A three‑stage decoupled generation paradigm—Planning, Semantic Focusing, and Length Alignment—separates semantic generation from length constraints.

Extensive experiments across multiple LLMs (e.g., Qwen2.5, Llama‑3.1) and tasks (summarization, storytelling, QA) show that MARKERGEN reduces average absolute length error from 18.32% to 5.57% (a 12.57% absolute improvement) while improving quality scores and consuming only ~64% of the token budget.

Further evaluations demonstrate strong generalization across models, tasks, length scales (18–1450 tokens), constraint granularities (exact vs. interval), and languages (including Chinese GAOKAO benchmark), consistently keeping violation rates below 3%.

Ablation studies on TruthfulQA with Qwen2.5‑32B‑Instruct confirm the importance of external tool calls, the decaying‑interval marker insertion strategy, and the three‑stage decoupled generation, each contributing to lower error and higher quality.

Attention‑map analysis on Llama‑3.1‑8B‑Instruct shows that shallow layers focus on the inserted length markers for explicit length modeling, while deeper layers shift attention to semantic content, illustrating the two‑phase workflow of MARKERGEN.

Overall, MARKERGEN provides a plug‑and‑play, efficient solution that bridges the gap between sub‑ability diagnosis and human‑aligned generation, setting a new benchmark for industrial‑grade length‑controllable text generation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

LLM Text Generation Length-Controlled Generation MarkerGen Sub-Ability Diagnosis

Written by

Xiaohongshu Tech REDtech

Official account of the Xiaohongshu tech team, sharing tech innovations and problem insights, advancing together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.