Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models

This paper introduces Parrot, a system that enhances large language models' (LLMs) multi-turn instruction following capabilities through context-aware preference optimization (CaPO) and synthetic data generation, achieving significant performance improvements with limited training data.

Kuaishou Tech
Kuaishou Tech
Kuaishou Tech
Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models

This research presents Parrot, a framework designed to improve large language models' (LLMs) ability to follow multi-turn instructions. The system introduces context-aware preference optimization (CaPO) to address challenges in handling complex dialogue contexts, such as references and omissions. By training a model (Parrot-Ask) to generate human-like multi-turn dialogues and using this data for fine-tuning, Parrot achieves substantial performance gains. The approach combines data synthesis with preference optimization to enhance context understanding, outperforming baseline models like Vicuna and achieving up to 7% absolute improvement with only 40k training samples.

The paper also develops MT-Bench++, an evaluation benchmark for multi-turn instruction following, which includes 8 rounds of dialogue. Experimental results show that Parrot-Chat, the optimized model, surpasses existing open-source models in both MT-Bench and MT-Bench++ evaluations. The CaPO strategy, which generates negative examples simulating context-related errors, further boosts performance by 2.4% when combined with multiple error scenarios.

Key contributions include a new data collection methodology using Parrot-Ask to generate high-quality multi-turn instructions and a systematic evaluation framework. While limited by dataset size and reliance on ChatGPT for data generation, the work advances LLM capabilities in real-world dialogue scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

large language modelsNLPmulti-turn dialoguedata synthesisCaPOinstruction following
Kuaishou Tech
Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.