Tagged articles
8 articles
Page 1 of 1
Fun with Large Models
Fun with Large Models
Nov 14, 2025 · Artificial Intelligence

Can GPT‑5.1’s Core Features Set a New Benchmark for Model Performance?

The article provides an in‑depth analysis of GPT‑5.1, highlighting its enhanced emotional conversation, stronger instruction‑following, superior code generation and physics simulation, and the new adaptive reasoning mechanism with two model variants, while comparing concrete test results against GPT‑5.

GPT-5.1adaptive reasoningconversation
0 likes · 9 min read
Can GPT‑5.1’s Core Features Set a New Benchmark for Model Performance?
Baidu Tech Salon
Baidu Tech Salon
Oct 24, 2025 · Artificial Intelligence

How Wenxin X1.1 Tops China’s LLMs on the New SuperCLUE-CPIF Benchmark

Recent release of the SuperCLUE-CPIF benchmark shows Baidu’s Wenxin X1.1 achieving the highest score among Chinese large language models, surpassing competitors like DeepSeek‑V3.2‑Exp‑Thinking and Hunyuan‑T1, with notable advantages in precise instruction following and complex task handling.

AI EvaluationBenchmarkWenxin X1.1
0 likes · 4 min read
How Wenxin X1.1 Tops China’s LLMs on the New SuperCLUE-CPIF Benchmark
Meituan Technology Team
Meituan Technology Team
Aug 28, 2025 · Artificial Intelligence

How Meeseeks Redefines LLM Instruction-Following Evaluation

Meeseeks, a new benchmark released by Meituan’s M17 team, systematically evaluates large language models’ instruction‑following ability with a three‑tier framework, multi‑round self‑correction, and extensive real‑world data, revealing performance gaps among models such as OpenAI o‑series, Claude, DeepSeek and Qwen2.5.

AIBenchmarkLLM evaluation
0 likes · 13 min read
How Meeseeks Redefines LLM Instruction-Following Evaluation
AntTech
AntTech
Jun 4, 2025 · Artificial Intelligence

LLaDA and LLaDA‑V: Large Language Diffusion Models and Their Multimodal Extensions

This article presents the LLaDA series of diffusion‑based large language models, explains how their generative‑modeling principle yields language intelligence comparable to autoregressive models, and details the multimodal LLaDA‑V architecture, training methods, experimental results, and broader implications for AI research.

Generative ModelingMultimodal AIdiffusion models
0 likes · 10 min read
LLaDA and LLaDA‑V: Large Language Diffusion Models and Their Multimodal Extensions
Ops Development & AI Practice
Ops Development & AI Practice
Apr 5, 2025 · Artificial Intelligence

Why Do LLMs Follow Instructions So Well? Unpacking the Secrets

This article explains the concept of instruction‑following in large language models, compares early and modern LLMs, details the training techniques that enable it, highlights its importance, offers practical prompting tips, and discusses current challenges and future directions.

AILLMPrompt engineering
0 likes · 10 min read
Why Do LLMs Follow Instructions So Well? Unpacking the Secrets
Kuaishou Tech
Kuaishou Tech
Jul 23, 2024 · Artificial Intelligence

Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models

This paper introduces Parrot, a system that enhances large language models' (LLMs) multi-turn instruction following capabilities through context-aware preference optimization (CaPO) and synthetic data generation, achieving significant performance improvements with limited training data.

CaPONLPdata synthesis
0 likes · 9 min read
Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models