Tagged articles

Large Language Models

1206 articles · Page 5 of 13

Feb 16, 2026 · Artificial Intelligence

Three Years of AI Evolution: From Incremental Gains to Unlimited Capability Frontiers

The article analyzes how, over the past three years, rapid growth in compute, data, and model architecture has turned incremental advances in large language models into qualitative leaps—spanning emergent abilities, world‑model video generation, and agentic AI—suggesting an effectively unbounded frontier for AI capabilities.

AI agentsAI capability boundariesLarge Language Models

0 likes · 18 min read

Three Years of AI Evolution: From Incremental Gains to Unlimited Capability Frontiers

Design Hub

Feb 16, 2026 · Industry Insights

Three AI Industry Shifts in Feb 2026: Open‑Source, Talent, and Infrastructure

In February 2026 three pivotal AI developments—OpenAI hiring OpenClaw founder Peter Steinberger, Alibaba unveiling the trillion‑parameter Qwen3‑Max‑Thinking model, and Cloudflare launching Markdown for Agents—illustrate how open‑source collaboration, talent mobility, and AI‑native infrastructure are reshaping the sector.

AI InfrastructureAI agentsCloudflare

0 likes · 14 min read

Three AI Industry Shifts in Feb 2026: Open‑Source, Talent, and Infrastructure

Old Zhang's AI Learning

Feb 16, 2026 · Artificial Intelligence

A New Extreme Quantization Tool for Large Models: AngelSlim’s 2‑Bit Compression

AngelSlim introduces a full‑stack large‑model compression suite that uses quantization‑aware training to shrink a 1.8B LLM to 2‑bit precision, achieving less than 4% accuracy loss, supporting a wide range of models, speculative decoding, and providing end‑to‑end deployment instructions for MacBook M4 and server environments.

AngelSlimGGUFLarge Language Models

0 likes · 13 min read

A New Extreme Quantization Tool for Large Models: AngelSlim’s 2‑Bit Compression

Black & White Path

Feb 15, 2026 · Artificial Intelligence

Microsoft Unveils Lightweight Tool to Scan Large Language Models for Hidden Backdoors

Microsoft's AI security team introduced a lightweight scanner that detects backdoors in open‑weight large language models by leveraging three observable signals, offering a low‑false‑positive solution while highlighting the tool's methodology, limitations, and its role in extending Microsoft's AI‑focused Secure Development Lifecycle.

AI safetyLLM securityLarge Language Models

0 likes · 6 min read

Microsoft Unveils Lightweight Tool to Scan Large Language Models for Hidden Backdoors

Top Architect

Feb 14, 2026 · Artificial Intelligence

Why Test‑Time Compute Is the Next Breakthrough for Large Language Models

The article explains how inference‑oriented large language models shift the focus from training‑time resources to test‑time computation, detailing scaling laws, verification techniques, reinforcement‑learning pipelines such as DeepSeek‑R1, and methods for distilling reasoning abilities into smaller, consumer‑grade models.

Large Language ModelsPrompt EngineeringScaling Laws

0 likes · 19 min read

Why Test‑Time Compute Is the Next Breakthrough for Large Language Models

Machine Learning Algorithms & Natural Language Processing

Feb 11, 2026 · Artificial Intelligence

Breaking the Data Ceiling: UltraData’s 2.4 TB Tiered Dataset with the Largest L3 Math Library

UltraData presents a five‑level tiered data‑management system (L0‑L4) for large‑language‑model training, releases the world’s largest open L3 mathematics dataset (2.4 TB), validates the approach with extensive MiniCPM‑1.2B experiments showing consistent performance gains across web, multilingual, math and code domains, and opens a suite of governance tools and a community portal.

Data GovernanceLarge Language ModelsMathematics Dataset

0 likes · 15 min read

Breaking the Data Ceiling: UltraData’s 2.4 TB Tiered Dataset with the Largest L3 Math Library

Machine Learning Algorithms & Natural Language Processing

Feb 11, 2026 · Artificial Intelligence

Can TI‑DPO Fix DPO’s Blind Spot? Token‑Importance Guided Direct Preference Optimization for Better LLM Alignment

TI‑DPO introduces a hybrid weighting scheme and a triplet‑loss objective that weight tokens by gradient attribution and a Gaussian prior, enabling precise identification of critical tokens and yielding consistent performance gains over DPO, SimPO, and GRPO on Llama‑3, Mistral‑7B, and downstream benchmarks such as IFEval, TruthfulQA, and HumanEval.

Direct Preference OptimizationLarge Language ModelsRLHF

0 likes · 8 min read

Can TI‑DPO Fix DPO’s Blind Spot? Token‑Importance Guided Direct Preference Optimization for Better LLM Alignment

Qborfy AI

Feb 11, 2026 · Artificial Intelligence

What Is an AI Agent? From Passive Models to Autonomous Digital Assistants

This article explains AI agents as autonomous systems that perceive environments, set goals, and act, contrasting them with traditional AI, detailing their core definition, architecture, key components, practical applications, implementation steps, classification, technology stack, case studies, emerging trends, challenges, and future directions.

AI AgentAutoGPTAutonomous Systems

0 likes · 11 min read

What Is an AI Agent? From Passive Models to Autonomous Digital Assistants

PaperAgent

Feb 11, 2026 · Industry Insights

Is DeepSeek’s New V4 Model Redefining the AI Landscape?

DeepSeek has quietly released a new large‑language model—likely V4—featuring a May 2025 knowledge cutoff, a 1 million‑token context window, and pure‑text capabilities, while industry trends in 2026 shift focus toward agentic AI systems that coordinate multiple specialized models.

AI modelsDeepSeekIndustry Trends

0 likes · 3 min read

Is DeepSeek’s New V4 Model Redefining the AI Landscape?

PaperAgent

Feb 11, 2026 · Artificial Intelligence

Unlocking Agentic Reasoning: A Deep Dive into the New LLM Paradigm

This comprehensive review dissects the emerging Agentic Reasoning paradigm for large language models, outlining its three‑layer architecture, core capabilities, optimization modes, benchmark suites, and real‑world applications across mathematics, science, embodied AI, healthcare, and autonomous web exploration.

AI benchmarksArtificial IntelligenceAutonomous Agents

0 likes · 10 min read

Unlocking Agentic Reasoning: A Deep Dive into the New LLM Paradigm

Software Engineering 3.0 Era

Feb 11, 2026 · Artificial Intelligence

2025 Large Model Service Performance Report: Near‑100% Success, Rising Throughput, and Falling Prices

The 2025 monitoring report by AIIA and the China Academy of Information and Communications Technology evaluates 42 large‑model services across 13 MaaS platforms, revealing near‑100% call success rates, significant TPS growth, sub‑second latency, increasing open‑source model adoption, and a gradual decline in service pricing.

Large Language ModelsLatencyMaaS

0 likes · 11 min read

2025 Large Model Service Performance Report: Near‑100% Success, Rising Throughput, and Falling Prices

Machine Learning Algorithms & Natural Language Processing

Feb 10, 2026 · Artificial Intelligence

Why Self‑Distillation Is the 2026 Keyword for Continual Learning in Large Models

At the start of 2026, self‑distillation dominates the most cited LLM papers, offering a teacher‑free way for large models to continually acquire new knowledge while preserving existing capabilities.

Continual LearningLarge Language ModelsSelf‑Distillation

0 likes · 9 min read

Why Self‑Distillation Is the 2026 Keyword for Continual Learning in Large Models

Old Zhang's AI Learning

Feb 9, 2026 · Artificial Intelligence

Qwen 3.5 Emerges; ByteDance and DeepSeek Set to Release Flagship LLMs for Spring Festival

The LMSYS Chatbot Arena now shows Qwen 3.5 (codenamed Karp-001/002) alongside ByteDance's Pisces‑llm models and DeepSeek‑V4, with new Transformers configs and hints of an Active‑3B MoE architecture, suggesting a fresh wave of flagship large language models arriving for the Spring Festival.

ByteDanceDeepSeekLarge Language Models

0 likes · 4 min read

Qwen 3.5 Emerges; ByteDance and DeepSeek Set to Release Flagship LLMs for Spring Festival

AI2ML AI to Machine Learning

Feb 7, 2026 · Artificial Intelligence

Why the ‘Skills’ Approach Is the Third Major Compromise Shaping Enterprise AI in 2026

The article argues that embracing the Skills paradigm— a lightweight, low‑cost alternative to large‑scale model training—represents the third major compromise in the large‑model era, balancing reduced emergence and planning hallucinations against increased stability and engineering efficiency for enterprise AI deployments.

Enterprise AILarge Language ModelsMixture of Experts

0 likes · 8 min read

Why the ‘Skills’ Approach Is the Third Major Compromise Shaping Enterprise AI in 2026

Baidu Intelligent Cloud Tech Hub

Feb 6, 2026 · Artificial Intelligence

Accelerating GLM‑4.x Inference on Kunlun XPU with SGLang & vLLM

Baidu’s Baige team successfully adapted the GLM‑4.x series language models to the Kunlun XPU platform by leveraging SGLang and the vLLM‑Kunlun plugin, employing agile adaptation, precision alignment with torch_xray, and extensive performance tuning to achieve GPU‑level accuracy and superior inference speed.

AILarge Language ModelsXPU

0 likes · 6 min read

Accelerating GLM‑4.x Inference on Kunlun XPU with SGLang & vLLM

AI Software Product Manager

Feb 4, 2026 · Artificial Intelligence

Mastering Agent Skills: A Systematic Guide to Large Model Capabilities

This article traces the evolution of large‑model capabilities from early plugins to the standardized Agent Skills framework, explains the core concepts, technical composition, and progressive disclosure mechanism, and provides a step‑by‑step practical guide for building, configuring, and deploying Skills across ecosystems.

AI ArchitectureAI OperationsAgent Skills

0 likes · 11 min read

Mastering Agent Skills: A Systematic Guide to Large Model Capabilities

AI Engineering

Feb 3, 2026 · Artificial Intelligence

Anthropic Study Reveals AI Errors Are ‘Hot Chaos’ Rather Than Goal‑Driven Misbehaviour

Anthropic researchers measured AI mistakes by separating systematic bias from random variance, finding that longer inference times and larger models increase chaotic behavior, that language models act as dynamic systems rather than optimizers, and that AI risk should be managed as complex‑system failure rather than malicious intent.

AI safetyAnthropicLarge Language Models

0 likes · 6 min read

Anthropic Study Reveals AI Errors Are ‘Hot Chaos’ Rather Than Goal‑Driven Misbehaviour

PaperAgent

Feb 3, 2026 · Artificial Intelligence

Why Today's LLMs Still Struggle with “Learn‑and‑Apply” Tasks: Insights from the CL‑Bench Study

The CL‑Bench benchmark reveals that current large language models fail to learn and apply new, long‑context knowledge, exposing critical gaps in context learning, scoring design, and error patterns across ten cutting‑edge models.

AI researchContext LearningLLM evaluation

0 likes · 7 min read

Why Today's LLMs Still Struggle with “Learn‑and‑Apply” Tasks: Insights from the CL‑Bench Study

AI Architecture Hub

Feb 3, 2026 · Artificial Intelligence

How AI-Powered Programming Is Redefining the Developer’s Role

The article explains how large‑model programming shifts developers from writing code to defining clear documentation, outlines a three‑stage document‑driven workflow, offers practical prompt‑engineering tips, model‑selection guidance, safety checklists, and highlights the new core competencies programmers need in the AI era.

AI programmingDocument-driven developmentLarge Language Models

0 likes · 9 min read

How AI-Powered Programming Is Redefining the Developer’s Role

Tencent Technical Engineering

Feb 2, 2026 · Artificial Intelligence

Why Neural Networks Are the Hidden Engine Behind Modern AI: From Basics to Large Language Models

This comprehensive guide walks through the fundamentals of neural networks, activation functions, training methods, and how they power large language models, while also covering tokenization, self‑attention, transformer architectures, AI infrastructure, and practical usage through agents and retrieval‑augmented generation.

Agent systemsArtificial IntelligenceGPU infrastructure

0 likes · 75 min read

Why Neural Networks Are the Hidden Engine Behind Modern AI: From Basics to Large Language Models

Sohu Tech Products

Jan 28, 2026 · Artificial Intelligence

How OnePiece Brings Context Engineering and Implicit Reasoning to Industrial Ranking

This article details the OnePiece framework, which integrates context engineering, anchor item sequences, and progressive implicit reasoning into generative recommendation systems, achieving significant offline and online performance gains on Shopee Search by enhancing model inference, personalization, and computational efficiency.

Context EngineeringImplicit ReasoningLarge Language Models

0 likes · 13 min read

How OnePiece Brings Context Engineering and Implicit Reasoning to Industrial Ranking

Woodpecker Software Testing

Jan 28, 2026 · Artificial Intelligence

How Large Language Models Overcome Traditional Software Testing Pain Points

Large language models can dramatically reshape software testing by automating test case generation, understanding requirements, predicting failures, and streamlining result analysis, as demonstrated through detailed workflow diagrams, pseudocode, Python implementations, and real‑world case studies in finance, e‑commerce, and IoT domains.

AI test generationLarge Language ModelsPrompt Engineering

0 likes · 10 min read

How Large Language Models Overcome Traditional Software Testing Pain Points

Data STUDIO

Jan 27, 2026 · Artificial Intelligence

How Python RAG Architectures Can Tame Large‑Model Hallucinations: A Complete Guide to 9 Designs

This article explains why large‑language‑model hallucinations are risky, introduces Retrieval‑Augmented Generation (RAG) as a remedy, and walks through nine Python‑based RAG architectures—standard, conversational, corrective, adaptive, fusion, HyDE, self‑RAG, agentic, and graph RAG—detailing their workflows, code examples, strengths, weaknesses, and a decision‑making map for selecting the right design.

AI hallucinationLangChainLarge Language Models

0 likes · 29 min read

How Python RAG Architectures Can Tame Large‑Model Hallucinations: A Complete Guide to 9 Designs

PaperAgent

Jan 25, 2026 · Industry Insights

Top 10 Chinese Large Models to Watch: Features, Benchmarks, and Download Links

This roundup highlights ten cutting‑edge Chinese AI models—including Qwen3‑TTS, LongCat‑Flash‑Thinking‑2601, GLM‑4.7‑Flash, STEP3‑VL‑10B, Baichuan‑M3, and Youtu‑LLM—detailing their multilingual capabilities, architecture innovations, performance claims, and providing direct repository links for researchers and developers.

AI researchChinese AILarge Language Models

0 likes · 7 min read

Top 10 Chinese Large Models to Watch: Features, Benchmarks, and Download Links

dbaplus Community

Jan 21, 2026 · Information Security

How Large Language Models Transform Data Security: Frameworks, Challenges, and Real-World Practices

This article reviews the current state, feasibility, industry adoption, concrete deployment scenarios, and future directions of applying large language models to data security, covering technical challenges, architectural designs, prompt engineering, privacy‑preserving techniques, and practical case studies.

AI ApplicationsData SecurityInformation Security

0 likes · 21 min read

How Large Language Models Transform Data Security: Frameworks, Challenges, and Real-World Practices

Tencent Cloud Developer

Jan 20, 2026 · Artificial Intelligence

From Transformers to Agents: A Complete Timeline of Large Language Model Evolution

This article traces the evolution of large language models from the 2017 Transformer breakthrough through successive milestones such as BERT, GPT‑3, RL‑HF alignment, multimodal extensions, open‑source alternatives, and the rise of retrieval‑augmented generation, AI agents, and emerging protocols that shape modern AI applications.

Large Language ModelsPrompt EngineeringRAG

0 likes · 44 min read

From Transformers to Agents: A Complete Timeline of Large Language Model Evolution

Architect's Guide

Jan 19, 2026 · Artificial Intelligence

Mastering Prompt Engineering: From Blind Prompting to Reliable LLM Solutions

This article explains how to treat prompt engineering as a systematic, experiment‑driven practice—distinguishing it from blind prompting—by defining problems, building demo sets, crafting and testing prompt candidates, evaluating accuracy versus cost, and establishing verification loops for reliable large language model applications.

LLM testingLarge Language ModelsPrompt Engineering

0 likes · 16 min read

Mastering Prompt Engineering: From Blind Prompting to Reliable LLM Solutions

Old Meng AI Explorer

Jan 18, 2026 · Artificial Intelligence

How BabelDOC Preserves PDF Layout While Translating & OneAIFW Shields Your Data

Two open‑source projects—BabelDOC, a Python‑based PDF translator that retains original formatting using AI models, and OneAIFW, a Zig‑and‑Rust local AI firewall that anonymizes sensitive data before LLM queries—offer practical, privacy‑preserving solutions for researchers and developers.

AI privacyData ProtectionDocument processing

0 likes · 8 min read

How BabelDOC Preserves PDF Layout While Translating & OneAIFW Shields Your Data

Fun with Large Models

Jan 18, 2026 · Artificial Intelligence

Step‑by‑Step Guide to Deploying Large Language Models Locally with VLLM and Ollama

This article walks through two mainstream local deployment solutions—high‑performance VLLM for production Linux servers and lightweight Ollama for personal Windows machines—covering environment setup, model download, server launch, API testing, key configuration parameters, and the quantization technique that makes Ollama models compact.

GPU OptimizationLarge Language ModelsModel Quantization

0 likes · 18 min read

Step‑by‑Step Guide to Deploying Large Language Models Locally with VLLM and Ollama

Baobao Algorithm Notes

Jan 16, 2026 · Artificial Intelligence

From PPO to SAPO: Evolution of Large‑Model Reinforcement Learning Algorithms

This article systematically reviews the main reinforcement‑learning algorithms—PPO, GRPO, DAPO, GSPO, and SAPO—used for fine‑tuning large language models, explaining why supervised fine‑tuning precedes RL, how each method improves training efficiency and stability, and what trade‑offs they entail.

GRPOLarge Language ModelsPPO

0 likes · 15 min read

From PPO to SAPO: Evolution of Large‑Model Reinforcement Learning Algorithms

AsiaInfo Technology: New Tech Exploration

Jan 16, 2026 · Artificial Intelligence

How to Evaluate Ontology Quality: Metrics, Methods, and Tools

This article surveys ontology quality evaluation by outlining key metrics such as consistency, completeness, and coverage, and reviewing five major assessment approaches—including corpus‑based, gold‑standard, metric‑driven, rule‑based, and application‑driven methods—while highlighting representative tools, open‑source implementations, and future research challenges.

Knowledge EngineeringLarge Language ModelsSemantic Web

0 likes · 20 min read

How to Evaluate Ontology Quality: Metrics, Methods, and Tools

PaperAgent

Jan 16, 2026 · Artificial Intelligence

Do Large Language Models Really Have Self‑Awareness? Inside Anthropic’s Introspective Experiments

This article reviews Anthropic’s recent paper on emergent introspective awareness in large language models, detailing a novel concept‑injection method, four key findings about AI’s ability to detect, distinguish, and control internal thoughts, and a cross‑model performance comparison.

AI IntrospectionAnthropicArtificial Intelligence Research

0 likes · 7 min read

Do Large Language Models Really Have Self‑Awareness? Inside Anthropic’s Introspective Experiments

AI Info Trend

Jan 14, 2026 · Industry Insights

2026 AI Model Leaderboards: Google Dominates, Anthropic Surprises, OpenAI’s New Champion

The 2026 AI model leaderboards across Text, Web Development, Vision, and Text-to-Image arenas reveal Google’s Gemini series leading in text and vision, Anthropic’s Claude Opus unexpectedly topping web‑dev rankings, and OpenAI’s GPT‑Image‑1.5 clinching the top spot in creative image generation, highlighting an increasingly competitive AI landscape.

AIAnthropicGoogle

0 likes · 8 min read

2026 AI Model Leaderboards: Google Dominates, Anthropic Surprises, OpenAI’s New Champion

AI Insight Log

Jan 13, 2026 · Artificial Intelligence

Why Bigger LLMs Still Forget Facts – DeepSeek’s Engram Memory Module Explained

This article analyzes DeepSeek’s new Engram module, showing how conditional memory reduces the compute‑only approach of large language models, improves knowledge retrieval, reasoning, long‑context handling, and system efficiency while maintaining strict parameter and FLOP budgets.

AI ArchitectureDeepSeekEngram

0 likes · 15 min read

Why Bigger LLMs Still Forget Facts – DeepSeek’s Engram Memory Module Explained

DataFunTalk

Jan 13, 2026 · Artificial Intelligence

How Conditional Memory (Engram) Boosts Large Language Models Beyond MoE

DeepSeek's new paper introduces a conditional memory mechanism called Engram that complements Mixture‑of‑Experts, providing O(1) lookup, improving knowledge retrieval, reasoning, and long‑context performance while scaling efficiently on the same FLOPs budget.

EngramLarge Language ModelsMemory Retrieval

0 likes · 18 min read

How Conditional Memory (Engram) Boosts Large Language Models Beyond MoE

PaperAgent

Jan 13, 2026 · Artificial Intelligence

How Engram’s Conditional Memory Redefines Sparsity in Large Language Models

DeepSeek’s newly released Engram module introduces a conditional memory mechanism that leverages O(1) N‑gram lookup to create a new sparsity axis for large language models, reducing early‑layer compute, improving inference efficiency, and delivering notable performance gains across reasoning and knowledge tasks, as demonstrated by extensive experiments on 27‑billion‑parameter models.

Efficient InferenceEngramLLM Sparsity

0 likes · 8 min read

How Engram’s Conditional Memory Redefines Sparsity in Large Language Models

BirdNest Tech Talk

Jan 11, 2026 · Artificial Intelligence

How AI Agents Overcome Context Window Limits: Gemini vs Manus Deep Research

The article analyzes the context‑window bottleneck of large language models, compares two architectural strategies—strengthening the model (Gemini Deep Research) and parallel agent decomposition (Manus Wide Research)—and details a wind‑power investment case study, technical implementation, and future directions.

AI researchLarge Language ModelsReAct

0 likes · 16 min read

How AI Agents Overcome Context Window Limits: Gemini vs Manus Deep Research

Old Meng AI Explorer

Jan 10, 2026 · Artificial Intelligence

Run Large Language Models on a Laptop: How ktransformers Breaks the GPU Barrier

ktransformers is an open‑source AI model optimization framework that uses dynamic quantization, layer fusion and memory reuse to cut memory usage by up to 50%, double loading speed and reduce inference cost, enabling 7B‑13B models to run smoothly on ordinary CPUs or low‑end GPUs.

KTransformersLarge Language ModelsModel Optimization

0 likes · 11 min read

Run Large Language Models on a Laptop: How ktransformers Breaks the GPU Barrier

PMTalk Product Manager Community

Jan 9, 2026 · Product Management

How AI Product Managers Build Conversational Analytics with Large Language Models

The article examines how traditional BI tools waste minutes on manual clicks, then details a step‑by‑step framework for selecting large models, designing memory‑aware architectures, mitigating security risks, and rolling out conversational analytics products that cut analysis time from days to minutes.

AI riskData VisualizationLarge Language Models

0 likes · 11 min read

How AI Product Managers Build Conversational Analytics with Large Language Models

HyperAI Super Neural

Jan 9, 2026 · Artificial Intelligence

How HY-MT1.5 Achieves 1 GB Mobile Translation with a 1.8B Model

The article explains how Tencent's open‑source HY‑MT1.5 tackles the high‑cost, large‑parameter barrier of neural machine translation by offering a 1.8 B‑parameter model that runs on roughly 1 GB of RAM, processes 50 tokens in 0.18 s, supports 33 languages, and uses on‑policy distillation to retain top‑tier accuracy, while providing a step‑by‑step online demo and free compute credits for new users.

HY-MT1.5Large Language ModelsMachine Translation

0 likes · 5 min read

How HY-MT1.5 Achieves 1 GB Mobile Translation with a 1.8B Model

PMTalk Product Manager Community

Jan 8, 2026 · Artificial Intelligence

Understanding Fine‑Tuning: A Primer for AI Product Managers

This article explains how large language models are first pre‑trained on massive text corpora and then fine‑tuned with smaller, task‑specific datasets, covering the fine‑tuning process, types such as full‑parameter and PEFT, practical benefits, real‑world analogies, and key challenges like data quality and catastrophic forgetting.

AI product managementLarge Language ModelsModel Adaptation

0 likes · 6 min read

Understanding Fine‑Tuning: A Primer for AI Product Managers

AI Frontier Lectures

Jan 5, 2026 · Artificial Intelligence

Why WeDLM Outpaces AR Models: Diffusion Decoding Meets KV Cache for 10× Faster Inference

Tencent WeChat AI introduces WeDLM, a diffusion language model that works with standard causal attention and KV caching, achieving up to ten‑fold speedups over autoregressive models while maintaining or improving generation quality across math reasoning and open‑ended tasks.

Diffusion Language ModelKV-CacheLarge Language Models

0 likes · 8 min read

Why WeDLM Outpaces AR Models: Diffusion Decoding Meets KV Cache for 10× Faster Inference

Tencent Technical Engineering

Jan 5, 2026 · Artificial Intelligence

How ReAct Turns Large Language Models into Explainable, Actionable Agents

This article explains the ReAct framework, which augments large language models with explicit reasoning, tool use, and observation loops to overcome hallucinations, improve transparency, and enable dynamic interaction with external environments across diverse tasks.

AI agentsLarge Language ModelsReAct

0 likes · 31 min read

How ReAct Turns Large Language Models into Explainable, Actionable Agents

Smart Era Software Development

Jan 5, 2026 · Artificial Intelligence

Understanding Vibe Coding: The Cognitive Architecture Behind AI-Powered Programming

The article defines Vibe Coding, contrasts it with traditional Spec‑driven development, traces the evolution of human‑AI collaboration from simple resource retrieval to multi‑agent workflows, analyzes current limitations, shares practical tool setups, and forecasts future trends in AI‑augmented software engineering.

AI codingHuman-AI CollaborationLarge Language Models

0 likes · 20 min read

Understanding Vibe Coding: The Cognitive Architecture Behind AI-Powered Programming

DataFunSummit

Jan 4, 2026 · Artificial Intelligence

How Ant Group’s DeepInsight Boosted Text‑to‑SQL Accuracy by 46% with an AI‑Driven Evaluation Framework

This article details Ant Group’s DeepInsight intelligent evaluation system for Chinese Text‑to‑SQL, describing the AI‑BI background, challenges of existing benchmarks, a feature‑annotated evaluation design, automated dataset generation, experimental results showing a 46% accuracy gain and 71% reduction in failure rate, and future research directions.

AILarge Language ModelsText-to-SQL

0 likes · 13 min read

How Ant Group’s DeepInsight Boosted Text‑to‑SQL Accuracy by 46% with an AI‑Driven Evaluation Framework

DataFunTalk

Jan 4, 2026 · Artificial Intelligence

How Agentic RAG and Generative Ranking Are Redefining AI Search and Recommendation

This article summarizes three cutting‑edge AI techniques—Alibaba Cloud's Agentic RAG architecture for multimodal search, Huawei Noah's large‑model‑driven recommendation system evolution, and Baidu's generative ranking (GRAB) model for ads—detailing their challenges, designs, performance gains, and practical deployment insights.

AI SearchGenerative RankingLarge Language Models

0 likes · 7 min read

How Agentic RAG and Generative Ranking Are Redefining AI Search and Recommendation

Network Intelligence Research Center (NIRC)

Dec 31, 2025 · Artificial Intelligence

Why AI Inference Is Slow and How Cutting‑Edge Tech Boosts It in Industrial Settings

The article analyzes the severe inference bottlenecks of large language models, CNNs, and recommendation systems and presents a suite of research‑driven accelerations—including token‑level pipeline parallelism (HPipe), KV‑cache clustering (ClusterAttn), quantization (QoKV), heterogeneous edge frameworks (DeepZoning, PICO), delay‑aware edge‑cloud scheduling (DECC), and operator choreography (RACE)—validated on real‑world industrial workloads.

AI inferenceLarge Language ModelsRecommendation Systems

0 likes · 16 min read

Why AI Inference Is Slow and How Cutting‑Edge Tech Boosts It in Industrial Settings

PaperAgent

Dec 29, 2025 · Artificial Intelligence

Unveiling Bottom‑up Policy Optimization: Boosting LLM Reasoning with Internal Strategies

This article introduces Bottom‑up Policy Optimization (BuPO), a novel reinforcement‑learning framework that treats large language models as collections of internal layer and modular policies, revealing distinct inference entropy patterns in Llama and Qwen‑3 and demonstrating superior performance on complex mathematical reasoning benchmarks.

AI researchBottom-up OptimizationInternal Policy

0 likes · 10 min read

Unveiling Bottom‑up Policy Optimization: Boosting LLM Reasoning with Internal Strategies

AI Insight Log

Dec 29, 2025 · Industry Insights

Why Even Top AI Leaders Feel Outpaced: The Rise of AI‑Native Programming

OpenAI co‑founder Andrej Karpathy admits he feels left behind as programming contributions thin, sparking a deep industry discussion about AI‑driven tools, the shift from manual coding to AI‑orchestrated workflows, and how newcomers may outpace seasoned engineers.

AIClaudeLarge Language Models

0 likes · 6 min read

Why Even Top AI Leaders Feel Outpaced: The Rise of AI‑Native Programming

AI2ML AI to Machine Learning

Dec 27, 2025 · Artificial Intelligence

Why Jeff Dean Champions Speculative Decoding: The Underlying Ideas

Jeff Dean highlighted speculative decoding as a lossless inference acceleration technique that can boost large language model throughput by 2–3×, and the article breaks down its core concepts—including parallel token verification, draft‑target model collaboration, rejection sampling theory, and practical optimizations such as continuous batching and tree‑based verification.

Continuous BatchingDraft-Target ModelKV-Cache

0 likes · 8 min read

Why Jeff Dean Champions Speculative Decoding: The Underlying Ideas

Fighter's World

Dec 26, 2025 · Industry Insights

Where Is AI Heading in 2026 After the 2025 Sprint?

The article analyzes the rapid weekly turnover of leading LLM benchmarks in 2025, declining compute costs, the shift from chatbots to multi‑step agents, the widening pilot‑to‑production gap, and predicts that 2026 will be defined by infrastructure constraints, AI‑first product design, and accelerated enterprise adoption.

AI InfrastructureAI product strategyAI trends

0 likes · 25 min read

Where Is AI Heading in 2026 After the 2025 Sprint?

PaperAgent

Dec 26, 2025 · Artificial Intelligence

What Google’s 2025 AI Breakthroughs Reveal About the Future of Intelligent Agents

Google’s 2025 research recap highlights eight major breakthroughs—from the Gemini 3 series achieving unprecedented multimodal reasoning and efficiency, to AI‑driven advances in scientific discovery, creative generation, quantum computing, climate resilience, and responsible AI safety—showcasing how intelligent agents are reshaping products, research, and global challenges.

AI researchAI safetyLarge Language Models

0 likes · 10 min read

What Google’s 2025 AI Breakthroughs Reveal About the Future of Intelligent Agents

Old Meng AI Explorer

Dec 25, 2025 · Artificial Intelligence

Run 100B LLM on a Laptop: BitNet’s 1‑Bit Quantization Enables CPU‑Only AI

BitNet, Microsoft’s open‑source 1‑bit quantization framework, shrinks model size by up to ten‑fold and lets ordinary CPUs—including i7 laptops and ARM tablets—run 2B‑100B language models at usable speeds while cutting power consumption dramatically, offering a practical, GPU‑free solution for local AI.

BitNetCPU inferenceLLM Quantization

0 likes · 9 min read

Run 100B LLM on a Laptop: BitNet’s 1‑Bit Quantization Enables CPU‑Only AI

Efficient Ops

Dec 24, 2025 · Artificial Intelligence

From AI+ Era to Enterprise AI Agents: Evolution, Technologies, and Practical Guidance

The talk outlines the AI+ era's digital ecosystem, traces the evolution from traditional AI to Agentic AI, examines emerging AI Agent technologies, and shares concrete enterprise‑level development practices, frameworks, and governance strategies for financial industry deployments.

AI agentsEnterprise ArchitectureGovernance

0 likes · 18 min read

From AI+ Era to Enterprise AI Agents: Evolution, Technologies, and Practical Guidance

DevOps Coach

Dec 24, 2025 · Artificial Intelligence

Unlock AI Creativity with Verbalized Sampling: The 8‑Word Prompt Trick

A recent Stanford‑led study reveals that asking large language models for multiple responses with associated probabilities—using just eight words—restores lost creativity caused by post‑training alignment, and the article explains why it works and how to apply it.

AI alignmentCreativityLarge Language Models

0 likes · 11 min read

Unlock AI Creativity with Verbalized Sampling: The 8‑Word Prompt Trick

Alibaba Cloud Big Data AI Platform

Dec 23, 2025 · Artificial Intelligence

How Skrull Boosts Long-Context Fine‑Tuning Speed Up to 7.5×

The Skrull system, accepted at NeurIPS 2025, dynamically schedules long and short sequences during each training iteration, overlapping communication and computation to achieve up to 7.54× speedup for long‑context fine‑tuning of large language models while maintaining stability through load‑balancing and rollback mechanisms.

Dynamic Data SchedulingLarge Language ModelsLong Context Fine-Tuning

0 likes · 8 min read

How Skrull Boosts Long-Context Fine‑Tuning Speed Up to 7.5×

Alibaba Cloud Developer

Dec 23, 2025 · Artificial Intelligence

How Hybrid Transformer‑Mamba Architectures Overcome KVCache Challenges in Large‑Model Inference

This article explains how SGLang’s hybrid model design combines Transformer attention with Mamba state‑space layers, introduces a dual‑pool memory architecture and elastic allocation, and presents specialized prefix‑cache and speculative‑decoding techniques that together enable efficient, scalable inference for long‑context large language models.

Inference OptimizationKVCacheLarge Language Models

0 likes · 22 min read

How Hybrid Transformer‑Mamba Architectures Overcome KVCache Challenges in Large‑Model Inference

Network Intelligence Research Center (NIRC)

Dec 23, 2025 · Artificial Intelligence

ClusterAttn: Compressing KV Cache with Intrinsic Attention Clustering

ClusterAttn tackles the KV‑cache bottleneck of large language models by exploiting the natural clustering of attention scores, achieving up to 92% compression without accuracy loss, boosting throughput 2.6–4.8×, handling 128K‑token sequences on a single GPU, and outperforming existing training‑free compression methods.

KV cache compressionLarge Language Modelsattention clustering

0 likes · 8 min read

ClusterAttn: Compressing KV Cache with Intrinsic Attention Clustering

Baobao Algorithm Notes

Dec 22, 2025 · Artificial Intelligence

Which Agentic RL Framework Wins? A Deep Dive into AReal, Seer, Slime & verl

This article analyzes the training‑efficiency challenges of multi‑turn agentic reinforcement learning and compares four recent open‑source frameworks—AReal (Ant), Seer (Moonshot), Slime (Zhipu) and verl (Bytedance)—examining their asynchronous inference designs, rollout‑train separation, long‑context handling, off‑policy mitigation, and system‑level optimizations to guide framework selection.

Agentic RLAsynchronous InferenceLarge Language Models

0 likes · 18 min read

Which Agentic RL Framework Wins? A Deep Dive into AReal, Seer, Slime & verl

PaperAgent

Dec 19, 2025 · Artificial Intelligence

Can We Trust AI? Inside GPT‑5.2‑Codex’s Monitorability Breakthrough

OpenAI’s new GPT‑5.2‑Codex model achieves state‑of‑the‑art performance on SWE‑Bench Pro and Terminal‑Bench 2.0, and a 90‑page technical report introduces the concept of monitorability, defining metrics, benchmark suites, and key findings about chain‑of‑thought length, RL training, and model size.

AI safetyChain-of-ThoughtGPT-5.2

0 likes · 10 min read

Can We Trust AI? Inside GPT‑5.2‑Codex’s Monitorability Breakthrough

HyperAI Super Neural

Dec 19, 2025 · Artificial Intelligence

Weekly AI Paper Digest: Open-Source LLMs, Agent Systems, and Long-Context Reasoning

This week’s AI paper roundup reviews six recent research works—including RecGPT‑V2, Nemotron 3 Nano, FrontierScience benchmark, AutoGLM, Deeper‑GXX, and QwenLong‑L1.5—highlighting advances in large‑language‑model‑driven recommendation, Mixture‑of‑Experts models, expert‑level scientific reasoning, GUI‑based foundation agents, graph neural network deepening, and ultra‑long‑context inference.

AI researchAgent systemsLarge Language Models

0 likes · 6 min read

Weekly AI Paper Digest: Open-Source LLMs, Agent Systems, and Long-Context Reasoning

HyperAI Super Neural

Dec 18, 2025 · Artificial Intelligence

GPT-5 Leads as OpenAI Unveils FrontierScience: Dual‑Track Reasoning and Research Benchmark

OpenAI's FrontierScience benchmark, released on Dec 16, 2025, evaluates expert‑level scientific reasoning and research tasks, showing GPT‑5.2 scoring 25% on Olympiad and 77% on Research, outperforming other models while highlighting strengths in closed‑form problems and gaps in open‑ended research tasks.

AI evaluationFrontierScienceGPT-5

0 likes · 10 min read

GPT-5 Leads as OpenAI Unveils FrontierScience: Dual‑Track Reasoning and Research Benchmark

Zhuanzhuan Tech

Dec 17, 2025 · Artificial Intelligence

How AI Powers Automatic Security Tagging in Large‑Scale Data Governance

This article details how a Chinese e‑commerce platform leverages large‑language‑model AI, the open‑source Dify platform, and engineered workflows to automate security tagging of massive data assets, covering data‑governance fundamentals, AI‑driven tagging advantages, technical architecture, prompt engineering, optimization cases, and future roadmap.

AIData GovernanceLarge Language Models

0 likes · 25 min read

How AI Powers Automatic Security Tagging in Large‑Scale Data Governance

Instant Consumer Technology Team

Dec 16, 2025 · Artificial Intelligence

How Mind Lab Trained a Trillion‑Parameter Agentic Memory with Only 10% GPU Power

This article explains how the Mind Lab team tackled the challenges of training a 1‑trillion‑parameter mixture‑of‑experts model for agentic memory using reinforcement learning, LoRA, and a custom Megatron‑Bridge architecture, achieving a ten‑fold speedup while consuming just a fraction of the usual GPU resources.

AIAgentic AppsLarge Language Models

0 likes · 9 min read

How Mind Lab Trained a Trillion‑Parameter Agentic Memory with Only 10% GPU Power

DataFunSummit

Dec 14, 2025 · Artificial Intelligence

How Sina Weibo Scaled Enterprise AI with a Unified Multi‑Agent Platform

Sina Weibo’s engineering team tackled the high technical barriers, low reuse, and long cycles of large‑model AI deployment by building a unified AI application platform that combines a layered architecture, low‑code workflow, multi‑agent orchestration, and knowledge‑base integration, enabling rapid, reliable AI solutions across the company.

AI platformEnterprise AIKnowledge Base

0 likes · 26 min read

How Sina Weibo Scaled Enterprise AI with a Unified Multi‑Agent Platform

Bighead's Algorithm Notes

Dec 13, 2025 · Artificial Intelligence

Key Quantitative Finance Papers (Dec 6‑12 2025) – AI‑Driven Insights

This article summarizes ten recent arXiv papers (Dec 6‑12 2025) that explore AI‑driven techniques—from neural‑network ranking and reinforcement learning to quantum models and LLM agents—for quantitative finance and investment decision‑making.

CryptocurrencyLarge Language ModelsMachine Learning

0 likes · 18 min read

Key Quantitative Finance Papers (Dec 6‑12 2025) – AI‑Driven Insights

PaperAgent

Dec 12, 2025 · Artificial Intelligence

What Makes GPT‑5.2 and Gemini‑3‑Pro So Fast? Inside Their Key Features and Real‑World Tests

Gemini‑3‑pro’s surprise debut and OpenAI’s emergency release of GPT‑5.2 highlight a shift toward faster inference, deeper reasoning, and lower hallucination rates, with detailed performance metrics, three‑tier model options, extended context windows, and mixed community test results that reveal both strengths and shortcomings.

AI model performanceGPT-5.2Gemini 3 Pro

0 likes · 4 min read

What Makes GPT‑5.2 and Gemini‑3‑Pro So Fast? Inside Their Key Features and Real‑World Tests

Amap Tech

Dec 11, 2025 · Artificial Intelligence

How ACoder Achieved Up to 24× Faster Multi‑Platform Development with AI

The ACoder platform combines multi‑model AI, a panoramic code‑understanding engine, and a layered knowledge‑management system to automate the entire software‑development lifecycle, delivering 5‑20× overall efficiency gains, up to 24× speed‑up for cross‑platform code migration, and dramatically higher code‑recall accuracy.

AI codingKnowledge ManagementLarge Language Models

0 likes · 19 min read

How ACoder Achieved Up to 24× Faster Multi‑Platform Development with AI

Xiaomi Tech

Dec 11, 2025 · Artificial Intelligence

Open‑Source AI Evolution: From Zipformer to Zapformer and Smart Automotive Quality

The MEET 2026 conference showcased Daniel Povey’s analogy of AI evolution to biological evolution, Xiaomi’s open‑source AI breakthroughs such as Zipformer and Zapformer, and the company’s multi‑agent automotive quality engine that leverages large‑scale models, data‑driven diagnostics, and open collaboration to accelerate intelligent technology across industries.

Artificial IntelligenceAutomotive QualityLarge Language Models

0 likes · 12 min read

Open‑Source AI Evolution: From Zipformer to Zapformer and Smart Automotive Quality

Smart Era Software Development

Dec 11, 2025 · Artificial Intelligence

From Scale Race to Efficiency Breakthrough: How Architecture Innovation Will Shape 2026 Large Models and Agents

The article analyzes how architecture innovation—through sparse, multimodal, and dynamic designs—will break the compute bottleneck of large models, reshape pre‑training hierarchies, and drive three distinct 2026 pathways for both model efficiency and agent competition.

2026 predictionsAI agentsLarge Language Models

0 likes · 12 min read

From Scale Race to Efficiency Breakthrough: How Architecture Innovation Will Shape 2026 Large Models and Agents

Wu Shixiong's Large Model Academy

Dec 10, 2025 · Artificial Intelligence

Why RLHF Success Relies on Data Engineering, Not Just Model Tricks

The article explains that the real difficulty of RLHF lies in designing and curating high‑quality preference data, building robust reward models through bad‑case rewriting, human‑in‑the‑loop labeling, and inference‑based reward modeling, while algorithmic details like PPO are secondary concerns.

Data EngineeringGRPOLarge Language Models

0 likes · 9 min read

Why RLHF Success Relies on Data Engineering, Not Just Model Tricks

AI Frontier Lectures

Dec 9, 2025 · Artificial Intelligence

Can Token‑Level Surrogates Stabilize RL for Large Language Models? A Deep Dive

This article analyzes why optimizing sequence‑level rewards for LLMs with token‑level surrogate objectives can improve reinforcement‑learning stability, explains the theoretical conditions required, introduces Routing Replay for MoE models, and presents extensive experiments validating the approach.

Importance SamplingLarge Language ModelsMixture of Experts

0 likes · 12 min read

Can Token‑Level Surrogates Stabilize RL for Large Language Models? A Deep Dive

Tencent Cloud Developer

Dec 9, 2025 · Artificial Intelligence

How Do Large Language Models Turn Text into Math? A Deep Dive into Transformers

This article walks through the complete workflow of AI large language models, from turning user queries into token matrices via tokenization and embedding, through the Transformer’s self‑attention and multi‑head mechanisms, to decoding logits into human‑readable text, while also covering position encoding, long‑context strategies, generation parameters, and practical engineering tips.

Inference OptimizationLarge Language ModelsSelf-Attention

0 likes · 29 min read

How Do Large Language Models Turn Text into Math? A Deep Dive into Transformers

PaperAgent

Dec 6, 2025 · Artificial Intelligence

How Titans and MIRAS Enable AI Models to Remember 1 Million Tokens

Google's Titans architecture and the MIRAS theoretical framework introduce a deep neural memory that lets large language models learn in real time, retain surprising information, and handle context windows of up to two million tokens, outperforming existing Transformers and linear RNNs on a range of benchmarks.

AI memoryLarge Language ModelsMIRAS framework

0 likes · 10 min read

How Titans and MIRAS Enable AI Models to Remember 1 Million Tokens

HyperAI Super Neural

Dec 6, 2025 · Artificial Intelligence

Quick Look at This Week’s Frontier AI Papers: DeepSeekMath‑V2, MedSAM‑3, SAM 3D, Qwen3‑VL, and M²

This roundup surveys five cutting‑edge AI papers—DeepSeekMath‑V2’s self‑verifiable mathematical reasoning, MedSAM‑3’s promptable medical image and video segmentation, SAM 3D’s single‑image 3D reconstruction, Qwen3‑VL’s high‑capacity vision‑language model, and the M² memory‑mesh transformer for image captioning—highlighting their key methods, benchmarks, and code links.

3D reconstructionImage CaptioningLarge Language Models

0 likes · 6 min read

Quick Look at This Week’s Frontier AI Papers: DeepSeekMath‑V2, MedSAM‑3, SAM 3D, Qwen3‑VL, and M²

PMTalk Product Manager Community

Dec 4, 2025 · Industry Insights

Three Chinese AI Giants, Three Strategies: Doubao, DeepSeek, and Qwen in 2025

In 2025 China's AI large‑model arena is sharply fragmenting, with ByteDance's Doubao leading user activity, DeepSeek dominating technical and international influence, and Alibaba's Qwen carving a unique full‑stack strategic edge, each pursuing distinct paths in technology, product and ecosystem competition.

AIChinaDeepSeek

0 likes · 11 min read

Three Chinese AI Giants, Three Strategies: Doubao, DeepSeek, and Qwen in 2025

JD Retail Technology

Dec 4, 2025 · Artificial Intelligence

Twin Networks Reveal How to Optimize Data Mixtures for Large Language Models

This article presents TANDEM, a bi‑level data‑mixture optimization framework that uses twin networks to automatically adjust domain‑specific training data ratios, offering theoretical guarantees, broader applicability, and significant performance gains across pre‑training, fine‑tuning, and e‑commerce product‑understanding tasks.

Large Language ModelsNeurIPSbi-level optimization

0 likes · 6 min read

Twin Networks Reveal How to Optimize Data Mixtures for Large Language Models

Tencent Cloud Developer

Dec 4, 2025 · Artificial Intelligence

From Tapestry to LLMs: 30+ Years of Recommender System Evolution

This article traces the three‑decade evolution of recommender systems—from early collaborative‑filtering prototypes like Tapestry, through the Netflix Prize era and deep‑learning breakthroughs such as Wide&Deep and DIN, to the current generative‑AI wave driven by large language models—highlighting key milestones, technical shifts, industrial deployments, and future challenges.

Industrial DeploymentLarge Language Modelscollaborative filtering

0 likes · 38 min read

From Tapestry to LLMs: 30+ Years of Recommender System Evolution

PaperAgent

Dec 4, 2025 · Artificial Intelligence

From Code Foundations to AI Agents: A Deep Dive into Code LLMs and Their Applications

This article reviews a comprehensive 303‑page survey on code foundation models, tracing the evolution of code‑focused large language models from 2021 to 2025, comparing general‑purpose and specialized LLMs, and presenting extensive experiments on prompting, fine‑tuning, reinforcement learning, and autonomous coding agents.

AI codingCode LLMLarge Language Models

0 likes · 5 min read

From Code Foundations to AI Agents: A Deep Dive into Code LLMs and Their Applications

AI2ML AI to Machine Learning

Dec 3, 2025 · Artificial Intelligence

2026 Forecast: How Large‑Model AI Will Evolve After 2025 Breakthroughs

The article reviews the major 2025 breakthroughs in multimodal, open‑source, and deployment technologies for large models and outlines four 2026 trends—including ToC vs. ToB service split, dual‑hand data generation, MoE routing advances, and AI4Science breakthroughs—that will shape the next wave of AI development.

AI DeploymentAI4ScienceLarge Language Models

0 likes · 6 min read

2026 Forecast: How Large‑Model AI Will Evolve After 2025 Breakthroughs

Baidu MEUX

Dec 3, 2025 · User Experience Design

Boost User Research with AI: Automating Short Feedback Classification & Long‑Form Insight Extraction

This article explains how AI large‑language models can automate short user‑feedback classification and extract insights from long interview texts, offering practical prompting tips, fine‑tuning strategies, and Retrieval‑Augmented Generation workflows to make user research faster, more accurate, and less labor‑intensive.

AIFeedback ClassificationLarge Language Models

0 likes · 11 min read

Boost User Research with AI: Automating Short Feedback Classification & Long‑Form Insight Extraction

ShiZhen AI

Dec 2, 2025 · Artificial Intelligence

What Is a Prompt? Mastering Question Techniques for Better AI Results

Episode 4 of the Comic‑AI series explains that a prompt is the art of formulating precise questions to guide large language models, covering content and format constraints, positive and negative prompting, and showing how specific instructions lead to more predictable AI behavior.

AIAI interactionLarge Language Models

0 likes · 3 min read

What Is a Prompt? Mastering Question Techniques for Better AI Results

ShiZhen AI

Dec 1, 2025 · Artificial Intelligence

AI Comic Episode 3: What Exactly Is a Token?

This episode explains that a token is the smallest text chunk an LLM processes—ranging from characters to subwords—covers why subword tokenization avoids vocabulary explosion, compares token counts across languages, describes the computational cost of sequential generation, and introduces visual tokens for multimodal models.

AI FundamentalsLarge Language Modelsmultimodal

0 likes · 7 min read

AI Comic Episode 3: What Exactly Is a Token?

JD Tech

Nov 28, 2025 · Artificial Intelligence

How JD Ads Uses Large Language Models to Transform Advertising

This article details JD Advertising's shift from generic to domain‑specific large models, the design of AI‑driven ad agents, the end‑to‑end GRAM retrieval‑alignment system, CTR‑guided AIGC for creatives, ultra‑low‑latency inference techniques, and ARM‑based optimizations that together reshape modern ad marketing.

CTR optimizationIntelligent agentsLarge Language Models

0 likes · 19 min read

How JD Ads Uses Large Language Models to Transform Advertising

Meituan Technology Team

Nov 27, 2025 · Artificial Intelligence

AMO‑Bench: A New High‑Difficulty, Original Math Reasoning Benchmark for LLMs

AMO‑Bench, released by Meituan's LongCat team, is a 50‑question, IMO‑level math reasoning benchmark that combines original, high‑difficulty problems with automated scoring, exposing the current limits of top large language models whose best accuracy hovers around 52 % and offering a more discriminative evaluation tool for future model improvements.

AI evaluationAMO-BenchLarge Language Models

0 likes · 12 min read

AMO‑Bench: A New High‑Difficulty, Original Math Reasoning Benchmark for LLMs

DataFunTalk

Nov 25, 2025 · Artificial Intelligence

Unlocking Agentic RAG and Generative Ranking: AI Search & Recommendation Breakthroughs

This article summarizes cutting‑edge techniques from Alibaba Cloud AI Search’s Agentic RAG architecture, Huawei Noah’s LLM‑enhanced recommendation evolution, and Baidu’s GRAB generative ranking model, detailing multi‑agent retrieval, multimodal data handling, scaling laws, causal attention, and performance gains demonstrated through benchmarks and real‑world deployments.

AI SearchAgentic RAGGenerative Ranking

0 likes · 8 min read

Unlocking Agentic RAG and Generative Ranking: AI Search & Recommendation Breakthroughs

ITPUB

Nov 24, 2025 · Artificial Intelligence

Why Memory, Not Size, Is the Next Bottleneck for Large Language Models

In a detailed interview, the CTO of Memory Tensor (Shanghai) explains how limited memory capacity hampers large models, outlines the MemOS memory operating system, discusses information‑theoretic metrics, multimodal extensions, and reinforcement‑learning strategies for scalable, secure, and explainable AI memory management.

AI ArchitectureLarge Language ModelsMultimodal AI

0 likes · 23 min read

Why Memory, Not Size, Is the Next Bottleneck for Large Language Models

DataFunSummit

Nov 23, 2025 · Artificial Intelligence

How Large Language Models Are Revolutionizing Banking Data Integration

This article examines the challenges of traditional banking data, explains how large language models can fuse structured and unstructured information, outlines a new data‑centric infrastructure and governance approach, and describes the DiFY platform’s AI‑agent and DataOps capabilities for agile, non‑intrusive integration with core banking systems.

AI agentsBig DataData Governance

0 likes · 16 min read

How Large Language Models Are Revolutionizing Banking Data Integration

Kuaishou Tech

Nov 20, 2025 · Artificial Intelligence

How UniDex and UniSearch Redefine Video Search with Semantic Indexing and Generative Models

This article explains how Kuaishou’s UniDex replaces traditional term‑based inverted indexes with model‑driven semantic posting lists and how the end‑to‑end UniSearch framework generates video IDs directly from queries, delivering higher relevance, lower latency, and significant online performance gains.

AILarge Language ModelsSearch

0 likes · 17 min read

How UniDex and UniSearch Redefine Video Search with Semantic Indexing and Generative Models

360 Zhihui Cloud Developer

Nov 20, 2025 · Artificial Intelligence

How DeepAgent Redefines AI Agents with Memory Folding and ToolPO

This article breaks down the DeepAgent paper, explaining its novel "main model + auxiliary model" architecture, the memory‑folding mechanism that compresses long‑context reasoning, and the ToolPO reinforcement strategy that enables efficient tool discovery and usage.

AI agentsLarge Language ModelsToolPO

0 likes · 8 min read

How DeepAgent Redefines AI Agents with Memory Folding and ToolPO

Tencent Advertising Technology

Nov 20, 2025 · Artificial Intelligence

CoderRec: Latent Reasoning Boosts Sequential Recommendation

CoderRec, a new sequential recommendation framework jointly developed by Tencent Advertising Technology and Tsinghua University, combines domain‑specific latent reasoning with cross‑scale model collaboration to capture implicit user intent and fuse large‑language‑model semantics with traditional recommender signals, achieving state‑of‑the‑art performance on multiple Amazon datasets.

Artificial IntelligenceLarge Language Modelscross-scale collaboration

0 likes · 17 min read

CoderRec: Latent Reasoning Boosts Sequential Recommendation

Baobao Algorithm Notes

Nov 18, 2025 · Artificial Intelligence

How LightReasoner Lets Small Models Teach Large Models to Reason Efficiently

The LightReasoner paper from Hong Kong University shows that small language models can guide large models on critical reasoning steps, achieving up to 90% faster inference and significant accuracy gains across multiple math benchmarks.

Contrastive DecodingKL divergenceLarge Language Models

0 likes · 9 min read

How LightReasoner Lets Small Models Teach Large Models to Reason Efficiently

Alibaba Cloud Developer

Nov 18, 2025 · Artificial Intelligence

How ReAct and Reflexion Boost Large Language Models for Complex, Real‑World Tasks

The article explains the limitations of large language models on multi‑step reasoning, real‑time information retrieval, and planning, then introduces the ReAct (Reasoning + Acting) framework and its Reflexion extension, detailing their mechanisms, examples, performance gains, practical applications, and future research directions.

LLM reasoningLarge Language ModelsPrompt Engineering

0 likes · 16 min read

How ReAct and Reflexion Boost Large Language Models for Complex, Real‑World Tasks

AI Tech Publishing

Nov 17, 2025 · Artificial Intelligence

Frontier AI Models in RL Environments Reveal an Agent Capability Hierarchy

The article evaluates nine cutting‑edge AI models on 150 simulated workplace tasks, showing that even the strongest models complete fewer than 40% of tasks, and uses these results to propose a hierarchical framework of agentic capabilities ranging from tool use to common‑sense reasoning.

AI model evaluationLarge Language ModelsTool Use

0 likes · 19 min read

Frontier AI Models in RL Environments Reveal an Agent Capability Hierarchy

Software Engineering 3.0 Era

Nov 16, 2025 · Industry Insights

Inside the 2025 AI+ R&D Survey: How Chinese Teams Are Transforming Software Development with Large Models

The 2025 AI+ R&D survey reveals that 89.2% of Chinese software teams have embraced large language models, with 62.8% actively using them, driving vertical adoption, significant cost and productivity gains, while also highlighting eight key challenges and a shift toward AI agents.

AIChinaIndustry Survey

0 likes · 16 min read

Inside the 2025 AI+ R&D Survey: How Chinese Teams Are Transforming Software Development with Large Models

Data Thinking Notes

Nov 16, 2025 · Artificial Intelligence

How AI Agents Transform Automation: Architecture, Challenges & Future Trends

This comprehensive overview examines AI agents powered by large language models, detailing their definition, core components, architectural patterns, key technologies such as prompt engineering and retrieval‑augmented generation, diverse application domains, current challenges, security solutions, and emerging research directions.

Large Language ModelsMulti-Agent SystemsPrompt Engineering

0 likes · 81 min read

How AI Agents Transform Automation: Architecture, Challenges & Future Trends

Liangxu Linux

Nov 12, 2025 · Artificial Intelligence

Top Open‑Source AI‑Powered Tools to Boost Your Workflow (2024)

It introduces several open-source projects—MarkItDown for document-to‑Markdown conversion, Codebuff AI coding assistant, Twitter’s recommendation algorithm, mlx‑lm for running LLMs on Apple silicon, Perplexica AI search, and ChinaTextbook dataset—highlighting their features, usage, and GitHub links.

AILarge Language ModelsSearch Engine

0 likes · 6 min read

Top Open‑Source AI‑Powered Tools to Boost Your Workflow (2024)

AntTech

Nov 11, 2025 · Artificial Intelligence

Breaking the Efficiency Wall: Ant Group’s Bailing Model Paves the Way to AGI

At CNCC 2025, Ant Group’s Vice President Zhou Jun outlined the Bailing large‑model’s five‑layer architecture, hybrid linear attention, Ling Scaling Law, and novel training algorithms that dramatically cut costs and latency, achieving state‑of‑the‑art performance on math and code benchmarks while promoting open‑source collaboration toward AGI.

AGILarge Language ModelsMixture of Experts

0 likes · 8 min read

Breaking the Efficiency Wall: Ant Group’s Bailing Model Paves the Way to AGI

Alimama Tech

Nov 11, 2025 · Artificial Intelligence

Accelerating LLM RL with Async Training, Mini‑Critics, and Attention Rewards

This article introduces the 3A collaborative framework—Async architecture, Asymmetric PPO mini‑critics, and an attention‑based reasoning rhythm—demonstrating how decoupled, fine‑grained parallel training and structure‑aware reward allocation dramatically improve efficiency, scalability, and interpretability of reinforcement learning for large language models.

Asynchronous TrainingLarge Language Modelsattention mechanisms

0 likes · 23 min read

Accelerating LLM RL with Async Training, Mini‑Critics, and Attention Rewards

Network Intelligence Research Center (NIRC)

Nov 11, 2025 · Artificial Intelligence

What Is Mechanistic Interpretability and Why It Matters for Large Language Models

The article defines mechanistic interpretability as reverse‑engineering LLMs to reveal how they represent knowledge and make decisions, explains its importance for transparency, risk mitigation, and model improvement, and surveys key techniques such as causal tracing, zero‑making, noise‑making, and logit‑lens methods with illustrative examples.

Large Language Modelscausal tracinglogit lens

0 likes · 8 min read

What Is Mechanistic Interpretability and Why It Matters for Large Language Models