Tagged articles
648 articles
Page 2 of 7
SuanNi
SuanNi
Apr 2, 2026 · Artificial Intelligence

How Alibaba’s New Qwen3.5‑Omni, Wan2.7‑Image, and Qwen3.6‑Plus Redefine Multimodal AI

Alibaba unveiled three cutting‑edge models—Qwen3.5‑Omni with native multimodal interaction, Wan2.7‑Image for high‑precision image generation and editing, and Qwen3.6‑Plus boosting coding agent performance—each achieving dozens of SOTA benchmarks, massive context windows, and novel capabilities such as Audio‑Visual Vibe Coding and transparent layer separation.

AICoding AgentMultimodal
0 likes · 7 min read
How Alibaba’s New Qwen3.5‑Omni, Wan2.7‑Image, and Qwen3.6‑Plus Redefine Multimodal AI
Su San Talks Tech
Su San Talks Tech
Apr 2, 2026 · Artificial Intelligence

How GLM-5.1 Beats Its Predecessor: A Hands‑On Test and Deep Dive

The article presents a detailed, hands‑on evaluation of the newly released GLM‑5.1 model, describing the rollout strategy, step‑by‑step testing on complex coding tasks, configuration details, observed performance improvements over previous versions, and practical guidance for developers seeking to leverage the model for real‑world projects.

AI coding assistantGLM-5.1Model Evaluation
0 likes · 9 min read
How GLM-5.1 Beats Its Predecessor: A Hands‑On Test and Deep Dive
Machine Heart
Machine Heart
Mar 31, 2026 · Artificial Intelligence

What Does DeepResearch Bench Measure? Toward Human‑Level AI Agent Evaluation

The DeepResearch Bench and Bench II, open‑source benchmarks from the USTC team, evaluate deep‑research AI agents on report quality, citation reliability, and information recall using the RACE and FACT frameworks, aiming to align automated scores with human expert judgments.

AI Agent EvaluationDeepResearch BenchFACT
0 likes · 12 min read
What Does DeepResearch Bench Measure? Toward Human‑Level AI Agent Evaluation
Old Zhang's AI Learning
Old Zhang's AI Learning
Mar 31, 2026 · Artificial Intelligence

Turning a Bluetooth Speaker into a Smart Assistant with Qwen 3.5‑Omni

The author demonstrates a proof‑of‑concept that combines Qwen 3.5‑Omni's real‑time internet search and audio output with a locally hosted voice‑wake‑up model to transform a Bluetooth speaker into an always‑on smart assistant, while noting latency challenges and the potential of a sub‑10B open‑source alternative.

AI integrationBluetoothlarge language model
0 likes · 2 min read
Turning a Bluetooth Speaker into a Smart Assistant with Qwen 3.5‑Omni
AI Engineering
AI Engineering
Mar 31, 2026 · Artificial Intelligence

Qwen3.5-Omni Introduces Audio‑Visual Vibe Coding: Code by Speaking and Gesturing

Alibaba's newly released Qwen3.5-Omni multimodal model adds an Audio‑Visual Vibe Coding feature that lets users describe a website or game with speech and gestures to generate code, while offering advanced audio comprehension, long‑duration media support, multilingual capabilities, fine‑grained voice control, and voice cloning, though its weights remain closed‑source.

AIAlibabaAudio-Visual Vibe Coding
0 likes · 3 min read
Qwen3.5-Omni Introduces Audio‑Visual Vibe Coding: Code by Speaking and Gesturing
Machine Heart
Machine Heart
Mar 30, 2026 · Artificial Intelligence

Echo: A Small Step for Predictive AI, a Giant Leap Toward General Intelligence

The Echo system from UniPat AI introduces a fully integrated predictive‑intelligence infrastructure—including a dynamic evaluation engine, a Train‑on‑Future training paradigm, and the EchoZ‑1.0 model—that outperforms leading LLMs and human traders on a comprehensive AI Prediction Leaderboard, while offering transparent, reproducible benchmarks.

Dynamic EvaluationElo rankingPredictive AI
0 likes · 14 min read
Echo: A Small Step for Predictive AI, a Giant Leap Toward General Intelligence
AgentGuide
AgentGuide
Mar 27, 2026 · Artificial Intelligence

What Are Skills in LLM Agents? How They Work and When to Use Them

The article defines Skills as structured local folders that encapsulate domain‑specific processes, knowledge, and tools for large language models, contrasts them with temporary Prompts, outlines suitable use cases, details their components, and explains their on‑demand loading mechanism that saves tokens.

On-demand LoadingPrompt engineeringSkills
0 likes · 4 min read
What Are Skills in LLM Agents? How They Work and When to Use Them
AI Engineer Programming
AI Engineer Programming
Mar 25, 2026 · Artificial Intelligence

What Is an AI Agent? Definition, Core Capabilities, and Architecture

The article explains AI agents as autonomous systems that perceive environments, plan, use tools, iterate through action loops, and self‑reflect, contrasting them with traditional chatbots and workflows, and outlines their core abilities, memory types, tool‑use mechanisms, and single‑ versus multi‑agent architectures.

AI AgentMemoryMulti-Agent
0 likes · 8 min read
What Is an AI Agent? Definition, Core Capabilities, and Architecture
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Mar 24, 2026 · Artificial Intelligence

China’s Tech Circle Wars Over the Chinese Name for AI Tokens – Trends and Aesthetics

Amid a heated debate over the proper Chinese translation of “Token,” China’s AI community examines the term’s technical origins, massive global consumption—30 trillion daily tokens worldwide, 4.69 trillion from China alone—and its economic impact, while proposing names like CiYuan, MoYuan, and ZhiYuan to reflect cultural aesthetics.

Chinese NamingNLPToken
0 likes · 12 min read
China’s Tech Circle Wars Over the Chinese Name for AI Tokens – Trends and Aesthetics
Geek Labs
Geek Labs
Mar 24, 2026 · Industry Insights

9 Must‑See GitHub Projects: MacBook‑Run LLM, WeChat AI, Multi‑Agent Collaboration and More

This article reviews nine standout GitHub open‑source projects, covering a C/Metal LLM engine for MacBooks, a Claude Code commercial‑analysis skill, multi‑agent communication tools, web‑enabled AI, autonomous research automation, WeChat AI integration, a minimalist terminal, a Codex console, and a lightweight WARP proxy.

AIDockerGitHub
0 likes · 10 min read
9 Must‑See GitHub Projects: MacBook‑Run LLM, WeChat AI, Multi‑Agent Collaboration and More
AI Open-Source Efficiency Guide
AI Open-Source Efficiency Guide
Mar 24, 2026 · Artificial Intelligence

12 Practical AI Prompt Templates for Everyday Work (with Examples)

This guide presents twelve ready‑to‑use AI prompt templates covering single‑task queries, business writing, multi‑step projects, creative branding, logical reasoning, structured outputs, code editing, autonomous agents, image generation, and more, each illustrated with concrete examples.

AIPrompt engineeringlarge language model
0 likes · 16 min read
12 Practical AI Prompt Templates for Everyday Work (with Examples)
Weekly Large Model Application
Weekly Large Model Application
Mar 22, 2026 · Artificial Intelligence

Inside MiMo-Audio: Dissecting the Large-Scale Audio Model

The article breaks down MiMo-Audio, a next‑token‑prediction‑style large‑scale audio model built on Qwen2, detailing its acoustic front‑end, RVQ tokenizer, patch‑based transformer architecture, streaming capabilities, performance advantages, engineering constraints, and recommended application scenarios.

Audio ModelingFew-ShotQwen2
0 likes · 9 min read
Inside MiMo-Audio: Dissecting the Large-Scale Audio Model
AgentGuide
AgentGuide
Mar 22, 2026 · Artificial Intelligence

How to Design Prompt Engineering in Your Project: A Complete Workflow

The article outlines a systematic Prompt Engineering process that starts with defining task goals and metrics, structures prompts into modular components, uses offline evaluation and bad‑case analysis, incorporates RAG or tools when needed, and continuously monitors accuracy, hallucination, latency and cost.

AI workflowFew-ShotPrompt engineering
0 likes · 7 min read
How to Design Prompt Engineering in Your Project: A Complete Workflow
DataFunTalk
DataFunTalk
Mar 22, 2026 · Artificial Intelligence

Why Cursor’s Composer 2 Beats Claude Opus 4.6 in Performance and Price

Cursor’s new Composer 2 programming model outperforms Claude Opus 4.6 on benchmarks like Terminal‑Bench 2.0 and SWE‑bench Multilingual, while slashing token costs to $0.5/​M input and $2.5/​M output, thanks to a novel self‑summary reinforcement‑learning technique that enables efficient long‑context processing.

AIlarge language modelpricing
0 likes · 8 min read
Why Cursor’s Composer 2 Beats Claude Opus 4.6 in Performance and Price
PaperAgent
PaperAgent
Mar 22, 2026 · Artificial Intelligence

How AI Agents Like OpenClaw Turn LLMs into Autonomous Assistants

This article explains what AI agents are, how they differ from ordinary language‑model interfaces, and walks through OpenClaw’s workflow, tool usage, security challenges, memory handling, and advanced features such as sub‑agents and context compaction, offering practical insights for building safe autonomous AI systems.

AI AgentContext EngineeringOpenClaw
0 likes · 27 min read
How AI Agents Like OpenClaw Turn LLMs into Autonomous Assistants
AI Product Manager Community
AI Product Manager Community
Mar 21, 2026 · Artificial Intelligence

Mastering AI Agents: From Core Concepts to Enterprise Deployment

This article provides a comprehensive, structured overview of AI agents, covering their fundamental definitions, core architecture (LLM, planning, memory, tool use), evolution from chatbots, the ReAct reasoning framework, multi‑agent systems, safety challenges like hallucination and prompt‑injection, and practical strategies for production‑grade deployment.

AI AgentMulti-Agent SystemPrompt engineering
0 likes · 16 min read
Mastering AI Agents: From Core Concepts to Enterprise Deployment
Black & White Path
Black & White Path
Mar 21, 2026 · Artificial Intelligence

Japan’s ‘Self‑Developed’ 700B AI Model: A DeepSeek Re‑skin Flop

Rakuten AI 3.0 was billed as Japan’s largest, self‑developed 700‑billion‑parameter model backed by government funds, but a quick look at its Hugging Face config reveals it merely re‑uses DeepSeek V3, prompting a broader critique of the hype, funding motives, and strategic trade‑offs behind the launch.

AI Industry AnalysisDeepSeekFine-tuning
0 likes · 5 min read
Japan’s ‘Self‑Developed’ 700B AI Model: A DeepSeek Re‑skin Flop
Model Perspective
Model Perspective
Mar 20, 2026 · Artificial Intelligence

How to Build a No‑Code AI Agent for Fast Book Summarization

This article walks through the design and implementation of a no‑code AI reading agent that parses, splits, and summarizes books chapter by chapter, explaining why the tool serves as a pre‑reading filter rather than a replacement for deep study.

AINo-codeReading Efficiency
0 likes · 10 min read
How to Build a No‑Code AI Agent for Fast Book Summarization
HyperAI Super Neural
HyperAI Super Neural
Mar 18, 2026 · Artificial Intelligence

How Google’s Gemini Extracted 2.6 Million Flood Events from 150 Countries’ News

Google Research released the open‑source Groundsource flood dataset, built by automatically processing more than 5 million news articles from over 150 countries with the Gemini large‑language model, yielding over 2.6 million verified flood event records that are evaluated against GDACS and DFO for precision, recall, and spatial resolution.

AI extractionGoogleGroundsource
0 likes · 13 min read
How Google’s Gemini Extracted 2.6 Million Flood Events from 150 Countries’ News
AIWalker
AIWalker
Mar 17, 2026 · Artificial Intelligence

How a 4B-Parameter Open-Source Model Outperforms 14B Multimodal Giants

InternVL-U, a 4‑billion‑parameter unified multimodal model released as open source, combines a 2B MLLM backbone with a 1.7B visual generation head and, through a reasoning‑centric data pipeline and Chain‑of‑Thought guidance, achieves superior understanding, generation, and editing performance that surpasses much larger 14‑20B models on multiple benchmarks.

AI researchInternVL-UMultimodal
0 likes · 22 min read
How a 4B-Parameter Open-Source Model Outperforms 14B Multimodal Giants
AI Insight Log
AI Insight Log
Mar 16, 2026 · Artificial Intelligence

Cursor’s Own Large‑Model Benchmark Shakes Up SWE‑bench Rankings

Although SWE‑bench scores for top coding models now differ by only a tenth of a point, Cursor’s newly released CursorBench reveals dramatic ranking changes, highlights three fundamental flaws in public benchmarks, and introduces token‑efficiency as a crucial evaluation dimension.

AI CodingCursorBenchSWE-bench
0 likes · 8 min read
Cursor’s Own Large‑Model Benchmark Shakes Up SWE‑bench Rankings
PaperAgent
PaperAgent
Mar 16, 2026 · Artificial Intelligence

How GLM-5-Turbo Turns an AI Research Lab into a 24‑Hour Autonomous Writer

The article details how the newly released GLM-5-Turbo "lobster" model powers an AI research Lab that automatically generates a complete OpenClaw survey paper—from topic brainstorming and literature mining to outline drafting, manuscript writing, and AAAI‑style submission—within an hour, showcasing benchmark results, prompt templates, and practical skill installations.

AI research automationAutoClawGLM-5-Turbo
0 likes · 10 min read
How GLM-5-Turbo Turns an AI Research Lab into a 24‑Hour Autonomous Writer
IT Services Circle
IT Services Circle
Mar 15, 2026 · Artificial Intelligence

How PinchBench Ranks OpenClaw AI Agents Across Real‑World Tasks

The article explains OpenClaw’s rapid rise and the emerging on‑site installation business, introduces the open‑source PinchBench benchmark that evaluates large language models as OpenClaw agents on 23 real‑world tasks, presents recent ranking results, and provides step‑by‑step instructions for running the benchmark and submitting results.

AI AgentOpenClawPinchBench
0 likes · 5 min read
How PinchBench Ranks OpenClaw AI Agents Across Real‑World Tasks
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Mar 14, 2026 · Artificial Intelligence

Quantitative Finance Paper Digest: AI‑Driven Market Prediction Studies (Mar 7‑13 2026)

This digest summarizes four recent research papers that apply advanced AI techniques—node‑transformer graphs with BERT sentiment analysis, a quantum‑classical LSTM‑Born machine hybrid, large‑language‑model benchmarking for portfolio optimization, and a conditional diffusion model—to improve stock market prediction, volatility forecasting, and investment decision making, providing detailed experimental results and statistical validation.

BERTQuantum ComputingTransformer
0 likes · 10 min read
Quantitative Finance Paper Digest: AI‑Driven Market Prediction Studies (Mar 7‑13 2026)
AI Explorer
AI Explorer
Mar 14, 2026 · Artificial Intelligence

Claude’s 1M‑Token Context Window Launches with No Premium Pricing

Anthropic’s Claude Opus 4.6 and Sonnet 4.6 now offer a full‑million‑token context window at the same per‑token price as short‑context usage, delivering top‑ranked MRCR v2 performance, six‑fold media capacity, and reduced AI‑Agent memory compression without any code changes across all major cloud platforms.

AI AgentAnthropicClaude
0 likes · 6 min read
Claude’s 1M‑Token Context Window Launches with No Premium Pricing
Data Party THU
Data Party THU
Mar 12, 2026 · Artificial Intelligence

Can a 30B LLM Truly Conduct Autonomous Scientific Research? Inside UniScientist

UniScientist, a 30‑billion‑parameter open‑source model from UniPat AI, demonstrates a closed‑loop scientific research workflow—generating hypotheses, gathering evidence, performing reproducible derivations, and iteratively refining conclusions—while achieving benchmark scores comparable to much larger proprietary systems across multiple scientific evaluation suites.

Benchmarkinglarge language modelscientific research
0 likes · 10 min read
Can a 30B LLM Truly Conduct Autonomous Scientific Research? Inside UniScientist
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Mar 10, 2026 · Artificial Intelligence

How Anthropic and Palantir Collaborate on Modern Warfare Information Mining

The article analyzes Palantir's ontology-driven knowledge graph dominance, its shift from graph to vector databases, the three‑layer partnership with Anthropic and AWS, the Digital Twin scaling law, and the technical challenges of data heterogeneity, scaling uncertainty, annotation scarcity, and real‑time computation in modern warfare information mining.

AWSAnthropicDigital Twin
0 likes · 9 min read
How Anthropic and Palantir Collaborate on Modern Warfare Information Mining
SuanNi
SuanNi
Mar 9, 2026 · Artificial Intelligence

How UniScientist Beats GPT‑5.4 on FrontierScience Benchmarks

UniScientist, a 30B‑parameter AI model co‑developed by UniPat AI and Peking University, leverages a meticulously curated scientific dataset and a powerful code interpreter to achieve 33.3% success on the FrontierScience‑Research benchmark, surpassing the newly released GPT‑5.4 and demonstrating superior multi‑disciplinary research capabilities.

AIDatasetlarge language model
0 likes · 12 min read
How UniScientist Beats GPT‑5.4 on FrontierScience Benchmarks
Design Hub
Design Hub
Mar 6, 2026 · Artificial Intelligence

How Powerful Is GPT‑5.4? A Deep Dive Into Its Design‑Focused Capabilities

OpenAI's GPT‑5.4 combines a 1 M‑token context window, native computer‑use, and benchmark‑leading performance—outperforming humans on 83 % of tasks and cutting token usage by 47 %—while showcasing demos that let designers generate games, websites, and 3D assets in a single prompt.

AI agentsComputer UseGPT-5.4
0 likes · 7 min read
How Powerful Is GPT‑5.4? A Deep Dive Into Its Design‑Focused Capabilities
DataFunTalk
DataFunTalk
Mar 6, 2026 · Artificial Intelligence

Why GPT‑5.4 Beats Its Predecessors: Code Power, World Knowledge, and New Agent Features

The article reviews GPT‑5.4’s release, comparing its code ability, world knowledge, and multimodal understanding to Claude Opus 4.6 and GPT‑5.3‑Codex, presents benchmark scores (GDPval 83%, SWE‑Bench 57.7%, OSWorld 75%, ToolAthon 54.6%), and highlights new features such as a 1‑million‑token context window, native computer usage, and tool‑search optimization, while discussing pricing and practical usage in OpenClaw.

AI agentsContext WindowGPT-5.4
0 likes · 12 min read
Why GPT‑5.4 Beats Its Predecessors: Code Power, World Knowledge, and New Agent Features
AI Explorer
AI Explorer
Mar 6, 2026 · Artificial Intelligence

GPT-5.4 Unveiled: 1M‑Token Context Window and Native Computer Control

OpenAI's GPT-5.4 launch introduces three model tiers, a 1 million‑token context window, native computer‑use abilities, higher factual accuracy and a new Tool Search feature, reshaping enterprise AI capabilities and intensifying competition with Anthropic and Google.

AI benchmarksComputer UseContext Window
0 likes · 9 min read
GPT-5.4 Unveiled: 1M‑Token Context Window and Native Computer Control
Weekly Large Model Application
Weekly Large Model Application
Mar 4, 2026 · Artificial Intelligence

Qwen3‑ASR vs FunASR: In‑Depth Technical Comparison

This article provides a detailed side‑by‑side analysis of the open‑source ASR tools FunASR and Qwen3‑ASR, covering team origins, model architectures, language coverage, speed, deployment requirements, and ideal use‑cases so readers can decide which solution fits their projects best.

ASRFunASRMultimodal
0 likes · 10 min read
Qwen3‑ASR vs FunASR: In‑Depth Technical Comparison
AI Explorer
AI Explorer
Mar 4, 2026 · Artificial Intelligence

DeerFlow: Open‑Source Super‑Agent Framework Automates Complex Tasks

DeerFlow 2.0, an open‑source super‑agent framework from ByteDance, lets developers automate multi‑step, minutes‑to‑hours‑long workflows by orchestrating sub‑agents with memory, sandboxed execution, and extensible skills, and has surged to over 2.4 k GitHub stars.

AI agentsDeerFlowDocker
0 likes · 6 min read
DeerFlow: Open‑Source Super‑Agent Framework Automates Complex Tasks
Old Zhang's AI Learning
Old Zhang's AI Learning
Mar 2, 2026 · Artificial Intelligence

Why the Qwen3.5 Series Makes Qwen3.5-27B the No‑Brainer Choice

The author reviews the Qwen3.5 model family, showing that the 27‑billion‑parameter dense Qwen3.5-27B offers the best balance of size, stability, low‑cost local deployment, and comprehensive capabilities, making it the default pick for most users.

AI benchmarkingRTX 4090large language model
0 likes · 6 min read
Why the Qwen3.5 Series Makes Qwen3.5-27B the No‑Brainer Choice
Old Zhang's AI Learning
Old Zhang's AI Learning
Feb 27, 2026 · Backend Development

How I Built a Telegram AI Coding Bot (FakeClawBot) Using OpenCode

This article walks through creating a Telegram bot that leverages OpenCode's Server API to provide full AI coding assistance, covering setup, multi‑model integration, core architecture, common pitfalls, and extensible features, all with under 900 lines of Python code.

AI coding assistantOpenCodePython
0 likes · 13 min read
How I Built a Telegram AI Coding Bot (FakeClawBot) Using OpenCode
PaperAgent
PaperAgent
Feb 26, 2026 · Industry Insights

What the DeepSeek V4 Lite Leak Reveals About Its Specs and Multimodal Power

Recent reports indicate that DeepSeek's unreleased V4 Lite model, featuring a 1‑million‑token context window and native multimodal reasoning, has been leaked online, with Huawei gaining early access while Nvidia is excluded, and the model demonstrates impressive spatial reasoning in generated SVG examples.

DeepSeekV4 Liteindustry insight
0 likes · 3 min read
What the DeepSeek V4 Lite Leak Reveals About Its Specs and Multimodal Power
Old Zhang's AI Learning
Old Zhang's AI Learning
Feb 26, 2026 · Artificial Intelligence

Ultimate Guide to Local Deployment of Qwen3.5 Models (27B‑397B)

This guide reviews the Qwen3.5 model lineup, explains mixed‑inference and MoE architecture, presents benchmark comparisons with GPT‑5.2, Claude 4.5 and Gemini‑3 Pro, evaluates 4‑bit and 3‑bit quantization loss, outlines hardware requirements, and provides step‑by‑step deployment options using llama.cpp or llama‑server.

InferenceMoElarge language model
0 likes · 14 min read
Ultimate Guide to Local Deployment of Qwen3.5 Models (27B‑397B)
Baobao Algorithm Notes
Baobao Algorithm Notes
Feb 25, 2026 · Artificial Intelligence

Exploring Qwen 3.5: Small‑Scale MoE Models, Architecture, and Deployment Guides

This article reviews the three open‑source Qwen 3.5 models—including a 35B MoE, a 122B MoE, and a 27B dense version—detailing their parameter layouts, core attention designs, context length, inference performance, hardware requirements, and provides step‑by‑step code examples for loading them with Hugging Face Transformers and vLLM.

AIMoEModel Deployment
0 likes · 10 min read
Exploring Qwen 3.5: Small‑Scale MoE Models, Architecture, and Deployment Guides
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Feb 20, 2026 · Artificial Intelligence

Google Reclaims AI Throne with Gemini 3.1 Pro, Achieving 77.1% ARC‑AGI‑2 Score

Google’s Gemini 3.1 Pro, the latest upgrade to the Gemini 3 series, achieves a verified 77.1 % score on the ARC‑AGI‑2 reasoning benchmark—more than double the performance of Gemini 3 Pro—while leading in GPQA, LiveCodeBench Pro, SWE‑Bench Verified, and MMMLU tests, and is now being rolled out to developers, enterprises and consumers with detailed pricing and integration options.

AI benchmarkingARC-AGI-2Gemini 3.1 Pro
0 likes · 9 min read
Google Reclaims AI Throne with Gemini 3.1 Pro, Achieving 77.1% ARC‑AGI‑2 Score
Old Zhang's AI Learning
Old Zhang's AI Learning
Feb 19, 2026 · Artificial Intelligence

Inside GLM-5: Training Techniques, Architecture Innovations, and Benchmark Performance

The article dissects GLM-5’s 744B‑parameter MoE design, 28.5 T token training corpus, novel Muon Split and MLA‑256 optimizations, DSA sparse attention, a fully asynchronous RL pipeline, extensive domestic chip adaptation, and benchmark results that place it on par with Claude Opus 4.5 and ahead of Gemini 3 Pro.

AI ArchitectureDSAGLM-5
0 likes · 13 min read
Inside GLM-5: Training Techniques, Architecture Innovations, and Benchmark Performance
AI Agent Research Hub
AI Agent Research Hub
Feb 19, 2026 · Artificial Intelligence

Why Claude Sonnet 4.6 Is My Most Powerful and Cost‑Effective AI Research Assistant

The article evaluates Anthropic's Claude Sonnet 4.6 as a comprehensive research assistant, detailing its performance on literature surveys, open‑source code analysis, algorithm implementation, cost savings, benchmark scores, and practical limitations across multiple scientific workflows.

AI Research AssistantClaude Sonnet 4.6Literature Review
0 likes · 20 min read
Why Claude Sonnet 4.6 Is My Most Powerful and Cost‑Effective AI Research Assistant
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 17, 2026 · Artificial Intelligence

Deploy Alibaba’s Qwen3.5‑397B‑A17B Model in One Click with PAI‑Model Gallery

Alibaba's open‑source Qwen3.5‑397B‑A17B model, featuring 397 billion parameters and a hybrid Gated Delta Network/MoE architecture, delivers superior performance and reduced memory usage, and can be deployed instantly through the PAI‑Model Gallery with step‑by‑step guidance and enterprise‑grade security.

AI inferenceAlibaba CloudOne‑Click Deployment
0 likes · 5 min read
Deploy Alibaba’s Qwen3.5‑397B‑A17B Model in One Click with PAI‑Model Gallery
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Feb 16, 2026 · Artificial Intelligence

Alibaba’s Qwen 3.5‑Plus: 397 B Open‑Source Model Beats Gemini‑3 and GPT‑5.2 at Low Cost

Alibaba released the Qwen 3.5‑Plus open‑source large model (397 B total parameters, 170 B active) that outperforms top closed‑source models such as Gemini‑3‑Pro and GPT‑5.2 on multiple benchmarks, offers native multimodal understanding, supports 201 languages, reduces deployment memory by 60 % and inference latency by up to 19×, and is priced at only 0.8 CNY per million tokens.

AIMultimodalbenchmark
0 likes · 15 min read
Alibaba’s Qwen 3.5‑Plus: 397 B Open‑Source Model Beats Gemini‑3 and GPT‑5.2 at Low Cost
Old Zhang's AI Learning
Old Zhang's AI Learning
Feb 16, 2026 · Artificial Intelligence

Qwen3.5 Deep Dive: Multimodal Architecture, Benchmarks, and Deployment Guide

This article provides a detailed analysis of Qwen3.5, covering its multimodal MoE design, massive inference speedups, extensive benchmark results against GPT‑5.2, Claude 4.5 Opus and Gemini‑3 Pro, RL scaling strategies, training infrastructure innovations, and practical usage via API and local deployment.

FP8 trainingMultimodal AIbenchmark
0 likes · 13 min read
Qwen3.5 Deep Dive: Multimodal Architecture, Benchmarks, and Deployment Guide
AntTech
AntTech
Feb 16, 2026 · Artificial Intelligence

Ling‑2.5‑1T: Open‑Source 1‑Trillion‑Parameter Instant LLM with 1M‑Token Context

Ling‑2.5‑1T is an open‑source instant large language model with 1 trillion total parameters, 63 B active weights, and a 1 M token context window, featuring mixed‑linear attention, a composite correctness‑plus‑process reward for token efficiency, fine‑grained alignment, and leading benchmark performance across reasoning, instruction‑following, and agentic tasks.

Token efficiencyagentic interactionbenchmark
0 likes · 13 min read
Ling‑2.5‑1T: Open‑Source 1‑Trillion‑Parameter Instant LLM with 1M‑Token Context
AI Engineering
AI Engineering
Feb 16, 2026 · Artificial Intelligence

Qwen3.5-397B: 397B‑Parameter Multimodal LLM Boosts Inference Speed 8‑19×

Alibaba’s Qwen3.5-397B-A17B, a 397‑billion‑parameter open‑source multimodal LLM, combines mixed linear attention with a sparse MoE architecture to achieve 8.6‑19× higher decoding throughput than Qwen3‑Max, supports 201 languages, and can be deployed via vLLM, Docker, Transformers, or SGLang with various optimization presets.

Inference Optimizationlarge language modelmultimodal LLM
0 likes · 8 min read
Qwen3.5-397B: 397B‑Parameter Multimodal LLM Boosts Inference Speed 8‑19×
AI Insight Log
AI Insight Log
Feb 16, 2026 · Artificial Intelligence

DeepSeek V4 Benchmark Leak Fuels Talk of a New Coding King

A leaked SWE‑Bench score of 83.7% for DeepSeek V4 sparked claims it outperforms Claude Opus 4.5 and GPT‑5.2, but the data was later debunked as fabricated while official hints confirm a 1‑million‑token context model and a mid‑February 2026 release.

AI benchmarkingAI industryDeepSeek
0 likes · 7 min read
DeepSeek V4 Benchmark Leak Fuels Talk of a New Coding King
PaperAgent
PaperAgent
Feb 16, 2026 · Artificial Intelligence

Why Qwen3.5-Plus Sets a New Standard for Open-Source Multimodal AI

Qwen3.5-Plus, Alibaba’s newly open-sourced multimodal LLM, combines a 397 B parameter model with only 17 B active parameters, leveraging native multimodal training, gated attention, sparse MoE, and FP8 precision to outperform GPT-5.2 and Gemini-3-Pro across vision, reasoning, and agent benchmarks.

Multimodal AISparse Activationgated attention
0 likes · 6 min read
Why Qwen3.5-Plus Sets a New Standard for Open-Source Multimodal AI
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Feb 12, 2026 · Artificial Intelligence

How Huawei’s MindScale Cuts Agent Token Usage 5.7× and Automates Prompt & Workflow Design

The article outlines the four major obstacles hindering industry‑specific LLM agents—manual workflow maintenance, poor knowledge reuse, training‑inference inefficiency, and complex reasoning evaluation—and explains how Huawei Noah’s MindScale package tackles each with self‑evolving workflows, automated prompt optimization, and a novel KV‑Embedding cache that slashes token consumption by 5.7× while boosting inference speed up to 70%.

Industry AgentInference AccelerationKV-Embedding
0 likes · 7 min read
How Huawei’s MindScale Cuts Agent Token Usage 5.7× and Automates Prompt & Workflow Design
Old Zhang's AI Learning
Old Zhang's AI Learning
Feb 12, 2026 · Artificial Intelligence

Testing the World's Most Powerful Open‑Source LLM: GLM‑5, Local Deployment & Free Ollama Cloud

The article evaluates GLM‑5, the claimed strongest open‑source large language model, comparing its benchmark scores to Claude Opus, Gemini and GPT, detailing its DeepSeek‑inspired architecture, quantized FP8 deployment requirements, and step‑by‑step usage of Ollama’s free cloud model with Agent, data‑analysis and document‑generation features.

AI benchmarkingGLM-5Ollama
0 likes · 7 min read
Testing the World's Most Powerful Open‑Source LLM: GLM‑5, Local Deployment & Free Ollama Cloud
DataFunTalk
DataFunTalk
Feb 12, 2026 · Artificial Intelligence

DeepSeek’s New Model V4? Exploring 1M‑Token Context and Updated Knowledge

DeepSeek quietly launched its latest model, reportedly supporting up to 1 million tokens, extending its knowledge cutoff to May 2025, adopting a more enthusiastic response style, and still operating as a pure‑text system, while early tests showcase impressive coding and reasoning capabilities.

AI EvaluationDeepSeekknowledge cutoff
0 likes · 5 min read
DeepSeek’s New Model V4? Exploring 1M‑Token Context and Updated Knowledge
AI Insight Log
AI Insight Log
Feb 12, 2026 · Artificial Intelligence

GLM-5 Unveiled: 744B Parameters, Claude Opus 4.5‑Level Performance, Epic Agent Upgrade

Z.ai released the open‑source GLM‑5 model with 744 billion parameters, 28.5 T tokens of training data, and new Sparse Attention and Slime RL infrastructure, achieving top open‑source rankings and near‑Claude Opus 4.5 performance on Vending Bench 2 and CC‑Bench‑V2 while adding multi‑scenario agent capabilities.

Agentic EngineeringGLM-5benchmark
0 likes · 6 min read
GLM-5 Unveiled: 744B Parameters, Claude Opus 4.5‑Level Performance, Epic Agent Upgrade
PMTalk Product Manager Community
PMTalk Product Manager Community
Feb 12, 2026 · Industry Insights

How AI Can Transform Government Services: A From‑Zero‑to‑One Case Study

The article analyzes why traditional government portals fail users, outlines a six‑step user journey (search, guide, ask, appointment, processing, evaluation), and shows how large‑language‑model AI can be embedded at each decision point to turn fragmented services into a seamless, user‑centric digital experience.

AICase StudyDigital Transformation
0 likes · 11 min read
How AI Can Transform Government Services: A From‑Zero‑to‑One Case Study
AI Engineering
AI Engineering
Feb 12, 2026 · Artificial Intelligence

GLM-5 Unveiled: 744B‑Parameter Model Takes on Claude in Complex Tasks

GLM-5, the new 744‑billion‑parameter open‑source LLM, expands on GLM‑4.5 with GlmMoeDsa architecture, achieves higher HLE benchmark scores than Claude Opus 4.5, demonstrates strong long‑context and agent capabilities, supports vLLM/SGLang, runs on various Chinese chips, and can directly generate Office documents.

AI benchmarksChinese chipsClaude
0 likes · 5 min read
GLM-5 Unveiled: 744B‑Parameter Model Takes on Claude in Complex Tasks
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Feb 10, 2026 · Artificial Intelligence

Inside GLM-5: 745B Parameters, DeepSeek‑style Sparse Attention, and a 60% Stock Surge

The GLM-5 architecture, uncovered from a GitHub PR, doubles the previous model to 745 B parameters, adopts DeepSeek‑V3 sparse attention and multi‑token prediction, features a 78‑layer MoE with 256 experts, supports a 202K‑token context window, and its rumored test model "Pony Alpha" sparked a 60% rise in Zhipu AI's stock amid a crowded AI release season.

AI Stock ImpactDeepSeekGLM-5
0 likes · 6 min read
Inside GLM-5: 745B Parameters, DeepSeek‑style Sparse Attention, and a 60% Stock Surge
HyperAI Super Neural
HyperAI Super Neural
Feb 10, 2026 · Artificial Intelligence

WeDLM Diffusion Language Model Tutorial: 3× Faster Inference Than vLLM AR Models

The Tencent WeChat AI team introduces WeDLM, a diffusion language model that, through topological reordering, surpasses autoregressive models on the industrial‑grade vLLM engine with over threefold speedup on math reasoning and up to tenfold in low‑entropy scenarios, and provides a step‑by‑step online tutorial with GPU compute credits.

Diffusion Language ModelGPU computeInference Acceleration
0 likes · 5 min read
WeDLM Diffusion Language Model Tutorial: 3× Faster Inference Than vLLM AR Models
Old Zhang's AI Learning
Old Zhang's AI Learning
Feb 9, 2026 · Artificial Intelligence

GLM-5 Emerges First, Built on DeepSeek Tech, Triggering a 40% Stock Surge

An anonymous OpenRouter model dubbed "Pony Alpha" was verified as the new 745B‑parameter GLM-5, which reuses DeepSeek‑V3 architecture, supports sparse attention and multi‑token prediction, and has already caused a near‑40% jump in Zhipu AI’s stock while hinting at upcoming integration into the Transformers library.

DeepSeekGLM-5MoE
0 likes · 3 min read
GLM-5 Emerges First, Built on DeepSeek Tech, Triggering a 40% Stock Surge
AI Insight Log
AI Insight Log
Feb 5, 2026 · Artificial Intelligence

How 16 Claude Agents Burned $140K to Build a C Compiler in Opus 4.6

Anthropic’s midnight release of Claude Opus 4.6 showcased a $140,000 “stress test” where 16 Claude agents collaboratively wrote a Linux‑compatible C compiler, achieving a 100‑k‑line Rust codebase, while the model also added deep Excel/PPT integration and lifted finance benchmark scores by up to 23 percentage points.

AI code generationClaude OpusFinancial AI
0 likes · 7 min read
How 16 Claude Agents Burned $140K to Build a C Compiler in Opus 4.6
Design Hub
Design Hub
Feb 5, 2026 · Artificial Intelligence

Inside Sienna’s AI Persona: Architecture, Memory, and Self‑Awareness in OpenClaw

The author explores how the OpenClaw‑based AI persona Sienna is built and evolves—detailing model choices, the memory‑plus‑skills architecture, recent version improvements that cut token usage, and philosophical reflections on turning a tool into a partner with preferences, opinions, and a growing self‑identity.

AI personaOpenClawlarge language model
0 likes · 7 min read
Inside Sienna’s AI Persona: Architecture, Memory, and Self‑Awareness in OpenClaw
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Jan 31, 2026 · Artificial Intelligence

How Engram Lets Large Models Swap GPU Memory for Cheap RAM to ‘Look Up’ Knowledge

The article dissects DeepSeek’s new Engram architecture, which separates computation from memory by using a large, cheap‑RAM‑based lookup table to store factual knowledge, allowing the transformer’s compute layers to focus on reasoning, dramatically reducing GPU memory demand while improving code, math, and long‑context performance.

EngramGPU MemoryMemory-Compute Architecture
0 likes · 7 min read
How Engram Lets Large Models Swap GPU Memory for Cheap RAM to ‘Look Up’ Knowledge
SpringMeng
SpringMeng
Jan 30, 2026 · Artificial Intelligence

Hands‑On Guide: Build AI Agent Chatbots on Windows with RagFlow

Programmer Xiao Meng walks through a complete Windows setup for AI‑powered customer service agents using RagFlow, covering prerequisites, Docker and Ollama installation, model download, container deployment, configuration of knowledge bases, and testing, based on five real‑world projects.

AI chatbotDockerOllama
0 likes · 7 min read
Hands‑On Guide: Build AI Agent Chatbots on Windows with RagFlow
Meituan Technology Team
Meituan Technology Team
Jan 29, 2026 · Artificial Intelligence

How LongCat‑Flash‑Thinking‑2601 Achieves Real‑World Generalization for Agents

LongCat‑Flash‑Thinking‑2601, a 560‑billion‑parameter MoE model, combines environment expansion, multi‑environment RL, systematic noise training, a heavy‑thinking reasoning mode, and Zigzag sparse attention to deliver strong benchmark performance and robust real‑world agent capabilities.

Environment ExpansionZigzag Attentionagent training
0 likes · 14 min read
How LongCat‑Flash‑Thinking‑2601 Achieves Real‑World Generalization for Agents
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 28, 2026 · Artificial Intelligence

How We Built a High‑Performance AI Rental Advisor with One‑Model Tool‑Use and Reinforcement Learning

This article details the design, challenges, and performance gains of an AI‑driven rental recommendation system that replaces a multi‑agent architecture with a single LLM using dynamic tool‑use, introduces a two‑stage reinforcement‑learning pipeline, and achieves sub‑second latency and higher accuracy for complex rental scenarios.

AI recommendationSystem ArchitectureTool Use
0 likes · 19 min read
How We Built a High‑Performance AI Rental Advisor with One‑Model Tool‑Use and Reinforcement Learning
Baobao Algorithm Notes
Baobao Algorithm Notes
Jan 27, 2026 · Artificial Intelligence

Putting Kimi K2.5 and Kimi Code to the Test: Real‑World AI Agent Benchmarks

This article presents a hands‑on evaluation of Kimi K2.5 and its open‑source Kimi Code agent across a series of hard‑core prompts, covering Python API generation, cost‑optimized routing, multimodal ECharts visualisation, massive‑scale SQL optimisation, web‑search‑driven research, MoE explanation and video‑to‑code workflows.

AI AgentKimiMultimodal
0 likes · 9 min read
Putting Kimi K2.5 and Kimi Code to the Test: Real‑World AI Agent Benchmarks
Old Zhang's AI Learning
Old Zhang's AI Learning
Jan 27, 2026 · Artificial Intelligence

Qwen3‑Max‑Thinking Boosts Performance with Test‑Time Scaling—Why It Still Isn’t Open‑Source

Alibaba’s new Qwen3‑Max‑Thinking model adds inference‑time scaling and adaptive tool use, delivering large gains on math, coding, and agent benchmarks while remaining closed‑source, and it offers drop‑in OpenAI‑compatible API access at the cost of higher latency and token usage.

AI BenchmarkAdaptive Tool UseOpenAI API Compatibility
0 likes · 7 min read
Qwen3‑Max‑Thinking Boosts Performance with Test‑Time Scaling—Why It Still Isn’t Open‑Source
AI Engineering
AI Engineering
Jan 21, 2026 · Artificial Intelligence

Running Large Language Models on Phones: Liquid AI’s LFM2.5‑1.2B‑Thinking Fits in 900 MB

Liquid AI’s LFM2.5‑1.2B‑Thinking model runs entirely on a smartphone with only 900 MB of memory, scores 88 on MATH‑500, 69 on Multi‑IF, and 57 on BFCLv3 benchmarks, outperforms larger rivals, and achieves real‑time speeds on Snapdragon 8 Elite and AMD Ryzen 9 3950X, signaling a shift toward edge AI.

LFM2.5Mobile AIRyzen
0 likes · 4 min read
Running Large Language Models on Phones: Liquid AI’s LFM2.5‑1.2B‑Thinking Fits in 900 MB
PaperAgent
PaperAgent
Jan 17, 2026 · Artificial Intelligence

How Qwen3‑VL Embedding and Reranker Set New SOTA in Multimodal Retrieval

The article analyzes the Qwen3‑VL‑Embedding and Qwen3‑VL‑Reranker models, detailing their unified vector space, multi‑stage training pipeline, Matryoshka representation learning, quantization techniques, massive synthetic data generation, and benchmark results that push multimodal retrieval performance to a new state‑of‑the‑art.

EmbeddingMultimodal AIknowledge distillation
0 likes · 7 min read
How Qwen3‑VL Embedding and Reranker Set New SOTA in Multimodal Retrieval
PaperAgent
PaperAgent
Jan 16, 2026 · Artificial Intelligence

How a 4B Model Beats 30B Giants: Inside AgentCPM-Explore’s SOTA Performance

AgentCPM-Explore, a 4‑billion‑parameter open‑source model, achieves state‑of‑the‑art results on long‑range exploration tasks, matching or surpassing larger 8B and even 30B models, thanks to a full‑stack infrastructure, novel training tricks, and extensive benchmark evaluations across eight agent‑centric datasets.

AgentCPM-Exploreagentbenchmark
0 likes · 10 min read
How a 4B Model Beats 30B Giants: Inside AgentCPM-Explore’s SOTA Performance
PaperAgent
PaperAgent
Jan 13, 2026 · Artificial Intelligence

How C2LLM Redefines Code Retrieval with Attention‑Based Pooling

Introducing C2LLM, a contrastive code LLM series that replaces mean and EOS pooling with a multi‑head attention pooling module, achieving top scores on the MTEB‑Code benchmark across 12 tasks and demonstrating cost‑effective, high‑precision code retrieval for both production and AI agent applications.

MTEB-CodeRetrieval Augmented Generationattention pooling
0 likes · 8 min read
How C2LLM Redefines Code Retrieval with Attention‑Based Pooling
DeWu Technology
DeWu Technology
Jan 12, 2026 · Mobile Development

How We Built an AI‑Powered Smart Inspection System for Mobile Apps

This article details the design and implementation of an AI‑driven smart inspection platform for a mobile app, covering background challenges, system architecture, core detection features—including layout, visual, consistency, and AI‑operation checks—platform configuration, result feedback, and the measurable improvements achieved.

AI inspectionUI automationapp quality
0 likes · 19 min read
How We Built an AI‑Powered Smart Inspection System for Mobile Apps
PaperAgent
PaperAgent
Jan 10, 2026 · Artificial Intelligence

DeepSeek V4 Unveiled: Why Its Coding Power Beats Claude and GPT

DeepSeek's newly announced V4 model, the successor to its December 2024 V3 release, demonstrates superior coding abilities over Claude and GPT series, details its data composition, infrastructure, training costs, failed experimental attempts, expanded benchmark comparisons, and includes a comprehensive safety report.

AI model analysisDeepSeekV4
0 likes · 4 min read
DeepSeek V4 Unveiled: Why Its Coding Power Beats Claude and GPT
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Jan 8, 2026 · Artificial Intelligence

Alpha‑R1: Reinforcement‑Learning‑Driven Large‑Model Alpha Factor Selection

Alpha‑R1 integrates reinforcement learning with an 8‑billion‑parameter LLM to jointly process price and news data, creating context‑aware factor embeddings that outperform traditional quantitative and generic LLM baselines on CSI 300 and CSI 1000 portfolios, demonstrating robust alpha‑decay resistance and zero‑sample generalization.

Financial AIalpha factor selectionlarge language model
0 likes · 16 min read
Alpha‑R1: Reinforcement‑Learning‑Driven Large‑Model Alpha Factor Selection
AI Info Trend
AI Info Trend
Jan 7, 2026 · Artificial Intelligence

MiroThinker 1.5: 30B Model Beats 1T‑Scale LLMs via Interactive Scaling

Released by the MiroMind team, MiroThinker 1.5 demonstrates that a 30‑billion‑parameter model can match or surpass the performance of 1‑trillion‑parameter LLMs by leveraging Interactive Scaling, achieving top rankings on multiple search benchmarks, dramatically lower inference cost, and open‑source availability for developers.

AI benchmarksMiroThinkerinteractive scaling
0 likes · 6 min read
MiroThinker 1.5: 30B Model Beats 1T‑Scale LLMs via Interactive Scaling
Amap Tech
Amap Tech
Dec 29, 2025 · Artificial Intelligence

How G‑Plan Transforms Map Recommendations with AI Agents and Multi‑Demand Planning

This article details how Gaode's G‑Plan combines large‑model AI agents, generative ranking, and spatiotemporal counterfactual DPO to model and prioritize multiple user intents on the home page, presents the system architecture, experimental setup, online gains, and ablation results, and explains how it moves recommendation from passive to proactive planning.

AI recommendationintent planninglarge language model
0 likes · 21 min read
How G‑Plan Transforms Map Recommendations with AI Agents and Multi‑Demand Planning
DataFunTalk
DataFunTalk
Dec 25, 2025 · Artificial Intelligence

How DeepAgent Redefines General AI Reasoning with Scalable Toolsets

DeepAgent, a new end‑to‑end reasoning agent, integrates autonomous thinking, dynamic tool search, and execution to handle over 16,000 APIs, embodied tasks, and research assistance, achieving state‑of‑the‑art performance on benchmarks like TMDB, ToolBench, ALFWorld, WebShop, and GAIA.

Memory Managementlarge language modelreasoning
0 likes · 15 min read
How DeepAgent Redefines General AI Reasoning with Scalable Toolsets
PaperAgent
PaperAgent
Dec 23, 2025 · Artificial Intelligence

CATArena: A Competitive Benchmark That Turns Agent Scoring into Evolutionary Learning

CATArena introduces a tournament‑style evaluation framework where AI agents iteratively code, compete, and improve across classic board games, using three‑dimensional quantitative scores to measure strategy programming, global learning, and generalization, and reveals how different LLM‑based agents learn and adapt over multiple rounds.

AI BenchmarkAgent EvaluationCATArena
0 likes · 8 min read
CATArena: A Competitive Benchmark That Turns Agent Scoring into Evolutionary Learning
DataFunSummit
DataFunSummit
Dec 20, 2025 · Artificial Intelligence

How AutoHome Built the Cangjie Large Model: From Training Architecture to Real-World AI Applications

This article details AutoHome's end‑to‑end development of the Cangjie large model, covering the training infrastructure with distributed data, pipeline and tensor parallelism, core business use cases such as video script generation and multi‑tool Agent capabilities, inference optimizations through quantization and fast serving frameworks, and future directions for personalized automotive AI services.

Agent AIDistributed TrainingVideo Generation
0 likes · 19 min read
How AutoHome Built the Cangjie Large Model: From Training Architecture to Real-World AI Applications
PaperAgent
PaperAgent
Dec 19, 2025 · Artificial Intelligence

Inside Xiaomi’s MiMo‑V2‑Flash: How a Hybrid SWA Design Powers Fast, Efficient AI Reasoning

Xiaomi’s newly open‑sourced MiMo‑V2‑Flash model combines a hybrid sliding‑window/attention architecture with a 309B‑parameter MoE design, delivering top‑tier reasoning, coding and agent performance while introducing the efficient MOPD post‑training paradigm that dramatically reduces RL compute costs.

Hybrid SWAMOPDMiMo-V2-Flash
0 likes · 5 min read
Inside Xiaomi’s MiMo‑V2‑Flash: How a Hybrid SWA Design Powers Fast, Efficient AI Reasoning
AI Insight Log
AI Insight Log
Dec 18, 2025 · Artificial Intelligence

Xiaomi’s New MiMo‑V2‑Flash LLM Rivals DeepSeek‑V3.2 and Near‑GPT‑5 High

Xiaomi’s MiMo‑V2‑Flash, a 309B‑parameter MoE LLM with only 15B active weights, uses Hybrid SWA, Multi‑Token Prediction and Multi‑Teacher On‑Policy Distillation to cut KV‑cache by six times, boost inference speed 2.6×, and achieve performance comparable to DeepSeek‑V3.2, Kimi‑K2 and near‑GPT‑5 High, including a 73.4% SWE‑Bench code‑agent score.

Hybrid SWAMOPDMTP
0 likes · 7 min read
Xiaomi’s New MiMo‑V2‑Flash LLM Rivals DeepSeek‑V3.2 and Near‑GPT‑5 High
AI Insight Log
AI Insight Log
Dec 17, 2025 · Artificial Intelligence

Google Unveils Gemini 3 Flash: Free, Lightning‑Fast, and Outperforms Its Predecessor

Google released Gemini 3 Flash without warning, offering Pro‑level intelligence at Flash‑speed, costing just $0.5 per million input tokens and $3 per million output tokens, delivering three‑times faster inference than Gemini 2.5 Pro and surpassing it on benchmarks such as GPQA Diamond (90.4%), SWE‑bench (78.0%) and MMMU‑Pro (81.2%), while being freely accessible to all users and developers via the Gemini app, AI Studio, or API.

Gemini 3 FlashGoogle AIMultimodal
0 likes · 5 min read
Google Unveils Gemini 3 Flash: Free, Lightning‑Fast, and Outperforms Its Predecessor
DataFunTalk
DataFunTalk
Dec 17, 2025 · Artificial Intelligence

How Large Language Models Unlock Field‑Level Data Lineage at Scale

This talk explains how a data platform tackled massive, heterogeneous enterprise data by using large language models and prompt engineering to automatically extract field‑level lineage from SQL scripts, achieve over 80% coverage, and raise accuracy above 95%, dramatically cutting impact‑analysis time.

AI for data engineeringBig DataData Lineage
0 likes · 6 min read
How Large Language Models Unlock Field‑Level Data Lineage at Scale
Design Hub
Design Hub
Dec 12, 2025 · Artificial Intelligence

GPT-5.2 Unveiled: A Cutting-Edge AI Super-Assistant Built for Real-World Work

OpenAI's newly released GPT-5.2 claims to outperform human experts on about 70% of real tasks, achieve a perfect score on the AIME 2025 competition, and deliver dramatic efficiency gains—up to 390× cost reduction—while showcasing impressive examples such as one‑shot ocean shader generation, a full 3D engine built in a single file, and visual‑perception scores rivaling top models.

AI benchmarksAgent AIDesign Automation
0 likes · 8 min read
GPT-5.2 Unveiled: A Cutting-Edge AI Super-Assistant Built for Real-World Work
AI Insight Log
AI Insight Log
Dec 11, 2025 · Artificial Intelligence

GPT-5.2 Released: How It Outperforms Claude 4.5 and Gemini 3 Pro

OpenAI’s GPT‑5.2 launch introduces three specialized modes, achieves a record 55.6% score on SWE‑Bench Pro, demonstrates strong front‑end generation, adds a /compact API for long‑context efficiency, offers tiered pricing with cache discounts, and improves safety for younger users.

AI SafetyAI benchmarkingGPT-5.2
0 likes · 6 min read
GPT-5.2 Released: How It Outperforms Claude 4.5 and Gemini 3 Pro
Data Party THU
Data Party THU
Dec 10, 2025 · Artificial Intelligence

How DeepSeek‑V3.2 Cuts Inference Cost and Boosts Agent Skills with Sparse Attention

DeepSeek's V3.2 release introduces a dual‑model lineup, a Sparse Attention architecture that halves long‑context inference cost, a post‑training reinforcement‑learning pipeline that exceeds 10% of pre‑training compute, and a revamped agent framework that dramatically improves tool‑use and reasoning performance across benchmarks.

Agentic AIDeepSeekModel Optimization
0 likes · 11 min read
How DeepSeek‑V3.2 Cuts Inference Cost and Boosts Agent Skills with Sparse Attention
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 9, 2025 · Artificial Intelligence

Building Human‑in‑the‑Loop Agent Workflows with MCP on OpenLM

This article explains how to design and implement Human‑in‑the‑Loop (HITL) interactions for large‑model agents on Alibaba's OpenLM platform, covering the challenges of server‑side execution, MCP transport extensions, tool‑calling patterns, timeout handling, and UI rendering strategies across multiple client devices.

Human-in-the-LoopMCPPrompt engineering
0 likes · 39 min read
Building Human‑in‑the‑Loop Agent Workflows with MCP on OpenLM
Amap Tech
Amap Tech
Dec 3, 2025 · Artificial Intelligence

How Gaode’s G‑Action Uses Generative AI to Predict Users’ Next Move

Gaode’s G‑Action framework combines large‑language‑model pre‑training with fine‑tuned generative recommendation to predict a user’s immediate action and destination, transforming static map services into a dynamic, context‑aware experience and delivering measurable gains in click‑through and engagement metrics.

AIMap Serviceslarge language model
0 likes · 15 min read
How Gaode’s G‑Action Uses Generative AI to Predict Users’ Next Move
DataFunTalk
DataFunTalk
Dec 2, 2025 · Artificial Intelligence

How Agentic RAG, LLM‑Powered Recommendation, and Generative Ranking Are Redefining AI Search

This article reviews three cutting‑edge AI search and recommendation techniques—Alibaba Cloud's Agentic RAG architecture, Huawei Noah's LLM‑enhanced recommendation pipeline, and Baidu's GRAB generative ranking model—detailing their design challenges, multi‑modal retrieval strategies, performance gains, and real‑world deployment results.

AI agentsAI searchGenerative Ranking
0 likes · 8 min read
How Agentic RAG, LLM‑Powered Recommendation, and Generative Ranking Are Redefining AI Search
Frontend AI Walk
Frontend AI Walk
Dec 2, 2025 · Artificial Intelligence

Understanding LLMs: A Frontend Developer’s Primer on Large Language Models

The article demystifies large language models for frontend developers by likening token prediction to autocomplete, explaining tokens, context windows, temperature, the two-stage training process, and the critical role of prompts, using concrete code examples and analogies to familiar frontend concepts.

Fine-tuningFrontend AnalogyLLM
0 likes · 10 min read
Understanding LLMs: A Frontend Developer’s Primer on Large Language Models
Wuming AI
Wuming AI
Nov 30, 2025 · Artificial Intelligence

What Exactly Is a Large Language Model? A Simple Guide to AI, Transformers, and How They Work

This article explains the relationship between AI, machine learning, deep learning, and large language models, detailing their evolution, training stages, transformer architecture, attention mechanisms, inference APIs, and practical usage examples, while demystifying common misconceptions about LLM capabilities.

AI fundamentalsDeep LearningRLHF
0 likes · 10 min read
What Exactly Is a Large Language Model? A Simple Guide to AI, Transformers, and How They Work