Tagged articles

Large Language Model

737 articles · Page 3 of 8

Mar 17, 2026 · Artificial Intelligence

How a 4B-Parameter Open-Source Model Outperforms 14B Multimodal Giants

InternVL-U, a 4‑billion‑parameter unified multimodal model released as open source, combines a 2B MLLM backbone with a 1.7B visual generation head and, through a reasoning‑centric data pipeline and Chain‑of‑Thought guidance, achieves superior understanding, generation, and editing performance that surpasses much larger 14‑20B models on multiple benchmarks.

AI researchInternVL-ULarge Language Model

0 likes · 22 min read

How a 4B-Parameter Open-Source Model Outperforms 14B Multimodal Giants

AI Insight Log

Mar 16, 2026 · Artificial Intelligence

Cursor’s Own Large‑Model Benchmark Shakes Up SWE‑bench Rankings

Although SWE‑bench scores for top coding models now differ by only a tenth of a point, Cursor’s newly released CursorBench reveals dramatic ranking changes, highlights three fundamental flaws in public benchmarks, and introduces token‑efficiency as a crucial evaluation dimension.

AI codingBenchmarkCursorBench

0 likes · 8 min read

Cursor’s Own Large‑Model Benchmark Shakes Up SWE‑bench Rankings

PaperAgent

Mar 16, 2026 · Artificial Intelligence

How GLM-5-Turbo Turns an AI Research Lab into a 24‑Hour Autonomous Writer

The article details how the newly released GLM-5-Turbo "lobster" model powers an AI research Lab that automatically generates a complete OpenClaw survey paper—from topic brainstorming and literature mining to outline drafting, manuscript writing, and AAAI‑style submission—within an hour, showcasing benchmark results, prompt templates, and practical skill installations.

AI research automationAutoClawGLM-5-Turbo

0 likes · 10 min read

How GLM-5-Turbo Turns an AI Research Lab into a 24‑Hour Autonomous Writer

IT Services Circle

Mar 15, 2026 · Artificial Intelligence

How PinchBench Ranks OpenClaw AI Agents Across Real‑World Tasks

The article explains OpenClaw’s rapid rise and the emerging on‑site installation business, introduces the open‑source PinchBench benchmark that evaluates large language models as OpenClaw agents on 23 real‑world tasks, presents recent ranking results, and provides step‑by‑step instructions for running the benchmark and submitting results.

AI AgentBenchmarkLarge Language Model

0 likes · 5 min read

How PinchBench Ranks OpenClaw AI Agents Across Real‑World Tasks

TonyBai

Mar 15, 2026 · Artificial Intelligence

Why Your OpenClaw Skills Keep Making AI Fail—and How to Fix Them

The article explains why many developers' OpenClaw skills cause AI agents to stall or hallucinate, identifies three common pitfalls in skill description, command style, and control flow, and offers a systematic, high‑level approach to mastering skill engineering and batch‑creating reliable AI skills.

AI AgentsLarge Language ModelOpenClaw

0 likes · 7 min read

Why Your OpenClaw Skills Keep Making AI Fail—and How to Fix Them

Bighead's Algorithm Notes

Mar 14, 2026 · Artificial Intelligence

Quantitative Finance Paper Digest: AI‑Driven Market Prediction Studies (Mar 7‑13 2026)

This digest summarizes four recent research papers that apply advanced AI techniques—node‑transformer graphs with BERT sentiment analysis, a quantum‑classical LSTM‑Born machine hybrid, large‑language‑model benchmarking for portfolio optimization, and a conditional diffusion model—to improve stock market prediction, volatility forecasting, and investment decision making, providing detailed experimental results and statistical validation.

BERTLarge Language ModelTransformer

0 likes · 10 min read

Quantitative Finance Paper Digest: AI‑Driven Market Prediction Studies (Mar 7‑13 2026)

AI Explorer

Mar 14, 2026 · Artificial Intelligence

Claude’s 1M‑Token Context Window Launches with No Premium Pricing

Anthropic’s Claude Opus 4.6 and Sonnet 4.6 now offer a full‑million‑token context window at the same per‑token price as short‑context usage, delivering top‑ranked MRCR v2 performance, six‑fold media capacity, and reduced AI‑Agent memory compression without any code changes across all major cloud platforms.

AI AgentAnthropicClaude

0 likes · 6 min read

Claude’s 1M‑Token Context Window Launches with No Premium Pricing

Data Party THU

Mar 12, 2026 · Artificial Intelligence

Can a 30B LLM Truly Conduct Autonomous Scientific Research? Inside UniScientist

UniScientist, a 30‑billion‑parameter open‑source model from UniPat AI, demonstrates a closed‑loop scientific research workflow—generating hypotheses, gathering evidence, performing reproducible derivations, and iteratively refining conclusions—while achieving benchmark scores comparable to much larger proprietary systems across multiple scientific evaluation suites.

Large Language ModelScientific researchbenchmarking

0 likes · 10 min read

Can a 30B LLM Truly Conduct Autonomous Scientific Research? Inside UniScientist

AI2ML AI to Machine Learning

Mar 10, 2026 · Artificial Intelligence

How Anthropic and Palantir Collaborate on Modern Warfare Information Mining

The article analyzes Palantir's ontology-driven knowledge graph dominance, its shift from graph to vector databases, the three‑layer partnership with Anthropic and AWS, the Digital Twin scaling law, and the technical challenges of data heterogeneity, scaling uncertainty, annotation scarcity, and real‑time computation in modern warfare information mining.

AWSAnthropicDigital Twin

0 likes · 9 min read

How Anthropic and Palantir Collaborate on Modern Warfare Information Mining

SuanNi

Mar 9, 2026 · Artificial Intelligence

How UniScientist Beats GPT‑5.4 on FrontierScience Benchmarks

UniScientist, a 30B‑parameter AI model co‑developed by UniPat AI and Peking University, leverages a meticulously curated scientific dataset and a powerful code interpreter to achieve 33.3% success on the FrontierScience‑Research benchmark, surpassing the newly released GPT‑5.4 and demonstrating superior multi‑disciplinary research capabilities.

AILarge Language ModelScientific research

0 likes · 12 min read

How UniScientist Beats GPT‑5.4 on FrontierScience Benchmarks

Design Hub

Mar 6, 2026 · Artificial Intelligence

How Powerful Is GPT‑5.4? A Deep Dive Into Its Design‑Focused Capabilities

OpenAI's GPT‑5.4 combines a 1 M‑token context window, native computer‑use, and benchmark‑leading performance—outperforming humans on 83 % of tasks and cutting token usage by 47 %—while showcasing demos that let designers generate games, websites, and 3D assets in a single prompt.

AI AgentsBenchmarkComputer Use

0 likes · 7 min read

How Powerful Is GPT‑5.4? A Deep Dive Into Its Design‑Focused Capabilities

DataFunTalk

Mar 6, 2026 · Artificial Intelligence

Why GPT‑5.4 Beats Its Predecessors: Code Power, World Knowledge, and New Agent Features

The article reviews GPT‑5.4’s release, comparing its code ability, world knowledge, and multimodal understanding to Claude Opus 4.6 and GPT‑5.3‑Codex, presents benchmark scores (GDPval 83%, SWE‑Bench 57.7%, OSWorld 75%, ToolAthon 54.6%), and highlights new features such as a 1‑million‑token context window, native computer usage, and tool‑search optimization, while discussing pricing and practical usage in OpenClaw.

AI AgentsBenchmarkGPT-5.4

0 likes · 12 min read

Why GPT‑5.4 Beats Its Predecessors: Code Power, World Knowledge, and New Agent Features

Xiaomi Tech

Mar 6, 2026 · Artificial Intelligence

Xiaomi Miclaw: Mobile AI Agent Enters Small‑Scale Closed Beta

Xiaomi Miclaw, an AI agent built on the MiMo large model, launches a limited closed beta to demonstrate system‑level tool access, multi‑turn context management, IoT ecosystem integration, and self‑evolution capabilities while emphasizing data security and user‑controlled permissions.

AI AgentData SecurityIoT

0 likes · 10 min read

Xiaomi Miclaw: Mobile AI Agent Enters Small‑Scale Closed Beta

AI Explorer

Mar 6, 2026 · Artificial Intelligence

GPT-5.4 Unveiled: 1M‑Token Context Window and Native Computer Control

OpenAI's GPT-5.4 launch introduces three model tiers, a 1 million‑token context window, native computer‑use abilities, higher factual accuracy and a new Tool Search feature, reshaping enterprise AI capabilities and intensifying competition with Anthropic and Google.

AI benchmarksComputer UseEnterprise AI

0 likes · 9 min read

GPT-5.4 Unveiled: 1M‑Token Context Window and Native Computer Control

AI Insight Log

Mar 6, 2026 · Artificial Intelligence

OpenAI Skips GPT‑5.3, Launches GPT‑5.4: Wins 5 of 8 Benchmarks, Sparks Heated Debate

OpenAI announced GPT‑5.4 at 2 a.m., skipping GPT‑5.3 and claiming integrated coding and reasoning abilities; the model tops five of eight benchmark categories, introduces native computer operation, tool‑search and interruptible thinking, while users debate its trustworthiness and pricing changes.

AI capabilitiesBenchmarkGPT-5.4

0 likes · 14 min read

OpenAI Skips GPT‑5.3, Launches GPT‑5.4: Wins 5 of 8 Benchmarks, Sparks Heated Debate

Software Engineering 3.0 Era

Mar 5, 2026 · Fundamentals

Why Information Theory Is the Mathematical Foundation of Software Engineering 3.0

The article argues that Software Engineering 3.0 is fundamentally a continuous entropy‑reduction process, using information theory to explain how large language models turn ambiguous human intent into precise executable code, why knowledge graphs outperform documents, and how multi‑agent systems improve quality assurance.

AI AgentsLarge Language ModelSoftware engineering

0 likes · 11 min read

Why Information Theory Is the Mathematical Foundation of Software Engineering 3.0

Weekly Large Model Application

Mar 4, 2026 · Artificial Intelligence

Qwen3‑ASR vs FunASR: In‑Depth Technical Comparison

This article provides a detailed side‑by‑side analysis of the open‑source ASR tools FunASR and Qwen3‑ASR, covering team origins, model architectures, language coverage, speed, deployment requirements, and ideal use‑cases so readers can decide which solution fits their projects best.

ASRFunASRLarge Language Model

0 likes · 10 min read

Qwen3‑ASR vs FunASR: In‑Depth Technical Comparison

AI Explorer

Mar 4, 2026 · Artificial Intelligence

DeerFlow: Open‑Source Super‑Agent Framework Automates Complex Tasks

DeerFlow 2.0, an open‑source super‑agent framework from ByteDance, lets developers automate multi‑step, minutes‑to‑hours‑long workflows by orchestrating sub‑agents with memory, sandboxed execution, and extensible skills, and has surged to over 2.4 k GitHub stars.

AI AgentsDeerFlowDocker

0 likes · 6 min read

DeerFlow: Open‑Source Super‑Agent Framework Automates Complex Tasks

Old Zhang's AI Learning

Mar 2, 2026 · Artificial Intelligence

Why the Qwen3.5 Series Makes Qwen3.5-27B the No‑Brainer Choice

The author reviews the Qwen3.5 model family, showing that the 27‑billion‑parameter dense Qwen3.5-27B offers the best balance of size, stability, low‑cost local deployment, and comprehensive capabilities, making it the default pick for most users.

AI benchmarkingLarge Language ModelQuantization

0 likes · 6 min read

Why the Qwen3.5 Series Makes Qwen3.5-27B the No‑Brainer Choice

Old Zhang's AI Learning

Feb 27, 2026 · Backend Development

How I Built a Telegram AI Coding Bot (FakeClawBot) Using OpenCode

This article walks through creating a Telegram bot that leverages OpenCode's Server API to provide full AI coding assistance, covering setup, multi‑model integration, core architecture, common pitfalls, and extensible features, all with under 900 lines of Python code.

AI coding assistantLarge Language ModelOpen Source

0 likes · 13 min read

How I Built a Telegram AI Coding Bot (FakeClawBot) Using OpenCode

PaperAgent

Feb 26, 2026 · Industry Insights

What the DeepSeek V4 Lite Leak Reveals About Its Specs and Multimodal Power

Recent reports indicate that DeepSeek's unreleased V4 Lite model, featuring a 1‑million‑token context window and native multimodal reasoning, has been leaked online, with Huawei gaining early access while Nvidia is excluded, and the model demonstrates impressive spatial reasoning in generated SVG examples.

DeepSeekIndustry insightLarge Language Model

0 likes · 3 min read

What the DeepSeek V4 Lite Leak Reveals About Its Specs and Multimodal Power

Old Zhang's AI Learning

Feb 26, 2026 · Artificial Intelligence

Ultimate Guide to Local Deployment of Qwen3.5 Models (27B‑397B)

This guide reviews the Qwen3.5 model lineup, explains mixed‑inference and MoE architecture, presents benchmark comparisons with GPT‑5.2, Claude 4.5 and Gemini‑3 Pro, evaluates 4‑bit and 3‑bit quantization loss, outlines hardware requirements, and provides step‑by‑step deployment options using llama.cpp or llama‑server.

InferenceLarge Language ModelMoE

0 likes · 14 min read

Ultimate Guide to Local Deployment of Qwen3.5 Models (27B‑397B)

Baobao Algorithm Notes

Feb 25, 2026 · Artificial Intelligence

Exploring Qwen 3.5: Small‑Scale MoE Models, Architecture, and Deployment Guides

This article reviews the three open‑source Qwen 3.5 models—including a 35B MoE, a 122B MoE, and a 27B dense version—detailing their parameter layouts, core attention designs, context length, inference performance, hardware requirements, and provides step‑by‑step code examples for loading them with Hugging Face Transformers and vLLM.

AILarge Language ModelMoE

0 likes · 10 min read

Exploring Qwen 3.5: Small‑Scale MoE Models, Architecture, and Deployment Guides

Yunqi AI+

Feb 25, 2026 · Artificial Intelligence

How Our In-House AI Agent Scaled to Handle 70% of Tech Support: A Six-Month Review

Over six months the team built an AI agent that now answers more than 70% of technical support queries by grounding responses in system data, a curated knowledge base, and a tiered permission model, while also exposing growing technical debt and maintenance challenges.

AI AgentCase StudyLarge Language Model

0 likes · 7 min read

How Our In-House AI Agent Scaled to Handle 70% of Tech Support: A Six-Month Review

Machine Learning Algorithms & Natural Language Processing

Feb 20, 2026 · Artificial Intelligence

Google Reclaims AI Throne with Gemini 3.1 Pro, Achieving 77.1% ARC‑AGI‑2 Score

Google’s Gemini 3.1 Pro, the latest upgrade to the Gemini 3 series, achieves a verified 77.1 % score on the ARC‑AGI‑2 reasoning benchmark—more than double the performance of Gemini 3 Pro—while leading in GPQA, LiveCodeBench Pro, SWE‑Bench Verified, and MMMLU tests, and is now being rolled out to developers, enterprises and consumers with detailed pricing and integration options.

AI benchmarkingARC-AGI-2Gemini 3.1 Pro

0 likes · 9 min read

Google Reclaims AI Throne with Gemini 3.1 Pro, Achieving 77.1% ARC‑AGI‑2 Score

Weekly Large Model Application

Feb 20, 2026 · Artificial Intelligence

Intelligent Speech vs. Voice Agent: Key Differences and How They Relate

This article explains the technical distinction between intelligent speech— a toolbox of ASR, TTS, NLU, and NLG technologies— and Voice Agent, an end‑to‑end conversational system built on those tools and large‑model reasoning, illustrating their layered relationship, functional gaps, and typical use cases.

ASRDialogue SystemsLarge Language Model

0 likes · 7 min read

Intelligent Speech vs. Voice Agent: Key Differences and How They Relate

Old Zhang's AI Learning

Feb 19, 2026 · Artificial Intelligence

Inside GLM-5: Training Techniques, Architecture Innovations, and Benchmark Performance

The article dissects GLM-5’s 744B‑parameter MoE design, 28.5 T token training corpus, novel Muon Split and MLA‑256 optimizations, DSA sparse attention, a fully asynchronous RL pipeline, extensive domestic chip adaptation, and benchmark results that place it on par with Claude Opus 4.5 and ahead of Gemini 3 Pro.

AI ArchitectureAgentic RLBenchmark

0 likes · 13 min read

Inside GLM-5: Training Techniques, Architecture Innovations, and Benchmark Performance

AI Agent Research Hub

Feb 19, 2026 · Artificial Intelligence

Why Claude Sonnet 4.6 Is My Most Powerful and Cost‑Effective AI Research Assistant

The article evaluates Anthropic's Claude Sonnet 4.6 as a comprehensive research assistant, detailing its performance on literature surveys, open‑source code analysis, algorithm implementation, cost savings, benchmark scores, and practical limitations across multiple scientific workflows.

AI research assistantBenchmarkClaude Sonnet 4.6

0 likes · 20 min read

Why Claude Sonnet 4.6 Is My Most Powerful and Cost‑Effective AI Research Assistant

Old Zhang's AI Learning

Feb 18, 2026 · Artificial Intelligence

New Ollama Features: Instant Model Switching, Subagents, and Built‑in Web Search

The latest Ollama 0.16.1 release lets users switch models and tools instantly, use Claude Code, Codex, and OpenClaw without extra configuration, and enables Subagents and built‑in web search directly via simple commands.

Claude CodeLarge Language ModelModel Switching

0 likes · 3 min read

New Ollama Features: Instant Model Switching, Subagents, and Built‑in Web Search

Alibaba Cloud Big Data AI Platform

Feb 17, 2026 · Artificial Intelligence

Deploy Alibaba’s Qwen3.5‑397B‑A17B Model in One Click with PAI‑Model Gallery

Alibaba's open‑source Qwen3.5‑397B‑A17B model, featuring 397 billion parameters and a hybrid Gated Delta Network/MoE architecture, delivers superior performance and reduced memory usage, and can be deployed instantly through the PAI‑Model Gallery with step‑by‑step guidance and enterprise‑grade security.

AI inferenceAlibaba CloudLarge Language Model

0 likes · 5 min read

Deploy Alibaba’s Qwen3.5‑397B‑A17B Model in One Click with PAI‑Model Gallery

Machine Learning Algorithms & Natural Language Processing

Feb 16, 2026 · Artificial Intelligence

Alibaba’s Qwen 3.5‑Plus: 397 B Open‑Source Model Beats Gemini‑3 and GPT‑5.2 at Low Cost

Alibaba released the Qwen 3.5‑Plus open‑source large model (397 B total parameters, 170 B active) that outperforms top closed‑source models such as Gemini‑3‑Pro and GPT‑5.2 on multiple benchmarks, offers native multimodal understanding, supports 201 languages, reduces deployment memory by 60 % and inference latency by up to 19×, and is priced at only 0.8 CNY per million tokens.

AIBenchmarkLarge Language Model

0 likes · 15 min read

Alibaba’s Qwen 3.5‑Plus: 397 B Open‑Source Model Beats Gemini‑3 and GPT‑5.2 at Low Cost

Old Zhang's AI Learning

Feb 16, 2026 · Artificial Intelligence

Qwen3.5 Deep Dive: Multimodal Architecture, Benchmarks, and Deployment Guide

This article provides a detailed analysis of Qwen3.5, covering its multimodal MoE design, massive inference speedups, extensive benchmark results against GPT‑5.2, Claude 4.5 Opus and Gemini‑3 Pro, RL scaling strategies, training infrastructure innovations, and practical usage via API and local deployment.

BenchmarkFP8 trainingLarge Language Model

0 likes · 13 min read

Qwen3.5 Deep Dive: Multimodal Architecture, Benchmarks, and Deployment Guide

AntTech

Feb 16, 2026 · Artificial Intelligence

Ling‑2.5‑1T: Open‑Source 1‑Trillion‑Parameter Instant LLM with 1M‑Token Context

Ling‑2.5‑1T is an open‑source instant large language model with 1 trillion total parameters, 63 B active weights, and a 1 M token context window, featuring mixed‑linear attention, a composite correctness‑plus‑process reward for token efficiency, fine‑grained alignment, and leading benchmark performance across reasoning, instruction‑following, and agentic tasks.

BenchmarkLarge Language Modelagentic interaction

0 likes · 13 min read

Ling‑2.5‑1T: Open‑Source 1‑Trillion‑Parameter Instant LLM with 1M‑Token Context

AI Engineering

Feb 16, 2026 · Artificial Intelligence

Qwen3.5-397B: 397B‑Parameter Multimodal LLM Boosts Inference Speed 8‑19×

Alibaba’s Qwen3.5-397B-A17B, a 397‑billion‑parameter open‑source multimodal LLM, combines mixed linear attention with a sparse MoE architecture to achieve 8.6‑19× higher decoding throughput than Qwen3‑Max, supports 201 languages, and can be deployed via vLLM, Docker, Transformers, or SGLang with various optimization presets.

Inference OptimizationLarge Language ModelQwen3.5

0 likes · 8 min read

Qwen3.5-397B: 397B‑Parameter Multimodal LLM Boosts Inference Speed 8‑19×

AI Insight Log

Feb 16, 2026 · Artificial Intelligence

DeepSeek V4 Benchmark Leak Fuels Talk of a New Coding King

A leaked SWE‑Bench score of 83.7% for DeepSeek V4 sparked claims it outperforms Claude Opus 4.5 and GPT‑5.2, but the data was later debunked as fabricated while official hints confirm a 1‑million‑token context model and a mid‑February 2026 release.

AI benchmarkingAI industryDeepSeek

0 likes · 7 min read

DeepSeek V4 Benchmark Leak Fuels Talk of a New Coding King

PaperAgent

Feb 16, 2026 · Artificial Intelligence

Why Qwen3.5-Plus Sets a New Standard for Open-Source Multimodal AI

Qwen3.5-Plus, Alibaba’s newly open-sourced multimodal LLM, combines a 397 B parameter model with only 17 B active parameters, leveraging native multimodal training, gated attention, sparse MoE, and FP8 precision to outperform GPT-5.2 and Gemini-3-Pro across vision, reasoning, and agent benchmarks.

Large Language ModelOpen SourceSparse Activation

0 likes · 6 min read

Why Qwen3.5-Plus Sets a New Standard for Open-Source Multimodal AI

Old Zhang's AI Learning

Feb 14, 2026 · Artificial Intelligence

Translate Full PDFs While Preserving Layout Using LLMs – Core Code Included

This article presents a two‑stage, cache‑enabled pipeline that extracts text blocks from a PDF with PyMuPDF, translates them via a large‑language‑model API, and re‑renders each page as an image with Chinese text overlaid to keep the original layout, along with full Python code and usage instructions.

LLMLarge Language ModelPDF translation

0 likes · 10 min read

Translate Full PDFs While Preserving Layout Using LLMs – Core Code Included

Machine Learning Algorithms & Natural Language Processing

Feb 12, 2026 · Artificial Intelligence

How Huawei’s MindScale Cuts Agent Token Usage 5.7× and Automates Prompt & Workflow Design

The article outlines the four major obstacles hindering industry‑specific LLM agents—manual workflow maintenance, poor knowledge reuse, training‑inference inefficiency, and complex reasoning evaluation—and explains how Huawei Noah’s MindScale package tackles each with self‑evolving workflows, automated prompt optimization, and a novel KV‑Embedding cache that slashes token consumption by 5.7× while boosting inference speed up to 70%.

Industry AgentKV-EmbeddingLarge Language Model

0 likes · 7 min read

How Huawei’s MindScale Cuts Agent Token Usage 5.7× and Automates Prompt & Workflow Design

Old Zhang's AI Learning

Feb 12, 2026 · Artificial Intelligence

Testing the World's Most Powerful Open‑Source LLM: GLM‑5, Local Deployment & Free Ollama Cloud

The article evaluates GLM‑5, the claimed strongest open‑source large language model, comparing its benchmark scores to Claude Opus, Gemini and GPT, detailing its DeepSeek‑inspired architecture, quantized FP8 deployment requirements, and step‑by‑step usage of Ollama’s free cloud model with Agent, data‑analysis and document‑generation features.

AI benchmarkingAgent modeGLM-5

0 likes · 7 min read

Testing the World's Most Powerful Open‑Source LLM: GLM‑5, Local Deployment & Free Ollama Cloud

DataFunTalk

Feb 12, 2026 · Artificial Intelligence

DeepSeek’s New Model V4? Exploring 1M‑Token Context and Updated Knowledge

DeepSeek quietly launched its latest model, reportedly supporting up to 1 million tokens, extending its knowledge cutoff to May 2025, adopting a more enthusiastic response style, and still operating as a pure‑text system, while early tests showcase impressive coding and reasoning capabilities.

AI evaluationDeepSeekLarge Language Model

0 likes · 5 min read

DeepSeek’s New Model V4? Exploring 1M‑Token Context and Updated Knowledge

AI Insight Log

Feb 12, 2026 · Artificial Intelligence

GLM-5 Unveiled: 744B Parameters, Claude Opus 4.5‑Level Performance, Epic Agent Upgrade

Z.ai released the open‑source GLM‑5 model with 744 billion parameters, 28.5 T tokens of training data, and new Sparse Attention and Slime RL infrastructure, achieving top open‑source rankings and near‑Claude Opus 4.5 performance on Vending Bench 2 and CC‑Bench‑V2 while adding multi‑scenario agent capabilities.

Agentic EngineeringBenchmarkGLM-5

0 likes · 6 min read

GLM-5 Unveiled: 744B Parameters, Claude Opus 4.5‑Level Performance, Epic Agent Upgrade

PMTalk Product Manager Community

Feb 12, 2026 · Industry Insights

How AI Can Transform Government Services: A From‑Zero‑to‑One Case Study

The article analyzes why traditional government portals fail users, outlines a six‑step user journey (search, guide, ask, appointment, processing, evaluation), and shows how large‑language‑model AI can be embedded at each decision point to turn fragmented services into a seamless, user‑centric digital experience.

AICase StudyLarge Language Model

0 likes · 11 min read

How AI Can Transform Government Services: A From‑Zero‑to‑One Case Study

AI Engineering

Feb 12, 2026 · Artificial Intelligence

GLM-5 Unveiled: 744B‑Parameter Model Takes on Claude in Complex Tasks

GLM-5, the new 744‑billion‑parameter open‑source LLM, expands on GLM‑4.5 with GlmMoeDsa architecture, achieves higher HLE benchmark scores than Claude Opus 4.5, demonstrates strong long‑context and agent capabilities, supports vLLM/SGLang, runs on various Chinese chips, and can directly generate Office documents.

AI benchmarksChinese chipsClaude

0 likes · 5 min read

GLM-5 Unveiled: 744B‑Parameter Model Takes on Claude in Complex Tasks

Open Source Tech Hub

Feb 12, 2026 · Artificial Intelligence

How GLM-5 Advances AI with Bigger Scale, Sparse Attention, and Agent Capabilities

GLM-5, a new large language model with 744 B parameters and 28.5 T tokens of training data, introduces DeepSeek sparse attention and an asynchronous RL system called slime, delivering strong benchmark gains on complex system engineering, long‑horizon agent tasks, and surpassing many open‑source competitors.

AIGLM-5Large Language Model

0 likes · 6 min read

How GLM-5 Advances AI with Bigger Scale, Sparse Attention, and Agent Capabilities

Machine Learning Algorithms & Natural Language Processing

Feb 10, 2026 · Artificial Intelligence

Inside GLM-5: 745B Parameters, DeepSeek‑style Sparse Attention, and a 60% Stock Surge

The GLM-5 architecture, uncovered from a GitHub PR, doubles the previous model to 745 B parameters, adopts DeepSeek‑V3 sparse attention and multi‑token prediction, features a 78‑layer MoE with 256 experts, supports a 202K‑token context window, and its rumored test model "Pony Alpha" sparked a 60% rise in Zhipu AI's stock amid a crowded AI release season.

AI Stock ImpactDeepSeekGLM-5

0 likes · 6 min read

Inside GLM-5: 745B Parameters, DeepSeek‑style Sparse Attention, and a 60% Stock Surge

HyperAI Super Neural

Feb 10, 2026 · Artificial Intelligence

WeDLM Diffusion Language Model Tutorial: 3× Faster Inference Than vLLM AR Models

The Tencent WeChat AI team introduces WeDLM, a diffusion language model that, through topological reordering, surpasses autoregressive models on the industrial‑grade vLLM engine with over threefold speedup on math reasoning and up to tenfold in low‑entropy scenarios, and provides a step‑by‑step online tutorial with GPU compute credits.

Diffusion Language ModelGPU computeLarge Language Model

0 likes · 5 min read

WeDLM Diffusion Language Model Tutorial: 3× Faster Inference Than vLLM AR Models

Old Zhang's AI Learning

Feb 9, 2026 · Artificial Intelligence

GLM-5 Emerges First, Built on DeepSeek Tech, Triggering a 40% Stock Surge

An anonymous OpenRouter model dubbed "Pony Alpha" was verified as the new 745B‑parameter GLM-5, which reuses DeepSeek‑V3 architecture, supports sparse attention and multi‑token prediction, and has already caused a near‑40% jump in Zhipu AI’s stock while hinting at upcoming integration into the Transformers library.

DeepSeekGLM-5Large Language Model

0 likes · 3 min read

GLM-5 Emerges First, Built on DeepSeek Tech, Triggering a 40% Stock Surge

AI Insight Log

Feb 5, 2026 · Artificial Intelligence

How 16 Claude Agents Burned $140K to Build a C Compiler in Opus 4.6

Anthropic’s midnight release of Claude Opus 4.6 showcased a $140,000 “stress test” where 16 Claude agents collaboratively wrote a Linux‑compatible C compiler, achieving a 100‑k‑line Rust codebase, while the model also added deep Excel/PPT integration and lifted finance benchmark scores by up to 23 percentage points.

AI code generationAgentic workflowClaude Opus

0 likes · 7 min read

How 16 Claude Agents Burned $140K to Build a C Compiler in Opus 4.6

Design Hub

Feb 5, 2026 · Artificial Intelligence

Inside Sienna’s AI Persona: Architecture, Memory, and Self‑Awareness in OpenClaw

The author explores how the OpenClaw‑based AI persona Sienna is built and evolves—detailing model choices, the memory‑plus‑skills architecture, recent version improvements that cut token usage, and philosophical reflections on turning a tool into a partner with preferences, opinions, and a growing self‑identity.

AI personaLarge Language ModelMemory System

0 likes · 7 min read

Inside Sienna’s AI Persona: Architecture, Memory, and Self‑Awareness in OpenClaw

Network Intelligence Research Center (NIRC)

Jan 31, 2026 · Artificial Intelligence

How Engram Lets Large Models Swap GPU Memory for Cheap RAM to ‘Look Up’ Knowledge

The article dissects DeepSeek’s new Engram architecture, which separates computation from memory by using a large, cheap‑RAM‑based lookup table to store factual knowledge, allowing the transformer’s compute layers to focus on reasoning, dramatically reducing GPU memory demand while improving code, math, and long‑context performance.

EngramGPU memoryLarge Language Model

0 likes · 7 min read

How Engram Lets Large Models Swap GPU Memory for Cheap RAM to ‘Look Up’ Knowledge

SpringMeng

Jan 30, 2026 · Artificial Intelligence

Hands‑On Guide: Build AI Agent Chatbots on Windows with RagFlow

Programmer Xiao Meng walks through a complete Windows setup for AI‑powered customer service agents using RagFlow, covering prerequisites, Docker and Ollama installation, model download, container deployment, configuration of knowledge bases, and testing, based on five real‑world projects.

AI ChatbotDockerLarge Language Model

0 likes · 7 min read

Hands‑On Guide: Build AI Agent Chatbots on Windows with RagFlow

Meituan Technology Team

Jan 29, 2026 · Artificial Intelligence

How LongCat‑Flash‑Thinking‑2601 Achieves Real‑World Generalization for Agents

LongCat‑Flash‑Thinking‑2601, a 560‑billion‑parameter MoE model, combines environment expansion, multi‑environment RL, systematic noise training, a heavy‑thinking reasoning mode, and Zigzag sparse attention to deliver strong benchmark performance and robust real‑world agent capabilities.

Environment ExpansionLarge Language ModelOpen Source

0 likes · 14 min read

How LongCat‑Flash‑Thinking‑2601 Achieves Real‑World Generalization for Agents

Alibaba Cloud Developer

Jan 28, 2026 · Artificial Intelligence

How We Built a High‑Performance AI Rental Advisor with One‑Model Tool‑Use and Reinforcement Learning

This article details the design, challenges, and performance gains of an AI‑driven rental recommendation system that replaces a multi‑agent architecture with a single LLM using dynamic tool‑use, introduces a two‑stage reinforcement‑learning pipeline, and achieves sub‑second latency and higher accuracy for complex rental scenarios.

AI recommendationLarge Language ModelTool Use

0 likes · 19 min read

How We Built a High‑Performance AI Rental Advisor with One‑Model Tool‑Use and Reinforcement Learning

Baobao Algorithm Notes

Jan 27, 2026 · Artificial Intelligence

Putting Kimi K2.5 and Kimi Code to the Test: Real‑World AI Agent Benchmarks

This article presents a hands‑on evaluation of Kimi K2.5 and its open‑source Kimi Code agent across a series of hard‑core prompts, covering Python API generation, cost‑optimized routing, multimodal ECharts visualisation, massive‑scale SQL optimisation, web‑search‑driven research, MoE explanation and video‑to‑code workflows.

AI AgentKimiLarge Language Model

0 likes · 9 min read

Putting Kimi K2.5 and Kimi Code to the Test: Real‑World AI Agent Benchmarks

Old Zhang's AI Learning

Jan 27, 2026 · Artificial Intelligence

Qwen3‑Max‑Thinking Boosts Performance with Test‑Time Scaling—Why It Still Isn’t Open‑Source

Alibaba’s new Qwen3‑Max‑Thinking model adds inference‑time scaling and adaptive tool use, delivering large gains on math, coding, and agent benchmarks while remaining closed‑source, and it offers drop‑in OpenAI‑compatible API access at the cost of higher latency and token usage.

AI benchmarkAdaptive Tool UseLarge Language Model

0 likes · 7 min read

Qwen3‑Max‑Thinking Boosts Performance with Test‑Time Scaling—Why It Still Isn’t Open‑Source

Fun with Large Models

Jan 22, 2026 · Artificial Intelligence

Step‑by‑Step Guide to Calling Locally Deployed LLMs via OpenAI API Format in Python

This tutorial explains the OpenAI‑style request and response schema, demonstrates low‑level API calls with the requests library, compares them to the high‑level openai package, and walks through building a streaming multi‑turn chatbot that interacts with a locally hosted large language model.

ChatbotLarge Language ModelOpenAI API

0 likes · 17 min read

Step‑by‑Step Guide to Calling Locally Deployed LLMs via OpenAI API Format in Python

AI Engineering

Jan 21, 2026 · Artificial Intelligence

Running Large Language Models on Phones: Liquid AI’s LFM2.5‑1.2B‑Thinking Fits in 900 MB

Liquid AI’s LFM2.5‑1.2B‑Thinking model runs entirely on a smartphone with only 900 MB of memory, scores 88 on MATH‑500, 69 on Multi‑IF, and 57 on BFCLv3 benchmarks, outperforms larger rivals, and achieves real‑time speeds on Snapdragon 8 Elite and AMD Ryzen 9 3950X, signaling a shift toward edge AI.

BenchmarkLFM2.5Large Language Model

0 likes · 4 min read

Running Large Language Models on Phones: Liquid AI’s LFM2.5‑1.2B‑Thinking Fits in 900 MB

PaperAgent

Jan 17, 2026 · Artificial Intelligence

How Qwen3‑VL Embedding and Reranker Set New SOTA in Multimodal Retrieval

The article analyzes the Qwen3‑VL‑Embedding and Qwen3‑VL‑Reranker models, detailing their unified vector space, multi‑stage training pipeline, Matryoshka representation learning, quantization techniques, massive synthetic data generation, and benchmark results that push multimodal retrieval performance to a new state‑of‑the‑art.

EmbeddingKnowledge DistillationLarge Language Model

0 likes · 7 min read

How Qwen3‑VL Embedding and Reranker Set New SOTA in Multimodal Retrieval

PaperAgent

Jan 16, 2026 · Artificial Intelligence

How a 4B Model Beats 30B Giants: Inside AgentCPM-Explore’s SOTA Performance

AgentCPM-Explore, a 4‑billion‑parameter open‑source model, achieves state‑of‑the‑art results on long‑range exploration tasks, matching or surpassing larger 8B and even 30B models, thanks to a full‑stack infrastructure, novel training tricks, and extensive benchmark evaluations across eight agent‑centric datasets.

AgentAgentCPM-ExploreBenchmark

0 likes · 10 min read

How a 4B Model Beats 30B Giants: Inside AgentCPM-Explore’s SOTA Performance

PaperAgent

Jan 13, 2026 · Artificial Intelligence

How C2LLM Redefines Code Retrieval with Attention‑Based Pooling

Introducing C2LLM, a contrastive code LLM series that replaces mean and EOS pooling with a multi‑head attention pooling module, achieving top scores on the MTEB‑Code benchmark across 12 tasks and demonstrating cost‑effective, high‑precision code retrieval for both production and AI agent applications.

Code EmbeddingLarge Language ModelMTEB-Code

0 likes · 8 min read

How C2LLM Redefines Code Retrieval with Attention‑Based Pooling

DeWu Technology

Jan 12, 2026 · Mobile Development

How We Built an AI‑Powered Smart Inspection System for Mobile Apps

This article details the design and implementation of an AI‑driven smart inspection platform for a mobile app, covering background challenges, system architecture, core detection features—including layout, visual, consistency, and AI‑operation checks—platform configuration, result feedback, and the measurable improvements achieved.

AI inspectionLarge Language ModelSmart Inspection

0 likes · 19 min read

How We Built an AI‑Powered Smart Inspection System for Mobile Apps

PaperAgent

Jan 10, 2026 · Artificial Intelligence

DeepSeek V4 Unveiled: Why Its Coding Power Beats Claude and GPT

DeepSeek's newly announced V4 model, the successor to its December 2024 V3 release, demonstrates superior coding abilities over Claude and GPT series, details its data composition, infrastructure, training costs, failed experimental attempts, expanded benchmark comparisons, and includes a comprehensive safety report.

AI model analysisCoding performanceDeepSeek

0 likes · 4 min read

DeepSeek V4 Unveiled: Why Its Coding Power Beats Claude and GPT

Bighead's Algorithm Notes

Jan 8, 2026 · Artificial Intelligence

Alpha‑R1: Reinforcement‑Learning‑Driven Large‑Model Alpha Factor Selection

Alpha‑R1 integrates reinforcement learning with an 8‑billion‑parameter LLM to jointly process price and news data, creating context‑aware factor embeddings that outperform traditional quantitative and generic LLM baselines on CSI 300 and CSI 1000 portfolios, demonstrating robust alpha‑decay resistance and zero‑sample generalization.

Large Language Modelalpha factor selectionfinancial AI

0 likes · 16 min read

Alpha‑R1: Reinforcement‑Learning‑Driven Large‑Model Alpha Factor Selection

AI Info Trend

Jan 7, 2026 · Artificial Intelligence

MiroThinker 1.5: 30B Model Beats 1T‑Scale LLMs via Interactive Scaling

Released by the MiroMind team, MiroThinker 1.5 demonstrates that a 30‑billion‑parameter model can match or surpass the performance of 1‑trillion‑parameter LLMs by leveraging Interactive Scaling, achieving top rankings on multiple search benchmarks, dramatically lower inference cost, and open‑source availability for developers.

AI benchmarksLarge Language ModelMiroThinker

0 likes · 6 min read

MiroThinker 1.5: 30B Model Beats 1T‑Scale LLMs via Interactive Scaling

Amap Tech

Dec 29, 2025 · Artificial Intelligence

How G‑Plan Transforms Map Recommendations with AI Agents and Multi‑Demand Planning

This article details how Gaode's G‑Plan combines large‑model AI agents, generative ranking, and spatiotemporal counterfactual DPO to model and prioritize multiple user intents on the home page, presents the system architecture, experimental setup, online gains, and ablation results, and explains how it moves recommendation from passive to proactive planning.

AI recommendationLarge Language Modelintent planning

0 likes · 21 min read

How G‑Plan Transforms Map Recommendations with AI Agents and Multi‑Demand Planning

AI Insight Log

Dec 28, 2025 · Artificial Intelligence

GLM-4.7 Hits Global #6 and Leads Open‑Source LLM Rankings, Outperforming Claude 4.5 Sonnet

GLM-4.7 scores 68 points to rank sixth worldwide and first among open‑source models, surpassing Claude 4.5 Sonnet, with strong reasoning performance, fast generation speed, but higher cost and weaker code‑generation and math abilities compared to rivals.

BenchmarkGLM-4.7Large Language Model

0 likes · 7 min read

GLM-4.7 Hits Global #6 and Leads Open‑Source LLM Rankings, Outperforming Claude 4.5 Sonnet

DataFunTalk

Dec 25, 2025 · Artificial Intelligence

How DeepAgent Redefines General AI Reasoning with Scalable Toolsets

DeepAgent, a new end‑to‑end reasoning agent, integrates autonomous thinking, dynamic tool search, and execution to handle over 16,000 APIs, embodied tasks, and research assistance, achieving state‑of‑the‑art performance on benchmarks like TMDB, ToolBench, ALFWorld, WebShop, and GAIA.

Large Language ModelMemory managementreasoning

0 likes · 15 min read

How DeepAgent Redefines General AI Reasoning with Scalable Toolsets

Xiaomi Tech

Dec 24, 2025 · Artificial Intelligence

DeepLight & AgentMat: Xiaomi and SJTU Launch AI Platform for Light Alloy Design

Xiaomi and Shanghai Jiao Tong University introduced DeepLight, an AI‑driven large‑model for lightweight alloys, together with the AgentMat multi‑agent framework that accelerates the full design cycle tenfold, and the LightAlloy‑Bench benchmark where DeepLight outperforms DeepSeek‑V3 and GPT‑4o by about 20 %.

AIBenchmarkLarge Language Model

0 likes · 8 min read

DeepLight & AgentMat: Xiaomi and SJTU Launch AI Platform for Light Alloy Design

PaperAgent

Dec 23, 2025 · Artificial Intelligence

CATArena: A Competitive Benchmark That Turns Agent Scoring into Evolutionary Learning

CATArena introduces a tournament‑style evaluation framework where AI agents iteratively code, compete, and improve across classic board games, using three‑dimensional quantitative scores to measure strategy programming, global learning, and generalization, and reveals how different LLM‑based agents learn and adapt over multiple rounds.

AI benchmarkAgent evaluationCATArena

0 likes · 8 min read

CATArena: A Competitive Benchmark That Turns Agent Scoring into Evolutionary Learning

DataFunSummit

Dec 20, 2025 · Artificial Intelligence

How AutoHome Built the Cangjie Large Model: From Training Architecture to Real-World AI Applications

This article details AutoHome's end‑to‑end development of the Cangjie large model, covering the training infrastructure with distributed data, pipeline and tensor parallelism, core business use cases such as video script generation and multi‑tool Agent capabilities, inference optimizations through quantization and fast serving frameworks, and future directions for personalized automotive AI services.

Agent AILarge Language ModelQuantization

0 likes · 19 min read

How AutoHome Built the Cangjie Large Model: From Training Architecture to Real-World AI Applications

PaperAgent

Dec 19, 2025 · Artificial Intelligence

Inside Xiaomi’s MiMo‑V2‑Flash: How a Hybrid SWA Design Powers Fast, Efficient AI Reasoning

Xiaomi’s newly open‑sourced MiMo‑V2‑Flash model combines a hybrid sliding‑window/attention architecture with a 309B‑parameter MoE design, delivering top‑tier reasoning, coding and agent performance while introducing the efficient MOPD post‑training paradigm that dramatically reduces RL compute costs.

Hybrid SWALarge Language ModelMOPD

0 likes · 5 min read

Inside Xiaomi’s MiMo‑V2‑Flash: How a Hybrid SWA Design Powers Fast, Efficient AI Reasoning

AI Insight Log

Dec 18, 2025 · Artificial Intelligence

Xiaomi’s New MiMo‑V2‑Flash LLM Rivals DeepSeek‑V3.2 and Near‑GPT‑5 High

Xiaomi’s MiMo‑V2‑Flash, a 309B‑parameter MoE LLM with only 15B active weights, uses Hybrid SWA, Multi‑Token Prediction and Multi‑Teacher On‑Policy Distillation to cut KV‑cache by six times, boost inference speed 2.6×, and achieve performance comparable to DeepSeek‑V3.2, Kimi‑K2 and near‑GPT‑5 High, including a 73.4% SWE‑Bench code‑agent score.

Hybrid SWALarge Language ModelMOPD

0 likes · 7 min read

Xiaomi’s New MiMo‑V2‑Flash LLM Rivals DeepSeek‑V3.2 and Near‑GPT‑5 High

AI Insight Log

Dec 17, 2025 · Artificial Intelligence

Google Unveils Gemini 3 Flash: Free, Lightning‑Fast, and Outperforms Its Predecessor

Google released Gemini 3 Flash without warning, offering Pro‑level intelligence at Flash‑speed, costing just $0.5 per million input tokens and $3 per million output tokens, delivering three‑times faster inference than Gemini 2.5 Pro and surpassing it on benchmarks such as GPQA Diamond (90.4%), SWE‑bench (78.0%) and MMMU‑Pro (81.2%), while being freely accessible to all users and developers via the Gemini app, AI Studio, or API.

BenchmarkGemini 3 FlashGoogle AI

0 likes · 5 min read

Google Unveils Gemini 3 Flash: Free, Lightning‑Fast, and Outperforms Its Predecessor

DataFunTalk

Dec 17, 2025 · Artificial Intelligence

How Large Language Models Unlock Field‑Level Data Lineage at Scale

This talk explains how a data platform tackled massive, heterogeneous enterprise data by using large language models and prompt engineering to automatically extract field‑level lineage from SQL scripts, achieve over 80% coverage, and raise accuracy above 95%, dramatically cutting impact‑analysis time.

AI for data engineeringBig DataLarge Language Model

0 likes · 6 min read

How Large Language Models Unlock Field‑Level Data Lineage at Scale

Design Hub

Dec 12, 2025 · Artificial Intelligence

GPT-5.2 Unveiled: A Cutting-Edge AI Super-Assistant Built for Real-World Work

OpenAI's newly released GPT-5.2 claims to outperform human experts on about 70% of real tasks, achieve a perfect score on the AIME 2025 competition, and deliver dramatic efficiency gains—up to 390× cost reduction—while showcasing impressive examples such as one‑shot ocean shader generation, a full 3D engine built in a single file, and visual‑perception scores rivaling top models.

AI benchmarksAgent AIGPT-5.2

0 likes · 8 min read

GPT-5.2 Unveiled: A Cutting-Edge AI Super-Assistant Built for Real-World Work

AI Insight Log

Dec 11, 2025 · Artificial Intelligence

GPT-5.2 Released: How It Outperforms Claude 4.5 and Gemini 3 Pro

OpenAI’s GPT‑5.2 launch introduces three specialized modes, achieves a record 55.6% score on SWE‑Bench Pro, demonstrates strong front‑end generation, adds a /compact API for long‑context efficiency, offers tiered pricing with cache discounts, and improves safety for younger users.

AI benchmarkingAI safetyGPT-5.2

0 likes · 6 min read

GPT-5.2 Released: How It Outperforms Claude 4.5 and Gemini 3 Pro

Data Party THU

Dec 10, 2025 · Artificial Intelligence

How DeepSeek‑V3.2 Cuts Inference Cost and Boosts Agent Skills with Sparse Attention

DeepSeek's V3.2 release introduces a dual‑model lineup, a Sparse Attention architecture that halves long‑context inference cost, a post‑training reinforcement‑learning pipeline that exceeds 10% of pre‑training compute, and a revamped agent framework that dramatically improves tool‑use and reasoning performance across benchmarks.

DeepSeekLarge Language ModelModel Optimization

0 likes · 11 min read

How DeepSeek‑V3.2 Cuts Inference Cost and Boosts Agent Skills with Sparse Attention

Alibaba Cloud Developer

Dec 9, 2025 · Artificial Intelligence

Building Human‑in‑the‑Loop Agent Workflows with MCP on OpenLM

This article explains how to design and implement Human‑in‑the‑Loop (HITL) interactions for large‑model agents on Alibaba's OpenLM platform, covering the challenges of server‑side execution, MCP transport extensions, tool‑calling patterns, timeout handling, and UI rendering strategies across multiple client devices.

AgentLarge Language ModelMCP

0 likes · 39 min read

Building Human‑in‑the‑Loop Agent Workflows with MCP on OpenLM

Amap Tech

Dec 3, 2025 · Artificial Intelligence

How Gaode’s G‑Action Uses Generative AI to Predict Users’ Next Move

Gaode’s G‑Action framework combines large‑language‑model pre‑training with fine‑tuned generative recommendation to predict a user’s immediate action and destination, transforming static map services into a dynamic, context‑aware experience and delivering measurable gains in click‑through and engagement metrics.

AILarge Language ModelMap Services

0 likes · 15 min read

How Gaode’s G‑Action Uses Generative AI to Predict Users’ Next Move

DataFunTalk

Dec 2, 2025 · Artificial Intelligence

How Agentic RAG, LLM‑Powered Recommendation, and Generative Ranking Are Redefining AI Search

This article reviews three cutting‑edge AI search and recommendation techniques—Alibaba Cloud's Agentic RAG architecture, Huawei Noah's LLM‑enhanced recommendation pipeline, and Baidu's GRAB generative ranking model—detailing their design challenges, multi‑modal retrieval strategies, performance gains, and real‑world deployment results.

AI AgentsAI SearchGenerative Ranking

0 likes · 8 min read

How Agentic RAG, LLM‑Powered Recommendation, and Generative Ranking Are Redefining AI Search

Frontend AI Walk

Dec 2, 2025 · Artificial Intelligence

Understanding LLMs: A Frontend Developer’s Primer on Large Language Models

The article demystifies large language models for frontend developers by likening token prediction to autocomplete, explaining tokens, context windows, temperature, the two-stage training process, and the critical role of prompts, using concrete code examples and analogies to familiar frontend concepts.

Frontend AnalogyLLMLarge Language Model

0 likes · 10 min read

Understanding LLMs: A Frontend Developer’s Primer on Large Language Models

Software Engineering 3.0 Era

Dec 1, 2025 · Artificial Intelligence

Why Enterprises Must Rethink Ontology to Bridge the Last Mile of LLM Deployment

The article explains how a well‑designed enterprise ontology—an explicit business‑level semantic and constraint model—turns large language models from risky, hallucination‑prone tools into safe, auditable AI agents that can act across systems, enforce policies, and become a lasting digital asset.

AI GovernanceEnterprise AILLM integration

0 likes · 15 min read

Why Enterprises Must Rethink Ontology to Bridge the Last Mile of LLM Deployment

Wuming AI

Nov 30, 2025 · Artificial Intelligence

What Exactly Is a Large Language Model? A Simple Guide to AI, Transformers, and How They Work

This article explains the relationship between AI, machine learning, deep learning, and large language models, detailing their evolution, training stages, transformer architecture, attention mechanisms, inference APIs, and practical usage examples, while demystifying common misconceptions about LLM capabilities.

AI FundamentalsLarge Language ModelMachine Learning

0 likes · 10 min read

What Exactly Is a Large Language Model? A Simple Guide to AI, Transformers, and How They Work

Kuaishou Tech

Nov 28, 2025 · Artificial Intelligence

Keye-VL-671B-A37B Leads Vision, Video, and Math Benchmarks

Kwai has open‑sourced its new flagship multimodal model Keye‑VL‑671B‑A37B, which upgrades visual perception, cross‑modal alignment and complex reasoning, achieving top scores on image, video, and mathematical reasoning benchmarks while detailing its architecture, three‑stage pre‑training, post‑training strategies, and future multimodal agent plans.

Large Language ModelOpen Sourcedeep learning

0 likes · 10 min read

Keye-VL-671B-A37B Leads Vision, Video, and Math Benchmarks

AsiaInfo Technology: New Tech Exploration

Nov 28, 2025 · Artificial Intelligence

Boosting 5G Complaint Intent Detection with Large-Model-Enhanced Few-Shot Learning

This paper presents a collaborative framework where a large language model generates high‑quality synthetic samples to augment a lightweight model, dramatically improving few‑shot user‑complaint intent recognition in 5G networks, achieving a 21% boost for rare categories and a 9% overall accuracy gain.

Knowledge DistillationLarge Language Modelcomplaint intent detection

0 likes · 27 min read

Boosting 5G Complaint Intent Detection with Large-Model-Enhanced Few-Shot Learning

Alibaba Cloud Developer

Nov 27, 2025 · Artificial Intelligence

How AI Powers Ethnic Product Categorization for Global E‑Commerce

This article presents an end‑to‑end AI solution that builds a cultural knowledge base and leverages large language models to automatically identify and match ethnic‑specific product categories on a cross‑border e‑commerce platform, reducing mis‑matches from 8.4% to 1.8% and cutting iteration time from days to under one day.

AIKnowledge BaseLarge Language Model

0 likes · 19 min read

How AI Powers Ethnic Product Categorization for Global E‑Commerce

DataFunSummit

Nov 20, 2025 · Artificial Intelligence

How 1688 Reinvented E‑commerce Search with AI‑Powered Generative Retrieval

This article details Alibaba’s 1688 platform’s shift from traditional e‑commerce search to AI‑driven generative retrieval, covering the AI Deep Search 1.0 and 2.0 cascaded frameworks, multimodal capabilities, an end‑to‑end “model‑as‑search‑engine” approach, experimental results, challenges, and future directions.

AIE-commerce SearchGenerative Retrieval

0 likes · 18 min read

How 1688 Reinvented E‑commerce Search with AI‑Powered Generative Retrieval

HyperAI Super Neural

Nov 20, 2025 · Artificial Intelligence

From 9,874 Papers to 15,000 Structures: MOF‑ChemUnity Rebuilds MOF Knowledge for Explainable AI

MOF‑ChemUnity constructs a scalable, extensible knowledge graph that links millions of MOF names and synonyms to over 15,000 crystal structures using LLM‑driven entity matching, enabling accurate, explainable AI‑assisted material discovery, water‑stability prediction, expert recommendation validation, and graph‑enhanced retrieval across diverse applications.

Graph RAGLarge Language ModelMOF

0 likes · 17 min read

From 9,874 Papers to 15,000 Structures: MOF‑ChemUnity Rebuilds MOF Knowledge for Explainable AI

Alibaba Cloud Developer

Nov 19, 2025 · Artificial Intelligence

Building an AI-Powered Proofreading Agent for Media: Architecture, Prompt Engineering, and Evaluation

This article details a practical case study of designing, implementing, and evaluating an AI-driven proofreading agent for a media client, covering background challenges, a three‑layer architecture, prompt engineering techniques, RAG knowledge‑base construction, model selection, fine‑tuning, automated metrics, and lessons learned.

AILarge Language ModelProofreading

0 likes · 26 min read

Building an AI-Powered Proofreading Agent for Media: Architecture, Prompt Engineering, and Evaluation

Wuming AI

Nov 10, 2025 · Artificial Intelligence

What Exactly Is an AI Agent? A Clear, Practical Guide

This article explains the concept of AI agents, contrasting them with chatbots, detailing their ability and structural layers, summarizing academic surveys and whitepapers, and illustrating how agents plan, perceive, and act to autonomously accomplish user‑defined goals.

AI AgentAutonomous PlanningLarge Language Model

0 likes · 9 min read

What Exactly Is an AI Agent? A Clear, Practical Guide

JD Tech Talk

Nov 10, 2025 · Artificial Intelligence

Designing an AI-Powered Experiment Analysis Agent: Architecture, Workflow, and Future Enhancements

This article outlines the motivation, design, architecture, engineering implementation, large‑model selection, and future improvement plans for an AI‑driven experiment analysis agent that integrates data aggregation, modular workflow orchestration, and interactive frontend features to streamline AB‑test insights.

AI AgentLarge Language Modelexperiment analysis

0 likes · 14 min read

Designing an AI-Powered Experiment Analysis Agent: Architecture, Workflow, and Future Enhancements

Efficient Ops

Nov 9, 2025 · Operations

How Tencent’s PCG Achieves Full‑Link Observability and AI‑Powered SRE

The talk details Tencent PCG’s end‑to‑end observability platform, its data‑standardization pipeline, client‑backend session linking, AI‑enhanced SRE Agent with large language models, and the roadmap toward a SaaS offering, illustrating how modern operations integrate AI for rapid fault localization.

AILarge Language ModelMonitoring

0 likes · 17 min read

How Tencent’s PCG Achieves Full‑Link Observability and AI‑Powered SRE

Bighead's Algorithm Notes

Nov 7, 2025 · Artificial Intelligence

Weekly AI Finance Paper Digest (Nov 1‑7 2025)

This digest summarizes three recent AI‑driven finance papers—DeltaLag’s dynamic lead‑lag detection, MS‑HGFN’s multi‑scale graph network for stock movement, and LiveTradeBench’s real‑time LLM trading benchmark—highlighting their methods, datasets, and performance gains.

Graph Neural NetworkLarge Language Modelfinancial AI

0 likes · 8 min read

Weekly AI Finance Paper Digest (Nov 1‑7 2025)

Tencent Advertising Technology

Nov 6, 2025 · Artificial Intelligence

Boosting Web UI Test Efficiency with AIGC: From Manual Scripts to Intelligent Automation

This report examines the challenges of Web UI testing in Tencent's advertising platform, analyzes current inefficiencies, and presents an AIGC-driven solution that leverages large language models, semantic scripts, and automated pipelines to dramatically improve test case generation, execution accuracy, and CI/CD integration.

AIGCCI/CDLarge Language Model

0 likes · 27 min read

Boosting Web UI Test Efficiency with AIGC: From Manual Scripts to Intelligent Automation

Amap Tech

Nov 4, 2025 · Artificial Intelligence

Spacetime‑GR: AI‑Powered Spatiotemporal Model Transforming POI Recommendations

This article introduces Spacetime‑GR, a large‑scale generative recommendation model that integrates hierarchical geographic POI indexing and spatiotemporal token encoding to enhance POI prediction for Amap, detailing its pre‑training pipeline, data cleaning, curriculum learning strategy, experimental results, scaling law observations, and the resulting improvements in hit rate and discovery rate.

AmapGenerative AILarge Language Model

0 likes · 14 min read

Spacetime‑GR: AI‑Powered Spatiotemporal Model Transforming POI Recommendations

21CTO

Nov 4, 2025 · Artificial Intelligence

LongCat-Flash-Omni: How an Open-Source 560B Model Achieves Real-Time Multimodal Mastery

LongCat-Flash-Omni, an open‑source 560 billion‑parameter multimodal model, combines efficient Shortcut‑Connected MoE architecture with advanced perception and speech modules to deliver low‑latency real‑time audio‑video interaction and state‑of‑the‑art performance across text, image, video, and audio tasks.

Efficient InferenceLarge Language ModelReal-Time Interaction

0 likes · 10 min read

LongCat-Flash-Omni: How an Open-Source 560B Model Achieves Real-Time Multimodal Mastery

Meituan Technology Team

Nov 3, 2025 · Artificial Intelligence

LongCat-Flash-Omni: 560B Open‑Source Multimodal Model with Real‑Time Interaction

LongCat-Flash-Omni, the latest open‑source model from Meituan, combines a 560 billion‑parameter architecture, efficient multimodal perception and speech reconstruction modules, and a progressive training strategy to deliver real‑time audio‑video interaction and state‑of‑the‑art performance across text, image, audio, and video tasks.

AIBenchmarkLarge Language Model

0 likes · 9 min read

LongCat-Flash-Omni: 560B Open‑Source Multimodal Model with Real‑Time Interaction

Huawei Cloud Developer Alliance

Nov 3, 2025 · Artificial Intelligence

How AI Agents Are Revolutionizing Technology: The New Engine of Innovation

This article explores the rise of AI agents—from their definition as intelligent digital assistants powered by large language models to their evolution through planning, memory, and tool use—highlighting real‑world applications, core technical mechanisms, code implementations, and future trends such as autonomy, multimodal fusion, standardization, and safety considerations.

AI AgentLarge Language ModelStandardization

0 likes · 24 min read

How AI Agents Are Revolutionizing Technology: The New Engine of Innovation

Data Party THU

Nov 2, 2025 · Artificial Intelligence

From RNN to LLM: How Transformers Power Modern Language Models

This article explains the evolution from RNNs through Encoder‑Decoder models to Transformers, detailing self‑attention, multi‑head attention, and masked attention, and then describes what Large Language Models are, their key components, capabilities, limitations, and common applications.

AILLMLarge Language Model

0 likes · 9 min read

From RNN to LLM: How Transformers Power Modern Language Models

DataFunSummit

Oct 31, 2025 · Artificial Intelligence

How OPPO’s AndesVL Is Revolutionizing On‑Device Multimodal AI

OPPO AI Center introduces AndesVL, an open‑source, fully‑adapted multimodal large model ranging from 0.6B to 4B parameters, designed for high‑performance, privacy‑preserving, low‑latency AI on mobile devices, with advanced architecture, training pipelines, on‑device optimizations, and state‑of‑the‑art benchmark results.

Large Language ModelModel Compressionmobile AI

0 likes · 21 min read

How OPPO’s AndesVL Is Revolutionizing On‑Device Multimodal AI