Tagged articles
27 articles
Page 1 of 1
Old Zhang's AI Learning
Old Zhang's AI Learning
Apr 29, 2026 · Artificial Intelligence

Top 10 Open‑Source LLM Benchmarks: Scores, Rankings, and What They Test

This article walks through ten mainstream open‑source large‑model benchmarks—SWE‑bench Verified and Pro, MMLU‑Pro, GPQA Diamond, HLE, AIME, HMMT, olmOCR‑bench, Terminal‑Bench 2.0, and EvasionBench—explaining their data, evaluation metrics, current leading models, and the capability dimensions they reveal.

AI EvaluationLLM benchmarksMMLU-Pro
0 likes · 20 min read
Top 10 Open‑Source LLM Benchmarks: Scores, Rankings, and What They Test
Machine Heart
Machine Heart
Apr 25, 2026 · Artificial Intelligence

Open‑Source Models Dominate 21 Scientific Discovery Tasks with SimpleTES

The SimpleTES framework decomposes trial‑and‑error into three scalable dimensions—Concurrency, Length, and Candidates—enabling test‑time scaling that lets open‑source models outperform closed‑source rivals across 21 diverse scientific benchmarks, from LASSO regression to quantum circuit compilation.

AI for ScienceOpen-source modelsScientific Discovery
0 likes · 13 min read
Open‑Source Models Dominate 21 Scientific Discovery Tasks with SimpleTES
Old Zhang's AI Learning
Old Zhang's AI Learning
Apr 15, 2026 · Industry Insights

Is the Era of Commercial‑Ready Chinese Open‑Source LLMs Ending? MiniMax M2.7 License Update

The MiniMax M2.7 model switched its open‑source license to forbid commercial use, igniting a heated debate about what constitutes commercial activity, prompting a community clarification that self‑hosted coding remains free, and leading to a revised license that explicitly permits personal, academic, and non‑profit uses while highlighting broader market pressures from cloud providers that are reshaping the open‑source LLM ecosystem.

AI industryKimiLLM licensing
0 likes · 10 min read
Is the Era of Commercial‑Ready Chinese Open‑Source LLMs Ending? MiniMax M2.7 License Update
Old Meng AI Explorer
Old Meng AI Explorer
Apr 8, 2026 · Industry Insights

April 2026 AI Explosion: GPT‑6 Launch, Gemma 4 Open‑Source, and the Rise of Intelligent Agents

April 2026 saw an unprecedented wave of AI announcements—including OpenAI's $122 billion financing and upcoming GPT‑6 release, Google's open‑source Gemma 4 model, Microsoft's vertical AI suite, major Chinese model breakthroughs, a massive Claude Code leak, and emerging trends toward multimodal agents and embodied robotics—shaping the industry's future direction for developers and users alike.

AIGPT-6Google
0 likes · 18 min read
April 2026 AI Explosion: GPT‑6 Launch, Gemma 4 Open‑Source, and the Rise of Intelligent Agents
Architect's Journey
Architect's Journey
Mar 26, 2026 · Artificial Intelligence

How Cursor’s $30B AI Coding Tool Secretly Leverages China’s Kimi K2.5 Model

An API interception revealed that Cursor’s high‑valued AI programming platform relies on Moonshot AI’s Kimi K2.5 model, a trillion‑parameter MoE system, and uses a novel self‑summarization technique to compress context, achieving superior benchmark scores and exposing why Western open‑source models fall short.

AI programmingAgent AICursor
0 likes · 10 min read
How Cursor’s $30B AI Coding Tool Secretly Leverages China’s Kimi K2.5 Model
AI Info Trend
AI Info Trend
Mar 18, 2026 · Industry Insights

How AI Is Reshaping Financial Services by 2026: Trends, ROI, and Future Outlook

A recent Nvidia‑backed report surveyed over 800 financial‑service professionals and reveals that AI adoption has surged to 65%, generative AI use is up 52%, open‑source models and agentic AI are becoming core drivers, delivering measurable revenue growth and cost reductions while shaping investment priorities for 2026.

AIAgentic AIFinancial Services
0 likes · 8 min read
How AI Is Reshaping Financial Services by 2026: Trends, ROI, and Future Outlook
AI Explorer
AI Explorer
Mar 11, 2026 · Industry Insights

Why AI Is Humanity’s Largest Infrastructure Project, Not Just an App

Jensen Huang argues that AI is a five‑layer infrastructure—from energy and chips to data centers, models and applications—forming the biggest construction effort in human history, reshaping jobs, demanding new technical talent, and accelerating growth through open‑source models.

AI InfrastructureAI ecosystemData Centers
0 likes · 10 min read
Why AI Is Humanity’s Largest Infrastructure Project, Not Just an App
Weekly Large Model Application
Weekly Large Model Application
Feb 22, 2026 · Artificial Intelligence

2026 Guide: Pure‑CPU Open‑Source Chinese TTS Models Optimized for Performance

This article reviews the most capable open‑source Chinese text‑to‑speech models that run entirely on CPU in 2026, compares their quantization and speed features, recommends acceleration engines, outlines five hard‑won optimization rules, and provides a concise selection guide for various deployment scenarios.

CPU inferenceChinese TTSONNX Runtime
0 likes · 6 min read
2026 Guide: Pure‑CPU Open‑Source Chinese TTS Models Optimized for Performance
HyperAI Super Neural
HyperAI Super Neural
Jan 23, 2026 · Artificial Intelligence

Embodied AI Resources: Datasets, Modeling, Papers (Nvidia, ByteDance, Xiaomi)

This article compiles a comprehensive set of embodied AI resources, including large‑scale robot learning datasets such as BC‑Z (32 GB) and DexGraspVLA (7 GB), interactive world‑modeling frameworks like HY‑World 1.5, open‑source LLM deployments, and recent research papers from Nvidia, ByteDance, Xiaomi and leading universities, each with download links and brief summaries.

AI research papersEmbodied AIOpen-source models
0 likes · 14 min read
Embodied AI Resources: Datasets, Modeling, Papers (Nvidia, ByteDance, Xiaomi)
Tencent Cloud Developer
Tencent Cloud Developer
Jan 20, 2026 · Artificial Intelligence

From Transformers to Agents: A Complete Timeline of Large Language Model Evolution

This article traces the evolution of large language models from the 2017 Transformer breakthrough through successive milestones such as BERT, GPT‑3, RL‑HF alignment, multimodal extensions, open‑source alternatives, and the rise of retrieval‑augmented generation, AI agents, and emerging protocols that shape modern AI applications.

Open-source modelsPrompt engineeringRAG
0 likes · 44 min read
From Transformers to Agents: A Complete Timeline of Large Language Model Evolution
21CTO
21CTO
Dec 11, 2025 · Artificial Intelligence

Why DeepSeek’s Founder Made Nature’s 2025 Top‑10 Scientists List

Nature’s 2025 “Nature’s 10” list highlighted DeepSeek founder Liang Wenfeng for his breakthrough in AI transparency, noting his open‑weight model’s impact on researchers, while also detailing the model’s low‑cost performance and the other distinguished scientists honored that year.

DeepSeekLiang WenfengNature's 10
0 likes · 3 min read
Why DeepSeek’s Founder Made Nature’s 2025 Top‑10 Scientists List
PaperAgent
PaperAgent
Dec 7, 2025 · Industry Insights

What 1,000 Trillion Tokens Reveal About the Rise of Open‑Source LLMs

A massive 1 000 trillion‑token study by a16z and OpenRouter shows open‑source models now hold a third of the LLM market, programming tasks have surged to over 50 % of usage, role‑play scenarios dominate open‑source traffic, and price elasticity is surprisingly low, reshaping the competitive landscape.

AI MarketIndustry analysisLLM
0 likes · 6 min read
What 1,000 Trillion Tokens Reveal About the Rise of Open‑Source LLMs
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Dec 3, 2025 · Artificial Intelligence

2026 Forecast: How Large‑Model AI Will Evolve After 2025 Breakthroughs

The article reviews the major 2025 breakthroughs in multimodal, open‑source, and deployment technologies for large models and outlines four 2026 trends—including ToC vs. ToB service split, dual‑hand data generation, MoE routing advances, and AI4Science breakthroughs—that will shape the next wave of AI development.

AI deploymentAI4ScienceMixture of Experts
0 likes · 6 min read
2026 Forecast: How Large‑Model AI Will Evolve After 2025 Breakthroughs
Baobao Algorithm Notes
Baobao Algorithm Notes
Nov 11, 2025 · Artificial Intelligence

Why Redesign the Training Stack? Inside Olmo‑Thinking’s Open‑Source RL Journey

This article provides a detailed technical analysis of the Olmo‑Thinking project, covering why a new open‑source LLM was built, the challenges of reinforcement learning at scale, data‑mix optimization, architectural bottlenecks such as missing GQA and QK‑Norm, and the post‑training techniques used to improve reasoning and long‑context capabilities.

Open-source modelsRLVRdata selection
0 likes · 20 min read
Why Redesign the Training Stack? Inside Olmo‑Thinking’s Open‑Source RL Journey
Instant Consumer Technology Team
Instant Consumer Technology Team
Nov 5, 2025 · Artificial Intelligence

Why AI Agents Fail: 70% Failure Rate & How Interleaved Thinking Improves Reliability

Recent CMU and Salesforce studies reveal that top‑tier AI agents like Gemini 2.5 Pro, Claude 3.7 Sonnet and GPT‑4o fail in 69‑70% of multi‑step tasks, but MiniMax‑M2’s Interleaved Thinking reduces failure dramatically, highlighting that execution mechanisms, not model size, are key to reliable AI agents.

BenchmarkOpen-source modelsOpenAI API
0 likes · 17 min read
Why AI Agents Fail: 70% Failure Rate & How Interleaved Thinking Improves Reliability
Continuous Delivery 2.0
Continuous Delivery 2.0
Sep 11, 2025 · Artificial Intelligence

Building Scalable Enterprise RAG: Lessons, Pitfalls, and Proven Solutions

This article shares practical lessons from building a large‑scale enterprise RAG system, covering imperfect data, document quality scoring, hierarchical chunking, metadata design, semantic‑search failures, open‑source model choices, and table handling to achieve reliable AI‑driven search.

Enterprise AIOpen-source modelsRAG
0 likes · 13 min read
Building Scalable Enterprise RAG: Lessons, Pitfalls, and Proven Solutions
Architects' Tech Alliance
Architects' Tech Alliance
Jun 11, 2025 · Artificial Intelligence

From Transformers to DeepSeek‑R1: The 2017‑2025 Evolution of Large Language Models

This article chronicles the rapid development of large language models from the 2017 Transformer breakthrough through the rise of BERT, GPT‑3, ChatGPT, multimodal systems like GPT‑4V/o, and the recent cost‑efficient DeepSeek‑R1, highlighting key architectural innovations, scaling trends, alignment techniques, and their transformative impact on AI research and industry.

AI AlignmentBERTCost‑Efficient Inference
0 likes · 26 min read
From Transformers to DeepSeek‑R1: The 2017‑2025 Evolution of Large Language Models
21CTO
21CTO
May 15, 2025 · Artificial Intelligence

Can AI Soon Write Its Own Software? Insights from OpenAI’s Chief Scientist

In a May 14 interview, OpenAI chief scientist Jakub Pachocki discusses how reasoning models will evolve over the next five years, the growing role of reinforcement learning, whether AI truly “thinks,” upcoming open‑source model releases, and his shifting timeline for achieving artificial general intelligence.

AGIOpen-source modelsOpenAI
0 likes · 6 min read
Can AI Soon Write Its Own Software? Insights from OpenAI’s Chief Scientist
Architects' Tech Alliance
Architects' Tech Alliance
Mar 31, 2025 · Artificial Intelligence

A Comprehensive History of Large Language Models from the Transformer Era (2017) to DeepSeek‑R1 (2025)

This article reviews the evolution of large language models from the 2017 Transformer breakthrough through BERT, GPT series, alignment techniques, multimodal extensions, open‑weight releases, and the cost‑efficient DeepSeek‑R1 in 2025, highlighting key technical advances, scaling trends, and their societal impact.

AI AlignmentLLM evolutionMultimodal AI
0 likes · 26 min read
A Comprehensive History of Large Language Models from the Transformer Era (2017) to DeepSeek‑R1 (2025)
Fighter's World
Fighter's World
Mar 29, 2025 · Industry Insights

A Year in AI: Key Insights from the Unsupervised Learning & Latent Space Podcast

The podcast recap dissects a year of rapid AI change, highlighting surprise‑fast open‑source model releases, shifting foundation‑model dynamics, the rise of GPT wrappers, over‑hyped agents, undervalued memory, product‑market fit debates, infrastructure opportunities, and lingering mysteries like RL in non‑verifiable domains.

AI InfrastructureAI trendsGPT wrappers
0 likes · 22 min read
A Year in AI: Key Insights from the Unsupervised Learning & Latent Space Podcast
ZhongAn Tech Team
ZhongAn Tech Team
Mar 17, 2025 · Artificial Intelligence

Weekly Tech Digest: AI Model Advancements, Strategic Infrastructure Deals, and Industry Insights on AI Agents

This weekly technology digest highlights significant advancements in artificial intelligence, including OpenAI's Python-enabled o1 model, Google's open-source Gemma 3, and Alibaba's AI-driven Quark application, alongside major industry partnerships, expert forecasts on AI agent proliferation, and emerging developments in robotics and wearable technology.

AI agentsOpen-source modelsRobotics
0 likes · 7 min read
Weekly Tech Digest: AI Model Advancements, Strategic Infrastructure Deals, and Industry Insights on AI Agents
Ops Development & AI Practice
Ops Development & AI Practice
Sep 16, 2024 · Industry Insights

Why Mistral AI Is Shaping the Future of Open‑Source Large Language Models

Mistral AI, a French startup founded in 2023, leverages open‑source large language models, efficient architecture, and multimodal research to offer scalable AI solutions across enterprises, content creation, and healthcare, while pursuing a community‑driven strategy that positions it as a rising force in the competitive AI landscape.

AI industryMistral AIMultimodal AI
0 likes · 9 min read
Why Mistral AI Is Shaping the Future of Open‑Source Large Language Models
21CTO
21CTO
Dec 15, 2023 · Artificial Intelligence

Why 2024 Will Be the Year of AI Engineers and LLM‑Driven Apps

The article outlines five major AI engineering trends for 2024—including the rise of AI engineers, evolving LLM tech stacks, open‑source large models, vector databases, and AI agents—highlighting how these shifts will reshape application development and industry competition.

2024 trendsAI EngineeringAI agents
0 likes · 9 min read
Why 2024 Will Be the Year of AI Engineers and LLM‑Driven Apps
DataFunSummit
DataFunSummit
May 4, 2023 · Artificial Intelligence

LLM Ranking Arena: Elo‑Based Competitive Evaluation of Open‑Source Chatbots

A recent study by the LMSYS organization introduces an Elo‑rated, 1v1 battle arena for large language models, ranking open‑source chatbots like Vicuna, Koala, and ChatGLM, while discussing the limitations of traditional benchmarks and the advantages of crowd‑sourced, scalable evaluation.

AI benchmarkingChatbot ArenaElo Rating
0 likes · 7 min read
LLM Ranking Arena: Elo‑Based Competitive Evaluation of Open‑Source Chatbots
IT Architects Alliance
IT Architects Alliance
Apr 20, 2023 · Artificial Intelligence

Overview of Prominent Large Language Models and Instruction‑Finetuned Variants

This article provides a comprehensive overview of major large language models—including GPT series, T5, LaMDA, LLaMA, BLOOM, and others—detailing their architectures, parameter scales, open‑source status, and the evolution of instruction‑fine‑tuning techniques that improve zero‑shot and few‑shot performance.

AI researchInstruction TuningLLM comparison
0 likes · 24 min read
Overview of Prominent Large Language Models and Instruction‑Finetuned Variants