Tagged articles

AI Architecture

130 articles · Page 1 of 2

Jun 28, 2026 · Artificial Intelligence

7 Essential Things to Know About MCP AI (Multi‑Context Prompting)

MCP AI, a multi‑context prompting approach, replaces linear chat interactions by maintaining several active contexts that the model can switch between, solving context‑window limits, improving coherence, and enabling system‑level workflows, while requiring proper role definition, rules, and feedback loops.

AI ArchitectureClaudeCrewAI

0 likes · 7 min read

7 Essential Things to Know About MCP AI (Multi‑Context Prompting)

Code Mala Tang

Jun 26, 2026 · Artificial Intelligence

Which Layer Should Your Self‑Learning Agent Evolve? A Three‑Layer Breakdown

The article dissects self‑learning agents into model, harness, and context layers, evaluates real‑world approaches from Anthropic, Karpathy, DeepMind, Microsoft, and others, and argues that the most valuable learning signal comes from capturing genuine user feedback that most teams overlook.

AI ArchitectureCopilotKitcontext layer

0 likes · 15 min read

Which Layer Should Your Self‑Learning Agent Evolve? A Three‑Layer Breakdown

Data Party THU

Jun 15, 2026 · Artificial Intelligence

Beyond Single-Model Limits: How Collaborative Multi-Agent Architecture Drives AI Evolution

The article examines the shortcomings of single-agent AI systems—such as context overload, lack of specialization, and poor scalability—and explains how multi‑agent architectures with coordinated, specialized agents, shared memory, and parallel execution overcome these issues, offering a roadmap for the next generation of AI platforms.

AI ArchitectureAgent communicationMulti-Agent Systems

0 likes · 8 min read

Beyond Single-Model Limits: How Collaborative Multi-Agent Architecture Drives AI Evolution

AI Engineer Programming

Jun 14, 2026 · Artificial Intelligence

10 RAG Architectures Every AI Engineer Should Master

The article debunks the claim that Retrieval‑Augmented Generation is obsolete, explains why huge context windows are impractical, and systematically presents ten RAG patterns—from basic Naïve RAG to advanced Graph and Multimodal RAG—detailing their trade‑offs, costs, and suitable use cases.

AI ArchitectureEmbedding ModelsRAG

0 likes · 16 min read

10 RAG Architectures Every AI Engineer Should Master

DeepHub IMBA

Jun 2, 2026 · Artificial Intelligence

Multi-Agent Systems: Coordinators, Specialized Agents, and Communication Mechanisms

The article explains why single-agent AI architectures struggle with complex tasks and argues that future AI will rely on multi‑agent systems featuring a coordinator, specialized research, planning, critic, and execution agents, shared memory or message‑passing communication, and hierarchical or decentralized coordination for scalability and robustness.

AI ArchitectureCoordinatorMulti-Agent Systems

0 likes · 8 min read

Multi-Agent Systems: Coordinators, Specialized Agents, and Communication Mechanisms

AI Engineering

Jun 2, 2026 · Artificial Intelligence

Why Your Enterprise AI Looks Impressive Yet Produces Garbage Results

Even with the world’s best large language models, chaotic internal notes, calls, and processes turn enterprise AI output into junk; a five‑layer architecture—capture, retrieval, source‑truth, permission, and feedback—plus a six‑question test can turn a noisy "company brain" into a useful tool, as shown by Single Grain’s dramatic time‑saving results.

AI ArchitectureAutomationEnterprise AI

0 likes · 7 min read

Why Your Enterprise AI Looks Impressive Yet Produces Garbage Results

DaTaobao Tech

Jun 1, 2026 · Artificial Intelligence

Designing LLM‑Friendly Architecture: What Truly Makes an AI‑Friendly System?

The article analyzes how traditional deterministic engineering architectures clash with the probabilistic, semantic, and dynamic nature of LLM‑driven AI, proposing three paradigm shifts and detailing an AI‑Friendly stack—including Multi‑Agent, Context Engineering, and observability—that achieved 95.7% audit accuracy and over 80% efficiency gains in real‑world marketing scenarios.

AI ArchitectureLLMObservability

0 likes · 25 min read

Designing LLM‑Friendly Architecture: What Truly Makes an AI‑Friendly System?

DeepHub IMBA

May 26, 2026 · Artificial Intelligence

Agentic AI Design Patterns: Pros, Cons, and Use Cases of Six Architectures

The article breaks down six common agentic AI design patterns—Single Agent, Sequential Agents, Parallel Agents, Loop & Critic, Coordinator & Sub‑agents, and Sub‑Agents as Tools—detailing their implementation structures, strengths, weaknesses, and ideal application scenarios, helping practitioners choose the right architecture for scalable LLM workflows.

AI ArchitectureAgentic AILLM orchestration

0 likes · 9 min read

Agentic AI Design Patterns: Pros, Cons, and Use Cases of Six Architectures

PaperAgent

May 25, 2026 · Artificial Intelligence

DeepSeek’s Harness: How Agent Harness Engineering Is Shaping the Next LLM Agent Era

The article surveys DeepSeek’s Harness initiative, presenting the Binding‑Constraint Thesis, three‑stage evolution from prompt to harness engineering, the ETCLOVG seven‑layer architecture, and concrete benchmark evidence that harness‑only improvements far outweigh model upgrades, while detailing security, observability, and governance considerations for reliable LLM agents.

AI ArchitectureAgent Harness EngineeringAgent evaluation

0 likes · 12 min read

DeepSeek’s Harness: How Agent Harness Engineering Is Shaping the Next LLM Agent Era

Spring Full-Stack Practical Cases

May 20, 2026 · Artificial Intelligence

RAG vs. LLM Wiki vs. GBrain: Which Architecture Best Powers Agent Memory?

The article analyzes why AI agents forget, then compares three memory architectures—RAG, LLM Wiki, and GBrain—detailing their strengths, weaknesses, scalability, latency, compounding knowledge, and autonomy, and offers guidance on choosing the right approach for different use cases.

AI ArchitectureAgent MemoryLLM Wiki

0 likes · 20 min read

RAG vs. LLM Wiki vs. GBrain: Which Architecture Best Powers Agent Memory?

DataFunTalk

May 19, 2026 · Artificial Intelligence

How Knora’s Ontology‑Enhanced AI Tackles Hallucinations and Execution Gaps in Enterprise Deployments

The article explains how Knora 4.0 combines enterprise‑level ontologies with large‑model capabilities to overcome six common AI challenges—hallucination, instability, weak planning, poor responsiveness, data integration, and long cold‑start cycles—enabling autonomous, auditable execution illustrated by a LED production‑line case that achieved a 70‑fold efficiency boost.

AI ArchitectureAutonomous AgentsEnterprise AI

0 likes · 16 min read

How Knora’s Ontology‑Enhanced AI Tackles Hallucinations and Execution Gaps in Enterprise Deployments

FunTester

May 19, 2026 · Artificial Intelligence

How Memory Layering Makes AI Agents Smarter Over Time

The article explains why default agent memory is fleeting, proposes a two‑layer design of session and long‑term memory with a post‑session “dreaming” integration step, and shows how selective persistence and shared long‑term storage keep agents continuously improving.

AI ArchitectureAgent MemoryDream Integration

0 likes · 8 min read

How Memory Layering Makes AI Agents Smarter Over Time

Architect

May 12, 2026 · Artificial Intelligence

Why Does Past Information Influence Future Decisions? Analyzing Agent Memory Architecture

The article dissects Agent Memory, explaining how past observations are written, managed, and read to affect future tasks, highlighting challenges such as relevance, decay, conflict, security, and offering practical design guidelines and architectural options for production‑grade AI agents.

AI ArchitectureAgent MemoryLLM Agents

0 likes · 31 min read

Why Does Past Information Influence Future Decisions? Analyzing Agent Memory Architecture

DataFunTalk

May 5, 2026 · Artificial Intelligence

How Knora’s Ontology‑Enhanced AI Tackles Hallucinations and Execution Gaps in Enterprise Deployments

The article analyzes Knora 4.0, an ontology‑enhanced AI platform that combines large‑model capabilities with a structured knowledge graph to overcome hallucinations and execution gaps in enterprise deployments, detailing its architecture, autonomous agent Knora Claw, real‑world case studies, and a three‑year roadmap.

AI ArchitectureAutonomous AgentsBusiness Automation

0 likes · 18 min read

Spring Full-Stack Practical Cases

May 3, 2026 · Artificial Intelligence

9 Advanced Retrieval‑Augmented Generation (RAG) Architectures Explained

This article introduces Retrieval‑Augmented Generation (RAG) and systematically details nine distinct RAG architectures—standard, conversational with memory, corrective (CRAG), adaptive, self‑RAG, fusion, HyDE, agentic, and Graph RAG—highlighting their workflows, real‑world examples, advantages, and trade‑offs.

AI ArchitectureGraphRAGLLM

0 likes · 17 min read

9 Advanced Retrieval‑Augmented Generation (RAG) Architectures Explained

DataFunSummit

Apr 28, 2026 · Artificial Intelligence

How Knora’s Ontology‑Enhanced Large Model Solves Hallucination and Execution Gaps in Enterprise AI

The article explains how Knora 4.0 combines enterprise ontologies with large‑model AI to create a unified, autonomous execution loop, addressing six common AI‑deployment challenges, detailing the platform’s architecture, autonomous agents, real‑world case studies, roadmap, and expert round‑table insights.

AI ArchitectureAutonomous AgentsEnterprise AI

0 likes · 17 min read

How Knora’s Ontology‑Enhanced Large Model Solves Hallucination and Execution Gaps in Enterprise AI

Linyb Geek Road

Apr 28, 2026 · Artificial Intelligence

Why Just-in-Time Context Is the Secret to Efficient AI Agents

The article argues that loading prompts, skills, and configuration only when they are needed—just-in-time context—dramatically reduces token consumption, improves precision, and turns AI agents from wasteful code generators into lean, production‑grade assistants.

AI AgentsAI ArchitectureJust-in-Time Context

0 likes · 12 min read

Why Just-in-Time Context Is the Secret to Efficient AI Agents

Architect

Apr 27, 2026 · Artificial Intelligence

Sub-Agent vs Agent Team: Designing Multi-Agent Architectures Around Context Boundaries

The article explains how to choose between Sub‑Agent and Agent Team structures for multi‑agent systems by evaluating whether sub‑tasks share context, need isolation, compression, parallelism, or continuous collaboration, and provides practical guidelines, pitfalls, and a decision framework to avoid over‑engineering.

AI ArchitectureAgent TeamContext Boundaries

0 likes · 18 min read

Sub-Agent vs Agent Team: Designing Multi-Agent Architectures Around Context Boundaries

Architect's Tech Stack

Apr 25, 2026 · Artificial Intelligence

DeepSeek‑V4 Launch: 1.6 T Parameters, 1 M‑Token Context, Programming Skills Lead Open‑Source Rankings

DeepSeek released the V4 series—V4‑Pro (1.6 T total, 49 B active) and V4‑Flash (284 B total, 13 B active)—featuring three architectural upgrades, three inference modes, mixed‑precision FP4/FP8 weights, and benchmark results that place its programming ability at the top of open‑source models while supporting a million‑token context window.

AI ArchitectureDeepSeekLarge Language Model

0 likes · 5 min read

DeepSeek‑V4 Launch: 1.6 T Parameters, 1 M‑Token Context, Programming Skills Lead Open‑Source Rankings

IT Services Circle

Apr 25, 2026 · Artificial Intelligence

Understanding AI Core Concepts: Agent, Skills, Tools, and MCP

The article explains the four core AI components—Agent, Tools, Skills, and MCP—detailing their definitions, roles, the problems they address, and how they interoperate within the Cursor platform to transform a conversational model into a functional digital worker.

AI ArchitectureAgentMCP

0 likes · 13 min read

Understanding AI Core Concepts: Agent, Skills, Tools, and MCP

PaperAgent

Apr 24, 2026 · Artificial Intelligence

DeepSeek‑V4 Open‑Sources Its Million‑Token Architecture and Calls Out Claude Opus 4.6

DeepSeek‑V4’s open‑source report reveals a hybrid CSA/HCA attention design, manifold‑constrained residuals and the Muon optimizer that cut per‑token FLOPs to 27 % and KV‑Cache to 10 % at 1 M tokens, while benchmark results show it outperforms Claude Opus 4.6 on most tasks yet still lags on complex instruction following and multi‑turn dialogue.

AI ArchitectureClaude OpusDeepSeek-V4

0 likes · 11 min read

DeepSeek‑V4 Open‑Sources Its Million‑Token Architecture and Calls Out Claude Opus 4.6

Architect's Must-Have

Apr 23, 2026 · Artificial Intelligence

OpenAI Images 2.0 Deep Dive: How AI Image Generation Enters the “Thinking Era”

The article provides a comprehensive technical analysis of OpenAI's ChatGPT Images 2.0 (gpt‑image‑2), detailing its strategic launch, new autoregressive architecture, integrated reasoning and web‑search capabilities, multi‑image consistency, pricing model, competitive landscape, limitations, and future impact on visual AI workflows.

AI ArchitectureGPT Image 2Multimodal AI

0 likes · 28 min read

OpenAI Images 2.0 Deep Dive: How AI Image Generation Enters the “Thinking Era”

Machine Learning Algorithms & Natural Language Processing

Apr 21, 2026 · Artificial Intelligence

How a 22‑Year‑Old Reversed‑Engineered Mythos into OpenMythos Using MoE and DeepSeek‑Inspired Attention

OpenMythos re‑creates the Claude Mythos architecture as a Recurrent‑Depth Transformer with MoE routing, achieving comparable performance to larger Transformers while using roughly half the parameters, and demonstrates systematic generalization and depth extrapolation through looped inference in latent space.

AI ArchitectureLooped Language ModelsMixture of Experts

0 likes · 6 min read

How a 22‑Year‑Old Reversed‑Engineered Mythos into OpenMythos Using MoE and DeepSeek‑Inspired Attention

Old Zhang's AI Learning

Apr 21, 2026 · Artificial Intelligence

Is DeepSeek V4 Really Launching Next Week? Inside Its Core Architecture

Analyzing the credibility of Yifan Zhang’s brief “V4, next week” tweet, the article examines five supporting signals, details three newly revealed architecture components—Sparse MQA, Fused MoE Mega Kernel, and Manifold‑Constrained Hyper‑Connections—and summarizes V4’s rumored specifications, pricing, and strategic implications.

AI ArchitectureDeepSeekFused MoE

0 likes · 7 min read

Is DeepSeek V4 Really Launching Next Week? Inside Its Core Architecture

PaperAgent

Apr 21, 2026 · Artificial Intelligence

OpenMythos: Rebuilding Claude Mythos with Recursive Transformers and MoE

OpenMythos is an open‑source PyTorch reimplementation of Anthropic's Claude Mythos that uses a mixed‑expert routed recurrent Transformer, introduces Recursive Depth Transformers, Multi‑Latent Attention, and several stability mechanisms, and demonstrates parameter‑efficient scaling backed by empirical studies.

AI ArchitectureClaude MythosMoE

0 likes · 6 min read

OpenMythos: Rebuilding Claude Mythos with Recursive Transformers and MoE

AI Architect Hub

Apr 20, 2026 · Artificial Intelligence

Why LLMs Need RAG: Overcoming Core Limitations and Building Scalable AI Solutions

This article analyzes the fundamental shortcomings of large language models for enterprise use, explains how Retrieval‑Augmented Generation (RAG) bridges those gaps through a detailed offline‑online workflow, and explores emerging trends that will shape the next generation of intelligent AI architectures.

AI ArchitectureEnterprise AIFuture AI

0 likes · 10 min read

Why LLMs Need RAG: Overcoming Core Limitations and Building Scalable AI Solutions

Tech Freedom Circle

Apr 20, 2026 · Artificial Intelligence

Harness Architecture Meets LangChain and LangGraph: The Underlying Integration Logic

The article systematically dissects how Harness’s enterprise‑grade Super Agent architecture leverages LangChain’s component library and LangGraph’s execution engine, detailing dependency relationships, source‑level integration, and a real‑world multimodal customer‑service agent case.

AI ArchitectureDeerFlowHarness

0 likes · 16 min read

Harness Architecture Meets LangChain and LangGraph: The Underlying Integration Logic

Code Mala Tang

Apr 19, 2026 · Artificial Intelligence

Why Real‑World Constraints Define the Success of Claude Code Agents

The analysis of the arXiv paper “Dive into Claude Code” reveals that beyond model loops, the decisive factors for coding agents are practical system design issues such as permission control, context compression, safety, user intervention, and reliable execution in real environments.

AI ArchitectureClaude CodeContext Management

0 likes · 5 min read

Why Real‑World Constraints Define the Success of Claude Code Agents

Architect

Apr 18, 2026 · Artificial Intelligence

Why Multi‑Agent Systems Need More Than Role‑Playing: 5 Coordination Patterns Explained

Anthropic’s recent analysis reveals five multi‑agent coordination patterns—Generator‑Verifier, Orchestrator‑Subagent, Agent Teams, Message Bus, and Shared State—highlighting that the real challenges lie in context boundaries, information flow, verification standards, and termination conditions rather than merely assigning roles.

AI ArchitectureCoordination PatternsInformation Flow

0 likes · 30 min read

Why Multi‑Agent Systems Need More Than Role‑Playing: 5 Coordination Patterns Explained

Machine Heart

Apr 17, 2026 · Artificial Intelligence

Combining Transformers and RNNs: Google’s Memory Caching Unlocks Ultra‑Long Context

Google Research introduces Memory Caching (MC), a technique that gives RNNs growing memory capacity, bridging the gap with Transformers to enable ultra‑long context processing while reducing memory demands, and demonstrates its effectiveness through extensive language‑modeling and recall experiments.

AI ArchitectureGoogle ResearchLong Context

0 likes · 7 min read

Combining Transformers and RNNs: Google’s Memory Caching Unlocks Ultra‑Long Context

Qborfy AI

Apr 15, 2026 · Artificial Intelligence

Why Three AI Agents Beat One: Planner‑Generator‑Evaluator Architecture Explained

The article analyzes why a single AI struggles to self‑evaluate, presents Anthropic’s three‑agent (Planner, Generator, Evaluator) architecture with concrete DAW‑building examples, sprint contracts, cost‑benefit tables, and step‑by‑step processes that show how each role solves specific problems and improves overall quality.

AI ArchitectureEvaluatorcost analysis

0 likes · 24 min read

Why Three AI Agents Beat One: Planner‑Generator‑Evaluator Architecture Explained

FunTester

Apr 14, 2026 · Artificial Intelligence

Why Long-Term Memory Is the Next Frontier for Large Language Models

The article examines how the evolution of large‑language‑model memory is shifting from expanding context windows to building controllable, auditable long‑term memory systems, comparing strategies of OpenAI, Anthropic, Google, Microsoft and Meta, and outlining future trends such as automatic memory policies, multimodal storage, agent‑shared memory, and memory‑reasoning integration.

AI Architecturefuture AI trendslarge language models

0 likes · 8 min read

Why Long-Term Memory Is the Next Frontier for Large Language Models

AI Explorer

Apr 14, 2026 · Artificial Intelligence

OpenAI Launches Spud to Counter Anthropic’s Claude Mythos on Blackwell

OpenAI’s newly announced Spud model directly targets Anthropic’s Claude Mythos, leveraging Nvidia’s Blackwell architecture to shift the AI race from sheer scale toward hardware efficiency, signalling a strategic pivot where performance per compute unit becomes the next competitive benchmark.

AI ArchitectureAnthropicBlackwell

0 likes · 6 min read

OpenAI Launches Spud to Counter Anthropic’s Claude Mythos on Blackwell

AI Engineering

Apr 14, 2026 · Artificial Intelligence

Anthropic’s Multi‑Agent Coordination Guide: 5 Architectures and When to Use Them

When a single AI agent can’t finish a task, Anthropic’s new guide outlines five proven multi‑agent coordination patterns—generate‑validate, orchestrate‑sub‑agent, team, message‑bus, and shared‑state—detailing suitable scenarios, common pitfalls, and a recommendation to start simple and scale only as needed.

AI ArchitectureAnthropicCoordination Patterns

0 likes · 4 min read

Anthropic’s Multi‑Agent Coordination Guide: 5 Architectures and When to Use Them

Software Engineering 3.0 Era

Apr 14, 2026 · Artificial Intelligence

The First Principle of Context Engineering: Mastering the “Just‑Right” Art for AGI

The article explains that as large language models approach their capacity limits, performance is now bounded by the quality of the supplied context, advocating a “just‑right” approach that balances over‑ and under‑feeding through a three‑layer architecture, dynamic context agents, and a central router to enable scalable multi‑agent AI systems.

AI ArchitectureHarness EngineeringLLM

0 likes · 9 min read

The First Principle of Context Engineering: Mastering the “Just‑Right” Art for AGI

AndroidPub

Apr 9, 2026 · Artificial Intelligence

Beyond Prompting: Mastering Harness Engineering to Build Reliable LLM Applications

This article examines the evolution from Prompt Engineering to Context Engineering and finally to Harness Engineering, presenting a six‑layer architecture and practical modules that turn large language models into robust, observable, and maintainable AI systems.

AI ArchitectureHarness EngineeringLLM

0 likes · 28 min read

Beyond Prompting: Mastering Harness Engineering to Build Reliable LLM Applications

AI Step-by-Step

Apr 6, 2026 · Artificial Intelligence

Why Single Agents Fail: Embracing Multi‑Agent Microservice Architecture

When a single AI agent’s logic hits bottlenecks, the article explains how breaking responsibilities into bounded microservice agents, using pipelines for deterministic steps and supervisors for dynamic routing, yields clearer contracts, shared state, easier debugging, and more stable, scalable task execution.

AI ArchitectureMicroservicesOrchestration

0 likes · 12 min read

Why Single Agents Fail: Embracing Multi‑Agent Microservice Architecture

Architecture and Beyond

Apr 4, 2026 · Artificial Intelligence

How Claude Code Structures Its Memory: A Deep Dive into Multi‑Layered Agent Memory Design

This article dissects Claude Code's memory architecture, explaining its four distinct memory layers, file‑based long‑term storage, dynamic retrieval without embeddings, multi‑stage write paths, and session‑compression strategies, while highlighting design trade‑offs and practical takeaways for building robust AI agents.

AI ArchitectureAgent MemoryClaude Code

0 likes · 20 min read

How Claude Code Structures Its Memory: A Deep Dive into Multi‑Layered Agent Memory Design

Advanced AI Application Practice

Apr 3, 2026 · Industry Insights

In-Depth Breakdown of the AI Business Architect Role and Interview Strategies

This article dissects the AI Business Architect position, detailing its true responsibilities, core competency formula, key role personas, supply‑demand matching scenarios, end‑to‑end technical architecture (including RAG and multi‑agent design), evaluation metrics, and provides concrete interview questions with model answers to help candidates prepare effectively.

AI ArchitectureAgent systemsInterview Prep

0 likes · 18 min read

Architect

Apr 1, 2026 · Artificial Intelligence

Inside Claude Code: How Anthropic Built a Secure, Scalable Local Agent Runtime

This article dissects Claude Code’s open‑source repository, revealing how its startup sequence, context assembly, main loop, tool contracts, permission pipeline, and long‑task handling are engineered layer by layer to create a performant, secure local AI agent runtime.

AI ArchitectureAgent RuntimeClaude Code

0 likes · 24 min read

Inside Claude Code: How Anthropic Built a Secure, Scalable Local Agent Runtime

Ray's Galactic Tech

Mar 31, 2026 · Artificial Intelligence

From Single-Node RAG to Scalable Go AI Services: A Hands‑On Architecture Blueprint

This comprehensive guide walks Go engineers through the evolution from a prototype Retrieval‑Augmented Generation (RAG) service to a production‑grade, distributed AI platform, covering architecture, component boundaries, caching strategies, async indexing, observability, security, and step‑by‑step deployment.

AI ArchitectureBackend DevelopmentGo

0 likes · 42 min read

From Single-Node RAG to Scalable Go AI Services: A Hands‑On Architecture Blueprint

Data STUDIO

Mar 30, 2026 · Artificial Intelligence

Why a Single AI Falls Short: Building a Multi‑Agent Expert Team for Superior Reports

The article demonstrates how a monolithic LLM struggles with multi‑dimensional market analysis and shows, through step‑by‑step code, how assembling specialized AI agents for news, technical and financial analysis yields clearer structure, deeper insight, and higher evaluation scores.

AI ArchitectureLLM evaluationLangChain

0 likes · 17 min read

Why a Single AI Falls Short: Building a Multi‑Agent Expert Team for Superior Reports

AI Large-Model Wave and Transformation Guide

Mar 28, 2026 · Artificial Intelligence

From RNNs to Multimodal Agents: A Decade of Transformer Evolution

This article traces the evolution of sequence models from early RNN/LSTM designs through the breakthrough Transformer, its major branches, dense scaling, efficiency‑focused variants, next‑generation linear‑complexity SSMs, and finally multimodal agent architectures, highlighting each stage's strengths, weaknesses, and typical use cases.

AI ArchitectureEfficient AttentionLLM

0 likes · 12 min read

From RNNs to Multimodal Agents: A Decade of Transformer Evolution

Wu Shixiong's Large Model Academy

Mar 28, 2026 · Artificial Intelligence

Mastering Multi‑Agent Systems: Design, Parallel Execution, and Interview Strategies

This article dissects the shortcomings of single‑agent LLM pipelines, introduces the Supervisor‑based Multi‑Agent architecture with LangGraph, demonstrates parallel task execution, robust error handling, and result merging, and provides concrete interview guidance backed by real performance data.

AI ArchitectureError handlingLLM

0 likes · 19 min read

Mastering Multi‑Agent Systems: Design, Parallel Execution, and Interview Strategies

AI Explorer

Mar 27, 2026 · Artificial Intelligence

Why Tsinghua’s Multi‑Intelligence DeepSeek‑R1 Shifts AI from Depth to Width

Tsinghua University and WuWen XinQiong unveil DeepSeek‑R1, a multi‑model AI architecture that prioritizes width over depth, enabling parallel expert models to tackle complex, multi‑format data, addressing single‑model limitations while attracting significant industry investment and posing new engineering challenges.

AI ArchitectureDeepSeek-R1Multi-Model

0 likes · 7 min read

Why Tsinghua’s Multi‑Intelligence DeepSeek‑R1 Shifts AI from Depth to Width

AI Info Trend

Mar 24, 2026 · Artificial Intelligence

How OpenClaw 2.0 Turns AI from Chatbot to Actionable Agent – A Deep Dive

The OpenClaw 2.0 research report maps the evolution from simple chatbots to fully‑actionable AI agents, detailing its market surge, four‑layer memory architecture, zero‑code deployment options, cost‑saving token optimization, and a roadmap that predicts AI agents will reshape personal productivity and enterprise workflows.

AI AgentAI ArchitectureAI trends

0 likes · 6 min read

How OpenClaw 2.0 Turns AI from Chatbot to Actionable Agent – A Deep Dive

Architect

Mar 22, 2026 · Artificial Intelligence

Can Frozen LLMs Keep Learning? Inside Memento‑Skills' Deployment‑Time Learning

The article analyses the Memento‑Skills paper and its open‑source implementation, showing how a frozen large language model can continuously improve by treating skills as external memory, using a five‑step Observe‑Read‑Act‑Feedback‑Write loop, advanced routing, and modular architecture to achieve significant gains on GAIA and HLE benchmarks.

AI ArchitectureAgentDeployment-Time Learning

0 likes · 21 min read

Can Frozen LLMs Keep Learning? Inside Memento‑Skills' Deployment‑Time Learning

SuanNi

Mar 21, 2026 · Artificial Intelligence

Can AI Achieve Human‑Like Autonomous Learning? A Blueprint from Top Researchers

The article analyzes a groundbreaking AI research blueprint proposed by Yann LeCun, Emmanuel Dupoux, and Jitendra Malik, outlining three interacting systems—observation, action, and meta‑control—to enable machines to learn autonomously like infants, while highlighting technical and ethical challenges.

AI ArchitectureMeta Learningautonomous learning

0 likes · 13 min read

Can AI Achieve Human‑Like Autonomous Learning? A Blueprint from Top Researchers

Data Party THU

Mar 21, 2026 · Artificial Intelligence

Why Bigger Context Windows Hurt LLMs and How RAG Still Wins

The article explains that expanding LLM context windows leads to attention dilution and retrieval collapse, degrading answer quality, and argues that Retrieval‑Augmented Generation remains essential because it preserves signal density through focused retrieval and selective prompting.

AI ArchitectureAttention DilutionLLM

0 likes · 8 min read

Why Bigger Context Windows Hurt LLMs and How RAG Still Wins

Alibaba Cloud Developer

Mar 20, 2026 · Artificial Intelligence

Mastering Multi‑Agent Patterns with AgentScope and Spring AI Alibaba

This article analyzes the evolution of enterprise AI from single‑model chat to scalable multi‑agent workflows, explains seven core multi‑agent patterns—including Pipeline, Routing, Skills, Subagents, Supervisor, Handoffs, and Custom Workflow—provides detailed implementation guidance with Java code, and shows how Spring AI Alibaba now natively supports AgentScope orchestration for robust, observable AI applications.

AI ArchitectureAgentScopeJava

0 likes · 23 min read

Mastering Multi‑Agent Patterns with AgentScope and Spring AI Alibaba

Coder Circle

Mar 19, 2026 · Artificial Intelligence

OpenAI’s GPT‑5.4 mini and nano usher in the AI Execution‑Layer era

OpenAI’s March 17 release of GPT‑5.4 mini and nano marks a shift from single‑large‑model AI to a layered architecture with a control plane for complex reasoning and a data plane for high‑frequency tasks, delivering near‑flagship performance at a fraction of the cost and paving the way for hybrid agent systems and micro‑service‑style AI infrastructure.

AI ArchitectureControl PlaneData Plane

0 likes · 8 min read

OpenAI’s GPT‑5.4 mini and nano usher in the AI Execution‑Layer era

Tech Freedom Circle

Mar 19, 2026 · Artificial Intelligence

Failed Alibaba Interview: The 4 RAG Modules and 6 Design Principles You Need

The article dissects a failed Alibaba second‑round interview where the candidate answered only “vector‑search‑enhanced” for a RAG design, and then presents a systematic, four‑module RAG architecture together with six design principles, detailed indexing, query understanding, multi‑path recall, and context generation techniques to help candidates demonstrate comprehensive technical depth.

AI ArchitectureKnowledge GraphMulti‑Path Recall

0 likes · 22 min read

Failed Alibaba Interview: The 4 RAG Modules and 6 Design Principles You Need

DeepHub IMBA

Mar 13, 2026 · Artificial Intelligence

Why Bigger Context Windows Make RAG Essential, Not Redundant

Although expanding LLM context windows seems to eliminate the need for Retrieval‑Augmented Generation, in practice larger windows dilute attention and cause retrieval failures, so RAG remains crucial for filtering high‑signal content and maintaining answer quality.

AI ArchitectureAttention DilutionLLM

0 likes · 7 min read

Why Bigger Context Windows Make RAG Essential, Not Redundant

AI Waka

Mar 13, 2026 · Artificial Intelligence

How to Map Enterprise Workflows to Agentic AI Execution Graphs

This article explores the evolution of Agentic AI, outlines a full lifecycle for designing, deploying, and governing AI agents, presents a reference architecture, and demonstrates a practical case study of automating a customer service desk using agentified workflows.

AI ArchitectureAgentic AIEnterprise Automation

0 likes · 15 min read

How to Map Enterprise Workflows to Agentic AI Execution Graphs

AI Explorer

Mar 12, 2026 · Artificial Intelligence

Nvidia’s Open‑Source Nemotron 3 Super: Hybrid Mamba‑MoE Architecture Boosts Performance and Efficiency

Nvidia’s newly released open‑source 120‑billion‑parameter Nemotron 3 Super uses a hybrid Mamba‑MoE architecture that activates only a fraction of its parameters during inference, delivering up to 300 % faster inference while cutting costs, and its open‑source release aims to set new AI standards, influence ecosystem adoption, and spark a competition between architectural innovation and data quality.

AI ArchitectureMamba-MoENVIDIA

0 likes · 6 min read

Nvidia’s Open‑Source Nemotron 3 Super: Hybrid Mamba‑MoE Architecture Boosts Performance and Efficiency

SuanNi

Mar 7, 2026 · Artificial Intelligence

How HY‑WU Enables Real‑Time Dynamic Parameters for Large‑Scale AI Models

Tencent's HY‑WU architecture introduces functional memory that generates task‑specific parameters on the fly, overcoming catastrophic forgetting and static‑weight limitations, and demonstrates superior performance in image‑editing benchmarks compared to leading open‑source and closed‑source models.

AI ArchitectureTencentdynamic-parameters

0 likes · 12 min read

How HY‑WU Enables Real‑Time Dynamic Parameters for Large‑Scale AI Models

Architect

Mar 4, 2026 · Artificial Intelligence

What Makes a Real Agent Skill Efficient? A Deep Dive into Anthropic’s frontend‑design Skill

This article dissects a real Skill from Anthropic’s open‑source repository, explains the design principles behind its 42‑line SKILL.md, compares Anthropic’s Claude Code and OpenAI’s Codex ecosystems, and extracts practical lessons for building robust, version‑controlled AI Skills.

AI ArchitectureAgent SkillsClaude

0 likes · 18 min read

What Makes a Real Agent Skill Efficient? A Deep Dive into Anthropic’s frontend‑design Skill

JD Tech

Feb 27, 2026 · Artificial Intelligence

Why Agent Skills and MCP Should Work Together, Not Compete

This article clarifies the distinct roles of Agent Skills and Model Context Protocol (MCP), compares their core features, shows how they complement each other through design philosophy and real‑world scenarios, and provides a decision framework for choosing the right tool in AI agent architectures.

AI ArchitectureAgent SkillsAgentic AI

0 likes · 26 min read

Why Agent Skills and MCP Should Work Together, Not Compete

SuanNi

Feb 26, 2026 · Artificial Intelligence

How Alibaba’s Qwen3.5 Series Redefines Efficient Large‑Model Design

Alibaba’s newly released Qwen3.5 series—spanning 27B, 35B, and 122B parameter models—demonstrates how hybrid compute, high‑quality data, and reinforcement‑learning can boost multimodal understanding, ultra‑long‑context handling, and multilingual support while drastically lowering hardware requirements, marking a shift from pure scaling to efficient AI evolution.

AI ArchitectureLong ContextMultimodal AI

0 likes · 7 min read

How Alibaba’s Qwen3.5 Series Redefines Efficient Large‑Model Design

PaperAgent

Feb 23, 2026 · Industry Insights

Why Enterprise AI Fails and How Unified Context Layers Can Unlock True Autonomy

Enterprise AI projects are failing at alarming rates because fragmented context and lack of governance prevent autonomous agents from making decisions, and the Unified Context Layer (UCL) architecture offers a comprehensive solution that operationalizes context graphs, integrates existing systems, and enables truly autonomous, production‑grade AI.

AI ArchitectureAutonomous AgentsEnterprise AI

0 likes · 15 min read

Why Enterprise AI Fails and How Unified Context Layers Can Unlock True Autonomy

Old Zhang's AI Learning

Feb 19, 2026 · Artificial Intelligence

Inside GLM-5: Training Techniques, Architecture Innovations, and Benchmark Performance

The article dissects GLM-5’s 744B‑parameter MoE design, 28.5 T token training corpus, novel Muon Split and MLA‑256 optimizations, DSA sparse attention, a fully asynchronous RL pipeline, extensive domestic chip adaptation, and benchmark results that place it on par with Claude Opus 4.5 and ahead of Gemini 3 Pro.

AI ArchitectureAgentic RLDSA

0 likes · 13 min read

Inside GLM-5: Training Techniques, Architecture Innovations, and Benchmark Performance

Alibaba Cloud Developer

Feb 14, 2026 · Artificial Intelligence

Revamping AliGo’s AI Travel Assistant: Multi‑Agent Architecture & Prompt Engineering

The AliGo travel platform upgraded its AI assistant by replacing a single‑agent workflow with a modular multi‑agent system, introducing dynamic prompt generation, real‑time reasoning chains, context sharing, observability, and a knowledge base, which dramatically improved accuracy, stability, and user experience.

AI ArchitectureAgentScopeKnowledge Base

0 likes · 19 min read

Revamping AliGo’s AI Travel Assistant: Multi‑Agent Architecture & Prompt Engineering

PMTalk Product Manager Community

Feb 13, 2026 · Artificial Intelligence

From Zero to One: Building a Deployable RAG System for Intelligent Customer Service

This article walks product managers through the end‑to‑end design of a Retrieval‑Augmented Generation (RAG) intelligent‑customer‑service system, covering business value, knowledge‑base preparation, hybrid retrieval, prompt‑driven generation, deployment choices, monitoring metrics, and common methodological pitfalls.

AI ArchitectureIntelligent Customer ServicePrompt Engineering

0 likes · 11 min read

From Zero to One: Building a Deployable RAG System for Intelligent Customer Service

AI Software Product Manager

Feb 4, 2026 · Artificial Intelligence

Mastering Agent Skills: A Systematic Guide to Large Model Capabilities

This article traces the evolution of large‑model capabilities from early plugins to the standardized Agent Skills framework, explains the core concepts, technical composition, and progressive disclosure mechanism, and provides a step‑by‑step practical guide for building, configuring, and deploying Skills across ecosystems.

AI ArchitectureAI OperationsAgent Skills

0 likes · 11 min read

Mastering Agent Skills: A Systematic Guide to Large Model Capabilities

大转转FE

Feb 2, 2026 · Artificial Intelligence

Inside Moltbot’s Core Architecture, AI Memory Systems, and ToolRL Advances

This edition of the ZuanZuan Frontend Weekly curates five in‑depth articles covering Moltbot’s underlying gateway architecture, the explosive growth of Moltbook AI agents, practical integration of Alibaba Cloud RDS AI assistants, the design of short‑ and long‑term AI Agent memory systems, and a two‑stage ToolRL approach that dramatically improves AI‑driven recommendation performance.

AI ArchitectureAI OpsAgent Memory

0 likes · 7 min read

Inside Moltbot’s Core Architecture, AI Memory Systems, and ToolRL Advances

Ops Development Stories

Jan 28, 2026 · Artificial Intelligence

Understanding MCP, Agent, Skill, and Rule: How LLMs Differ from Traditional APIs

This article systematically explains the concepts of MCP, Agent, Skill, and Rule from an engineering viewpoint, highlighting their roles, differences from traditional API calls, and how they enable large language models to safely and autonomously interact with external tools.

AI ArchitectureAgentLLM

0 likes · 8 min read

Understanding MCP, Agent, Skill, and Rule: How LLMs Differ from Traditional APIs

PaperAgent

Jan 21, 2026 · Artificial Intelligence

Inside DeepSeek’s FlashMLA Update: What’s New in the MODEL1 Architecture

DeepSeek’s recent FlashMLA update introduces the new MODEL1, featuring a tighter KV-Cache layout, an extra two-stage cache, and a fixed 512×512 head dimension, with four code changes detailed in a public GitHub commit and illustrated by comparative diagrams.

AI ArchitectureDeepSeekFlashMLA

0 likes · 3 min read

Inside DeepSeek’s FlashMLA Update: What’s New in the MODEL1 Architecture

AI Insight Log

Jan 13, 2026 · Artificial Intelligence

Why Bigger LLMs Still Forget Facts – DeepSeek’s Engram Memory Module Explained

This article analyzes DeepSeek’s new Engram module, showing how conditional memory reduces the compute‑only approach of large language models, improves knowledge retrieval, reasoning, long‑context handling, and system efficiency while maintaining strict parameter and FLOP budgets.

AI ArchitectureDeepSeekEngram

0 likes · 15 min read

Why Bigger LLMs Still Forget Facts – DeepSeek’s Engram Memory Module Explained

Architect

Jan 1, 2026 · Artificial Intelligence

How Manifold-Constrained Hyper-Connections Boost Large Model Training Efficiency

DeepSeek’s new paper introduces mHC, a manifold‑constrained version of Hyper‑Connections that stabilizes gradient flow, adds only 6.7% training overhead, and enables reliable training of 27‑billion‑parameter models while improving benchmark performance by about 2%.

AI ArchitectureLarge‑Scale TrainingManifold-Constrained

0 likes · 7 min read

How Manifold-Constrained Hyper-Connections Boost Large Model Training Efficiency

PaperAgent

Jan 1, 2026 · Artificial Intelligence

How Manifold-Constrained Hyper-Connections Boost Large-Scale Model Training Efficiency

The article introduces mHC, a Manifold‑Constrained Hyper‑Connections technique that replaces standard residual links with multiple learned pathways, using double‑stochastic matrices to lock gradients, achieving stable training of 27‑billion‑parameter models with only 6.7% extra compute and superior performance across eight downstream benchmarks.

AI ArchitectureEfficient ImplementationManifold-Constrained

0 likes · 6 min read

How Manifold-Constrained Hyper-Connections Boost Large-Scale Model Training Efficiency

PaperAgent

Dec 31, 2025 · Artificial Intelligence

World Models Meet Embodied AI: The Next Leap for Agentic Systems

The article surveys the rise of agentic AI in 2025, highlights 2026’s shift toward world models combined with embodied intelligence, explains the concept and benefits of world models, and compares three architectural paradigms—modular, sequential, and unified—offering guidance for selecting the best approach.

AI ArchitectureAgentic AIEmbodied Intelligence

0 likes · 8 min read

World Models Meet Embodied AI: The Next Leap for Agentic Systems

Tencent Cloud Developer

Dec 24, 2025 · Backend Development

How IMA Scaled Its AI Knowledge Base from Monolith to Micro‑services

This article walks through the end‑to‑end design of IMA's AI‑driven knowledge base, covering its definition, core business flow, architecture evolution, data ingestion pipelines, management challenges, asynchronous processing, permission modeling, and the business value demonstrated by the prototype.

AI ArchitectureAccess ControlData Consistency

0 likes · 14 min read

How IMA Scaled Its AI Knowledge Base from Monolith to Micro‑services

Baobao Algorithm Notes

Dec 20, 2025 · Artificial Intelligence

How General‑Purpose Agents Are Converging on Claude Code and Deep Agent Designs

The article analyzes the 2025 shift toward a unified "general‑type" agent architecture exemplified by Claude Code and Deep Agent, detailing industry adoption, core technical features, skill‑based extensions, long‑running capabilities, and practical steps for building domain‑specific agents.

AI ArchitectureAgent SkillsClaude Code

0 likes · 25 min read

How General‑Purpose Agents Are Converging on Claude Code and Deep Agent Designs

ShiZhen AI

Dec 5, 2025 · Artificial Intelligence

Can AI Achieve Human‑Like Long‑Term Memory? Inside Google’s Titans Architecture

Google’s newly unveiled Titans architecture tackles AI’s “forgetfulness” by embedding a Neural Long‑Term Memory (LMM) module that updates model weights during inference using a test‑time training approach and a MIRAS surprise metric, enabling over 2 million‑token context with linear O(N) computation and superior benchmark results versus GPT‑4 RAG.

AI ArchitectureGoogle TitansMIRAS

0 likes · 5 min read

Can AI Achieve Human‑Like Long‑Term Memory? Inside Google’s Titans Architecture

ITPUB

Nov 24, 2025 · Artificial Intelligence

Why Memory, Not Size, Is the Next Bottleneck for Large Language Models

In a detailed interview, the CTO of Memory Tensor (Shanghai) explains how limited memory capacity hampers large models, outlines the MemOS memory operating system, discusses information‑theoretic metrics, multimodal extensions, and reinforcement‑learning strategies for scalable, secure, and explainable AI memory management.

AI ArchitectureMultimodal AIinformation theory

0 likes · 23 min read

Why Memory, Not Size, Is the Next Bottleneck for Large Language Models

Data Party THU

Nov 21, 2025 · Artificial Intelligence

Unlocking 2025 Multi-Agent AI: Core Tech, Frameworks, and Emerging Trends

This article analyzes the technical foundations, development frameworks, real‑time inference optimizations, typical industry deployments, and future research directions of multi‑agent systems in 2025, highlighting protocols like FIPA‑ACL and MCP, tools such as LangGraph and ADP3.0, and edge‑computing breakthroughs.

AI ArchitectureDistributed ComputingModel Quantization

0 likes · 16 min read

Unlocking 2025 Multi-Agent AI: Core Tech, Frameworks, and Emerging Trends

AI Large Model Application Practice

Oct 27, 2025 · Artificial Intelligence

Why Context Engineering Is the Next Evolution Beyond Prompt Engineering

The article explains how traditional prompt engineering is giving way to Context Engineering and the Agentic Context Engineering (ACE) framework, which lets large language model agents continuously learn and improve through evolving, well‑structured context without fine‑tuning.

AI ArchitectureAgentic AILLM

0 likes · 12 min read

Why Context Engineering Is the Next Evolution Beyond Prompt Engineering

AI2ML AI to Machine Learning

Oct 13, 2025 · Artificial Intelligence

How Large‑and‑Small Language Model Collaboration Is Shaping the Future

The article argues that combining large, high‑capacity models with lightweight, fine‑tuned small models can cut costs, lower latency, enable specialized vertical tasks, and shift development from chasing ever‑bigger models toward optimal system architectures, outlining key techniques such as state‑space models, knowledge distillation, and staged fine‑tuning.

AI ArchitectureEfficiencyLarge Language Model

0 likes · 3 min read

How Large‑and‑Small Language Model Collaboration Is Shaping the Future

Fun with Large Models

Sep 30, 2025 · Artificial Intelligence

DeepSeek-V3.2 Architecture Breakthrough: A 5‑Minute Guide to Its Core Features

The article introduces DeepSeek-V3.2, highlighting its new DeepSeek Sparse Attention (DSA) that boosts training and inference efficiency by up to 50%, cuts model usage costs dramatically, explains the updated API endpoints, and details the four‑stage post‑training pipeline that underpins the model’s performance improvements.

AI ArchitectureDSADeepSeek-V3.2

0 likes · 8 min read

DeepSeek-V3.2 Architecture Breakthrough: A 5‑Minute Guide to Its Core Features

Data Party THU

Sep 28, 2025 · Artificial Intelligence

Can the OaK Architecture Unlock General AI? A Deep Dive into Continuous Learning and Planning

The article presents Richard Sutton’s OaK architecture—a domain‑general, empirical, open‑ended framework that equips agents with continuously learnable components, meta‑learned step‑sizes, and a five‑stage FC‑STOMP pipeline to build world models, generate sub‑problems, learn options, and plan at run‑time.

AI ArchitectureContinual Learningmeta‑learning

0 likes · 22 min read

Can the OaK Architecture Unlock General AI? A Deep Dive into Continuous Learning and Planning

Tech Freedom Circle

Sep 25, 2025 · Artificial Intelligence

Inside RAGFlow: How Its Microservice Architecture Powers an Enterprise‑Grade Retrieval‑Augmented Generation Platform

This article provides a detailed technical walkthrough of RAGFlow's architecture, covering its microservice design, directory layout, layered structure, cloud‑native deployment, core modules such as DeepDoc, RAG engine, Agent system, and web UI, as well as multi‑tenant isolation, streaming responses, asynchronous task handling, concurrency controls, scalability strategies, and a complete request‑lifecycle example for document upload.

AI ArchitectureDeepDocDocker Compose

0 likes · 26 min read

Inside RAGFlow: How Its Microservice Architecture Powers an Enterprise‑Grade Retrieval‑Augmented Generation Platform

IT Architects Alliance

Sep 17, 2025 · Artificial Intelligence

How Distributed Scheduling Redefines AI Large-Model Training Architecture

The article examines how the explosive compute, storage, network, and fault‑tolerance demands of AI large‑model training force a fundamental redesign of system architecture, covering layered storage, optimized All‑Reduce communication, elastic resource orchestration, observability, and cost‑saving strategies.

AI ArchitectureCompute SchedulingStorage Hierarchy

0 likes · 9 min read

How Distributed Scheduling Redefines AI Large-Model Training Architecture

IT Architects Alliance

Sep 10, 2025 · Cloud Native

How AI, Cloud‑Native, and Platform Engineering Redefine System Architecture in 2024

Amid rapid AI breakthroughs, mature cloud‑native infrastructure, and rising edge computing, architects must adopt platform engineering, event‑driven and composable architectures, and AI‑native designs, while evolving technical and soft skills to meet escalating business complexity and guide technology selection over the next five years.

AI ArchitecturePlatform Engineeringcloud-native

0 likes · 12 min read

How AI, Cloud‑Native, and Platform Engineering Redefine System Architecture in 2024

Architect's Alchemy Furnace

Sep 10, 2025 · Artificial Intelligence

Mastering RAG: Classic Architecture, Challenges, and Evolution Explained

This article outlines the fundamental RAG workflow—from data indexing and querying to advanced modular designs—highlights key challenges such as retrieval accuracy, model robustness, context limits, and performance, and traces the evolution from naive to modular RAG systems.

AI ArchitectureRAGVector Indexing

0 likes · 12 min read

Mastering RAG: Classic Architecture, Challenges, and Evolution Explained

Baobao Algorithm Notes

Sep 10, 2025 · Artificial Intelligence

Qwen3-Next Unveiled: Sparse MoE, Hybrid Attention & Multi‑Token Prediction

A recent Hugging Face pull request reveals Alibaba’s upcoming Qwen3‑Next series, highlighting its extreme‑context, parameter‑efficient design that combines a 1:50 high‑sparsity MoE, a hybrid attention architecture mixing gated attention with Gated DeltaNet, and a Multi‑Token Prediction technique, promising ten‑fold throughput gains for 32K‑plus token contexts.

AI ArchitectureHybrid AttentionQwen3-Next

0 likes · 8 min read

Qwen3-Next Unveiled: Sparse MoE, Hybrid Attention & Multi‑Token Prediction

Architects Research Society

Sep 9, 2025 · Artificial Intelligence

Unlocking AI Autonomy: How Agentic Workflows Transform Complex Processes

Agentic Workflows introduce a dynamic, multi‑step AI orchestration framework that externalizes decision points, embeds observability, and supports branching, looping, and human intervention, enabling autonomous agents to automate intricate workflows across domains such as threat detection, fraud handling, and research assistance.

AIAI ArchitectureAutomation

0 likes · 3 min read

Unlocking AI Autonomy: How Agentic Workflows Transform Complex Processes

Architects Research Society

Sep 6, 2025 · Artificial Intelligence

From Hype to Engineered AI: The Core Architecture Behind Modern AI Apps

This article breaks down the essential components of production‑grade AI applications, covering the intelligent core (model, orchestration, memory), enterprise‑level supporting infrastructure, and critical governance, security, and data‑integrity measures required for reliable AI systems.

AI ArchitectureAI OpsLLM orchestration

0 likes · 4 min read

From Hype to Engineered AI: The Core Architecture Behind Modern AI Apps

Instant Consumer Technology Team

Sep 3, 2025 · Artificial Intelligence

Why Context Modeling Could Replace RAG – Insights from DeepVista CEO Jing Conan Wang

In a two‑hour interview, DeepVista CEO Jing Conan Wang explains how his new "context modeling" paradigm addresses the rigidity, lack of personalization, and performance limits of current RAG‑based AI agents, proposing a dual‑model architecture that learns and adapts context dynamically for faster, more accurate results.

AI ArchitectureLLM OptimizationPersonalized AI

0 likes · 15 min read

Why Context Modeling Could Replace RAG – Insights from DeepVista CEO Jing Conan Wang

AI Algorithm Path

Aug 8, 2025 · Artificial Intelligence

GPT‑5 Is Here: In‑Depth Technical Walkthrough of Architecture, Features, and Benchmarks

OpenAI’s GPT‑5, released on August 7 2025, introduces a unified system with real‑time routing, up to 400 k token context windows, multiple model families, refined safety mechanisms, new API controls, and benchmark results that show it surpasses GPT‑4 across intelligence, coding, instruction following, function calling and multimodal tasks.

AI ArchitectureAPIGPT-5

0 likes · 9 min read

GPT‑5 Is Here: In‑Depth Technical Walkthrough of Architecture, Features, and Benchmarks

Alibaba Cloud Developer

Jul 29, 2025 · Artificial Intelligence

How to Transform Chaotic AI Prompts into Robust System Designs

This article examines the pitfalls of rule‑heavy prompt engineering, introduces a systematic four‑layer architecture for AI prompts, outlines six practical compilation principles, and demonstrates how to rewrite a tangled prompt into a clear, maintainable, and scalable system blueprint.

AI ArchitectureLLMPrompt Engineering

0 likes · 84 min read

How to Transform Chaotic AI Prompts into Robust System Designs

AI Frontier Lectures

Jul 24, 2025 · Artificial Intelligence

State Space Models vs Transformers: Uncovering the Real Trade‑offs in Sequence Modeling

This article analyzes the fundamental differences between state space models (SSM) and Transformer architectures, highlighting their three core components, training efficiency, memory handling, tokenization impact, and empirical performance trade‑offs, and argues why SSMs can outperform Transformers on many sequence tasks.

AI ArchitectureTokenizationTransformers

0 likes · 19 min read

State Space Models vs Transformers: Uncovering the Real Trade‑offs in Sequence Modeling

Architect

Jul 11, 2025 · Artificial Intelligence

How OpenAI’s Zero‑Vector Agentic RAG Redefines AI Knowledge Retrieval

OpenAI’s new non‑vectorized Agentic RAG approach replaces traditional vector search with a hierarchical, multi‑round content selection process, leveraging large‑context models like GPT‑4.1‑mini for efficient document loading, dynamic navigation, and accurate answer generation, while outlining model selection strategies, cost trade‑offs, and production considerations.

AI ArchitectureAgentic RetrievalRAG

0 likes · 15 min read

How OpenAI’s Zero‑Vector Agentic RAG Redefines AI Knowledge Retrieval

Data Thinking Notes

Jun 24, 2025 · Artificial Intelligence

Anthropic’s Multi‑Agent Research System: Architecture, Lessons & 90% Performance Boost

Anthropic’s detailed post explains how its new Research feature uses a multi‑agent architecture with a lead coordinator and parallel sub‑agents, covering design principles, prompt engineering tricks, evaluation methods, production reliability challenges, and the substantial performance gains achieved over single‑agent baselines.

AI ArchitectureLLM researchMulti-Agent Systems

0 likes · 21 min read

Anthropic’s Multi‑Agent Research System: Architecture, Lessons & 90% Performance Boost

Alibaba Cloud Developer

Jun 10, 2025 · Artificial Intelligence

How AI Application Architectures Evolve: From Simple LLM Calls to Guardrails, Routing, and Agents

This article traces the evolution of AI application architectures—from the earliest minimal user‑LLM interaction to advanced designs featuring context enhancement, input/output guardrails, intent routing, model gateways, caching strategies, agent capabilities, monitoring, and inference performance optimizations—providing practical insights and references for developers.

AI ArchitectureAgentCaching

0 likes · 21 min read

How AI Application Architectures Evolve: From Simple LLM Calls to Guardrails, Routing, and Agents

ITFLY8 Architecture Home

Jun 9, 2025 · Artificial Intelligence

What Are Foundation Agents? A Deep Dive into Next‑Gen AI Architectures

This article reviews the 2025 "Advances and Challenges in Foundation Agents" paper, defining the Foundation Agent concept, detailing its seven core components, exploring self‑evolution, multi‑agent collaboration, and the safety and alignment challenges required to build trustworthy, autonomous AI systems.

AI ArchitectureFoundation AgentsMulti-Agent Systems

0 likes · 16 min read

What Are Foundation Agents? A Deep Dive into Next‑Gen AI Architectures

ITFLY8 Architecture Home

Jun 5, 2025 · Artificial Intelligence

Why Large Models Are Redefining Software: The Four AI Tech Drivers

The article explains how rapid AI advances and the AIAgent architecture are reshaping software development, outlines four key technical drivers—embedding, Transformer scaling laws, scenario Moore's law, and LLM OS—and discusses the security, professionalism, and responsibility challenges enterprises face when deploying AI‑native applications.

AI ArchitectureEmbeddingEnterprise AI

0 likes · 6 min read

Why Large Models Are Redefining Software: The Four AI Tech Drivers

Java Web Project

Jun 4, 2025 · Artificial Intelligence

Why DeepSeek V3 Stands Out: Architecture, Performance, and Open‑Source Edge

The article analyzes DeepSeek's rapid adoption, detailing its seven core models, the third‑generation MoE architecture, FP8 mixed‑precision training, 128K context window, benchmark superiority on MMLU/HumanEval/CMMLU, low training cost, and fully open‑source release, while also introducing a companion guide for developers.

AI ArchitectureDeepSeekFP8 training

0 likes · 9 min read

Why DeepSeek V3 Stands Out: Architecture, Performance, and Open‑Source Edge

DataFunSummit

Jun 2, 2025 · Artificial Intelligence

Enterprise Knowledge Brain Powered by Large Models and Knowledge Graphs

This article explains how the rapid development of large language models and knowledge graph technologies creates new opportunities for enterprise knowledge management, outlines the challenges of massive unstructured data, describes the architecture and core data flow of a corporate knowledge brain, and showcases key technologies and real‑world applications.

AI ArchitectureData IntegrationEnterprise AI

0 likes · 13 min read

Enterprise Knowledge Brain Powered by Large Models and Knowledge Graphs

Data Thinking Notes

Apr 20, 2025 · Artificial Intelligence

How Anthropic’s Model Context Protocol (MCP) Enables Seamless AI Integration

This article introduces Anthropic’s open Model Context Protocol (MCP), explaining its basic concepts, motivations, core architecture, components, and workflow, and shows how it standardizes and secures LLM interactions with external data sources, tools, and services.

AI ArchitectureAnthropicLLM integration

0 likes · 7 min read

How Anthropic’s Model Context Protocol (MCP) Enables Seamless AI Integration

Tencent Technical Engineering

Apr 14, 2025 · Artificial Intelligence

MCP Protocol: Technical Principles and Business Applications

The article examines the Model Context Protocol (MCP), detailing its microkernel‑based technical architecture, development timeline from Anthropic’s 2024 release to industry adoption, hands‑on implementation examples, and business use cases such as multi‑agent QQ robots, highlighting MCP’s potential to standardize AI tool integration across industries.

AI ApplicationsAI ArchitectureBusiness Implementation

0 likes · 14 min read

MCP Protocol: Technical Principles and Business Applications