Tagged articles
186 articles
Page 1 of 2
DataFunSummit
DataFunSummit
May 18, 2026 · Artificial Intelligence

How Palantir’s Ontology‑Based Semantic Network Drove 85% Growth and Zero Churn

Palantir’s Q1 2026 revenue jumped 85% while many AI firms saw valuations collapse, and the company attributes its success to replacing cheap‑token LLM wrappers with a deep ontology‑driven semantic network that secures high‑risk AI deployments, creates a durable moat, and delivers unprecedented net‑retention.

AI InfrastructureCompetitive LandscapeEnterprise AI
0 likes · 10 min read
How Palantir’s Ontology‑Based Semantic Network Drove 85% Growth and Zero Churn
Architects' Tech Alliance
Architects' Tech Alliance
May 14, 2026 · Artificial Intelligence

Jensen Huang’s China Visit: Could It Revive GPU Prospects? Inside Nvidia’s DGX H200 Cluster Design

The article reviews the US‑approved export of Nvidia's DGX H200, the lack of deliveries, Jensen Huang’s surprise China trip that may speed approvals, and then provides a detailed technical breakdown of the DGX H200 cluster’s compute and storage networking, topology, optical link choices, and cable count estimates.

AI InfrastructureDGX H200Data Center Networking
0 likes · 8 min read
Jensen Huang’s China Visit: Could It Revive GPU Prospects? Inside Nvidia’s DGX H200 Cluster Design
21CTO
21CTO
May 13, 2026 · Artificial Intelligence

Is AI Entering a Self‑Evolving Era? Baidu’s Robin Li Introduces the Daily Active Agents (DAA) Metric

Robin Li, CEO of Baidu, proposes Daily Active Agents (DAA) as the new AI‑era metric, arguing it better reflects platform value than Token or DAU by counting how many agents deliver results, and outlines a three‑layer evolution of agents, individuals, and organizations supported by a full‑stack AI infrastructure.

AI InfrastructureAI ecosystemAI evolution
0 likes · 10 min read
Is AI Entering a Self‑Evolving Era? Baidu’s Robin Li Introduces the Daily Active Agents (DAA) Metric
DataFunSummit
DataFunSummit
May 13, 2026 · Artificial Intelligence

From RAG to Ontology: Palantir’s Semantic Network Drives 85% Growth and Zero Churn

Amid rapidly commoditized large‑model capabilities, Palantir achieved an 85% YoY revenue surge and zero churn by replacing generic RAG approaches with a deep enterprise ontology that unifies business semantics, creating a durable infrastructure moat while other AI firms see valuation collapse.

AI InfrastructureEnterprise AIOntology
0 likes · 11 min read
From RAG to Ontology: Palantir’s Semantic Network Drives 85% Growth and Zero Churn
Machine Heart
Machine Heart
May 8, 2026 · Industry Insights

How SGLang’s $100M Seed Funding Powers the Next‑Gen Open AI Infrastructure

RadixArk raised a $100 million seed round backed by top hardware and AI investors to turn the open‑source SGLang inference engine and the Miles RL framework into day‑0 standards, aiming to democratize AI infrastructure and eliminate bottlenecks from training to inference.

AI InfrastructureDeepSeek-V4Hardware‑agnostic AI
0 likes · 10 min read
How SGLang’s $100M Seed Funding Powers the Next‑Gen Open AI Infrastructure
Machine Heart
Machine Heart
May 7, 2026 · Industry Insights

Elon Musk Disbands xAI and Allocates 220,000 GPUs to Anthropic

Elon Musk announced the dissolution of xAI, merging its Grok model and X‑related assets into a new SpaceXAI division, while simultaneously granting Anthropic access to over 220,000 Nvidia GPUs and more than 300 MW of compute to boost Claude’s performance and limits.

AI InfrastructureAnthropicClaude
0 likes · 6 min read
Elon Musk Disbands xAI and Allocates 220,000 GPUs to Anthropic
ZhiKe AI
ZhiKe AI
May 6, 2026 · Industry Insights

How WorldClaw Enables AI Agents to Pay On-Chain with Stablecoins

WorldClaw's new WorldRouter lets AI agents settle model‑calling fees on Solana or BNB Chain using the USD1 stablecoin, offering a unified gateway to 300+ models at 30% lower cost while introducing programmable wallets and on‑chain auditability to solve the agent‑authorization bottleneck.

AI InfrastructureWLFIWorldClaw
0 likes · 11 min read
How WorldClaw Enables AI Agents to Pay On-Chain with Stablecoins
Machine Heart
Machine Heart
May 5, 2026 · Artificial Intelligence

Musk’s 550K Nvidia GPUs Achieve Only 11% Utilization – Like Running 60K GPUs

xAI’s massive fleet of roughly 550,000 Nvidia H100 and H200 GPUs in its Memphis and Colossus data centers is operating at a mere 11% model FLOPs utilization, highlighting how scaling to hundreds of thousands of GPUs creates coordination, network, and scheduling bottlenecks that waste most of the hardware’s compute power.

AI InfrastructureGPU utilizationNvidia H100
0 likes · 5 min read
Musk’s 550K Nvidia GPUs Achieve Only 11% Utilization – Like Running 60K GPUs
AI Engineering
AI Engineering
May 4, 2026 · Artificial Intelligence

Why the Big‑Model Race Is Over: Where Real Value Lies in AI Infrastructure

The article argues that the competition over which large language model will dominate is outdated, explaining that true value now comes from building multi‑model routing, context engineering, standardized tool protocols, intelligent orchestration, and robust evaluation layers that turn models into reliable AI infrastructure.

AI InfrastructureMCPOrchestration
0 likes · 6 min read
Why the Big‑Model Race Is Over: Where Real Value Lies in AI Infrastructure
AI Explorer
AI Explorer
May 2, 2026 · Backend Development

Building a High‑Concurrency DeepSeek Middleware with Go

The ds2api project, written in Go, offers a high‑concurrency, plugin‑based middleware that standardizes and converts various AI model APIs into DeepSeek‑compatible requests, delivering tens of thousands of conversions per second with millisecond latency and a simple three‑step setup.

AI InfrastructureDeepSeekGo
0 likes · 6 min read
Building a High‑Concurrency DeepSeek Middleware with Go
High Availability Architecture
High Availability Architecture
Apr 30, 2026 · Artificial Intelligence

Redefining the Backend: How Workers, Triggers, and Functions Turn Agents into First-Class Workers

The article argues that the traditional separation between AI agent harnesses and back‑ends creates debugging complexity, and proposes redefining the backend with three primitives—worker, trigger, and function—so that agents become equivalent to services or queues, enabling real‑time discovery, scalable extensibility, and unified observability across heterogeneous components.

AI InfrastructureAgent Architecturebackend primitives
0 likes · 18 min read
Redefining the Backend: How Workers, Triggers, and Functions Turn Agents into First-Class Workers
AI Explorer
AI Explorer
Apr 29, 2026 · Industry Insights

SenseTime’s ‘Big Device’ Powers the Leap of Chinese AI from Usable to Practical

The article explains how DeepSeek V4’s delayed launch was a strategic move to fully adapt to Huawei’s Ascend chips, with SenseTime’s ‘Big Device’ acting as middleware that fine‑tunes hardware‑level scheduling, enabling million‑token contexts and bringing Chinese AI performance closer to Nvidia‑based systems, while noting remaining throughput challenges.

AI InfrastructureChinese AIDeepSeek-V4
0 likes · 7 min read
SenseTime’s ‘Big Device’ Powers the Leap of Chinese AI from Usable to Practical
Java Tech Enthusiast
Java Tech Enthusiast
Apr 27, 2026 · Operations

Earn 30K CNY/month Guarding DeepSeek’s Data Center on the Mongolian Grasslands

DeepSeek is hiring senior data‑center operations and delivery managers to run its new facility in Ulanqab, Inner Mongolia, offering a 30 K CNY monthly salary and emphasizing a strategy that shifts from algorithmic innovation to low‑cost, high‑efficiency physical infrastructure to support its upcoming V4 trillion‑parameter model.

AI InfrastructureData centerDeepSeek
0 likes · 5 min read
Earn 30K CNY/month Guarding DeepSeek’s Data Center on the Mongolian Grasslands
DataFunSummit
DataFunSummit
Apr 25, 2026 · Big Data

AI‑Era Multimodal Data Lake Infrastructure: TBDS Design, Storage, Compute, and Governance

The article analyzes how Tencent Cloud's TBDS platform tackles the AI era's multimodal data lake challenges through a native storage format (Lance), elastic Ray‑based compute, standardized metadata with Gravitino, and automated governance via Lakekeeper, citing architecture details, performance numbers, and real‑world deployments.

AI InfrastructureBig DataGravitino
0 likes · 13 min read
AI‑Era Multimodal Data Lake Infrastructure: TBDS Design, Storage, Compute, and Governance
DevOps in Software Development
DevOps in Software Development
Apr 21, 2026 · Industry Insights

Can Chinese Tokens Power a Self‑Sufficient AI Ecosystem?

The article argues that China’s AI future depends on a three‑part formula—Chinese models, Chinese GPUs, and Chinese green power—to build an open, distributed infrastructure that reduces reliance on Western super‑brain clouds and creates a sustainable, cost‑effective AI supply chain.

AI InfrastructureAI ecosystemChinese Tokens
0 likes · 9 min read
Can Chinese Tokens Power a Self‑Sufficient AI Ecosystem?
IT Services Circle
IT Services Circle
Apr 19, 2026 · Industry Insights

Why DeepSeek Is Moving Its AI Heart to the Mongolian Grasslands

DeepSeek’s latest hiring push reveals a strategic shift from algorithmic research to building and operating a high‑efficiency data center in Inner Mongolia’s Ulanqab, leveraging low‑temperature climate and existing cloud infrastructure to cut TCO, while gearing up for the upcoming V4 trillion‑parameter model.

AI InfrastructureData centerDeepSeek
0 likes · 5 min read
Why DeepSeek Is Moving Its AI Heart to the Mongolian Grasslands
Machine Heart
Machine Heart
Apr 18, 2026 · Industry Insights

DeepSeek’s First Fundraise: $100B Valuation and $300M Target Amid Talent Exodus

DeepSeek, the Chinese AI startup behind the high‑efficiency DeepSeek‑R1 model, is reportedly seeking at least $300 million at a $100 billion valuation, while shifting to building its own data‑center infrastructure and seeing key researchers depart for rivals, signaling a new financing and operational phase for the company.

AI InfrastructureAI financingDeepSeek
0 likes · 6 min read
DeepSeek’s First Fundraise: $100B Valuation and $300M Target Amid Talent Exodus
DataFunSummit
DataFunSummit
Apr 15, 2026 · Artificial Intelligence

How Relax Powers Scalable Multi‑Modal RL Training with Full Asynchrony

Relax, an open‑source RL training engine built on Megatron‑LM and SGLang, tackles data heterogeneity, system fragility, and role coupling by using a service‑oriented fault‑tolerant architecture, asynchronous pipelines, and multimodal‑native support, achieving up to 76% end‑to‑end speedup over veRL.

AI InfrastructureDistributed SystemsRL training
0 likes · 11 min read
How Relax Powers Scalable Multi‑Modal RL Training with Full Asynchrony
Machine Heart
Machine Heart
Apr 11, 2026 · Industry Insights

OpenAI’s Stargate Project Faces Leadership Exodus and Security Incident

After a Molotov cocktail was thrown at Sam Altman's home, OpenAI’s Stargate initiative suffered a shockwave of senior executive departures, a strategic pivot from self‑built data centers to partner‑driven cloud resources, massive funding commitments, and the suspension of its UK expansion, highlighting deep turmoil in the AI infrastructure race.

AI InfrastructureData CentersLeadership turnover
0 likes · 10 min read
OpenAI’s Stargate Project Faces Leadership Exodus and Security Incident
SuanNi
SuanNi
Apr 10, 2026 · Artificial Intelligence

How Claude Managed Agents Remove the Infrastructure Burden for Enterprise AI

Claude Managed Agents provide a pre‑built sandbox, orchestration, and session layers that let developers launch production‑grade AI agents in days instead of months, cutting costs, boosting success rates, and delivering real‑world enterprise case studies.

AI InfrastructureAutomationClaude
0 likes · 8 min read
How Claude Managed Agents Remove the Infrastructure Burden for Enterprise AI
Top Architecture Tech Stack
Top Architecture Tech Stack
Apr 10, 2026 · Artificial Intelligence

How Claude Managed Agents Slash Agent Development Costs by 500×

Claude Managed Agents, Anthropic's new hosted execution layer, eliminates the infrastructure headaches of building AI agents by providing sandboxing, state persistence, error recovery, and orchestration, enabling developers to create complex, long‑running agents with dramatically lower cost and effort.

AI InfrastructureAnthropicClaude
0 likes · 12 min read
How Claude Managed Agents Slash Agent Development Costs by 500×
AI Architecture Hub
AI Architecture Hub
Apr 10, 2026 · Artificial Intelligence

How Claude Managed Agents Turn AI Assistants into Production-Ready Cloud Workers

Claude Managed Agents, Anthropic's cloud‑hosted AI agent service, lets enterprises embed autonomous bug‑fixing, code‑writing, and reporting bots without building heavy infrastructure, offering managed runtimes, scalable sessions, and API integration while highlighting use‑case categories, architectural design, limitations, and industry impact.

AI AgentsAI InfrastructureAnthropic
0 likes · 11 min read
How Claude Managed Agents Turn AI Assistants into Production-Ready Cloud Workers
Big Data Tech Team
Big Data Tech Team
Apr 9, 2026 · Industry Insights

Why Data Engineers Are the New AI Powerhouses: 4 Core Reasons & Actionable Tips

The article analyzes why data development engineers are becoming more valuable in the AI era, outlining four core reasons—including data‑driven AI limits, the rise of RAG architectures, heightened data compliance, and a talent shortage—while offering concrete advice on mastering real‑time pipelines, unstructured data, and AI infrastructure.

AI InfrastructureBig DataRAG
0 likes · 8 min read
Why Data Engineers Are the New AI Powerhouses: 4 Core Reasons & Actionable Tips
Design Hub
Design Hub
Mar 28, 2026 · Artificial Intelligence

Why Harness Engineering Is Emerging as a New Kind of Company

The AI community is shifting its focus from model performance to building runnable, observable, and scalable agent systems, a trend illustrated by the rise of Harness Engineering, Open Agents Company, and Agent Matrix across X discussions, GitHub projects, and developer meetups.

AI AgentsAI InfrastructureAgent Matrix
0 likes · 14 min read
Why Harness Engineering Is Emerging as a New Kind of Company
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Mar 28, 2026 · Artificial Intelligence

Junyang Lin’s 10k‑Word Review: From Reasoning to Agentic Thinking in Large Models

In a detailed post‑departure analysis, Junyang Lin reviews two years of large‑model evolution, explains how o1 and DeepSeek‑R1 highlighted the limits of pure reasoning, and argues that the next breakthrough lies in agentic thinking that integrates environment interaction, tool use, and robust reinforcement‑learning infrastructure.

AI InfrastructureModel Evaluationagentic thinking
0 likes · 18 min read
Junyang Lin’s 10k‑Word Review: From Reasoning to Agentic Thinking in Large Models
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Mar 24, 2026 · Artificial Intelligence

How Hologres + Mem0 Deliver Low‑Cost, High‑Performance Long‑Memory for LLMs

This article explains how the combination of Hologres, a unified real‑time data warehouse, and Mem0, an open‑source LLM memory framework, overcomes the limited context window of large language models by providing scalable, low‑latency, and cost‑effective long‑term memory for AI applications.

AI InfrastructureHologresLLM
0 likes · 11 min read
How Hologres + Mem0 Deliver Low‑Cost, High‑Performance Long‑Memory for LLMs
AI Explorer
AI Explorer
Mar 19, 2026 · Industry Insights

Nvidia Unveils Physical AI Infrastructure: Turning Virtual Thinkers into Real-World Actors

At GTC 2026, Nvidia introduced a comprehensive physical AI platform built on the upgraded Omniverse, aiming to bridge virtual simulations with real-world robotics, industrial automation, and autonomous vehicles, positioning the company as a systemic infrastructure provider for the emerging AI‑driven manufacturing era.

AI InfrastructureDigital TwinIndustrial Robotics
0 likes · 5 min read
Nvidia Unveils Physical AI Infrastructure: Turning Virtual Thinkers into Real-World Actors
AI Explorer
AI Explorer
Mar 16, 2026 · Artificial Intelligence

HyperOffload: A New Storage Paradigm Aiming to Break the AI Memory Wall

HyperOffload, a joint effort by Shanghai Jiao Tong University and Huawei’s MindSpore team, proposes a dynamic tensor offloading system that moves data between GPU memory, CPU RAM, and SSDs, aiming to overcome the “memory wall” that limits trillion‑parameter AI model training and deployment.

AI InfrastructureAI memory wallGPU Memory Management
0 likes · 6 min read
HyperOffload: A New Storage Paradigm Aiming to Break the AI Memory Wall
JD Cloud Developers
JD Cloud Developers
Mar 16, 2026 · Operations

Why Traditional Monitoring Fails for AI Supercomputing and How to Build Next‑Gen Intelligent Monitoring

In the era of hundred‑thousand‑GPU clusters and trillion‑parameter models, conventional monitoring can no longer rely on simple alerts; it must become an observability system that quantifies training and inference performance, breaks data silos across data centers, servers, and networks, and provides business‑aware insights for AI infrastructure.

AI Infrastructurelarge models
0 likes · 10 min read
Why Traditional Monitoring Fails for AI Supercomputing and How to Build Next‑Gen Intelligent Monitoring
Black & White Path
Black & White Path
Mar 13, 2026 · Information Security

Beware: Generative AI as a New Cybercrime Ally—13 Enterprise Attack Vectors

The article analyzes how generative AI is transforming cybercrime by enabling 13 distinct attack methods—from highly personalized phishing emails and AI‑assisted malware creation to automated vulnerability hunting, deep‑fake social engineering, malicious LLMs, and attacks on AI infrastructure—highlighting recent research data and real‑world examples that illustrate the heightened speed, stealth, and accessibility of modern threats.

AI InfrastructureLLM Securitycybercrime
0 likes · 13 min read
Beware: Generative AI as a New Cybercrime Ally—13 Enterprise Attack Vectors
AI Explorer
AI Explorer
Mar 12, 2026 · Industry Insights

Nvidia’s $26 B Bet on Open‑Source AI Models: Redefining the Industry’s Foundations

Nvidia is committing $26 billion to open‑source AI models, shifting from a pure hardware supplier to shaping the entire AI stack—from chips and system software to frameworks and applications—while raising questions about ecosystem lock‑in, competition with newcomers like DeepSeek, and the future of AI infrastructure.

AI InfrastructureAI ecosystemAI strategy
0 likes · 7 min read
Nvidia’s $26 B Bet on Open‑Source AI Models: Redefining the Industry’s Foundations
AI Explorer
AI Explorer
Mar 11, 2026 · Industry Insights

Why AI Is Humanity’s Largest Infrastructure Project, Not Just an App

Jensen Huang argues that AI is a five‑layer infrastructure—from energy and chips to data centers, models and applications—forming the biggest construction effort in human history, reshaping jobs, demanding new technical talent, and accelerating growth through open‑source models.

AI InfrastructureAI ecosystemData Centers
0 likes · 10 min read
Why AI Is Humanity’s Largest Infrastructure Project, Not Just an App
Didi Tech
Didi Tech
Mar 11, 2026 · Cloud Native

How Huatuo Now Monitors MetaX GPUs for Cloud‑Native AI Workloads

Huatuo, the open‑source deep‑observability platform backed by Didi, now supports real‑time monitoring of MetaX GPUs, offering detailed hardware metrics via Docker or Kubernetes deployments and exposing them through a /metrics endpoint for cloud‑native AI and operations use cases.

AI InfrastructureCloud NativeGPU monitoring
0 likes · 4 min read
How Huatuo Now Monitors MetaX GPUs for Cloud‑Native AI Workloads
AI Info Trend
AI Info Trend
Mar 6, 2026 · Industry Insights

Why AI Is Becoming the New Utility: Key Insights from Deloitte’s 2026 Tech Trends

Deloitte’s 2026 Technology Trends report reveals AI’s shift from experimental labs to essential infrastructure, outlines five major trends—including physical AI, AI agents, hybrid AI infrastructure, AI‑native organizations, and AI‑driven security—and offers actionable steps for enterprises to seize the emerging growth window.

AIAI InfrastructureDigital Transformation
0 likes · 8 min read
Why AI Is Becoming the New Utility: Key Insights from Deloitte’s 2026 Tech Trends
PaperAgent
PaperAgent
Mar 5, 2026 · Artificial Intelligence

Bridging Agent Runtime and RL: Inside the Claw‑R1 Training Framework

Claw‑R1, a new reinforcement‑learning framework from the USTC Cognitive Intelligence Lab, integrates the OpenClaw Agent Runtime with RL training to enable agents to learn directly in real environments, addressing the gap between simulated tasks and true tool‑calling, multi‑step reasoning, and stable long‑task execution.

AI InfrastructureClaw-R1OpenClaw
0 likes · 10 min read
Bridging Agent Runtime and RL: Inside the Claw‑R1 Training Framework
SuanNi
SuanNi
Feb 27, 2026 · Artificial Intelligence

How Dual‑Channel Loading Doubles LLM Inference Throughput

The article analyzes the storage‑bandwidth bottleneck of agent‑style large language models, explains why traditional pre‑fill and decode architectures underutilize network resources, and details a dual‑channel loading and smart scheduling design that unlocks idle bandwidth, achieving up to 1.9× higher throughput in both offline and online inference workloads.

AI InfrastructureDual-Channel LoadingInference Optimization
0 likes · 14 min read
How Dual‑Channel Loading Doubles LLM Inference Throughput
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Feb 27, 2026 · Artificial Intelligence

Can DeepSeek’s DualPath Break GPU Bottlenecks and Ignite an Agentic AI Surge?

DeepSeek’s new DualPath inference framework, co‑developed with leading Chinese universities, decouples compute from KV‑Cache memory access to eliminate I/O stalls in multi‑round agentic workloads, delivering up to nearly 2× higher throughput and dramatically reducing job‑completion time across several large‑scale LLMs.

AI InfrastructureAgentic InferenceDeepSeek
0 likes · 13 min read
Can DeepSeek’s DualPath Break GPU Bottlenecks and Ignite an Agentic AI Surge?
Tencent Technical Engineering
Tencent Technical Engineering
Feb 27, 2026 · Artificial Intelligence

What Will AI Look Like in 2026? Insights from 8 Tech Giants

This article compiles and analyzes 2026 AI trend reports from eight leading technology companies, highlighting key themes such as AI agents, infrastructure, application scenarios, safety regulations, quantitative metrics, and shared consensus points to forecast the next phase of AI development.

2026 predictionsAI AgentsAI Governance
0 likes · 14 min read
What Will AI Look Like in 2026? Insights from 8 Tech Giants
Black & White Path
Black & White Path
Feb 26, 2026 · Information Security

13 Ways Attackers Leverage Generative AI to Exploit Systems

The article outlines thirteen distinct techniques by which cybercriminals exploit generative AI—from hyper‑personalized phishing and AI‑driven malware creation to AI‑coordinated espionage, deep‑fake social engineering, and attacks on AI infrastructure—backed by expert quotes, research findings, and concrete case studies.

AI AgentsAI Infrastructureattack vectors
0 likes · 14 min read
13 Ways Attackers Leverage Generative AI to Exploit Systems
Design Hub
Design Hub
Feb 16, 2026 · Industry Insights

Three AI Industry Shifts in Feb 2026: Open‑Source, Talent, and Infrastructure

In February 2026 three pivotal AI developments—OpenAI hiring OpenClaw founder Peter Steinberger, Alibaba unveiling the trillion‑parameter Qwen3‑Max‑Thinking model, and Cloudflare launching Markdown for Agents—illustrate how open‑source collaboration, talent mobility, and AI‑native infrastructure are reshaping the sector.

AI AgentsAI InfrastructureCloudflare
0 likes · 14 min read
Three AI Industry Shifts in Feb 2026: Open‑Source, Talent, and Infrastructure
JD Tech Talk
JD Tech Talk
Jan 30, 2026 · Artificial Intelligence

How JD’s 9N‑LLM Engine Powers Scalable Generative Recommendation at Billion‑Scale

This article details JD Retail’s 9N‑LLM unified training engine, explaining the background of generative recommendation, the challenges of massive sparse and dense parameters, and the multi‑framework, multi‑hardware solutions—including efficient sample processing, large‑scale sparse embedding, dense scaling, UniAttention acceleration, and reinforcement‑learning integration—that enable industrial‑scale deployment.

AI InfrastructureGenerative RecommendationLarge-Scale Training
0 likes · 26 min read
How JD’s 9N‑LLM Engine Powers Scalable Generative Recommendation at Billion‑Scale
Tencent Technical Engineering
Tencent Technical Engineering
Jan 23, 2026 · Artificial Intelligence

Unlocking AI Infra: Distributed Inference, PD Separation, TileLang, and Next‑Gen Agent Infrastructure

This article surveys the 2025 AI infrastructure landscape, covering distributed inference with PD‑separation, dynamic DOPD scheduling, AFD attention‑FFN disaggregation, high‑bandwidth cross‑machine communication libraries, the TileLang programming model, RL train‑inference decoupling via SeamlessFlow, and secure, low‑latency agent infra designs for future large‑scale models.

AI InfrastructureAgent SystemsDistributed inference
0 likes · 27 min read
Unlocking AI Infra: Distributed Inference, PD Separation, TileLang, and Next‑Gen Agent Infrastructure
AI Engineering
AI Engineering
Jan 23, 2026 · Industry Insights

vLLM Core Team Launches Inferact, Secures $150M Seed Funding

The vLLM core maintainers have founded Inferact, raised a $150 million seed round led by Andreessen Horowitz and Lightspeed, and highlighted escalating inference challenges, the project's ecosystem dominance, and a continued commitment to open‑source development.

AI InfrastructureInferactLLM inference
0 likes · 3 min read
vLLM Core Team Launches Inferact, Secures $150M Seed Funding
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 6, 2026 · Artificial Intelligence

How Tair‑KVCache‑HiSim Simulates LLM Inference 390 000× Faster with <5% Error

This article explains the design, challenges, and high‑fidelity architecture of Tair‑KVCache‑HiSim, a simulation tool that models multi‑level KV‑Cache behavior for large‑language‑model inference, predicts latency, throughput and cost under SLO constraints, and validates its predictions against real GPU deployments with sub‑5% error.

AI InfrastructureCost OptimizationKVCache
0 likes · 32 min read
How Tair‑KVCache‑HiSim Simulates LLM Inference 390 000× Faster with <5% Error
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Jan 5, 2026 · Artificial Intelligence

How Baidu Tianchi Supernodes Supercharge Large‑Model Inference: Architecture, Deployment, and Optimization

This article details Baidu's Tianchi supernode design and software tuning—covering hardware scale‑up, deployment planning, Prefill and Decode stage optimizations, quantization strategies, and communication schemes—to dramatically boost large‑model inference throughput and latency while lowering token‑cost.

AI InfrastructureParallelismPerformance Optimization
0 likes · 20 min read
How Baidu Tianchi Supernodes Supercharge Large‑Model Inference: Architecture, Deployment, and Optimization
Fighter's World
Fighter's World
Jan 2, 2026 · Artificial Intelligence

How AI Agents Are Redefining Systems of Record into Decision‑Making Engines

The article argues that AI agents will transform traditional Systems of Record, which only store outcomes, into next‑generation decision‑capturing Systems of Action by introducing event‑driven Context Graphs, addressing blind spots, technical challenges, and outlining strategic business paths for this paradigm shift.

AI AgentsAI InfrastructureContext Graph
0 likes · 30 min read
How AI Agents Are Redefining Systems of Record into Decision‑Making Engines
Fighter's World
Fighter's World
Dec 26, 2025 · Industry Insights

Where Is AI Heading in 2026 After the 2025 Sprint?

The article analyzes the rapid weekly turnover of leading LLM benchmarks in 2025, declining compute costs, the shift from chatbots to multi‑step agents, the widening pilot‑to‑production gap, and predicts that 2026 will be defined by infrastructure constraints, AI‑first product design, and accelerated enterprise adoption.

AI InfrastructureAI product strategyAI trends
0 likes · 25 min read
Where Is AI Heading in 2026 After the 2025 Sprint?
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 24, 2025 · Artificial Intelligence

Boosting LLM Inference: RoleBasedGroup & Mooncake for Stable, High‑Performance Service

Large language model inference faces memory pressure, but by externalizing KVCache with Mooncake and orchestrating roles via the Kubernetes‑native RoleBasedGroup (RBG), developers can achieve stable, high‑throughput, cost‑effective serving with seamless in‑place upgrades and topology‑aware performance.

AI InfrastructureKVCacheKubernetes
0 likes · 21 min read
Boosting LLM Inference: RoleBasedGroup & Mooncake for Stable, High‑Performance Service
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Dec 24, 2025 · Artificial Intelligence

How Context Parallelism Slashes LLM First‑Token Latency by 80% for 128K Tokens

The article explains how the newly merged Context Parallelism (CP) technique in SGLang, combined with DeepSeek V3.2's Sparse Attention architecture, reduces first‑token latency by up to 80% and alleviates memory pressure for ultra‑long 128K‑token sequences, detailing both algorithmic innovations and engineering solutions.

AI InfrastructureContext ParallelismDistributed inference
0 likes · 10 min read
How Context Parallelism Slashes LLM First‑Token Latency by 80% for 128K Tokens
Fighter's World
Fighter's World
Nov 28, 2025 · Artificial Intelligence

Is Gemini 3 Pro Google’s New Starting Point? An In‑Depth Technical and Market Analysis

The article examines Google’s Gemini 3 Pro launch, highlighting its full‑stack vertical integration, advanced System 2 reasoning, dynamic compute budgeting, native multimodal architecture, TPU cost advantages, the Antigravity IDE platform, generative UI capabilities, and the strategic implications for Google’s AI ecosystem and competitive positioning.

AI InfrastructureAntigravityGemini 3 Pro
0 likes · 32 min read
Is Gemini 3 Pro Google’s New Starting Point? An In‑Depth Technical and Market Analysis
Data Party THU
Data Party THU
Nov 25, 2025 · Artificial Intelligence

What $47,000 Taught Us About Deploying Multi‑Agent AI Systems

After spending $47,000 running four LangChain agents in production, we reveal the hidden costs of A2A communication and Anthropic’s MCP, expose seven common deployment pitfalls, and argue that dedicated AI infrastructure is essential for scalable multi‑agent systems.

A2A communicationAI InfrastructureCost Optimization
0 likes · 13 min read
What $47,000 Taught Us About Deploying Multi‑Agent AI Systems
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Nov 25, 2025 · Artificial Intelligence

Why DeepSeek‑V3.2‑Exp Lost Performance and How a Simple RoPE Fix Restored It

The Baidu Baige team discovered that DeepSeek‑V3.2‑Exp’s long‑context performance lagged behind the official report, traced the issue to a subtle RoPE layout mismatch in the open‑source inference demo, collaborated with DeepSeek to fix it, and verified that the model’s speed and accuracy fully recovered across multiple benchmarks.

AI InfrastructureDeepSeekLLM inference
0 likes · 9 min read
Why DeepSeek‑V3.2‑Exp Lost Performance and How a Simple RoPE Fix Restored It
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Nov 20, 2025 · Artificial Intelligence

Boost Multimodal Model Training Efficiency with Offline Sequence Packing and Mixed‑Modality Data

Baidu's Baige team introduces an extended multimodal data loader, automated ShareGPT format conversion, and offline sequence packing techniques that together double token throughput, cut SFT training time by up to six times, and improve GPU utilization and stability for large vision‑language models.

AI InfrastructureAIAKGPU efficiency
0 likes · 7 min read
Boost Multimodal Model Training Efficiency with Offline Sequence Packing and Mixed‑Modality Data
Kuaishou Tech
Kuaishou Tech
Nov 12, 2025 · Artificial Intelligence

How KaiFG Lets Python Feature Engineering Run at C++ Speed

KaiFG, Kuaishou's self‑built AI Feature Generator, unifies fragmented feature extraction frameworks, replaces slow C++ compilation cycles with Python‑level development, and achieves near‑C++ performance through Codon‑based compilation, reference‑counted memory management, and aggressive LLVM optimizations, dramatically shortening iteration time.

AI InfrastructureHigh‑performance computingfeature engineering
0 likes · 14 min read
How KaiFG Lets Python Feature Engineering Run at C++ Speed
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Nov 7, 2025 · Artificial Intelligence

From Big Data to 30,000‑GPU Clusters: The Evolution of China’s AI Infrastructure

In a deep interview, Baidu AI Computing chief scientist Wang Yanpeng and host Koji trace China's internet infrastructure from the early big‑data era through cloud computing to today's AI boom, highlighting the pivotal role of compute power, GPU acceleration, data scaling, and Baidu's Baige platform in shaping the AI arms race.

AI InfrastructureBaidu BaigeGPU computing
0 likes · 26 min read
From Big Data to 30,000‑GPU Clusters: The Evolution of China’s AI Infrastructure
21CTO
21CTO
Nov 4, 2025 · Cloud Computing

How OpenAI’s New Alliance with AWS Will Transform AI Computing

On November 3, OpenAI announced a strategic partnership with Amazon Web Services, committing $38 billion to run its AI workloads on AWS’s optimized infrastructure, including EC2 UltraServer GPU clusters, with plans to reach full capacity by the end of 2026, marking a shift from its previous Microsoft‑centric collaborations.

AI InfrastructureAWSNVIDIA GPUs
0 likes · 3 min read
How OpenAI’s New Alliance with AWS Will Transform AI Computing
DataFunTalk
DataFunTalk
Nov 4, 2025 · Cloud Computing

How OpenAI’s $38B Deal with AWS Will Transform AI Cloud Computing

OpenAI announced a multi‑year strategic partnership with Amazon Web Services, worth $38 billion, granting OpenAI access to AWS’s massive GPU‑powered EC2 UltraServers and scalable CPU resources to accelerate its generative AI workloads, while leveraging AWS’s security, performance, and cost advantages.

AI InfrastructureAWSOpenAI
0 likes · 5 min read
How OpenAI’s $38B Deal with AWS Will Transform AI Cloud Computing
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Oct 29, 2025 · Cloud Native

How Alibaba Cloud’s Container Stack Evolves for the AI Era

Alibaba Cloud’s container experts unveiled a comprehensive, AI‑focused upgrade across its cloud‑native stack—introducing AMD compute, dynamic scaling, AI‑native scheduling, secure execution environments, and advanced GPU profiling—to make containers the native foundation for AI workloads and accelerate enterprise AI adoption.

AI InfrastructureGPU schedulingcontainer computing
0 likes · 9 min read
How Alibaba Cloud’s Container Stack Evolves for the AI Era
Architects' Tech Alliance
Architects' Tech Alliance
Oct 27, 2025 · Artificial Intelligence

How AI Super Nodes Are Redefining Scalable AI Infrastructure

The article examines the emerging AI Super Node ecosystem, detailing its core concepts, four‑layer architecture, key enabling technologies, current challenges such as compatibility and energy consumption, and future directions like quantum‑classic hybrids and green low‑carbon designs, illustrating how it overcomes scaling bottlenecks in modern AI deployments.

AI InfrastructureEdge ComputingSecure AI
0 likes · 13 min read
How AI Super Nodes Are Redefining Scalable AI Infrastructure
Fighter's World
Fighter's World
Oct 26, 2025 · Industry Insights

How Bitcoin Miners Are Turning Into AI Infrastructure Providers: An IREN Case Study

The article offers a comprehensive analysis of IREN's shift from Bitcoin mining to AI cloud services, detailing its dual‑engine business model, vertical integration advantages, ambitious 2025‑2028 roadmap, and the key supply‑chain, regulatory, execution, financial, and competitive risks it faces.

AI InfrastructureBitcoin miningData center engineering
0 likes · 23 min read
How Bitcoin Miners Are Turning Into AI Infrastructure Providers: An IREN Case Study
BirdNest Tech Talk
BirdNest Tech Talk
Oct 24, 2025 · Backend Development

Bridging Go and Python with pyproc: Ultra‑Low‑Latency Interprocess Calls

This article introduces pyproc, a library that lets Go applications invoke Python functions via Unix Domain Sockets with sub‑45 µs latency, explaining the problem of mixing Go and Python ecosystems, the architecture, performance benefits, suitable use cases, and a step‑by‑step quick‑start guide with full code examples.

AI InfrastructureGoInterprocess Communication
0 likes · 7 min read
Bridging Go and Python with pyproc: Ultra‑Low‑Latency Interprocess Calls
DataFunTalk
DataFunTalk
Oct 15, 2025 · Artificial Intelligence

Why OpenAI’s Massive AI Infrastructure Bet Could Redefine Computing

The article analyzes OpenAI’s recent strategic partnerships and massive AI infrastructure investments, detailing multi‑gigawatt data‑center plans, chip collaborations, soaring energy demands, and the broader implications for AI as the next global infrastructure platform.

AI InfrastructureAI chipsCompute
0 likes · 9 min read
Why OpenAI’s Massive AI Infrastructure Bet Could Redefine Computing
DataFunSummit
DataFunSummit
Oct 8, 2025 · Artificial Intelligence

How EasyRec Boosts Recommendation Training and Inference Performance

This article explains the EasyRec recommendation system’s training and inference architecture, detailing optimization techniques such as embedding parallelism, CPU/GPU placement, XLA and TRT fusion, online learning pipelines, network compression, and real‑world deployment results that dramatically improve throughput and latency.

AI InfrastructureEasyRecInference Optimization
0 likes · 15 min read
How EasyRec Boosts Recommendation Training and Inference Performance
Fighter's World
Fighter's World
Oct 7, 2025 · Industry Insights

How Many Digital Workers Could Future AI Deploy?

The article analyzes Epoch AI's token‑based framework for estimating AI‑generated digital workers, critiques its static assumptions, and proposes a dynamic, multi‑factor model that incorporates compute supply, hardware constraints, inference efficiency, task reliability, and economic value to forecast a wide range of possible future digital‑worker counts.

AIAI InfrastructureAI scaling
0 likes · 27 min read
How Many Digital Workers Could Future AI Deploy?
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Sep 26, 2025 · Artificial Intelligence

How Alibaba’s UPN512 Redefines AI Scale‑Up Networking with Optical Interconnects

The UPN512 whitepaper details Alibaba Cloud's next‑generation AI infrastructure network, explaining the shift from dense to MoE models, the rise of train‑and‑inference integration, xPU scale‑up challenges, and how high‑radix Ethernet with LPO/NPO optical interconnects delivers ultra‑high bandwidth, low latency, cost‑effective, and reliable large‑scale AI compute clusters.

AI InfrastructureHigh‑performance computingUPN512
0 likes · 34 min read
How Alibaba’s UPN512 Redefines AI Scale‑Up Networking with Optical Interconnects
DevOps Cloud Academy
DevOps Cloud Academy
Sep 25, 2025 · Artificial Intelligence

How to Build Scalable MLOps Infrastructure for Enterprise AI Success

This article explains what MLOps is, why a robust MLOps framework is essential for businesses, outlines its core components, compares MLOps with AIOps, details the benefits of investing in MLOps, and provides a step‑by‑step guide to designing enterprise‑grade AI MLOps infrastructure.

AI InfrastructureMLOpsMachine Learning Operations
0 likes · 17 min read
How to Build Scalable MLOps Infrastructure for Enterprise AI Success
DataFunTalk
DataFunTalk
Sep 24, 2025 · Artificial Intelligence

How OpenAI’s Quest for a Compute Empire Is Reshaping the AI Landscape

In a week OpenAI secured a $300 billion Oracle cloud deal, loosened its exclusive tie‑up with Microsoft, announced massive AI infrastructure projects, and revealed its own chip development, highlighting a strategic shift toward building an independent compute empire amid mounting financial and competitive pressures.

AI InfrastructureAI computeIndustry analysis
0 likes · 22 min read
How OpenAI’s Quest for a Compute Empire Is Reshaping the AI Landscape
DataFunSummit
DataFunSummit
Sep 18, 2025 · Artificial Intelligence

How We Scaled WeChat AI Services with Ray: Lessons from Million‑Node Deployments

This article examines how Tencent's WeChat team leveraged the Ray distributed computing framework within the Astra platform to tackle massive AI workloads, addressing challenges of scale, GPU diversity, operational complexity, and cost while outlining their architecture and practical insights.

AI InfrastructureAstra PlatformRay
0 likes · 6 min read
How We Scaled WeChat AI Services with Ray: Lessons from Million‑Node Deployments
Architects' Tech Alliance
Architects' Tech Alliance
Sep 18, 2025 · Artificial Intelligence

How AI Model Training Is Redefining Data Center Scaling Strategies

Large‑scale AI model training now demands unprecedented bandwidth and latency performance, forcing data centers to adopt three scaling approaches—Scale‑up, Scale‑out, and Scale‑Across—while leveraging optical I/O, CPO, and optical circuit switching to overcome power, distance, and bandwidth limits.

AI InfrastructureScale‑Updata center scaling
0 likes · 11 min read
How AI Model Training Is Redefining Data Center Scaling Strategies
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Sep 9, 2025 · Artificial Intelligence

How Baidu Built a 32,000‑Card AI Super‑Compute Cluster and Boosted Efficiency by 50%

This article details Baidu Intelligent Cloud's journey in designing, constructing, and operating a 32,000‑card hybrid AI compute cluster, covering challenges in power, cooling, networking, multi‑cluster scheduling, and security, and explains how innovative hardware, software, and operational strategies achieved over 50% MFU improvement and industry‑first performance records.

AI InfrastructureGPU clustershybrid cloud
0 likes · 15 min read
How Baidu Built a 32,000‑Card AI Super‑Compute Cluster and Boosted Efficiency by 50%
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Aug 22, 2025 · Artificial Intelligence

Building Scalable AI Infrastructure: Insights from Alibaba Cloud’s AI Tech Day

The AI Infra Solutions and Best Practices salon held by Alibaba Cloud in Beijing gathered technical leaders from leading AI companies to share comprehensive strategies on network, compute, and storage architectures that enable high‑efficiency, low‑latency, and elastic AI infrastructure for modern enterprise workloads.

AI InfrastructureAI OpsStorage Solutions
0 likes · 7 min read
Building Scalable AI Infrastructure: Insights from Alibaba Cloud’s AI Tech Day
Architects' Tech Alliance
Architects' Tech Alliance
Aug 18, 2025 · Artificial Intelligence

How Large Model Training Dominates Compute and What New Techniques Can Change It

This article explains why pre‑training large AI models consumes 90‑99% of total compute, describes the full training and inference pipelines, introduces resource‑saving strategies such as PD‑separation, and reviews market trends and infrastructure challenges shaping the next generation of AI systems.

AI InfrastructureAI trainingGPU architecture
0 likes · 13 min read
How Large Model Training Dominates Compute and What New Techniques Can Change It
Baobao Algorithm Notes
Baobao Algorithm Notes
Aug 11, 2025 · Industry Insights

Why AI Infrastructure Must Be Close to Models and Hardware – Insights from Zhu Yibo

In a WAIC 2025 interview, Zhu Yibo, co‑founder of Jiejie Xingchen, shares deep insights on AI infrastructure, covering its evolution, the need for tight model‑hardware co‑design, cost‑efficiency metrics, industry challenges, and future directions for large‑scale AI systems.

AI InfrastructureHardware Optimizationindustry insights
0 likes · 36 min read
Why AI Infrastructure Must Be Close to Models and Hardware – Insights from Zhu Yibo
Architects' Tech Alliance
Architects' Tech Alliance
Aug 2, 2025 · Artificial Intelligence

How China’s Computing‑Power Strategy Is Powering the AI Future

China’s computing‑power industry is rapidly maturing as national policies, massive infrastructure investments, and domestic chip development converge to create a strategic high‑ground that fuels AI, data centers, and digital‑economy transformation, with clear upstream, mid‑stream, and downstream value chains.

AI InfrastructureChina policyData Centers
0 likes · 9 min read
How China’s Computing‑Power Strategy Is Powering the AI Future
DataFunTalk
DataFunTalk
Jul 25, 2025 · Artificial Intelligence

How the U.S. AI Action Plan Aims to Lead the Global AI Race

The U.S. AI Action Plan outlines a three‑pillar strategy—accelerating AI innovation, building robust AI infrastructure, and asserting leadership in international AI diplomacy and security—to secure America’s technological dominance, protect national interests, and ensure AI benefits American workers and society.

AI GovernanceAI InfrastructureAI competition
0 likes · 44 min read
How the U.S. AI Action Plan Aims to Lead the Global AI Race
AI Info Trend
AI Info Trend
Jul 24, 2025 · Industry Insights

What’s Driving AI Adoption in 2025? Six Key Trends Uncovered

The AI Adoption Survey H1 2025 reveals that nearly half of organizations have deployed AI in production, engineering and R&D lead usage, Chinese LLMs gain overseas interest, and cost, reliability and intelligence remain the top challenges, while tool preferences and multimodal trends reshape the market.

AI InfrastructureAI adoptionAI trends
0 likes · 7 min read
What’s Driving AI Adoption in 2025? Six Key Trends Uncovered
Architects' Tech Alliance
Architects' Tech Alliance
Jul 22, 2025 · Artificial Intelligence

Will AI Backend Networks Exceed $100 B in Spending by 2029? The Ethernet Surge Explained

Driven by exploding AI workloads, the data‑center networking landscape is shifting toward four distinct networks—Compute Fabric, Backend, Front‑end, and DCI—with forecasts showing AI backend network spend surpassing $100 billion by 2029, Ethernet outpacing InfiniBand, and massive port‑speed upgrades reshaping the market.

AIAI InfrastructureBackend
0 likes · 9 min read
Will AI Backend Networks Exceed $100 B in Spending by 2029? The Ethernet Surge Explained
Tencent Technical Engineering
Tencent Technical Engineering
Jul 18, 2025 · Artificial Intelligence

From CPUs to GPUs: How Traditional Backend Skills Power Modern AI Infrastructure

This article explores the evolution of AI infrastructure, comparing it with traditional backend systems, and details how hardware shifts to GPU-centric designs, software adaptations like deep learning frameworks, and engineering challenges in model training and inference can be addressed using established backend methodologies.

AI InfrastructureDeep LearningGPU computing
0 likes · 19 min read
From CPUs to GPUs: How Traditional Backend Skills Power Modern AI Infrastructure
Volcano Engine Developer Services
Volcano Engine Developer Services
Jul 17, 2025 · Artificial Intelligence

How Distributed KVCache (EIC) Revolutionizes Large‑Model Inference Performance

This article examines how Volcano Engine's Elastic Instant Cache (EIC) tackles the memory bottleneck, high‑concurrency latency, and cross‑node coordination challenges of large language model inference by decoupling storage and computation, pooling resources, and applying layered optimizations, ultimately boosting AI inference efficiency, scalability, and cost‑effectiveness across various deployment scenarios.

AI InfrastructureKVCacheLLM inference
0 likes · 30 min read
How Distributed KVCache (EIC) Revolutionizes Large‑Model Inference Performance
Tencent Cloud Developer
Tencent Cloud Developer
Jul 17, 2025 · Artificial Intelligence

Why GPUs Are the New CPUs: Unpacking AI Infrastructure Challenges

This article explores how AI infrastructure has shifted from CPU‑centric designs to GPU‑driven architectures, detailing hardware evolution, software changes, and the engineering challenges of large‑model training and inference, while offering practical insights for traditional backend engineers transitioning to AI systems.

AI InfrastructureDeep LearningGPU computing
0 likes · 16 min read
Why GPUs Are the New CPUs: Unpacking AI Infrastructure Challenges
DataFunTalk
DataFunTalk
Jul 15, 2025 · Artificial Intelligence

Inside Scale AI: How a Data‑Labeling Startup Became a $29 B AI Powerhouse

This investigative article traces Scale AI’s evolution from a MIT‑dropout’s data‑annotation startup to a $29 billion AI infrastructure leader, detailing its founder Alexandr Wang, core products, government contracts, competitive advantages, and the strategic shift toward defense‑focused AI solutions.

AI InfrastructureScale AITech Industry
0 likes · 15 min read
Inside Scale AI: How a Data‑Labeling Startup Became a $29 B AI Powerhouse
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Jul 11, 2025 · Cloud Native

How Alibaba Cloud’s AI Infra Innovations Are Transforming Kubernetes Workloads

This article summarizes Alibaba Cloud’s key technical contributions at KubeCon China 2025, covering AI‑focused Kubernetes optimizations, Argo Workflows enhancements, storage strategies for large models, Fluid’s data orchestration, multi‑tenant security, and the RoleBasedGroup framework for PD‑separated AI inference.

AI InfrastructureArgo WorkflowsFluid
0 likes · 20 min read
How Alibaba Cloud’s AI Infra Innovations Are Transforming Kubernetes Workloads
Architects' Tech Alliance
Architects' Tech Alliance
Jun 29, 2025 · Artificial Intelligence

Scale-Up vs Scale-Out: Balancing Performance and Flexibility in AI Infrastructure

This article explains the technical definitions, core differences, and practical use cases of Scale‑Up and Scale‑Out networking in AI systems, highlighting how they impact latency, bandwidth, and cost, and illustrates their combined application through NVIDIA's NVL72 supernode case study.

AI InfrastructureGPU networkingHigh‑performance computing
0 likes · 14 min read
Scale-Up vs Scale-Out: Balancing Performance and Flexibility in AI Infrastructure
IT Services Circle
IT Services Circle
Jun 23, 2025 · Artificial Intelligence

How the Emerging Computing Power Internet Will Transform AI and Data Services

The article explains the concept, background, definition, challenges, roadmap, and key application scenarios of China's Computing Power Internet, highlighting its role in unifying fragmented compute resources, enabling on‑demand AI services, and driving nationwide digital transformation.

AI Infrastructurecloud computingcomputing power internet
0 likes · 11 min read
How the Emerging Computing Power Internet Will Transform AI and Data Services
AntTech
AntTech
Jun 18, 2025 · Artificial Intelligence

How Ant Group’s Baoling Models Push Toward AGI with MoE and Multimodal Innovations

In a detailed AICon talk, Ant Group’s Baoling team leader Zhou Jun outlines their latest large‑model training techniques, MoE architecture optimizations, multimodal breakthroughs, open‑source releases, and the strategic roadmap needed to turn AI into a ubiquitous, “scan‑code‑level” everyday assistant.

AI InfrastructureMixture of ExpertsMultimodal AI
0 likes · 25 min read
How Ant Group’s Baoling Models Push Toward AGI with MoE and Multimodal Innovations
DataFunTalk
DataFunTalk
Jun 15, 2025 · Artificial Intelligence

Sam Altman Reveals the ‘Stargate’ AI Infrastructure Blueprint and Its $500B Future

In a Bloomberg Originals interview, OpenAI CEO Sam Altman discusses the massive “Stargate” infrastructure project, exploding demand for AI compute, multi‑partner collaborations, a projected $500 billion investment, GPU bottlenecks, and his vision for AI’s role in science, employment and humanity’s future.

AI InfrastructureAI fundingAI future
0 likes · 25 min read
Sam Altman Reveals the ‘Stargate’ AI Infrastructure Blueprint and Its $500B Future
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
May 23, 2025 · Artificial Intelligence

How Baidu’s Kunlun Supernode Redefines AI Compute Density and Performance

This article explains how Baidu’s Kunlun supernode, built on high‑density liquid‑cooled cabinets and a modular 1U 4‑card design, breaks traditional 8‑card limits, boosts compute density four‑fold, improves power and cooling efficiency, and provides a scalable foundation for large‑model AI training and inference.

AI InfrastructureGPU clusterHigh‑performance computing
0 likes · 13 min read
How Baidu’s Kunlun Supernode Redefines AI Compute Density and Performance
AntData
AntData
May 20, 2025 · Artificial Intelligence

How Vector Retrieval Powers AI: Challenges, Solutions, and VSAG’s Open‑Source Breakthrough

The article examines the rapid growth of unstructured data, explains the fundamentals and resource‑intensive nature of vector retrieval, presents Ant Group’s engineering practices—including hybrid HNSW‑DiskANN indexing, performance tricks like BSA pruning and memory prefetching, sparse‑vector and feedback‑driven recall improvements—and outlines the open‑source VSAG roadmap and ecosystem integrations.

AI InfrastructurePerformance OptimizationVector Retrieval
0 likes · 18 min read
How Vector Retrieval Powers AI: Challenges, Solutions, and VSAG’s Open‑Source Breakthrough
AI Product Manager Community
AI Product Manager Community
May 20, 2025 · Industry Insights

How Nvidia Is Shaping the Future of AI Infrastructure and Physical AI

At the 2025 Taipei International Computer Expo, Nvidia CEO Jensen Huang outlined the company's shift from a chipmaker to an AI infrastructure leader, introduced the concept of physical AI, and detailed upcoming hardware, software, and strategic initiatives that could reshape data centers, robotics, and autonomous driving.

AI InfrastructureNvidiaPhysical AI
0 likes · 7 min read
How Nvidia Is Shaping the Future of AI Infrastructure and Physical AI
Architects' Tech Alliance
Architects' Tech Alliance
May 11, 2025 · Industry Insights

Why China’s Intelligent Computing Centers Are Poised for Explosive Growth

The article analyzes China’s rapid expansion of intelligent computing centers, covering market forecasts, government policies, regional deployment patterns, major telecom operators' strategies, the industry value chain, and future trends that together signal a sustained surge in AI‑focused compute infrastructure across the nation.

AI InfrastructureChinaHPC
0 likes · 11 min read
Why China’s Intelligent Computing Centers Are Poised for Explosive Growth
Fighter's World
Fighter's World
May 2, 2025 · Industry Insights

Token Economics Reveals Nvidia’s New AI Factory Narrative

The article analyses Nvidia’s shift from a chip supplier to a full‑stack AI infrastructure provider called AI Factory, explains the token‑economics framework that measures intelligent output, details the hardware‑software stack and network fabric, quantifies token consumption of advanced agents, and evaluates the strategic opportunities and risks for Nvidia.

AI FactoryAI InfrastructureAgentic AI
0 likes · 29 min read
Token Economics Reveals Nvidia’s New AI Factory Narrative