The Single‑Agent Era Ends – Kimi K2.6 Scales to 300 Agents for Complex Tasks
This week’s tech roundup covers the launch of Kimi K2.6 with a 300‑agent swarm capability and major performance gains, DeepSeek V4’s new sparse‑attention architecture and pricing, Meshy’s AI‑3D partnership, a $4.55 B AI‑brain funding round, Honor’s record‑breaking robot, M‑Flow’s cone‑graph memory engine, and Vision Banana’s unified visual model, all backed by benchmark data and industry commentary.
Kimi released the open‑source model K2.6 less than three months after K2.5, positioning it as the strongest code‑generation and agent model in the open‑source space. The core upgrade expands the previous Agent Swarm to support up to 300 parallel sub‑agents and 4,000 collaborative steps, enabling end‑to‑end tasks such as building a complete hotel‑booking website with front‑end, back‑end, authentication, and data persistence. In a local test the model ran for over 12 hours, invoked tools more than 4,000 times, iterated 14 rounds, and increased token‑per‑second throughput from ~15 tokens/s to ~193 tokens/s—about a 20 % improvement over K2.5. Benchmarks on Humanity’s Last Exam, SWE‑Bench Pro and DeepSearchQA place K2.6 ahead of most closed‑source competitors, including GPT‑5.4, Claude Opus 4.6 and Gemini 3.1 Pro (Artificial Analysis). The model also adds stronger API precision and long‑run stability for frameworks such as OpenClaw and Hermes.
DeepSeek announced V4 on 24 April 2026, releasing two versions: V4‑Pro (1.6 T total parameters, 49 B activation parameters, 33 T pre‑training tokens) and V4‑Flash (284 B total, 13 B activation). Both support a 1 M‑token context window and three inference intensity levels (Non‑think / Think‑High / Think‑Max). The architecture introduces a DSA sparse‑attention mechanism that cuts per‑token FLOPs to 27 % of V3.2 and reduces KV‑cache memory by 90 % (Mechanical Heart). Pricing is ¥0.2 per million input tokens and ¥2 per million output tokens for Flash, and ¥12/¥24 respectively for Pro (Quantum Bit).
AI‑3D startup Meshy partnered with 3D‑printing platform Tuozhu, embedding its generation engine into the MakerLab workflow. Meshy reports $40 M ARR, 85 % gross margin, LTV/CAC > 4, and monthly revenue growth of 20‑30 %. The service reduces a typical $1,000 two‑week modeling job to a two‑minute, $1 process, cutting modeling time by half for a major gaming client and achieving >90 % efficiency gains for a beverage brand (Silicon Star).
A joint pre‑Series A round of $455 M (Red Swan + Hillhouse) funded the AI‑brain venture It Stone, underscoring the shift in embodied‑intelligence from hardware‑centric to brain‑centric competition. It Stone’s AWE 3.0 multimodal model improves task success three‑fold and reduces jitter 45 % in robotic assembly, while its WIYH dataset provides the first large‑scale real‑world multimodal data for embodied AI (Tencent Tech).
Honor’s “Lightning” robot dominated the Beijing Yizhuang humanoid half‑marathon, winning overall with a 50:26 net time—faster than the human half‑marathon record—and sweeping the top six places. Technical advantages include a liquid‑cooling loop delivering >4 L/min flow, shield‑tunnel steel and carbon‑fiber chassis, four 400 Nm peak‑torque joints, and a hybrid wind‑plus‑liquid cooling system that enables sustained high‑power operation (Robot Global).
In the memory‑engine space, the open‑source project M‑Flow (developed by a 19‑year‑old Ivy‑League team) introduced a four‑layer “Cone Graph” architecture that replaces flat vector retrieval with a directed graph of entities, facets, semantic edges and episodes. This design yields millisecond‑level response times, multi‑hop reasoning, and a one‑line Docker deployment, outperforming Mem0, Zep and Graphiti on LoCoMo, LongMemEval and EvolvingEvents (New Intelligence).
DeepMind’s Vision Banana research demonstrated that a generative image model can serve as a universal visual learner. Built on the Nano Banana Pro backbone, Vision Banana achieved SOTA zero‑shot results such as 0.699 mIoU on Cityscapes segmentation and 0.882 δ₁ on monocular depth, by parametrising all task outputs as RGB images and using lightweight prompt tuning (Mechanical Heart). The work argues that generation is the natural interface for visual tasks and foreshadows a paradigm shift toward unified generative‑vision models (Mechanical Heart).
Additional industry notes include the launch of DeepSeek‑V4‑Pro and Flash, the emergence of AI‑driven video generation platforms (e.g., Zhixiang Future) shifting from consumer toys to enterprise tools, and the “software‑as‑daily‑disposable” thesis presented by DingTalk CEO Chen Hang, which predicts a future where software is generated on‑demand and retired after use, reshaping enterprise organization and decision‑making (CSDN).
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ZhongAn Tech Team
China's first online insurer. Through tech innovation we make insurance simpler, warmer, and more valuable. Powered by technology, we support 50 billion RMB of policies and serve 600 million users with smart, personalized solutions. ZhongAn's hardcore tech and article shares are here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
