Author

SuanNi

A community for AI developers that aggregates large-model development services, models, and compute power.

247

Articles

Likes

428

Views

Comments

Latest from SuanNi

100 recent articles max

SuanNi

Jun 8, 2026 · Artificial Intelligence

Agent Harness Model Achieves Frontier Performance at <1% Compute Cost – Introducing Macaron‑V1‑Preview

A 30‑person lab trained a 749B‑parameter Agent model called Macaron‑V1‑Preview using fewer than 300 GPUs, achieving less than 1% of the compute cost of comparable models while matching state‑of‑the‑art performance on real‑world Agent benchmarks such as LivingBench, VitaBench, A2UI and PinchBench.

AIAgentEfficient Training

0 likes · 15 min read

Agent Harness Model Achieves Frontier Performance at <1% Compute Cost – Introducing Macaron‑V1‑Preview

SuanNi

Jun 8, 2026 · Artificial Intelligence

First Enterprise IT Ops Agent Benchmark Shows Claude Leads with Just 47% Score

The ITBench-AA benchmark, the first evaluation specifically for enterprise IT operations agents, tests 59 SRE scenarios and reveals that even top models like Claude Opus 4.7 achieve only a 47% score, highlighting both the difficulty of the tasks and the cost‑effectiveness gap between proprietary and open‑source agents.

AI AgentClaudeCost Efficiency

0 likes · 11 min read

First Enterprise IT Ops Agent Benchmark Shows Claude Leads with Just 47% Score

SuanNi

Jun 7, 2026 · Artificial Intelligence

How OpenAI’s Codex Is Driving a 3× Surge in Knowledge‑Work Productivity

OpenAI’s “The Next Era of Knowledge Work” report shows Codex powering over five million weekly active users with more than six‑fold growth, reshaping knowledge‑intensive tasks by tackling search, coordination and approval frictions, enabling parallel workflows, and prompting policy recommendations for broader AI adoption.

AI productivityCodexOpenAI

0 likes · 12 min read

How OpenAI’s Codex Is Driving a 3× Surge in Knowledge‑Work Productivity

SuanNi

Jun 7, 2026 · Artificial Intelligence

NVIDIA’s Physical AI Agent Skills Streamline Autonomous Driving, Robotics, and Vision AI

NVIDIA unveiled a suite of Physical AI Agent Skills at CVPR that connects data generation, simulation, policy training, and evaluation into a unified workflow, leveraging the Cosmos 3 multimodal model and tools such as InstantNuRec, AlpaGym, OmniDreams, and Alpamayo 2 Super to accelerate research in autonomous driving, vision AI, and robotics.

Agent skillsCosmos 3Nvidia

0 likes · 11 min read

NVIDIA’s Physical AI Agent Skills Streamline Autonomous Driving, Robotics, and Vision AI

SuanNi

Jun 6, 2026 · Artificial Intelligence

How JoyAI‑Echo Overcomes Forgetting in Minute‑Long Video Generation

JoyAI‑Echo introduces a cross‑modal audio‑visual memory bank, a three‑stage post‑training pipeline, and a Director Agent to enable consistent, high‑quality, real‑time generation of minute‑level videos, achieving up to 7.5× inference speedup and state‑of‑the‑art benchmark scores.

JoyAI-Echoaudio-visual AIcross-modal memory

0 likes · 13 min read

How JoyAI‑Echo Overcomes Forgetting in Minute‑Long Video Generation

SuanNi

Jun 6, 2026 · Artificial Intelligence

Demystifying Harness, Scaffold, and Other Tricky AI Agent Terms

This article breaks down the core terminology of AI agents—Model, Scaffold, Harness, Context Engineering, Policy, Tool Use, Skills, Sub‑agents, and the training‑side concepts of RL Environment, Trainer, Rollout, and Reward—explaining their roles, differences, and how they combine to form functional agents.

AI AgentContext EngineeringHarness

0 likes · 12 min read

Demystifying Harness, Scaffold, and Other Tricky AI Agent Terms

SuanNi

Jun 5, 2026 · Artificial Intelligence

How PaddleOCR‑VL‑1.6’s 0.9B Model Achieved 96.33% SOTA on OmniDocBench v1.6

PaddleOCR‑VL‑1.6, a compact 0.9B visual‑language model, diagnoses three types of weak regions, enriches targeted data, and applies a three‑stage CPT‑SFT‑RL training pipeline to reach a 96.33% overall score on OmniDocBench v1.6, surpassing much larger models across all document‑parsing tasks.

OmniDocBenchPaddleOCR-VL-1.6SOTA

0 likes · 10 min read

How PaddleOCR‑VL‑1.6’s 0.9B Model Achieved 96.33% SOTA on OmniDocBench v1.6

SuanNi

Jun 5, 2026 · Artificial Intelligence

AI Is Accelerating AI: Anthropic’s Pause Proposal and Three Future Scenarios

Anthropic’s internal data shows AI models are rapidly self‑improving—Claude now writes over 80% of its code, boosts engineer productivity several‑fold, and speeds up tasks dramatically—prompting a pause proposal and three possible future trajectories for AI development.

AI accelerationAI safetyAnthropic

0 likes · 16 min read

AI Is Accelerating AI: Anthropic’s Pause Proposal and Three Future Scenarios

SuanNi

Jun 5, 2026 · Artificial Intelligence

How Google’s Gemma 4 12B Packs Multimodal Power into a Laptop‑Friendly Model

Google’s Gemma 4 12B delivers near‑26B performance with half the memory, runs on a 16 GB laptop GPU, and uses a novel encoder‑free unified architecture that natively handles vision, audio, and text, making high‑quality multimodal AI truly local.

Gemma 4 12Baudio-visual integrationencoder-free architecture

0 likes · 6 min read

How Google’s Gemma 4 12B Packs Multimodal Power into a Laptop‑Friendly Model

SuanNi

Jun 4, 2026 · Artificial Intelligence

Bernini: An Open‑Source AI Model that Masterfully Handles Diverse Video Editing Tasks

Bernini combines a multimodal large language model with a diffusion renderer, uses a semantic planner‑renderer architecture, segment‑aware 3D position encoding and chain‑of‑thought reasoning, and achieves state‑of‑the‑art results on a 300‑case benchmark that outperforms closed‑source competitors.

BerniniLLMbenchmark

0 likes · 11 min read

Bernini: An Open‑Source AI Model that Masterfully Handles Diverse Video Editing Tasks