SuanNi
Author

SuanNi

A community for AI developers that aggregates large-model development services, models, and compute power.

247
Articles
0
Likes
428
Views
0
Comments
Recent Articles

Latest from SuanNi

100 recent articles max
SuanNi
SuanNi
Jun 8, 2026 · Artificial Intelligence

Agent Harness Model Achieves Frontier Performance at <1% Compute Cost – Introducing Macaron‑V1‑Preview

A 30‑person lab trained a 749B‑parameter Agent model called Macaron‑V1‑Preview using fewer than 300 GPUs, achieving less than 1% of the compute cost of comparable models while matching state‑of‑the‑art performance on real‑world Agent benchmarks such as LivingBench, VitaBench, A2UI and PinchBench.

AIAgentEfficient Training
0 likes · 15 min read
Agent Harness Model Achieves Frontier Performance at <1% Compute Cost – Introducing Macaron‑V1‑Preview
SuanNi
SuanNi
Jun 8, 2026 · Artificial Intelligence

First Enterprise IT Ops Agent Benchmark Shows Claude Leads with Just 47% Score

The ITBench-AA benchmark, the first evaluation specifically for enterprise IT operations agents, tests 59 SRE scenarios and reveals that even top models like Claude Opus 4.7 achieve only a 47% score, highlighting both the difficulty of the tasks and the cost‑effectiveness gap between proprietary and open‑source agents.

AI AgentClaudeCost Efficiency
0 likes · 11 min read
First Enterprise IT Ops Agent Benchmark Shows Claude Leads with Just 47% Score
SuanNi
SuanNi
Jun 7, 2026 · Artificial Intelligence

How OpenAI’s Codex Is Driving a 3× Surge in Knowledge‑Work Productivity

OpenAI’s “The Next Era of Knowledge Work” report shows Codex powering over five million weekly active users with more than six‑fold growth, reshaping knowledge‑intensive tasks by tackling search, coordination and approval frictions, enabling parallel workflows, and prompting policy recommendations for broader AI adoption.

AI productivityCodexOpenAI
0 likes · 12 min read
How OpenAI’s Codex Is Driving a 3× Surge in Knowledge‑Work Productivity
SuanNi
SuanNi
Jun 7, 2026 · Artificial Intelligence

NVIDIA’s Physical AI Agent Skills Streamline Autonomous Driving, Robotics, and Vision AI

NVIDIA unveiled a suite of Physical AI Agent Skills at CVPR that connects data generation, simulation, policy training, and evaluation into a unified workflow, leveraging the Cosmos 3 multimodal model and tools such as InstantNuRec, AlpaGym, OmniDreams, and Alpamayo 2 Super to accelerate research in autonomous driving, vision AI, and robotics.

Agent skillsCosmos 3Nvidia
0 likes · 11 min read
NVIDIA’s Physical AI Agent Skills Streamline Autonomous Driving, Robotics, and Vision AI
SuanNi
SuanNi
Jun 6, 2026 · Artificial Intelligence

How JoyAI‑Echo Overcomes Forgetting in Minute‑Long Video Generation

JoyAI‑Echo introduces a cross‑modal audio‑visual memory bank, a three‑stage post‑training pipeline, and a Director Agent to enable consistent, high‑quality, real‑time generation of minute‑level videos, achieving up to 7.5× inference speedup and state‑of‑the‑art benchmark scores.

JoyAI-Echoaudio-visual AIcross-modal memory
0 likes · 13 min read
How JoyAI‑Echo Overcomes Forgetting in Minute‑Long Video Generation
SuanNi
SuanNi
Jun 6, 2026 · Artificial Intelligence

Demystifying Harness, Scaffold, and Other Tricky AI Agent Terms

This article breaks down the core terminology of AI agents—Model, Scaffold, Harness, Context Engineering, Policy, Tool Use, Skills, Sub‑agents, and the training‑side concepts of RL Environment, Trainer, Rollout, and Reward—explaining their roles, differences, and how they combine to form functional agents.

AI AgentContext EngineeringHarness
0 likes · 12 min read
Demystifying Harness, Scaffold, and Other Tricky AI Agent Terms
SuanNi
SuanNi
Jun 5, 2026 · Artificial Intelligence

How PaddleOCR‑VL‑1.6’s 0.9B Model Achieved 96.33% SOTA on OmniDocBench v1.6

PaddleOCR‑VL‑1.6, a compact 0.9B visual‑language model, diagnoses three types of weak regions, enriches targeted data, and applies a three‑stage CPT‑SFT‑RL training pipeline to reach a 96.33% overall score on OmniDocBench v1.6, surpassing much larger models across all document‑parsing tasks.

OmniDocBenchPaddleOCR-VL-1.6SOTA
0 likes · 10 min read
How PaddleOCR‑VL‑1.6’s 0.9B Model Achieved 96.33% SOTA on OmniDocBench v1.6
SuanNi
SuanNi
Jun 5, 2026 · Artificial Intelligence

AI Is Accelerating AI: Anthropic’s Pause Proposal and Three Future Scenarios

Anthropic’s internal data shows AI models are rapidly self‑improving—Claude now writes over 80% of its code, boosts engineer productivity several‑fold, and speeds up tasks dramatically—prompting a pause proposal and three possible future trajectories for AI development.

AI accelerationAI safetyAnthropic
0 likes · 16 min read
AI Is Accelerating AI: Anthropic’s Pause Proposal and Three Future Scenarios
SuanNi
SuanNi
Jun 5, 2026 · Artificial Intelligence

How Google’s Gemma 4 12B Packs Multimodal Power into a Laptop‑Friendly Model

Google’s Gemma 4 12B delivers near‑26B performance with half the memory, runs on a 16 GB laptop GPU, and uses a novel encoder‑free unified architecture that natively handles vision, audio, and text, making high‑quality multimodal AI truly local.

Gemma 4 12Baudio-visual integrationencoder-free architecture
0 likes · 6 min read
How Google’s Gemma 4 12B Packs Multimodal Power into a Laptop‑Friendly Model
SuanNi
SuanNi
Jun 4, 2026 · Artificial Intelligence

Bernini: An Open‑Source AI Model that Masterfully Handles Diverse Video Editing Tasks

Bernini combines a multimodal large language model with a diffusion renderer, uses a semantic planner‑renderer architecture, segment‑aware 3D position encoding and chain‑of‑thought reasoning, and achieves state‑of‑the‑art results on a 300‑case benchmark that outperforms closed‑source competitors.

BerniniLLMbenchmark
0 likes · 11 min read
Bernini: An Open‑Source AI Model that Masterfully Handles Diverse Video Editing Tasks