Tagged articles
4 articles
Page 1 of 1
SuanNi
SuanNi
Jun 12, 2026 · Artificial Intelligence

Kimi K2.7 Code Goes Open: 30% Token Savings and Major Coding Performance Boost

Kimi K2.7 Code, now open‑source on HuggingFace, reduces token consumption by ~30% and boosts coding benchmark scores—Kimi Code Bench v2 climbs from 50.9 to 62.0, Program‑Bench from 48.3 to 53.6, MLS Bench Lite from 26.7 to 35.1—narrowing the gap with GPT‑5.5 and Claude Opus, all built on a 1‑trillion‑parameter MoE architecture with INT4 quantization and a 256K‑token context.

Code GenerationHuggingFaceINT4 quantization
0 likes · 6 min read
Kimi K2.7 Code Goes Open: 30% Token Savings and Major Coding Performance Boost
Old Zhang's AI Learning
Old Zhang's AI Learning
Jun 10, 2026 · Artificial Intelligence

Anthropic’s Claude Fable 5 and Mythos 5: Twin Models with a Shockingly Low Price and New Safety Switches

Anthropic released Claude Fable 5 and Mythos 5 as twin large‑language‑model variants that share the same base but differ only in safety‑classifier settings, offering 1 M‑token context, 128 k‑token output, a halved price, and a three‑layer real‑time safety system that routes risky requests to Claude Opus 4.8.

AI safetyAnthropicClaude Fable 5
0 likes · 12 min read
Anthropic’s Claude Fable 5 and Mythos 5: Twin Models with a Shockingly Low Price and New Safety Switches
Machine Heart
Machine Heart
Jun 9, 2026 · Artificial Intelligence

Claude Fable 5 Unveiled: Record-Breaking Performance and New Pricing

Anthropic has launched Claude Fable 5, its most powerful LLM to date, claiming top‑tier results across software engineering, knowledge work, vision and scientific benchmarks, while offering higher token efficiency, new safety layers, and a pricing model of $10 per M input and $50 per M output tokens.

AI safetyAnthropicClaude Fable 5
0 likes · 7 min read
Claude Fable 5 Unveiled: Record-Breaking Performance and New Pricing
Old Zhang's AI Learning
Old Zhang's AI Learning
Apr 29, 2026 · Artificial Intelligence

Top 10 Open‑Source LLM Benchmarks: Scores, Rankings, and What They Test

This article walks through ten mainstream open‑source large‑model benchmarks—SWE‑bench Verified and Pro, MMLU‑Pro, GPQA Diamond, HLE, AIME, HMMT, olmOCR‑bench, Terminal‑Bench 2.0, and EvasionBench—explaining their data, evaluation metrics, current leading models, and the capability dimensions they reveal.

AI evaluationLLM benchmarksMMLU-Pro
0 likes · 20 min read
Top 10 Open‑Source LLM Benchmarks: Scores, Rankings, and What They Test