Tagged articles
15 articles
Page 1 of 1
Geek Labs
Geek Labs
May 13, 2026 · Artificial Intelligence

Two LLM Inference Acceleration Projects: A Mac‑Local Engine vs a Data‑Center Engine

This article compares two recent GitHub LLM inference engines—ds4.c, a Metal‑optimized engine for DeepSeek V4 Flash on Apple Silicon Macs, and TokenSpeed, a Python/C++‑based, data‑center‑grade engine for GPU clusters—detailing their design choices, performance numbers, usage instructions, and suitable scenarios.

DeepSeekGPUInference
0 likes · 8 min read
Two LLM Inference Acceleration Projects: A Mac‑Local Engine vs a Data‑Center Engine
Node.js Tech Stack
Node.js Tech Stack
May 9, 2026 · Artificial Intelligence

Redis Founder Crafts DeepSeek V4 AI Inference Engine, Node.js Star Applauds

Redis creator Salvatore Sanfilippo (antirez) released DS4, a Metal‑only C inference engine tailored for DeepSeek V4 Flash on high‑end Macs, featuring narrow model focus, 2‑bit quantization, disk‑based KV cache, benchmark speeds around 26 tokens/s, and a dual OpenAI/Anthropic compatible server.

2-bit quantizationAI inference engineDeepSeek-V4
0 likes · 13 min read
Redis Founder Crafts DeepSeek V4 AI Inference Engine, Node.js Star Applauds
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
May 3, 2026 · Artificial Intelligence

Running a 400B Mixture‑of‑Experts LLM on iPhone 17 Pro: Inside Flash‑MoE

The article details how the open‑source Flash‑MoE engine streams a 400‑billion‑parameter Mixture‑of‑Experts language model on an iPhone 17 Pro, achieving interactive‑level token throughput by eliminating Python dependencies, crafting a custom Metal pipeline, and streaming weights directly from SSD.

Apple SiliconFlash-MoEGCD
0 likes · 7 min read
Running a 400B Mixture‑of‑Experts LLM on iPhone 17 Pro: Inside Flash‑MoE
Machine Heart
Machine Heart
May 1, 2026 · Artificial Intelligence

How a 400B Mixture‑of‑Experts Model Runs on the iPhone 17 Pro

The article details the Flash‑MoE project that streams the 400 billion‑parameter Qwen3.5‑397B‑A17B mixture‑of‑experts model on an iPhone 17 Pro, achieving up to 0.6 tokens per second with a custom Metal‑GPU pipeline, zero‑Python code, and SSD‑backed weight streaming that keeps only 5.5 GB in RAM.

Flash-MoELLMMetal
0 likes · 7 min read
How a 400B Mixture‑of‑Experts Model Runs on the iPhone 17 Pro
ByteDance Terminal Technology
ByteDance Terminal Technology
Aug 24, 2022 · Mobile Development

Impeller Rendering Engine: Background, Metal Shader Compilation, Vector Rendering, and Flutter DisplayList

This article provides an in‑depth technical overview of Flutter's Impeller rendering engine, covering its origin, Jank classification, Metal shader compilation evolution, vector rendering fundamentals, DisplayList architecture, Impeller's rendering pipeline, and the ImpellerC shader compiler, with code examples and performance insights.

DisplayListFlutterImpeller
0 likes · 31 min read
Impeller Rendering Engine: Background, Metal Shader Compilation, Vector Rendering, and Flutter DisplayList
Youku Technology
Youku Technology
Jun 8, 2022 · Mobile Development

How Youku Achieves Real-Time Bullet‑Screen Pass‑Through on Mobile

This article details Youku's technical approach to rendering bullet‑screen pass‑through on mobile devices, covering cloud‑based and on‑device segmentation pipelines, GPU‑accelerated rendering steps, performance optimizations, and engineering challenges to deliver seamless immersive viewing.

GPUMetalOpenGL
0 likes · 11 min read
How Youku Achieves Real-Time Bullet‑Screen Pass‑Through on Mobile
Xianyu Technology
Xianyu Technology
Oct 21, 2021 · Mobile Development

Flutter iOS GPU Background Crash Analysis and Solution

The article analyzes why Flutter crashes on iOS when accessing the GPU in the background, explains the official SyncSwitch fix for ImageDecoder, and details Xianyu’s additional patches for MultipleFrameCodec, EncodeImage, and Rasterizer::DrawToSurface that together, via PR #28383, fully resolve the GPU‑background crash.

CrashFlutterGPU
0 likes · 11 min read
Flutter iOS GPU Background Crash Analysis and Solution