Tagged articles
15 articles
Page 1 of 1
SuanNi
SuanNi
May 14, 2026 · Artificial Intelligence

How a Pure‑Software Framework Boosts On‑Device AI Agents by 1.6×

KAIST researchers introduced Agent‑X, a pure‑software acceleration framework that eliminates prefill and decode bottlenecks on mobile devices, achieving a 1.61× end‑to‑end speedup for on‑device AI agents without any loss in task accuracy.

Agent-XKAISTdecode bottleneck
0 likes · 9 min read
How a Pure‑Software Framework Boosts On‑Device AI Agents by 1.6×
Geek Labs
Geek Labs
Apr 11, 2026 · Mobile Development

PhoneClaw and PokeClaw: Turning Your Phone into a Private AI Agent

PhoneClaw and PokeClaw are open‑source, on‑device AI agents for iOS and Android that run Gemma 4 locally, offering offline privacy, zero‑cost operation, and native tool calling through iOS APIs or Android Accessibility services.

AndroidGemmaMobile AI
0 likes · 11 min read
PhoneClaw and PokeClaw: Turning Your Phone into a Private AI Agent
AI Explorer
AI Explorer
Apr 10, 2026 · Artificial Intelligence

Google AI Edge Gallery: Offline Mobile AI Model Playground

Google’s open‑source AI Edge Gallery lets Android and iOS devices run large language models such as Gemma 4 entirely offline, eliminating network latency and privacy concerns; the app showcases six modular AI features, offers a simple install path, and signals Google’s push toward a standardized edge‑AI ecosystem.

Gemma 4Google AI Edge GalleryKotlin
0 likes · 8 min read
Google AI Edge Gallery: Offline Mobile AI Model Playground
AndroidPub
AndroidPub
Apr 2, 2026 · Artificial Intelligence

How to Build Offline, Privacy‑First AI with On‑Device Retrieval‑Augmented Generation

This article explains how to implement on‑device Retrieval‑Augmented Generation (RAG) for large language models, covering embedding, vector indexing, model selection, quantization, data chunking, incremental updates, hybrid search, and agentic RAG to deliver fast, private, and personalized AI experiences on mobile devices.

EmbeddingLLMRAG
0 likes · 18 min read
How to Build Offline, Privacy‑First AI with On‑Device Retrieval‑Augmented Generation
DataFunSummit
DataFunSummit
Oct 31, 2025 · Artificial Intelligence

How OPPO’s AndesVL Is Revolutionizing On‑Device Multimodal AI

OPPO AI Center introduces AndesVL, an open‑source, fully‑adapted multimodal large model ranging from 0.6B to 4B parameters, designed for high‑performance, privacy‑preserving, low‑latency AI on mobile devices, with advanced architecture, training pipelines, on‑device optimizations, and state‑of‑the‑art benchmark results.

Mobile AIlarge language modelmodel compression
0 likes · 21 min read
How OPPO’s AndesVL Is Revolutionizing On‑Device Multimodal AI
AndroidPub
AndroidPub
Sep 26, 2025 · Mobile Development

How to Add On‑Device AI Scanning to Your Android App with ML Kit

This article walks through the practical steps of integrating Google ML Kit into an Android app, covering its privacy‑first, zero‑learning‑curve advantages and providing complete code examples for barcode scanning, OCR, error handling, CameraX setup, and performance tuning.

AndroidBarcode ScanningCameraX
0 likes · 14 min read
How to Add On‑Device AI Scanning to Your Android App with ML Kit
DataFunTalk
DataFunTalk
Sep 7, 2025 · Artificial Intelligence

Why Apple’s FastVLM Is 85× Faster and What It Means for On‑Device AI

Apple recently open‑sourced its FastVLM and MobileCLIP2 models, showcasing a multimodal vision‑language system that runs up to 85 times faster than comparable models, enabling real‑time AI on iPhones and other edge devices while illustrating Apple’s broader “B‑plan” of on‑device small‑model AI strategy.

AppleFastVLMVision-Language Model
0 likes · 15 min read
Why Apple’s FastVLM Is 85× Faster and What It Means for On‑Device AI
vivo Internet Technology
vivo Internet Technology
Sep 3, 2025 · Artificial Intelligence

How to Enable On‑Device AI in WeChat Mini‑Programs with TensorFlow.js and Native Inference

This article details a complete engineering solution for bringing on‑device AI to WeChat mini‑programs, comparing TensorFlow.js and WeChat native inference, covering model conversion, package‑size optimization, integration steps, performance metrics, and a hybrid strategy that boosts recommendation click‑through rates by 30%.

Mini ProgramTensorFlow.jsWeChat
0 likes · 13 min read
How to Enable On‑Device AI in WeChat Mini‑Programs with TensorFlow.js and Native Inference
Fighter's World
Fighter's World
Aug 29, 2025 · Artificial Intelligence

How Pixel 10 Reveals Google’s Decade‑Long On‑Device AI Strategy

The article analyzes Google’s Made by Google 2025 event, showing how the Pixel 10 lineup, the Tensor G5 chip, Gemini Nano, and a full‑stack AI infrastructure—including custom TPUs, AI Hypercomputer, and Vertex AI—form a coordinated on‑device AI strategy that challenges Apple and builds a long‑term economic moat.

AI strategyGeminiGoogle
0 likes · 25 min read
How Pixel 10 Reveals Google’s Decade‑Long On‑Device AI Strategy
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Aug 31, 2024 · Artificial Intelligence

Apple Intelligence and the Scaling Landscape of Large Language Models: Trends, Costs, and Deployment Considerations

An in‑depth analysis of Apple Intelligence and the broader LLM ecosystem, covering recent model scaling breakthroughs, data and compute requirements, pricing dynamics, hardware trends, on‑device versus cloud deployment, and strategic implications for developers, product managers, and AI practitioners.

AI hardwareApple IntelligenceLLM scaling
0 likes · 58 min read
Apple Intelligence and the Scaling Landscape of Large Language Models: Trends, Costs, and Deployment Considerations
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
May 9, 2024 · Artificial Intelligence

On‑Device AI and Federated Learning: Era Background, Theory, and Practical Applications

This article outlines the evolution from 1G to 6G communications, explains the third AI wave driven by big data, theory, and compute, introduces federated learning (horizontal, vertical, transfer), and details on‑device AI architectures, decision tree and neural network models, and real‑world use cases such as video preloading and autonomous driving.

Big DataEdge ComputingFederated Learning
0 likes · 13 min read
On‑Device AI and Federated Learning: Era Background, Theory, and Practical Applications
HelloTech
HelloTech
Aug 9, 2023 · Artificial Intelligence

Device Intelligence: Concepts, Architecture, and Applications

Device intelligence brings on-device reasoning and real-time inference to smartphones and IoT gateways, delivering low-latency, privacy-preserving, personalized services such as AR/VR enhancements and recommendation re-ranking, while confronting challenges of hardware fragmentation and model size, and complementing cloud AI through architectures like Hala’s MNN-based pipeline.

Device IntelligenceEdge ComputingMobile AI
0 likes · 10 min read
Device Intelligence: Concepts, Architecture, and Applications
Baidu App Technology
Baidu App Technology
May 10, 2021 · Mobile Development

LiteKit: Baidu's Mobile AI Deployment Framework for Fast AI Capability Integration

LiteKit, Baidu’s mobile AI deployment framework built on Paddle Lite, delivers out‑of‑the‑box video super‑resolution, human segmentation and gesture‑recognition SDKs that reduce integration complexity to three simple steps across Objective‑C, Java and C++, achieving real‑time performance (25 FPS) while lowering development effort and platform barriers.

LiteKitMobile AI DeploymentPaddle-Lite
0 likes · 14 min read
LiteKit: Baidu's Mobile AI Deployment Framework for Fast AI Capability Integration
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 9, 2020 · Artificial Intelligence

How Edge Computing Transforms Real-Time Recommendation Systems

This article examines the limitations of cloud‑based recommendation pipelines, explains how edge computing can provide localized user perception and rapid re‑ranking, describes the EdgeRec on‑device model architecture—including heterogeneous behavior sequence modeling and behavior‑aware attention reranking—and presents offline and online experimental results that demonstrate significant gains in click‑through and conversion rates.

behavior modelingon-device AIreal-time personalization
0 likes · 16 min read
How Edge Computing Transforms Real-Time Recommendation Systems
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 20, 2018 · Mobile Development

How Alibaba’s xMedia SDK Is Shaping the Future of Intelligent Mobile Terminals

This article examines the evolution of smart terminals, outlines the sensor and computing trends driving new mobile experiences, and details Alibaba’s xMedia SDK—including its rich‑media foundation, on‑device deep‑learning engine (xNN), SLAM positioning (xSLAM), 3D rendering (xAnt3D), and cross‑platform capabilities—showcasing how these technologies enable more intelligent, decentralized user interactions.

3D renderingRich MediaSLAM
0 likes · 19 min read
How Alibaba’s xMedia SDK Is Shaping the Future of Intelligent Mobile Terminals