Tagged articles
21 articles
Page 1 of 1
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Mar 14, 2026 · Artificial Intelligence

OpenClaw 3.12 Unveiled: UI Overhaul, Faster Model Engine, and Strengthened Security

OpenClaw 3.12 brings a complete UI redesign, a first‑class Ollama onboarding flow, a Fast Mode API switch, Gemini‑based multimodal memory, a new sessions_yield routing feature, robust failover mechanisms and dozens of security patches, all aimed at improving stability and performance for both casual users and large‑scale automation workloads.

AI agentsOpenClawUI redesign
0 likes · 7 min read
OpenClaw 3.12 Unveiled: UI Overhaul, Faster Model Engine, and Strengthened Security
AIWalker
AIWalker
Mar 10, 2026 · Artificial Intelligence

MIGM-Shortcut: Learning Controlled Latent Dynamics to Speed Up Masked Image Generation

The paper introduces MIGM-Shortcut, a self‑supervised method that learns controlled latent‑state dynamics to bypass redundant bidirectional attention in Masked Image Generation Models, achieving over 4× speed‑up on state‑of‑the‑art multimodal diffusion models like Lumina‑DiMOO while preserving image quality.

AIMIGMdiffusion models
0 likes · 8 min read
MIGM-Shortcut: Learning Controlled Latent Dynamics to Speed Up Masked Image Generation
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Dec 31, 2025 · Artificial Intelligence

Why AI Inference Is Slow and How Cutting‑Edge Tech Boosts It in Industrial Settings

The article analyzes the severe inference bottlenecks of large language models, CNNs, and recommendation systems and presents a suite of research‑driven accelerations—including token‑level pipeline parallelism (HPipe), KV‑cache clustering (ClusterAttn), quantization (QoKV), heterogeneous edge frameworks (DeepZoning, PICO), delay‑aware edge‑cloud scheduling (DECC), and operator choreography (RACE)—validated on real‑world industrial workloads.

AI inferenceRecommendation Systemsedge AI
0 likes · 16 min read
Why AI Inference Is Slow and How Cutting‑Edge Tech Boosts It in Industrial Settings
AntTech
AntTech
Sep 25, 2025 · Artificial Intelligence

ICCV Spotlight: Pixel Tracing for Copy Detection and Skip-Vision Model Acceleration

The ICCV 2025 live session will deep‑dive into two cutting‑edge papers—PixTrace with CopyNCE for precise image copy detection and Skip‑Vision for dramatically faster training and inference of vision‑language models—showcasing their methods, results, and real‑world impact.

Computer VisionICCV 2025Vision-Language Models
0 likes · 5 min read
ICCV Spotlight: Pixel Tracing for Copy Detection and Skip-Vision Model Acceleration
Data Party THU
Data Party THU
Aug 22, 2025 · Artificial Intelligence

TwigVLM: How Tiny Branches Accelerate Large Vision‑Language Models

TwigVLM introduces a lightweight “twig” module that prunes visual tokens early and enables self‑speculative decoding, achieving up to 154% speedup on long‑text generation while preserving 96% of original LVLM accuracy, as demonstrated on LLaVA‑1.5‑7B and other benchmarks.

LVLMMultimodal AIToken Pruning
0 likes · 14 min read
TwigVLM: How Tiny Branches Accelerate Large Vision‑Language Models
Baidu Geek Talk
Baidu Geek Talk
Aug 11, 2025 · Artificial Intelligence

FLUX-Lightning Slashes Diffusion Inference to 4 Steps, Doubling Speed

FLUX-Lightning, introduced by PaddleMIX, combines phased consistency distillation, adversarial learning, distribution‑matching distillation, and reflow loss to reduce diffusion model inference to just four steps while preserving image quality, and leverages the CINN compiler to achieve over 30% speed gains on A800 GPUs, surpassing existing SOTA acceleration methods.

AI inferenceCINNDistillation
0 likes · 21 min read
FLUX-Lightning Slashes Diffusion Inference to 4 Steps, Doubling Speed
AI Frontier Lectures
AI Frontier Lectures
Apr 1, 2025 · Artificial Intelligence

Can SpargeAttn Accelerate Any Model Without Training? A Deep Dive

This article reviews the SpargeAttn paper, describing how a training‑free sparse attention mechanism achieves 4‑7× inference speedup across language, video, and image models while preserving end‑to‑end accuracy, and outlines its challenges, algorithmic solutions, implementation details, and experimental results.

GPU OptimizationQuantized InferenceSpargeAttn
0 likes · 7 min read
Can SpargeAttn Accelerate Any Model Without Training? A Deep Dive
Baidu Geek Talk
Baidu Geek Talk
May 15, 2024 · Artificial Intelligence

Accelerating Large Model Training and Inference with Baidu Baige AIAK‑LLM: Challenges, Techniques, and Optimizations

The talk outlines how Baidu’s Baige AIAK‑LLM suite tackles the exploding compute demands of trillion‑parameter models by boosting Model FLOPS Utilization through advanced parallelism, memory‑saving recompute, zero‑offload, adaptive scheduling, and cross‑chip orchestration, delivering 30‑60% training and inference speedups and a unified cloud product.

AI InfrastructureBaiduInference Optimization
0 likes · 25 min read
Accelerating Large Model Training and Inference with Baidu Baige AIAK‑LLM: Challenges, Techniques, and Optimizations
DaTaobao Tech
DaTaobao Tech
Apr 26, 2024 · Artificial Intelligence

Accelerating Stable Diffusion Models: Evaluation of FlashAttention2, OneFlow, DeepCache, Stable-Fast, and LCM-LoRA

Our benchmark of FlashAttention2, OneFlow, DeepCache, Stable‑Fast, and LCM‑LoRA on Stable Diffusion models shows that DeepCache combined with PyTorch 2.2 consistently cuts inference time by 40‑50% with minimal code changes, while OneFlow offers 20‑40% speedups when compatible, making DeepCache the recommended default acceleration.

DeepCacheFlashAttention2LCM-LoRA
0 likes · 10 min read
Accelerating Stable Diffusion Models: Evaluation of FlashAttention2, OneFlow, DeepCache, Stable-Fast, and LCM-LoRA
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Apr 22, 2024 · Artificial Intelligence

PP-LCNet: A Lightweight CPU-Optimized Convolutional Neural Network

PP-LCNet is a lightweight convolutional neural network designed for Intel CPUs that leverages MKLDNN acceleration, H‑Swish activation, selective SE modules, larger kernels, and expanded fully‑connected layers to achieve higher accuracy without increasing inference latency across image classification, detection, and segmentation tasks.

CPU optimizationMKLDNNlightweight CNN
0 likes · 25 min read
PP-LCNet: A Lightweight CPU-Optimized Convolutional Neural Network
Alimama Tech
Alimama Tech
Dec 14, 2023 · Artificial Intelligence

AI-Driven Content Risk Control: System Evolution and Optimization at Alibaba

Alibaba Mom’s AI‑driven content risk platform has evolved from simple rule‑matching to a data‑centric, serverless architecture that integrates large‑model acceleration, decision‑tree compilation, high‑throughput vector retrieval and elastic word‑matching, delivering sub‑100 ms text and sub‑1 s image moderation while remaining stable during peak promotional traffic.

AIDevOpscontent moderation
0 likes · 25 min read
AI-Driven Content Risk Control: System Evolution and Optimization at Alibaba
360 Smart Cloud
360 Smart Cloud
Nov 20, 2023 · Artificial Intelligence

Overview of Recent Open‑Source AI Models and Tools (November 2023)

This article summarizes a collection of newly released open‑source AI projects covering natural‑language processing, multimodal processing, intelligent agents, recommendation systems, and model training acceleration, providing brief descriptions, key capabilities, and links to their repositories.

AIRecommendation Systemslarge language models
0 likes · 9 min read
Overview of Recent Open‑Source AI Models and Tools (November 2023)
Huolala Tech
Huolala Tech
Jul 28, 2023 · Artificial Intelligence

How HuoLala Leverages AI to Revolutionize Service Quality Inspection

This article details HuoLala's AI‑driven intelligent quality inspection system, covering its NLP‑based semantic understanding pipeline, data denoising, confidence learning, contrastive learning, model acceleration techniques such as pruning, knowledge distillation, quantization, and interpretability methods to improve coverage, recall and risk detection.

NLPcontrastive learningdata denoising
0 likes · 23 min read
How HuoLala Leverages AI to Revolutionize Service Quality Inspection
DataFunTalk
DataFunTalk
Oct 12, 2021 · Artificial Intelligence

PaddleNLP v2.1 Release: Taskflow One‑Click NLP, Few‑Shot Learning Enhancements, and 28× Text Generation Acceleration

PaddleNLP v2.1 introduces an industrial‑grade Taskflow for eight NLP scenarios, a three‑line few‑shot learning paradigm that boosts small‑sample performance, and a FasterTransformer‑based inference engine that delivers up to 28‑fold speedup for text generation, all backed by extensive model and algorithm integrations.

Few‑Shot LearningNLPPaddleNLP
0 likes · 7 min read
PaddleNLP v2.1 Release: Taskflow One‑Click NLP, Few‑Shot Learning Enhancements, and 28× Text Generation Acceleration
Suning Technology
Suning Technology
Oct 29, 2020 · Artificial Intelligence

Accelerating Deep Learning for Retail: Model Compression, Speed & Energy

This lecture outlines the key challenges of deep learning in retail—growing model size, speed, and energy consumption—and presents a comprehensive acceleration framework covering algorithmic optimizations like network design, pruning, and hardware acceleration, with practical examples such as MobileNet, model compression, and edge deployment.

Deep LearningHardware Optimizationmodel acceleration
0 likes · 15 min read
Accelerating Deep Learning for Retail: Model Compression, Speed & Energy
Tencent Tech
Tencent Tech
Feb 27, 2020 · Artificial Intelligence

How to Speed Up Deep Learning Models: Cutting-Edge Acceleration Techniques

Deep learning models often suffer from slow training and deployment due to their size, but a range of advanced acceleration methods—including model architecture optimization, pruning, quantization, knowledge distillation, and distributed training techniques—can dramatically improve speed and efficiency while maintaining performance.

Deep LearningDistributed Trainingknowledge distillation
0 likes · 14 min read
How to Speed Up Deep Learning Models: Cutting-Edge Acceleration Techniques
Hulu Beijing
Hulu Beijing
Apr 30, 2019 · Artificial Intelligence

How Can Deep Neural Networks Be Accelerated and Compressed? Key Techniques Explained

This article reviews why deep neural networks are over‑parameterized, outlines the challenges of deploying them on mobile and embedded devices, and presents six major strategies—pruning, low‑rank approximation, filter selection, quantization, knowledge distillation, and novel architecture design—to accelerate and compress models while preserving performance.

Deep Learningknowledge distillationmodel acceleration
0 likes · 11 min read
How Can Deep Neural Networks Be Accelerated and Compressed? Key Techniques Explained