Tagged articles

21 articles

Page 1 of 1

Machine Learning Algorithms & Natural Language Processing

Mar 14, 2026 · Artificial Intelligence

OpenClaw 3.12 Unveiled: UI Overhaul, Faster Model Engine, and Strengthened Security

OpenClaw 3.12 brings a complete UI redesign, a first‑class Ollama onboarding flow, a Fast Mode API switch, Gemini‑based multimodal memory, a new sessions_yield routing feature, robust failover mechanisms and dozens of security patches, all aimed at improving stability and performance for both casual users and large‑scale automation workloads.

AI agentsOpenClawUI redesign

0 likes · 7 min read

OpenClaw 3.12 Unveiled: UI Overhaul, Faster Model Engine, and Strengthened Security

AIWalker

Mar 10, 2026 · Artificial Intelligence

MIGM-Shortcut: Learning Controlled Latent Dynamics to Speed Up Masked Image Generation

The paper introduces MIGM-Shortcut, a self‑supervised method that learns controlled latent‑state dynamics to bypass redundant bidirectional attention in Masked Image Generation Models, achieving over 4× speed‑up on state‑of‑the‑art multimodal diffusion models like Lumina‑DiMOO while preserving image quality.

AIMIGMdiffusion models

0 likes · 8 min read

MIGM-Shortcut: Learning Controlled Latent Dynamics to Speed Up Masked Image Generation

Network Intelligence Research Center (NIRC)

Dec 31, 2025 · Artificial Intelligence

Why AI Inference Is Slow and How Cutting‑Edge Tech Boosts It in Industrial Settings

The article analyzes the severe inference bottlenecks of large language models, CNNs, and recommendation systems and presents a suite of research‑driven accelerations—including token‑level pipeline parallelism (HPipe), KV‑cache clustering (ClusterAttn), quantization (QoKV), heterogeneous edge frameworks (DeepZoning, PICO), delay‑aware edge‑cloud scheduling (DECC), and operator choreography (RACE)—validated on real‑world industrial workloads.

AI inferenceRecommendation Systemsedge AI

0 likes · 16 min read

Why AI Inference Is Slow and How Cutting‑Edge Tech Boosts It in Industrial Settings

Efficient Ops

Dec 24, 2025 · Artificial Intelligence

From AI+ Era to Enterprise AI Agents: Evolution, Technologies, and Practical Guidance

The talk outlines the AI+ era's digital ecosystem, traces the evolution from traditional AI to Agentic AI, examines emerging AI Agent technologies, and shares concrete enterprise‑level development practices, frameworks, and governance strategies for financial industry deployments.

AI agentsAgentic AILow-code Platforms

0 likes · 18 min read

From AI+ Era to Enterprise AI Agents: Evolution, Technologies, and Practical Guidance

AntTech

Sep 25, 2025 · Artificial Intelligence

ICCV Spotlight: Pixel Tracing for Copy Detection and Skip-Vision Model Acceleration

The ICCV 2025 live session will deep‑dive into two cutting‑edge papers—PixTrace with CopyNCE for precise image copy detection and Skip‑Vision for dramatically faster training and inference of vision‑language models—showcasing their methods, results, and real‑world impact.

Computer VisionICCV 2025Vision-Language Models

0 likes · 5 min read

ICCV Spotlight: Pixel Tracing for Copy Detection and Skip-Vision Model Acceleration

Data Party THU

Aug 22, 2025 · Artificial Intelligence

TwigVLM: How Tiny Branches Accelerate Large Vision‑Language Models

TwigVLM introduces a lightweight “twig” module that prunes visual tokens early and enables self‑speculative decoding, achieving up to 154% speedup on long‑text generation while preserving 96% of original LVLM accuracy, as demonstrated on LLaVA‑1.5‑7B and other benchmarks.

LVLMMultimodal AIToken Pruning

0 likes · 14 min read

TwigVLM: How Tiny Branches Accelerate Large Vision‑Language Models

Baidu Geek Talk

Aug 11, 2025 · Artificial Intelligence

FLUX-Lightning Slashes Diffusion Inference to 4 Steps, Doubling Speed

FLUX-Lightning, introduced by PaddleMIX, combines phased consistency distillation, adversarial learning, distribution‑matching distillation, and reflow loss to reduce diffusion model inference to just four steps while preserving image quality, and leverages the CINN compiler to achieve over 30% speed gains on A800 GPUs, surpassing existing SOTA acceleration methods.

AI inferenceCINNDistillation

0 likes · 21 min read

FLUX-Lightning Slashes Diffusion Inference to 4 Steps, Doubling Speed

Network Intelligence Research Center (NIRC)

Jul 2, 2025 · Artificial Intelligence

Optimizing Deep Learning Inference with TensorRT: A Practical Toolchain Walkthrough

This article walks through TensorRT's core optimization features, auxiliary debugging tools, and a step‑by‑step SMPLer‑X case study, showing how graph simplification, mixed‑precision, and engine generation cut inference latency to roughly 22‑29% of the original runtime.

Deep LearningGPU inferenceONNX

0 likes · 6 min read

Optimizing Deep Learning Inference with TensorRT: A Practical Toolchain Walkthrough

AI Frontier Lectures

Apr 1, 2025 · Artificial Intelligence

Can SpargeAttn Accelerate Any Model Without Training? A Deep Dive

This article reviews the SpargeAttn paper, describing how a training‑free sparse attention mechanism achieves 4‑7× inference speedup across language, video, and image models while preserving end‑to‑end accuracy, and outlines its challenges, algorithmic solutions, implementation details, and experimental results.

GPU OptimizationQuantized InferenceSpargeAttn

0 likes · 7 min read

Can SpargeAttn Accelerate Any Model Without Training? A Deep Dive

Baidu Geek Talk

May 15, 2024 · Artificial Intelligence

Accelerating Large Model Training and Inference with Baidu Baige AIAK‑LLM: Challenges, Techniques, and Optimizations

The talk outlines how Baidu’s Baige AIAK‑LLM suite tackles the exploding compute demands of trillion‑parameter models by boosting Model FLOPS Utilization through advanced parallelism, memory‑saving recompute, zero‑offload, adaptive scheduling, and cross‑chip orchestration, delivering 30‑60% training and inference speedups and a unified cloud product.

AI InfrastructureBaiduInference Optimization

0 likes · 25 min read

Accelerating Large Model Training and Inference with Baidu Baige AIAK‑LLM: Challenges, Techniques, and Optimizations

DataFunTalk

May 7, 2024 · Artificial Intelligence

Intelligent NPC: How AI Gives Game Characters a Soul and the Art of Deep Model Inference Acceleration

The talk by Tencent Games AI lead Qiu Dongyang explores the opportunities large models bring to game NPCs, showcases the "JueZhi Anuan" intelligent NPC prototype, discusses core challenges and acceleration techniques, and outlines future plans and compliance considerations for AI‑driven characters.

AIGame DevelopmentNPC

0 likes · 2 min read

Intelligent NPC: How AI Gives Game Characters a Soul and the Art of Deep Model Inference Acceleration

DaTaobao Tech

Apr 26, 2024 · Artificial Intelligence

Accelerating Stable Diffusion Models: Evaluation of FlashAttention2, OneFlow, DeepCache, Stable-Fast, and LCM-LoRA

Our benchmark of FlashAttention2, OneFlow, DeepCache, Stable‑Fast, and LCM‑LoRA on Stable Diffusion models shows that DeepCache combined with PyTorch 2.2 consistently cuts inference time by 40‑50% with minimal code changes, while OneFlow offers 20‑40% speedups when compatible, making DeepCache the recommended default acceleration.

DeepCacheFlashAttention2LCM-LoRA

0 likes · 10 min read

Accelerating Stable Diffusion Models: Evaluation of FlashAttention2, OneFlow, DeepCache, Stable-Fast, and LCM-LoRA

Rare Earth Juejin Tech Community

Apr 22, 2024 · Artificial Intelligence

PP-LCNet: A Lightweight CPU-Optimized Convolutional Neural Network

PP-LCNet is a lightweight convolutional neural network designed for Intel CPUs that leverages MKLDNN acceleration, H‑Swish activation, selective SE modules, larger kernels, and expanded fully‑connected layers to achieve higher accuracy without increasing inference latency across image classification, detection, and segmentation tasks.

CPU optimizationMKLDNNlightweight CNN

0 likes · 25 min read

PP-LCNet: A Lightweight CPU-Optimized Convolutional Neural Network

Alimama Tech

Dec 14, 2023 · Artificial Intelligence

AI-Driven Content Risk Control: System Evolution and Optimization at Alibaba

Alibaba Mom’s AI‑driven content risk platform has evolved from simple rule‑matching to a data‑centric, serverless architecture that integrates large‑model acceleration, decision‑tree compilation, high‑throughput vector retrieval and elastic word‑matching, delivering sub‑100 ms text and sub‑1 s image moderation while remaining stable during peak promotional traffic.

AIDevOpscontent moderation

0 likes · 25 min read

AI-Driven Content Risk Control: System Evolution and Optimization at Alibaba

Alibaba Cloud Big Data AI Platform

Dec 11, 2023 · Artificial Intelligence

How PAI‑Blade Supercharges PyTorch Training with Up to 41% Speedup

This article explains how PAI‑Blade uses compiler optimizations, TorchDynamo, MHLO conversion, and aggressive kernel fusion to accelerate PyTorch training, provides simple two‑line integration code, showcases benchmark results on A10 and A100 GPUs, and details deployment steps on PAI‑DSW.

BladeDISCGPU OptimizationPAI-Blade

0 likes · 8 min read

How PAI‑Blade Supercharges PyTorch Training with Up to 41% Speedup

360 Smart Cloud

Nov 20, 2023 · Artificial Intelligence

Overview of Recent Open‑Source AI Models and Tools (November 2023)

This article summarizes a collection of newly released open‑source AI projects covering natural‑language processing, multimodal processing, intelligent agents, recommendation systems, and model training acceleration, providing brief descriptions, key capabilities, and links to their repositories.

AIRecommendation Systemslarge language models

0 likes · 9 min read

Overview of Recent Open‑Source AI Models and Tools (November 2023)

Huolala Tech

Jul 28, 2023 · Artificial Intelligence

How HuoLala Leverages AI to Revolutionize Service Quality Inspection

This article details HuoLala's AI‑driven intelligent quality inspection system, covering its NLP‑based semantic understanding pipeline, data denoising, confidence learning, contrastive learning, model acceleration techniques such as pruning, knowledge distillation, quantization, and interpretability methods to improve coverage, recall and risk detection.

NLPcontrastive learningdata denoising

0 likes · 23 min read

How HuoLala Leverages AI to Revolutionize Service Quality Inspection

DataFunTalk

Oct 12, 2021 · Artificial Intelligence

PaddleNLP v2.1 Release: Taskflow One‑Click NLP, Few‑Shot Learning Enhancements, and 28× Text Generation Acceleration

PaddleNLP v2.1 introduces an industrial‑grade Taskflow for eight NLP scenarios, a three‑line few‑shot learning paradigm that boosts small‑sample performance, and a FasterTransformer‑based inference engine that delivers up to 28‑fold speedup for text generation, all backed by extensive model and algorithm integrations.

Few‑Shot LearningNLPPaddleNLP

0 likes · 7 min read

PaddleNLP v2.1 Release: Taskflow One‑Click NLP, Few‑Shot Learning Enhancements, and 28× Text Generation Acceleration

Suning Technology

Oct 29, 2020 · Artificial Intelligence

Accelerating Deep Learning for Retail: Model Compression, Speed & Energy

This lecture outlines the key challenges of deep learning in retail—growing model size, speed, and energy consumption—and presents a comprehensive acceleration framework covering algorithmic optimizations like network design, pruning, and hardware acceleration, with practical examples such as MobileNet, model compression, and edge deployment.

Deep LearningHardware Optimizationmodel acceleration

0 likes · 15 min read

Accelerating Deep Learning for Retail: Model Compression, Speed & Energy

Tencent Tech

Feb 27, 2020 · Artificial Intelligence

How to Speed Up Deep Learning Models: Cutting-Edge Acceleration Techniques

Deep learning models often suffer from slow training and deployment due to their size, but a range of advanced acceleration methods—including model architecture optimization, pruning, quantization, knowledge distillation, and distributed training techniques—can dramatically improve speed and efficiency while maintaining performance.

Deep LearningDistributed Trainingknowledge distillation

0 likes · 14 min read

How to Speed Up Deep Learning Models: Cutting-Edge Acceleration Techniques

Hulu Beijing

Apr 30, 2019 · Artificial Intelligence

How Can Deep Neural Networks Be Accelerated and Compressed? Key Techniques Explained

This article reviews why deep neural networks are over‑parameterized, outlines the challenges of deploying them on mobile and embedded devices, and presents six major strategies—pruning, low‑rank approximation, filter selection, quantization, knowledge distillation, and novel architecture design—to accelerate and compress models while preserving performance.

Deep Learningknowledge distillationmodel acceleration

0 likes · 11 min read

How Can Deep Neural Networks Be Accelerated and Compressed? Key Techniques Explained