Tagged articles

ONNX

17 articles · Page 1 of 1

May 7, 2026 · Artificial Intelligence

Running AI Inference Directly in the Browser with WebNN

WebNN brings hardware‑accelerated AI inference to web pages, letting developers run millisecond‑level face detection, real‑time filters, and semantic segmentation locally without cloud calls, while improving latency, privacy, and cost through a unified JavaScript API that maps to CPUs, GPUs or NPUs.

AI inferenceEdgeGPU

0 likes · 16 min read

Running AI Inference Directly in the Browser with WebNN

Woodpecker Software Testing

Mar 23, 2026 · Artificial Intelligence

Practical Guide to Optimizing AI Testing Tool Performance

This article analyzes why AI‑driven testing tools often become performance bottlenecks, identifies I/O and serialization as the main culprits, and presents concrete optimizations—including headless browser flags, mmap, gRPC streaming, model lightweighting, multi‑level caching, and Kubernetes‑based co‑scheduling—that together reduce latency by up to 90% and boost throughput severalfold.

AI testingCachingKubernetes

0 likes · 7 min read

Practical Guide to Optimizing AI Testing Tool Performance

Sohu Tech Products

Dec 17, 2025 · Artificial Intelligence

How We Cut Vision Transformer Inference Latency from 53 ms to 8 ms

Facing 53.64 ms per‑image latency in a Flask‑served Vision Transformer classifier, we iteratively optimized the pipeline—switching to ONNX Runtime, leveraging TensorRT, replacing Pillow with OpenCV, eliminating URL downloads, and finally batching requests—reducing average server‑side processing to 8.34 ms, a 6.4× speedup.

BatchingFlaskONNX

0 likes · 28 min read

How We Cut Vision Transformer Inference Latency from 53 ms to 8 ms

Programmer DD

Oct 13, 2025 · Artificial Intelligence

Running ONNX AI Inference Natively in Java Without Python

This article explains how enterprise architects can integrate ONNX‑based machine‑learning inference directly into Java applications, covering tokenizer integration, GPU acceleration, deployment patterns, and lifecycle management to achieve secure, scalable, and observable AI services without relying on Python runtimes.

AI inferenceEnterprise ArchitectureGPU

0 likes · 16 min read

Running ONNX AI Inference Natively in Java Without Python

Open Source Tech Hub

Sep 30, 2025 · Artificial Intelligence

Boost PHP Performance with High‑Speed Tensor Computing Using PHP‑ORT

PHP‑ORT is a high‑performance PHP extension that brings SIMD‑accelerated tensor operations and optional ONNX Runtime integration to PHP, offering multi‑core parallelism, extensive type support, and memory‑efficient processing for machine‑learning, scientific, and data‑intensive applications.

ExtensionONNXPHP

0 likes · 6 min read

Boost PHP Performance with High‑Speed Tensor Computing Using PHP‑ORT

Network Intelligence Research Center (NIRC)

Jul 2, 2025 · Artificial Intelligence

Optimizing Deep Learning Inference with TensorRT: A Practical Toolchain Walkthrough

This article walks through TensorRT's core optimization features, auxiliary debugging tools, and a step‑by‑step SMPLer‑X case study, showing how graph simplification, mixed‑precision, and engine generation cut inference latency to roughly 22‑29% of the original runtime.

GPU inferenceONNXPolygraphy

0 likes · 6 min read

Optimizing Deep Learning Inference with TensorRT: A Practical Toolchain Walkthrough

Network Intelligence Research Center (NIRC)

Apr 28, 2025 · Artificial Intelligence

Export a PyTorch Model to ONNX and Extract Intermediate Layer Features

This article walks through exporting a PyTorch image‑classification model to ONNX, identifying a hidden layer node, adding it as an output, and using ONNX Runtime to retrieve both the final prediction and the intermediate feature tensor.

ONNXONNX RuntimePyTorch

0 likes · 5 min read

Export a PyTorch Model to ONNX and Extract Intermediate Layer Features

Smart Era Software Development

Mar 31, 2025 · Artificial Intelligence

Common AI Model Formats Developers Use: GGUF, PyTorch, Safetensors, and ONNX

Developers face a variety of AI model formats—GGUF, PyTorch (.pt/.pth), Safetensors, and ONNX—each with distinct structures, advantages, drawbacks, and hardware support, and this article analyzes their metadata organization, quantization options, security considerations, and suitability for different deployment scenarios.

AI model formatsGGUFONNX

0 likes · 15 min read

Common AI Model Formats Developers Use: GGUF, PyTorch, Safetensors, and ONNX

Network Intelligence Research Center (NIRC)

Mar 19, 2025 · Game Development

Quickly Build a MetaXR Interaction Lab in Unity

This guide walks through setting up Meta XR SDK in Unity, using Building Blocks to add camera rigs, hand tracking and passthrough, binding interaction events, accessing hand‑tracking data via OVRSkeleton/OVRHand, and integrating ONNX machine‑learning models for XR experiments.

BuildingBlocksHandTrackingMetaXR

0 likes · 7 min read

Quickly Build a MetaXR Interaction Lab in Unity

Ops Development & AI Practice

Feb 14, 2025 · Artificial Intelligence

Large Model Format Showdown: Hugging Face, TensorFlow, ONNX, TorchScript, GGUF

This comprehensive guide examines the leading large‑model storage formats—including Hugging Face Transformers, TensorFlow SavedModel, ONNX, TorchScript, and GGUF—detailing their file structures, serialization methods, strengths, weaknesses, and typical use‑cases, helping developers and researchers select the optimal format for their specific AI workloads.

AI DeploymentGGUFModel Formats

0 likes · 21 min read

Large Model Format Showdown: Hugging Face, TensorFlow, ONNX, TorchScript, GGUF

Alibaba Cloud Developer

Nov 22, 2024 · Artificial Intelligence

Master YOLOv8: End-to-End Guide to Object Detection, Training, and Deployment

This comprehensive tutorial walks you through YOLOv8 object detection—from environment setup and dataset preparation to model training, validation, testing, and conversion to ONNX and TensorRT—providing clear commands, code snippets, and visual results for each step.

Model TrainingONNXTensorRT

0 likes · 8 min read

Master YOLOv8: End-to-End Guide to Object Detection, Training, and Deployment

Open Source Tech Hub

Aug 22, 2024 · Artificial Intelligence

Unlock AI Power in PHP: A Hands‑On Guide to TransformersPHP

TransformersPHP brings Hugging Face’s Transformer models to PHP, enabling developers to run thousands of pre‑trained NLP models locally for tasks like text generation, summarisation, and translation, with simple installation, ONNX‑based execution, and a Python‑like pipeline API.

AINLPONNX

0 likes · 8 min read

Unlock AI Power in PHP: A Hands‑On Guide to TransformersPHP

Rare Earth Juejin Tech Community

Apr 24, 2024 · Artificial Intelligence

Training MNIST with Burn on wgpu: From PyTorch to Rust Backend

This tutorial demonstrates how to train a MNIST digit‑recognition model using the Rust‑based Burn framework on top of the cross‑platform wgpu API, covering model export from PyTorch to ONNX, code generation, data loading, training loops, and performance comparison across CPU, GPU, and other backends.

BurnGPUMNIST

0 likes · 13 min read

Training MNIST with Burn on wgpu: From PyTorch to Rust Backend

Zuoyebang Tech Team

Jul 15, 2022 · Artificial Intelligence

How AI Scores Poetry Recitation: Inside Real-Time Speech Evaluation Tech

This article explains how the homework‑help platform uses computer‑assisted language learning and neural network models to automatically evaluate spoken poetry, detailing the evaluation dimensions, reliability metrics like Pearson correlation and kappa, data‑driven feature extraction, ONNX deployment, and continuous model improvement through patented automatic data feedback.

AIONNX__call__

0 likes · 3 min read

How AI Scores Poetry Recitation: Inside Real-Time Speech Evaluation Tech

Python Programming Learning Circle

Nov 8, 2021 · Artificial Intelligence

YOLOv5 Tutorial: From YOLOv3 to YOLOv5, Code Walkthrough, Model Export (JIT & ONNX) and Usage

This article provides a comprehensive guide on YOLOv5, covering its background from YOLOv3, detailed code analysis of the model architecture, step‑by‑step instructions for running detect.py, configuring yolov5s.yaml, exporting the model to TorchScript JIT and ONNX formats, and practical inference examples using PyTorch and ONNX Runtime.

JITONNXPyTorch

0 likes · 16 min read

YOLOv5 Tutorial: From YOLOv3 to YOLOv5, Code Walkthrough, Model Export (JIT & ONNX) and Usage

WeChat Backend Team

Jun 7, 2021 · Artificial Intelligence

How WeChat’s TFCC Boosts Deep Learning Inference Performance Across Platforms

The TFCC framework, developed by WeChat's backend team, delivers high‑performance, easy‑to‑use, and universal deep‑learning inference by supporting numerous ONNX and TensorFlow operations, optimizing model structures, constants, and operators, and providing a versatile runtime and math library for both CPU and GPU platforms.

ONNXTFCCTensorFlow

0 likes · 8 min read

How WeChat’s TFCC Boosts Deep Learning Inference Performance Across Platforms

DataFunSummit

Mar 28, 2021 · Artificial Intelligence

Deploying Scikit‑learn and HMMlearn Models as High‑Performance Online Prediction Services Using ONNX

This article demonstrates how to convert traditional scikit‑learn and hmmlearn machine‑learning models into ONNX format and integrate them into a C++ gRPC service for fast online inference, covering environment setup, model conversion, custom operators, performance testing, and end‑to‑end pipeline construction.

C#Model DeploymentONNX

0 likes · 22 min read

Deploying Scikit‑learn and HMMlearn Models as High‑Performance Online Prediction Services Using ONNX