Tagged articles
6 articles
Page 1 of 1
360 Tech Engineering
360 Tech Engineering
May 10, 2019 · Artificial Intelligence

Distributed Training with MXNet: Data Parallel on Single and Multi‑Node GPUs and Integration with Kubeflow

This article explains how MXNet supports data‑parallel training on single‑machine multi‑GPU and multi‑machine multi‑GPU setups, describes KVStore modes, outlines the worker‑server‑scheduler architecture, and shows how to launch large‑scale distributed training using Kubeflow and the mxnet‑operator.

Data ParallelDeep LearningDistributed Training
0 likes · 11 min read
Distributed Training with MXNet: Data Parallel on Single and Multi‑Node GPUs and Integration with Kubeflow
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
May 9, 2019 · Artificial Intelligence

Master Distributed MXNet Training with Kubeflow: A Step‑by‑Step Guide

Learn how to perform single‑machine multi‑GPU and multi‑node multi‑GPU training with MXNet, understand KVStore modes, configure workers, servers, and schedulers, and deploy large‑scale distributed training on Kubernetes using Kubeflow, including operator installation, task creation, and performance considerations.

Distributed TrainingGPUKubeflow
0 likes · 11 min read
Master Distributed MXNet Training with Kubeflow: A Step‑by‑Step Guide
Meitu Technology
Meitu Technology
May 29, 2018 · Artificial Intelligence

Boost MXNet Video Training Speed by Up to 18× with Rec‑Format I/O

This article analyzes MXNet's lack of native video I/O, compares existing image iterators, introduces a Rec‑format based video iterator, and demonstrates through single‑GPU and multi‑GPU experiments that the new approach can accelerate training by up to eighteen times.

Deep LearningImageRecordIterMXNet
0 likes · 9 min read
Boost MXNet Video Training Speed by Up to 18× with Rec‑Format I/O
ITPUB
ITPUB
Sep 21, 2016 · Artificial Intelligence

Deep Learning Platforms Unveiled: From DistBelief to TensorFlow and Real‑World Uses

The article reviews the evolution and challenges of deep learning, outlines major commercial platforms such as DistBelief, COTS, and Adam, compares open‑source frameworks like MXNet, TensorFlow and Petuum, and highlights their architectures, performance metrics, and diverse applications ranging from image recognition to recommendation systems.

Deep LearningMXNetTensorFlow
0 likes · 11 min read
Deep Learning Platforms Unveiled: From DistBelief to TensorFlow and Real‑World Uses
ITPUB
ITPUB
Sep 6, 2016 · Artificial Intelligence

Deep Learning Platforms: From Google’s DistBelief to Open‑Source MXNet and TensorFlow

The article reviews the evolution, challenges, and commercial and open‑source deep learning platforms—including DistBelief, COTS, Adam, MXNet, TensorFlow, and Petuum—while highlighting real‑world applications such as image recognition, recommendation, sentiment analysis, and crowd monitoring.

AI applicationsDistributed TrainingGPU Acceleration
0 likes · 10 min read
Deep Learning Platforms: From Google’s DistBelief to Open‑Source MXNet and TensorFlow