Artificial Intelligence 12 min read

Getting Started with Huawei Ascend AI Accelerators

This guide walks through the fundamentals of Huawei Ascend NPU hardware, the CANN software stack, driver and firmware installation, Kubernetes integration via Docker runtime and device plugin, and a complete ResNet‑50 inference demo on Ascend 310P.

Infra Learning Club

Feb 6, 2025

Getting Started with Huawei Ascend AI Accelerators

Overview

Huawei Ascend is an AI‑focused NPU series developed by HiSilicon. The Ascend 310 targets edge workloads with up to 16 TOPS (INT8) or 8 TOPS (FP16), while the Ascend 910 delivers up to 320 TFLOPS (FP16) or 640 TOPS (INT8). Both chips are built on advanced process nodes (12 nm for 310, N7+ for 910) and consume 8 W and 310 W respectively.

Software Stack (CANN)

The Compute Architecture for Neural Networks (CANN) is Huawei’s heterogeneous AI computing framework, analogous to Nvidia’s CUDA. It provides drivers, firmware, and a set of libraries that expose AI frameworks to the Ascend processors. The community edition can be downloaded freely for non‑commercial use; a commercial edition is required for production deployments and must be obtained through an application process.

Kubernetes Integration

To run Ascend NPU workloads on Kubernetes, the following steps are required:

Verify the NPU hardware with lspci or npu‑smi info.

Install the driver first, then the firmware (or reverse order for a full reinstall).

Download the appropriate driver ( A300t‑9000‑npu‑driver_*.run) and firmware ( A300t‑9000‑npu‑firmware_*.run) packages from the community download page.

Install the Ascend Docker Runtime:

$ wget -c https://mindx.obs.cn-south-1.myhuaweicloud.com/OpenSource/MindX/MindX%205.0.RC2/MindX%20DL%205.0.RC2/Ascend-docker-runtime_5.0.RC2_linux-x86_64.run
$ chmod u+x Ascend-docker-runtime_5.0.RC2_linux-x86_64.run
$ ./Ascend-docker-runtime_5.0.RC2_linux-x86_64.run --install

Configure containerd to use the Ascend runtime by editing /etc/containerd/config.toml and setting the runtime path to

/usr/local/Ascend/Ascend-Docker-Runtime/ascend-docker-runtime

Deploy the device plugin (MindX DL) either by building the image from source:

$ wget -c https://mindx.obs.cn-south-1.myhuaweicloud.com/OpenSource/MindX/MindX%205.0.RC2/MindX%20DL%205.0.RC2.1/Ascend-mindxdl-device-plugin_5.0.RC2.1_linux-x86_64.zip
$ docker build -t ascend‑k8sdeviceplugin:v5.0.RC2 .

or by pulling the pre‑built image (requires an enterprise AscendHub account):

$ docker pull ascendhub.huawei.com/public-ascendhub/ascend‑k8sdeviceplugin:v5.0.RC2

Apply the plugin manifest:

$ kubectl apply -f device-plugin-310-v5.0.RC2.yaml
$ kubectl label nodes {node-name} accelerator=huawei‑Ascend310

If the NPU is not detected in certain virtualized environments, the article suggests two fixes: mounting dmidecode or systemd‑detect‑virt into the plugin container, or adding apt‑get install -y systemd to the plugin Dockerfile.

Demo: ResNet‑50 Inference on Ascend 310P

Download the sample code from the Ascend GitHub repository, fetch the Caffe model files, and convert them to an Ascend‑compatible .om model using the ATC tool:

$ atc --model=caffe_model/resnet50.prototxt \
    --weight=caffe_model/resnet50.caffemodel \
    --framework=0 \
    --output=model/resnet50 \
    --soc_version=Ascend310P3 \
    --input_format=NCHW \
    --input_fp16_nodes=data \
    --output_type=FP32 \
    --out_nodes=prob:0

The conversion produces resnet50.om. Deploy a Kubernetes Deployment that requests one Ascend310P device:

kind: Deployment
apiVersion: apps/v1
metadata:
  name: ascend-test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ascend-test
  template:
    metadata:
      labels:
        app: ascend-test
    spec:
      containers:
      - name: container-1
        image: ascendhub.huawei.com/public-ascendhub/ascend-pytorch:23.0.RC1-centos7.6
        command: ["top", "-b"]
        resources:
          limits:
            cpu: "2"
            huawei.com/Ascend310P: "1"
            memory: 16Gi
          requests:
            cpu: "2"
            huawei.com/Ascend310P: "1"
            memory: 16Gi

Inside the pod, run the inference script: $ python3 ./src/acl_net.py The output shows a 76 % confidence for class 161 (basset).

CCE (Cloud Container Engine) Usage

After creating a CCE cluster, install the CCE AI suite from the plugin marketplace, add NPU nodes, and configure NPU quotas in the workload definition. The console displays monitoring metrics such as compute utilization, memory usage, and memory occupancy for each node.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Kubernetes AI inference NPU ResNet50 CANN Huawei Ascend Docker Runtime

Written by

Infra Learning Club

Infra Learning Club shares study notes, cutting-edge technology, and career discussions.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Overview

Software Stack (CANN)

Kubernetes Integration

Demo: ResNet‑50 Inference on Ascend 310P

CCE (Cloud Container Engine) Usage

Infra Learning Club

How this landed with the community

Was this worth your time?

0 Comments

Demo: ResNet‑50 Inference on Ascend 310P