Artificial Intelligence 25 min read

Tensor Indexing in PaddlePaddle: Concepts, Operations, and Practical Examples

This article explains PaddlePaddle tensor indexing, covering basic slicing, integer and boolean advanced indexing, ellipsis and newaxis usage, assignment in dynamic and static graphs, automatic gradient propagation, and demonstrates practical applications such as semantic segmentation, object detection, and NLP sequence masking.

Baidu Geek Talk

Jul 17, 2024

Tensor Indexing in PaddlePaddle: Concepts, Operations, and Practical Examples

In deep learning, tensors are the core data structures. This article, contributed by Lu Chang from the PaddlePaddle PFCC community, introduces tensor indexing in the PaddlePaddle framework, covering basic concepts, various indexing techniques, assignment, gradient propagation, and real‑world use cases.

Basic concepts : Tensor indexing refers to selecting sub‑sets of a multi‑dimensional array. It is essential for data access, model construction, and automatic differentiation.

Basic indexing includes:

Single integer or 0‑D Tensor indexing to select a specific row or element.

import paddle

a = paddle.arange(6).reshape((2,3))
print(a)
# Tensor Output:
# [[0, 1, 2],
#  [3, 4, 5]]

b = a[1]  # select second row
print(b)
# Tensor Output: [3, 4, 5]

c = a[-1]  # select last row
print(c)
# Tensor Output: [3, 4, 5]

index = paddle.to_tensor(1, dtype='int32')
print(a[index])
# Tensor Output: [3, 4, 5]

Python slice objects (start:end:step) for range selection, including negative indices and steps.

import paddle

a = paddle.arange(10).reshape((2,5))
print(a)
# Tensor Output:
# [[0, 1, 2, 3, 4],
#  [5, 6, 7, 8, 9]]

b = a[0, 1:4]  # first row, elements 1‑3
print(b)  # [1, 2, 3]

c = a[:, ::2]  # every second column
print(c)  # [[0, 2, 4], [5, 7, 9]]

d = a[:, ::-1]  # reverse each row
print(d)  # [[4, 3, 2, 1, 0], [9, 8, 7, 6, 5]]

g = a[:, 1:4:2]
print(g)  # [[1, 3], [6, 8]]

h = a[0, :, ::2]
print(h)  # [0, 2, 4]

Ellipsis ( ...) as a shorthand for full‑slice on remaining dimensions.

import paddle

a = paddle.arange(24).reshape((2,3,4))
print(a)
# ... selects all remaining axes
b = a[...]
print(b)  # identical to a

c = a[1, ...]
print(c)  # second 2×3×4 sub‑tensor

d = a[1, ..., 0]
print(d)  # elements at index 0 of the last dimension

Using None (or np.newaxis) to insert a new axis of size 1.

import paddle
import numpy as np

a = paddle.arange(6).reshape((2,3))
print(a)

b = a[None, :]
print(b)  # shape (1,2,3)

e = a[np.newaxis, :, np.newaxis, :]
print(e)  # shape (1,2,1,3)

f = paddle.arange(3)[:, None]
print(f)  # shape (3,1)

g = paddle.arange(2)[None, :]
print(g)  # shape (1,2)

h = f @ g.T
print(h)  # matrix multiplication result

Advanced indexing :

Integer array indexing (Python list, NumPy array, or Paddle Tensor) for arbitrary element selection and repetition.

a = paddle.arange(8).reshape((4,2))
b = a[[0,2,1]]
c = a[np.array([0,1,0])]
index = paddle.to_tensor([[1],[2]])
d = a[index]
e = a[[2,0,3],[1,0,0]]
print(b, c, d, e, sep='
')

Boolean array indexing (mask) to filter elements that satisfy a condition.

a = paddle.arange(8).reshape((4,2))
mask = a > 4
b = a[mask]
print(b)  # [5, 6, 7]

c = a[[True, False, True, False]]
print(c)  # [[0,1],[4,5]]

Combined (mixed) indexing that mixes basic and advanced indices.

import paddle

a = paddle.arange(24).reshape((2,3,4))
# basic index then advanced index
b = a[0, [1,2], 2]
print(b)  # [6,10]

c = a[:, [0,0,1], [1,2,0], :]
print(c.shape)  # (2,3,4)

d = a[:, [1], :, [2,1,0]]
print(d.shape)  # (3,1,3)

Index assignment :

In dynamic graph mode, setitem works directly. In static graph mode, paddle.static.setitem must be used to preserve the single‑assignment rule.

import paddle
paddle.enable_static()
with paddle.static.program_guard(paddle.static.Program()):
    a = paddle.ones((2,3,4), dtype='float32')
    b = paddle.static.setitem(a, 0, 10)
    # ... execute program ...
    print(b)

# slice assignment example
with paddle.static.program_guard(paddle.static.Program()):
    a = paddle.ones((2,3,4), dtype='float32')
    b = paddle.static.setitem(a, (slice(None), 1, 2), 10)
    # ... execute program ...
    print(b)

Gradient propagation :

Tensor indexing is differentiable; gradients are automatically back‑propagated to the original tensor positions.

da = paddle.zeros_like(a)
for i in range(X):
    for j in range(Y):
        da[index[i,j]] += dOut[i,j,:]

Practical cases :

Semantic segmentation – map class indices to RGB colors using indexing.

import paddle
pred = paddle.randint(0,5,shape=[512,512],dtype='int32')
colors = paddle.to_tensor([[0,0,0],[255,0,0],[0,255,0],[0,0,255],[255,255,0]])
pred_color = colors[pred].numpy().astype('uint8')

Object detection – filter bounding boxes by confidence score.

import paddle
outputs = {'bbox': paddle.randn([100,4]),
           'score': paddle.rand([100]),
           'class': paddle.randint(0,80,[100])}
conf_thr = 0.8
selected = paddle.nonzero(outputs['score'] > conf_thr).squeeze(1)
selected_boxes = outputs['bbox'][selected]
selected_classes = outputs['class'][selected]
print(selected_boxes, selected_classes)

NLP – mask and select sequences of varying lengths.

import paddle
batch_size, seq_len, dim = 32, 100, 768
x = paddle.randn((batch_size, seq_len, dim))
lengths = paddle.randint(10,100,shape=[batch_size])
mask = paddle.arange(seq_len) < lengths.unsqueeze(-1)
masked = paddle.where(mask.unsqueeze(-1), x, paddle.to_tensor(0.))
long_idx = paddle.nonzero(lengths > 50).flatten()
selected = masked[long_idx]
mean_feat = paddle.mean(selected, axis=1)
print(mean_feat.shape)

Overall, the article provides a comprehensive guide to tensor indexing in PaddlePaddle, from elementary slicing to advanced indexing, assignment, and gradient handling, and demonstrates how these techniques are applied in computer vision and NLP tasks.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Deep Learning data processing PaddlePaddle Advanced Indexing Gradient Propagation Tensor Indexing

Written by

Baidu Geek Talk

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.