Mastering NumPy: Manual Array Manipulation with as_strided and ndarray
This tutorial explores advanced NumPy techniques for manually manipulating array metadata such as strides and shape, using functions like as_strided() and ndarray() to create custom views, achieve memory‑efficient operations, and avoid common pitfalls like unintended data corruption.
In the final article of this series we take full control of NumPy's internal mechanisms. You will learn how to manually manipulate array metadata (stride and shape) to create powerful, memory‑efficient views, explore advanced tools such as as_strided() and the ndarray() constructor, and build custom array transformations without unnecessary data copies.
We will demonstrate practical examples such as extracting diagonals, performing efficient broadcasting, and implementing a 2‑D sliding window for convolution‑style operations.
Manual NumPy Array Manipulation
A view is a new array object that shares the same data buffer as the original array but has a different shape and/or stride. Views let you operate on the original data without copying, which saves memory but can produce non‑contiguous data and performance issues if used carelessly.
We start by directly modifying stride and shape metadata, then introduce the as_strided function and the ndarray constructor. In part three we illustrate three examples that use these tools to create custom array operations.
Directly Manipulating Strides and Shape
Previously we used functions like flip() and rot90() to create new views, but we can also modify the metadata ourselves. Let’s start with a simple 1‑D array:
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
# strides: (8,) shape: (12,)You can manually set a new shape, for example (4, 3):
x.shape = (4, 3)
print(x)
# array([[ 0, 1, 2],
# [ 3, 4, 5],
# [ 6, 7, 8],
# [ 9, 10, 11]])
print(x.strides)
# (24, 8)Note that the stride updates automatically to match the new shape. Changing the stride first and then the shape is not possible because stride is derived from shape.
You can also change the stride after reshaping:
x.strides = (16, 16)
# array([[ 0, 2, 4],
# [ 2, 4, 6],
# [ 4, 6, 8],
# [ 6, 8, 10]])If the new stride is incompatible with the shape or the memory buffer, NumPy raises a ValueError:
x.strides = (24, 16)
# ValueError: strides is not compatible with available memory
x.strides = (24,)
# ValueError: strides must be same length as shape (2)To conveniently manipulate stride and shape in a single call we have two options: as_strided() and ndarray(), each with its own characteristics.
Using as_strided() for Array Operations
The low‑level function as_strided() creates a new view with arbitrary shape and stride. For example, a 3‑element sliding window with stride 2 can be built as follows:
from numpy.lib.stride_tricks import as_strided
as_strided(x, shape=(4, 3), strides=(16, 16))
# array([[ 0, 2, 4],
# [ 2, 4, 6],
# [ 4, 6, 8],
# [ 6, 8, 10]]) as_strided()does **not** check whether the new stride fits within the allocated buffer, so you must ensure validity yourself. An invalid stride can read memory outside the array, producing garbage values or causing crashes:
as_strided(x, shape=(4, 3), strides=(24, 16))
# array([[ 0, 2, 4],
# [ 3, 5, 7],
# [ 6, 8, 10],
# [ 9, 11, 785]])Negative strides can be used to reverse an array, but you must also provide an offset so the view starts at the last element; otherwise you read memory before the buffer:
as_strided(x, shape=(12,), strides=(-8,))
# array([0, 113, 140037802748368, -1, -1, 33, 8876032, 140037802747904, 2314885530818453536, 723436171118780460, 3628529247127085088, 2314885530818453536])Using the ndarray Constructor for Array Operations
The ndarray constructor behaves similarly to as_strided() but performs stride‑compatibility checks and allows you to specify an offset for the starting pointer.
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
np.ndarray(buffer=x.data, shape=(4, 3), strides=(16, 16), dtype=x.dtype)
# array([[ 0, 2, 4],
# [ 2, 4, 6],
# [ 4, 6, 8],
# [ 6, 8, 10]])Providing an invalid stride now raises a ValueError:
np.ndarray(buffer=x.data, shape=(4, 3), strides=(24, 16), dtype=x.dtype)
# ValueError: strides is incompatible with shape of requested array and size of bufferTo reverse the array safely we add an offset equal to the last element’s byte position:
np.ndarray(buffer=x.data, shape=(12,), strides=(-8,), dtype=x.dtype, offset=x.nbytes - x.itemsize)
# array([11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0])Practical Examples of Custom Operations
We now illustrate three practical examples that use the techniques above: extracting diagonals from an N×N matrix, efficient memory broadcasting, and implementing a 2‑D sliding window for convolution‑style filtering.
Example 1: Extract Diagonal from an N×N Matrix
Note: NumPy provides np.diagonal() , but we assume we don’t know it to demonstrate the low‑level approach.
import numpy as np
N = 4
x = np.arange(N * N).reshape(N, N)
# array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11],
# [12, 13, 14, 15]])
diag = np.ndarray(buffer=x.data, shape=(N,), dtype=int,
strides=((N + 1) * x.itemsize,))
# array([ 0, 5, 10, 15])
anti_diag = np.ndarray(buffer=x.data, shape=(N,), dtype=int,
strides=((N - 1) * x.itemsize,),
offset=(N - 1) * x.itemsize)
# array([ 3, 6, 9, 12])Example 2: Efficient Memory Broadcasting
Adding a row vector of shape (1,4) to a column vector of shape (6,1) triggers NumPy broadcasting, producing a (6,4) result without copying data.
x = np.array([[0, 1, 2, 3]]) # shape (1,4)
y = np.array([[6], [7], [8], [9], [10], [11]]) # shape (6,1)
print(x + y)
# array([[ 6, 7, 8, 9],
# [ 7, 8, 9, 10],
# [ 8, 9, 10, 11],
# [ 9, 10, 11, 12],
# [10, 11, 12, 13],
# [11, 12, 13, 14]])We can mimic broadcasting by creating views with a stride of zero on the broadcasted axis:
x_broadcast = np.ndarray(buffer=x.data, shape=(6, 4), strides=(0, 8), dtype=x.dtype)
y_broadcast = np.ndarray(buffer=y.data, shape=(6, 4), strides=(8, 0), dtype=y.dtype)
print(x_broadcast + y_broadcast)
# same result as aboveExample 3: Implement a 2‑D Sliding Window
A 2‑D sliding window is common in image processing. We define a (2,2) kernel and slide it over a 1‑D array reshaped as needed, creating a 4‑D view of shape (3,2,2,2) (3 rows, 2 columns, kernel height, kernel width).
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
windows = np.ndarray(buffer=x.data, dtype=x.dtype,
shape=(3, 2, 2, 2),
strides=(24, 8, 24, 8))
# windows.shape == (3, 2, 2, 2)
# Example content shown in comments of the original articleWe then apply a simple kernel that sums the diagonal elements of each window:
kernel = np.array([[1, 0], [0, 1]])
# Option 1: explicit loops
result = np.zeros((3, 2))
for row in range(windows.shape[0]):
for col in range(windows.shape[1]):
result[row, col] = np.sum(windows[row, col] * kernel)
# Option 2: broadcasting
result = np.sum(windows * kernel[None, None, ...], axis=(2, 3))
# Option 3: einsum
result = np.einsum('ijkl,kl->ij', windows, kernel)
print(result)
# array([[ 4, 6],
# [10, 12],
# [16, 18]])Precautions: Memory Consumption and Data Integrity
Overwriting Original Data with Views
Modifying a view also modifies the original array because they share the same buffer. For example, using as_strided() to create a view that repeats elements can unintentionally change the source data:
y = np.array([2, 5])
b = as_strided(y, shape=(2, 8), strides=(8, 0), writeable=True)
b[0, 4] = 99
print(y) # array([99, 5])
print(b) # view shows the change in all repeated positionsCrossing Array Boundaries
as_strided()performs no bounds checking. Supplying a stride that points outside the allocated memory can read garbage or cause crashes:
y = np.array([2, 5])
b = as_strided(y, shape=(2, 3), strides=(16, 0))
# May produce nonsensical values without raising an errorMisleading nbytes Reporting
Views report memory usage as shape * itemsize (the nbytes attribute) even when they share a tiny underlying buffer. Use sys.getsizeof() or a memory profiler for a realistic estimate.
Writable Views Are the Default
By default as_strided() creates writable views. If you only need read‑only access, set writeable=False to avoid accidental modifications:
b = as_strided(y, shape=(2, 8), strides=(8, 0), writeable=False)
# b.flags.writeable is False; attempts to assign raise ValueErrorSummary
These techniques can yield significant performance gains, but they trade safety for speed. Always verify that your custom views stay within allocated memory, that you are not unintentionally mutating shared data, and that the added complexity is justified.
If you are unsure, test, profile, and document your approach. The power of manual stride manipulation comes with responsibility.
💬 Feel free to comment, ask questions, or share experiences—let’s keep the discussion going and continue exploring NumPy and beyond.
👏 If you found this series valuable, please give it a like and consider following me on Medium for future deep dives into Python internals, performance optimization, and memory‑efficient programming.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code Mala Tang
Read source code together, write articles together, and enjoy spicy hot pot together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
