Comprehensive Guide to Essential NumPy Functions for Array Creation, Manipulation, and Analysis
This tutorial presents a detailed overview of over fifty core NumPy functions, covering array creation, reshaping, arithmetic, statistical analysis, set operations, splitting, stacking, printing, and data persistence, with clear explanations and complete code examples for each operation.
Why NumPy Is One of the Most Useful Tools in Python
NumPy is a fundamental library for handling large datasets efficiently in Python, offering a rich set of functions for array manipulation that are essential for data science and scientific computing.
Creating Arrays
1. array
Creates one‑dimensional or multi‑dimensional arrays.
<code>numpy.array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0, like=None)</code>Example:
<code>import numpy as np
np.array([1,2,3,4,5])
# → array([1, 2, 3, 4, 5])</code>Convert pandas Series/DataFrame to NumPy array:
<code>sex = pd.Series(['Male','Male','Female'])
np.array(sex)
# → array(['Male','Male','Female'], dtype=object)</code>2. linspace
Creates an array of evenly spaced floating‑point numbers.
<code>numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)</code>Example:
<code>np.linspace(10, 100, 10)
# → array([10., 20., 30., 40., 50., 60., 70., 80., 90., 100.])</code>3. arange
Returns values within a given interval with a specified step.
<code>numpy.arange([start, ]stop, [step, ]dtype=None, *, like=None)</code> <code>np.arange(5,10,2)
# → array([5, 7, 9])</code>4. uniform
Generates random samples from a uniform distribution.
<code>numpy.random.uniform(low=0.0, high=1.0, size=None)</code> <code>np.random.uniform(5,10,size=4)
# → array([6.47445571, 5.60725873, 8.82192327, 7.47674099])</code>5. random.randint
Generates n random integers within a specified range.
<code>numpy.random.randint(low, high=None, size=None, dtype=int)</code> <code>np.random.randint(5,10,10)
# → array([6, 8, 9, 9, 7, 6, 9, 8, 5, 9])</code>6. random.random
Generates n random floating‑point numbers.
<code>numpy.random.random(size=None)</code> <code>np.random.random(3)
# → array([0.87656396, 0.24706716, 0.98950278])</code>7. logspace
Generates numbers spaced evenly on a log scale.
<code>numpy.logspace(start, stop, num=50, endpoint=True, base=10.0, dtype=None, axis=0)</code> <code>np.logspace(0,10,5,base=2)
# → array([1.00000000e+00, 5.65685425e+00, 3.20000000e+01, 1.81019336e+02, 1.02400000e+03])</code>8. zeros
Creates an array filled with zeros.
<code>numpy.zeros(shape, dtype=float, order='C', *, like=None)</code> <code>np.zeros((2,3), dtype='int')
# → array([[0, 0, 0],[0, 0, 0]])</code>9. ones
Creates an array filled with ones.
<code>numpy.ones(shape, dtype=None, order='C', *, like=None)</code> <code>np.ones((3,4))
# → array([[1.,1.,1.,1.],[1.,1.,1.,1.],[1.,1.,1.,1.]])</code>10. full
Creates an n‑dimensional array filled with a single value.
<code>numpy.full(shape, fill_value, dtype=None, order='C', *, like=None)</code> <code>np.full((2,4), fill_value=2)
# → array([[2,2,2,2],[2,2,2,2]])</code>11. identity
Creates an identity matrix of a given size.
<code>numpy.identity(n, dtype=None, *, like=None)</code> <code>np.identity(4)
# → array([[1.,0.,0.,0.],[0.,1.,0.,0.],[0.,0.,1.,0.],[0.,0.,0.,1.]])</code>Array Operations
12. min
Returns the minimum value in an array.
<code>np.min(a, axis=None, out=None, keepdims=<no value>, initial=<no value>, where=<no value>)</code> <code>arr = np.array([1,1,2,3,3,4,5,6,6,2])
np.min(arr)
# → 1</code>13. max
Returns the maximum value in an array.
<code>np.max(a, axis=None, out=None)</code> <code>np.max(arr)
# → 6</code>14. unique
Returns the sorted unique elements of an array.
<code>numpy.unique(ar, return_index=False, return_inverse=False, return_counts=False, axis=None, *, equal_nan=True)</code> <code>np.unique(arr, return_counts=True)
# → (array([1,2,3,4,5,6]), array([2,2,2,1,1,2]))</code>15. mean
Computes the average of array elements.
<code>numpy.mean(a, axis=None, dtype=None, out=None)</code> <code>np.mean(arr, dtype='int')
# → 3</code>16. median
Returns the median of the array.
<code>numpy.median(a, axis=None, out=None, overwrite_input=False, method='linear', keepdims=False, *, interpolation=None)</code> <code>arr2 = np.array([[1,2,3],[5,8,4]])
np.median(arr2)
# → 3.5</code>17. digitize
Returns the indices of the bins to which each value in input array belongs.
<code>numpy.digitize(x, bins, right=False)</code> <code>a = np.array([-0.9,0.5,0.9,1,1.2,1.4,3.6,4.7,5.3])
bins = np.array([0,1,2,3])
np.digitize(a,bins)
# → array([0,1,1,2,2,2,4,4,4])</code>18. reshape
Gives a new shape to an array without changing its data.
<code>numpy.reshape(shape)</code> <code>A = np.random.randint(15, size=(4,3))
A.reshape(3,4)
# → reshaped array</code>19. expand_dims
Expands the dimensions of an array.
<code>numpy.expand_dims(a, axis)</code> <code>np.expand_dims(A, axis=0)
# → array with a new leading dimension</code>20. squeeze
Removes single‑dimensional entries from the shape of an array.
<code>np.squeeze(a, axis=None)</code> <code>np.squeeze(arr)
# → 1‑D array</code>21. count_nonzero
Counts the number of non‑zero elements.
<code>numpy.count_nonzero(a, axis=None, *, keepdims=False)</code> <code>a = np.array([0,0,1,1,1,0])
np.count_nonzero(a)
# → 3</code>22. argwhere
Finds the indices of non‑zero elements.
<code>numpy.argwhere(a)</code> <code>np.argwhere(a)
# → array([[2],[3],[4]])</code>23. argmax & argmin
argmax returns the index of the maximum element; argmin returns the index of the minimum element.
<code>numpy.argmax(a, axis=None, out=None, *, keepdims=<no value>)</code> <code>np.argmax(arr)
# → 1</code> <code>numpy.argmin(a, axis=None, out=None, *, keepdims=<no value>)</code> <code>np.argmin(arr)
# → 3</code>24. sort
Sorts an array.
<code>numpy.sort(a, axis=-1, kind=None, order=None)</code> <code>np.sort(arr)
# → array([1,2,3,4,5,7])</code>25. absolute (abs)
Computes the absolute value element‑wise.
<code>numpy.absolute(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True, [, signature, extobj])</code> <code>A = np.array([[1,-3,4],[-2,-4,3]])
np.abs(A)
# → array([[1,3,4],[2,4,3]])</code>26. round
Rounds floating‑point values to the specified number of decimals.
<code>numpy.around(a, decimals=0, out=None)</code> <code>a = np.random.random((3,4))
np.round(a, decimals=0)
# → array with values rounded to 0 decimals</code>27. clip
Clips (limits) the values in an array.
<code>numpy.clip(a, a_min, a_max, out=None, **kwargs)</code> <code>arr.clip(0,5)
# → array([0,1,0,0,5,5,5,2,3])</code>Replacing Values in Arrays
28. where
Returns elements chosen from x or y depending on condition.
<code>numpy.where(condition, [x, y])</code> <code>a = np.arange(12).reshape(4,3)
np.where(a>5)
# → indices of elements greater than 5
np.where(a>5, a, -1)
# → array with values >5 kept, others set to -1</code>29. put
Replaces specified elements of an array with given values.
<code>numpy.put(a, ind, v)</code> <code>arr = np.array([1,2,3,4,5,6])
np.put(arr, [1,2], [6,7])
# → array([1,6,7,4,5,6])</code>30. copyto
Copies values from one array to another.
<code>numpy.copyto(dst, src, casting='same_kind', where=True)</code> <code>arr1 = np.array([1,2,3])
arr2 = np.array([4,5,6])
np.copyto(arr1, arr2)
# arr1 becomes [4,5,6]</code>Set Operations
31. intersect1d
Returns the sorted, unique values that are in both of the input arrays.
<code>numpy.intersect1d(ar1, ar2, assume_unique=False, return_indices=False)</code> <code>np.intersect1d(ar1, ar2)
# → array([1,3,4,5])</code>32. setdiff1d
Returns the sorted, unique values in ar1 that are not in ar2.
<code>numpy.setdiff1d(ar1, ar2, assume_unique=False)</code> <code>np.setdiff1d(a, b)
# → array([1,3,4])</code>33. setxor1d
Returns the sorted, unique values that are in only one of the input arrays.
<code>numpy.setxor1d(ar1, ar2, assume_unique=False)</code> <code>np.setxor1d(a, b)
# → array([2,3,6,9,36])</code>34. union1d
Combines two arrays and returns the unique sorted union.
<code>numpy.union1d(ar1, ar2)</code> <code>np.union1d(a, b)
# → array([1,2,3,4,5,36])</code>Array Splitting
35. hsplit
Splits an array horizontally (column‑wise).
<code>numpy.hsplit(ary, indices_or_sections)</code> <code>A = np.array([[3,4,5,2],[6,7,2,6]])
np.hsplit(A, 2)
# → [array([[3,4],[6,7]]), array([[5,2],[2,6]])]</code>36. vsplit
Splits an array vertically (row‑wise).
<code>numpy.vsplit(ary, indices_or_sections)</code> <code>np.vsplit(A, 2)
# → [array([[3,4,5,2]]), array([[6,7,2,6]])]</code>Array Stacking
37. hstack
Stacks arrays horizontally (column‑wise).
<code>numpy.hstack(tup)</code> <code>a = np.array([1,2,3,4,5])
b = np.array([1,4,9,16,25])
np.hstack((a,b))
# → array([1,2,3,4,5,1,4,9,16,25])</code>38. vstack
Stacks arrays vertically (row‑wise).
<code>numpy.vstack(tup)</code> <code>np.vstack((a,b))
# → array([[1,2,3,4,5],[1,4,9,16,25]])</code>Array Comparison
39. allclose
Checks if two arrays are element‑wise equal within a tolerance.
<code>numpy.allclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)</code> <code>a = np.array([0.25,0.4,0.6,0.32])
b = np.array([0.26,0.3,0.7,0.32])
np.allclose(a,b,0.1)
# → False
np.allclose(a,b,0.5)
# → True</code>40. equal
Element‑wise comparison of two arrays.
<code>numpy.equal(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])</code> <code>np.equal(arr1, arr2)
# → array([True, True, True, False, True, True])</code>Repeating Array Elements
41. repeat
Repeats each element of an array n times.
<code>numpy.repeat(a, repeats, axis=None)</code> <code>np.repeat('2017',3)
# → array(['2017','2017','2017'])</code>Example with a pandas DataFrame:
<code>fruits = pd.DataFrame([['Mango',40],['Apple',90],['Banana',130]], columns=['Product','ContainerSales'])
fruits['year'] = np.repeat(2020, fruits.shape[0])
# DataFrame now has a 'year' column with value 2020 for all rows</code>42. tile
Constructs an array by repeating A the specified number of times.
<code>numpy.tile(A, reps)</code> <code>np.tile('Ram',5)
# → array(['Ram','Ram','Ram','Ram','Ram'])
np.tile(3,(2,3))
# → array([[3,3,3],[3,3,3]])</code>Einstein Summation
43. einsum
Evaluates the Einstein summation convention on the provided operands.
<code>numpy.einsum(subscripts, *operands, out=None, dtype=None, order='K', casting='safe', optimize=False)</code> <code>a = np.arange(1,10).reshape(3,3)
b = np.arange(21,30).reshape(3,3)
np.einsum('ii->i', a)
# → array([1,5,9])
np.einsum('ij,jk', a, b)
# → matrix multiplication result
np.einsum('ii', a)
# → 15</code>Statistical Analysis
44. histogram
Computes the histogram of a dataset.
<code>numpy.histogram(a, bins=10, range=None, normed=None, weights=None, density=None)</code> <code>A = np.array([[3,4,5,2],[6,7,2,6]])
np.histogram(A)
# → (counts array, bin edges array)</code>45. percentile
Computes the q‑th percentile of the data along the specified axis.
<code>numpy.percentile(a, q, axis=None, out=None, overwrite_input=False, method='linear', keepdims=False, *, interpolation=None)</code> <code>a = np.array([[2,4,6],[4,8,12]])
np.percentile(a, 50)
# → 5.0
np.percentile(a, 10)
# → 3.0</code>46. std
Computes the standard deviation along the specified axis.
<code>numpy.std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=<no value>, *, where=<no value>)</code> <code>np.std(a, axis=1)
# → array([1.63299316, 3.26598632])
np.std(a, axis=0)
# → array([1.,2.,3.])</code>47. var
Computes the variance along the specified axis.
<code>numpy.var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=<no value>, *, where=<no value>)</code> <code>np.var(a, axis=1)
# → array([2.66666667,10.66666667])
np.var(a, axis=0)
# → array([1.,4.,9.])</code>Array Printing
48. set_printoptions (precision)
Sets printing options such as precision.
<code>numpy.set_printoptions(precision=None, threshold=None, edgeitems=None, linewidth=None, suppress=None, nanstr=None, infstr=None, formatter=None, sign=None, floatmode=None, *, legacy=None)</code> <code>np.set_printoptions(precision=2)
a = np.array([12.23456, 32.34535])
print(a)
# → array([12.23, 32.34])</code>Other options: set maximum printed elements, line width, etc.
<code>np.set_printoptions(threshold=np.inf) # show all elements
np.set_printoptions(linewidth=100) # increase elements per line</code>Saving and Loading Data
49. savetxt
Saves an array to a text file.
<code>numpy.savetxt(fname, X, fmt='%.18e', delimiter=' ', newline='\n', header='', footer='', comments='# ', encoding=None)</code> <code>arr = np.linspace(10,100,500).reshape(25,20)
np.savetxt('array.txt', arr)</code>50. loadtxt
Loads data from a text file.
<code>numpy.loadtxt(fname, dtype=<class 'float'>, comments='#', delimiter=None, converters=None, skiprows=0, usecols=None, unpack=False, ndmin=0, encoding='bytes', max_rows=None, *, quotechar=None, like=None)</code> <code>np.loadtxt('array.txt')</code>These fifty NumPy functions form a solid foundation for efficient numerical computing, data manipulation, and scientific analysis in Python.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.