Fundamentals 29 min read

Comprehensive Guide to Essential NumPy Functions for Array Creation, Manipulation, and Analysis

This tutorial presents a detailed overview of over fifty core NumPy functions, covering array creation, reshaping, arithmetic, statistical analysis, set operations, splitting, stacking, printing, and data persistence, with clear explanations and complete code examples for each operation.

Python Programming Learning Circle

Jul 30, 2024

Comprehensive Guide to Essential NumPy Functions for Array Creation, Manipulation, and Analysis

Why NumPy Is One of the Most Useful Tools in Python

NumPy is a fundamental library for handling large datasets efficiently in Python, offering a rich set of functions for array manipulation that are essential for data science and scientific computing.

Creating Arrays

1. array

Creates one‑dimensional or multi‑dimensional arrays.

numpy.array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0, like=None)

Example:

import numpy as np
np.array([1,2,3,4,5])
# → array([1, 2, 3, 4, 5])

Convert pandas Series/DataFrame to NumPy array:

sex = pd.Series(['Male','Male','Female'])
np.array(sex)
# → array(['Male','Male','Female'], dtype=object)

2. linspace

Creates an array of evenly spaced floating‑point numbers.

numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)

Example:

np.linspace(10, 100, 10)
# → array([10., 20., 30., 40., 50., 60., 70., 80., 90., 100.])

3. arange

Returns values within a given interval with a specified step.

numpy.arange([start, ]stop, [step, ]dtype=None, *, like=None)

np.arange(5,10,2)
# → array([5, 7, 9])

4. uniform

Generates random samples from a uniform distribution.

numpy.random.uniform(low=0.0, high=1.0, size=None)

np.random.uniform(5,10,size=4)
# → array([6.47445571, 5.60725873, 8.82192327, 7.47674099])

5. random.randint

Generates n random integers within a specified range.

numpy.random.randint(low, high=None, size=None, dtype=int)

np.random.randint(5,10,10)
# → array([6, 8, 9, 9, 7, 6, 9, 8, 5, 9])

6. random.random

Generates n random floating‑point numbers.

numpy.random.random(size=None)

np.random.random(3)
# → array([0.87656396, 0.24706716, 0.98950278])

7. logspace

Generates numbers spaced evenly on a log scale.

numpy.logspace(start, stop, num=50, endpoint=True, base=10.0, dtype=None, axis=0)

np.logspace(0,10,5,base=2)
# → array([1.00000000e+00, 5.65685425e+00, 3.20000000e+01, 1.81019336e+02, 1.02400000e+03])

8. zeros

Creates an array filled with zeros.

numpy.zeros(shape, dtype=float, order='C', *, like=None)

np.zeros((2,3), dtype='int')
# → array([[0, 0, 0],[0, 0, 0]])

9. ones

Creates an array filled with ones.

numpy.ones(shape, dtype=None, order='C', *, like=None)

np.ones((3,4))
# → array([[1.,1.,1.,1.],[1.,1.,1.,1.],[1.,1.,1.,1.]])

10. full

Creates an n‑dimensional array filled with a single value.

numpy.full(shape, fill_value, dtype=None, order='C', *, like=None)

np.full((2,4), fill_value=2)
# → array([[2,2,2,2],[2,2,2,2]])

11. identity

Creates an identity matrix of a given size.

numpy.identity(n, dtype=None, *, like=None)

np.identity(4)
# → array([[1.,0.,0.,0.],[0.,1.,0.,0.],[0.,0.,1.,0.],[0.,0.,0.,1.]])

Array Operations

12. min

Returns the minimum value in an array.

np.min(a, axis=None, out=None, keepdims=<no value>, initial=<no value>, where=<no value>)

arr = np.array([1,1,2,3,3,4,5,6,6,2])
np.min(arr)
# → 1

13. max

Returns the maximum value in an array.

np.max(a, axis=None, out=None)

np.max(arr)
# → 6

14. unique

Returns the sorted unique elements of an array.

numpy.unique(ar, return_index=False, return_inverse=False, return_counts=False, axis=None, *, equal_nan=True)

np.unique(arr, return_counts=True)
# → (array([1,2,3,4,5,6]), array([2,2,2,1,1,2]))

15. mean

Computes the average of array elements.

numpy.mean(a, axis=None, dtype=None, out=None)

np.mean(arr, dtype='int')
# → 3

16. median

Returns the median of the array.

numpy.median(a, axis=None, out=None, overwrite_input=False, method='linear', keepdims=False, *, interpolation=None)

arr2 = np.array([[1,2,3],[5,8,4]])
np.median(arr2)
# → 3.5

17. digitize

Returns the indices of the bins to which each value in input array belongs.

numpy.digitize(x, bins, right=False)

a = np.array([-0.9,0.5,0.9,1,1.2,1.4,3.6,4.7,5.3])
bins = np.array([0,1,2,3])
np.digitize(a,bins)
# → array([0,1,1,2,2,2,4,4,4])

18. reshape

Gives a new shape to an array without changing its data.

numpy.reshape(shape)

A = np.random.randint(15, size=(4,3))
A.reshape(3,4)
# → reshaped array

19. expand_dims

Expands the dimensions of an array.

numpy.expand_dims(a, axis)

np.expand_dims(A, axis=0)
# → array with a new leading dimension

20. squeeze

Removes single‑dimensional entries from the shape of an array.

np.squeeze(a, axis=None)

np.squeeze(arr)
# → 1‑D array

21. count_nonzero

Counts the number of non‑zero elements.

numpy.count_nonzero(a, axis=None, *, keepdims=False)

a = np.array([0,0,1,1,1,0])
np.count_nonzero(a)
# → 3

22. argwhere

Finds the indices of non‑zero elements.

numpy.argwhere(a)

np.argwhere(a)
# → array([[2],[3],[4]])

23. argmax & argmin

argmax returns the index of the maximum element; argmin returns the index of the minimum element.

numpy.argmax(a, axis=None, out=None, *, keepdims=<no value>)

np.argmax(arr)
# → 1

numpy.argmin(a, axis=None, out=None, *, keepdims=<no value>)

np.argmin(arr)
# → 3

24. sort

Sorts an array.

numpy.sort(a, axis=-1, kind=None, order=None)

np.sort(arr)
# → array([1,2,3,4,5,7])

25. absolute (abs)

Computes the absolute value element‑wise.

numpy.absolute(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True, [, signature, extobj])

A = np.array([[1,-3,4],[-2,-4,3]])
np.abs(A)
# → array([[1,3,4],[2,4,3]])

26. round

Rounds floating‑point values to the specified number of decimals.

numpy.around(a, decimals=0, out=None)

a = np.random.random((3,4))
np.round(a, decimals=0)
# → array with values rounded to 0 decimals

27. clip

Clips (limits) the values in an array.

numpy.clip(a, a_min, a_max, out=None, **kwargs)

arr.clip(0,5)
# → array([0,1,0,0,5,5,5,2,3])

Replacing Values in Arrays

28. where

Returns elements chosen from x or y depending on condition.

numpy.where(condition, [x, y])

a = np.arange(12).reshape(4,3)
np.where(a>5)
# → indices of elements greater than 5
np.where(a>5, a, -1)
# → array with values >5 kept, others set to -1

29. put

Replaces specified elements of an array with given values.

numpy.put(a, ind, v)

arr = np.array([1,2,3,4,5,6])
np.put(arr, [1,2], [6,7])
# → array([1,6,7,4,5,6])

30. copyto

Copies values from one array to another.

numpy.copyto(dst, src, casting='same_kind', where=True)

arr1 = np.array([1,2,3])
arr2 = np.array([4,5,6])
np.copyto(arr1, arr2)
# arr1 becomes [4,5,6]

Set Operations

31. intersect1d

Returns the sorted, unique values that are in both of the input arrays.

numpy.intersect1d(ar1, ar2, assume_unique=False, return_indices=False)

np.intersect1d(ar1, ar2)
# → array([1,3,4,5])

32. setdiff1d

Returns the sorted, unique values in ar1 that are not in ar2.

numpy.setdiff1d(ar1, ar2, assume_unique=False)

np.setdiff1d(a, b)
# → array([1,3,4])

33. setxor1d

Returns the sorted, unique values that are in only one of the input arrays.

numpy.setxor1d(ar1, ar2, assume_unique=False)

np.setxor1d(a, b)
# → array([2,3,6,9,36])

34. union1d

Combines two arrays and returns the unique sorted union.

numpy.union1d(ar1, ar2)

np.union1d(a, b)
# → array([1,2,3,4,5,36])

Array Splitting

35. hsplit

Splits an array horizontally (column‑wise).

numpy.hsplit(ary, indices_or_sections)

A = np.array([[3,4,5,2],[6,7,2,6]])
np.hsplit(A, 2)
# → [array([[3,4],[6,7]]), array([[5,2],[2,6]])]

36. vsplit

Splits an array vertically (row‑wise).

numpy.vsplit(ary, indices_or_sections)

np.vsplit(A, 2)
# → [array([[3,4,5,2]]), array([[6,7,2,6]])]

Array Stacking

37. hstack

Stacks arrays horizontally (column‑wise).

numpy.hstack(tup)

a = np.array([1,2,3,4,5])
b = np.array([1,4,9,16,25])
np.hstack((a,b))
# → array([1,2,3,4,5,1,4,9,16,25])

38. vstack

Stacks arrays vertically (row‑wise).

numpy.vstack(tup)

np.vstack((a,b))
# → array([[1,2,3,4,5],[1,4,9,16,25]])

Array Comparison

39. allclose

Checks if two arrays are element‑wise equal within a tolerance.

numpy.allclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)

a = np.array([0.25,0.4,0.6,0.32])
b = np.array([0.26,0.3,0.7,0.32])
np.allclose(a,b,0.1)
# → False
np.allclose(a,b,0.5)
# → True

40. equal

Element‑wise comparison of two arrays.

numpy.equal(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

np.equal(arr1, arr2)
# → array([True, True, True, False, True, True])

Repeating Array Elements

41. repeat

Repeats each element of an array n times.

numpy.repeat(a, repeats, axis=None)

np.repeat('2017',3)
# → array(['2017','2017','2017'])

Example with a pandas DataFrame:

fruits = pd.DataFrame([['Mango',40],['Apple',90],['Banana',130]], columns=['Product','ContainerSales'])
fruits['year'] = np.repeat(2020, fruits.shape[0])
# DataFrame now has a 'year' column with value 2020 for all rows

42. tile

Constructs an array by repeating A the specified number of times.

numpy.tile(A, reps)

np.tile('Ram',5)
# → array(['Ram','Ram','Ram','Ram','Ram'])
np.tile(3,(2,3))
# → array([[3,3,3],[3,3,3]])

Einstein Summation

43. einsum

Evaluates the Einstein summation convention on the provided operands.

numpy.einsum(subscripts, *operands, out=None, dtype=None, order='K', casting='safe', optimize=False)

a = np.arange(1,10).reshape(3,3)
b = np.arange(21,30).reshape(3,3)
np.einsum('ii->i', a)
# → array([1,5,9])
np.einsum('ij,jk', a, b)
# → matrix multiplication result
np.einsum('ii', a)
# → 15

Statistical Analysis

44. histogram

Computes the histogram of a dataset.

numpy.histogram(a, bins=10, range=None, normed=None, weights=None, density=None)

A = np.array([[3,4,5,2],[6,7,2,6]])
np.histogram(A)
# → (counts array, bin edges array)

45. percentile

Computes the q‑th percentile of the data along the specified axis.

numpy.percentile(a, q, axis=None, out=None, overwrite_input=False, method='linear', keepdims=False, *, interpolation=None)

a = np.array([[2,4,6],[4,8,12]])
np.percentile(a, 50)
# → 5.0
np.percentile(a, 10)
# → 3.0

46. std

Computes the standard deviation along the specified axis.

numpy.std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=<no value>, *, where=<no value>)

np.std(a, axis=1)
# → array([1.63299316, 3.26598632])
np.std(a, axis=0)
# → array([1.,2.,3.])

47. var

Computes the variance along the specified axis.

numpy.var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=<no value>, *, where=<no value>)

np.var(a, axis=1)
# → array([2.66666667,10.66666667])
np.var(a, axis=0)
# → array([1.,4.,9.])

Array Printing

48. set_printoptions (precision)

Sets printing options such as precision.

numpy.set_printoptions(precision=None, threshold=None, edgeitems=None, linewidth=None, suppress=None, nanstr=None, infstr=None, formatter=None, sign=None, floatmode=None, *, legacy=None)

np.set_printoptions(precision=2)
a = np.array([12.23456, 32.34535])
print(a)
# → array([12.23, 32.34])

Other options: set maximum printed elements, line width, etc.

np.set_printoptions(threshold=np.inf)  # show all elements
np.set_printoptions(linewidth=100)   # increase elements per line

Saving and Loading Data

49. savetxt

Saves an array to a text file.

numpy.savetxt(fname, X, fmt='%.18e', delimiter=' ', newline='
', header='', footer='', comments='# ', encoding=None)

arr = np.linspace(10,100,500).reshape(25,20)
np.savetxt('array.txt', arr)

50. loadtxt

Loads data from a text file.

numpy.loadtxt(fname, dtype=<class 'float'>, comments='#', delimiter=None, converters=None, skiprows=0, usecols=None, unpack=False, ndmin=0, encoding='bytes', max_rows=None, *, quotechar=None, like=None)

np.loadtxt('array.txt')

These fifty NumPy functions form a solid foundation for efficient numerical computing, data manipulation, and scientific analysis in Python.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python statistics MachineLearning tutorial NumPy DataScience ArrayOperations

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.