Artificial Intelligence 29 min read

A Comprehensive Overview of Popular Python Libraries for Artificial Intelligence and Data Science

This article introduces and demonstrates more than twenty widely used Python libraries for artificial intelligence, computer vision, natural language processing, and data analysis, providing concise explanations and runnable code snippets that illustrate each library's core functionality and typical use cases.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
A Comprehensive Overview of Popular Python Libraries for Artificial Intelligence and Data Science

The article begins with a brief motivation for learning common Python AI libraries and then presents a series of concise introductions, performance comparisons, and example code for each library.

1. NumPy – Shows how NumPy’s C‑based array operations dramatically outperform pure Python loops when computing sine values, with a timing comparison.

<code>import numpy as np
import math
import time

# Pure Python
start = time.time()
for i in range(10):
    list_1 = list(range(1,10000))
    for j in range(len(list_1)):
        list_1[j] = math.sin(list_1[j])
print("Pure Python time: {}s".format(time.time()-start))

# NumPy
start = time.time()
for i in range(10):
    list_1 = np.array(np.arange(1,10000))
    list_1 = np.sin(list_1)
print("NumPy time: {}s".format(time.time()-start))
</code>

2. OpenCV – Demonstrates basic image filtering, averaging, Gaussian blur and bilateral filtering, and visualizes the results with Matplotlib.

<code>import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt

img = cv.imread('h89817032p0.png')
kernel = np.ones((5,5),np.float32)/25
dst = cv.filter2D(img,-1,kernel)
blur_1 = cv.GaussianBlur(img,(5,5),0)
blur_2 = cv.bilateralFilter(img,9,75,75)
plt.figure(figsize=(10,10))
plt.subplot(221); plt.imshow(img[:,:,::-1]); plt.title('Original')
plt.subplot(222); plt.imshow(dst[:,:,::-1]); plt.title('Averaging')
plt.subplot(223); plt.imshow(blur_1[:,:,::-1]); plt.title('Gaussian')
plt.subplot(224); plt.imshow(blur_2[:,:,::-1]); plt.title('Bilateral')
plt.show()
</code>

3. scikit‑image – Uses rescale, resize and downscale_local_mean to manipulate an image and displays the transformations.

<code>from skimage import data, color, io
from skimage.transform import rescale, resize, downscale_local_mean
import matplotlib.pyplot as plt

image = color.rgb2gray(io.imread('h89817032p0.png'))
image_rescaled = rescale(image, 0.25, anti_aliasing=False)
image_resized = resize(image, (image.shape[0]//4, image.shape[1]//4), anti_aliasing=True)
image_downscaled = downscale_local_mean(image, (4,3))
plt.figure(figsize=(20,20))
plt.subplot(221); plt.imshow(image, cmap='gray'); plt.title('Original')
plt.subplot(222); plt.imshow(image_rescaled, cmap='gray'); plt.title('Rescaled')
plt.subplot(223); plt.imshow(image_resized, cmap='gray'); plt.title('Resized')
plt.subplot(224); plt.imshow(image_downscaled, cmap='gray'); plt.title('Downscaled')
plt.show()
</code>

4‑5. PIL / Pillow – Shows how to generate a simple captcha image with random characters, colors and blur.

<code>from PIL import Image, ImageDraw, ImageFont, ImageFilter
import random

def rndChar():
    return chr(random.randint(65,90))

def rndColor():
    return (random.randint(64,255), random.randint(64,255), random.randint(64,255))

def rndColor2():
    return (random.randint(32,127), random.randint(32,127), random.randint(32,127))

width, height = 60*6, 60*6
image = Image.new('RGB', (width, height), (255,255,255))
font = ImageFont.truetype('/usr/share/fonts/wps-office/simhei.ttf', 60)
draw = ImageDraw.Draw(image)
for x in range(width):
    for y in range(height):
        draw.point((x,y), fill=rndColor())
for t in range(6):
    draw.text((60*t+10,150), rndChar(), font=font, fill=rndColor2())
image = image.filter(ImageFilter.BLUR)
image.save('code.jpg','jpeg')
</code>

6. SimpleCV – Briefly mentions that SimpleCV wraps OpenCV for easier use but has poor Python 3 support, showing a typical usage snippet and the resulting SyntaxError.

<code>from SimpleCV import Image, Color
img = Image('http://i.imgur.com/lfAeZ4n.png')
feats = img.findKeypoints()
feats.draw(color=Color.RED)
img.show()
</code>

7. Mahotas – Loads an image, computes a simple statistic and displays it.

<code>import numpy as np, mahotas, matplotlib.pyplot as plt
f = mahotas.demos.load('lena', as_grey=True)
f = f[128:,128:]
print('Zero fraction:', np.mean(f==0))
plt.imshow(f, cmap='gray')
plt.show()
</code>

8. Ilastik – Described as an interactive machine‑learning tool for bio‑image analysis (no code needed).

9. Scikit‑learn – Implements KMeans and MiniBatchKMeans clustering on synthetic blobs and visualizes the clusters.

<code>import time, numpy as np, matplotlib.pyplot as plt
from sklearn.cluster import KMeans, MiniBatchKMeans
from sklearn.datasets import make_blobs

centers = [[1,1],[-1,-1],[1,-1]]
X, _ = make_blobs(n_samples=3000, centers=centers, cluster_std=0.7)

k_means = KMeans(n_clusters=3, n_init=10)
k_means.fit(X)
mbk = MiniBatchKMeans(n_clusters=3, batch_size=45, n_init=10)
mbk.fit(X)
# Plotting omitted for brevity
</code>

10. SciPy – Shows how to use special functions (e.g., Bessel, elliptic) to plot a 3‑D drumhead surface.

<code>from scipy import special
import numpy as np, matplotlib.pyplot as plt

def drumhead_height(n,k,d,a,t):
    kth_zero = special.jn_zeros(n,k)[-1]
    return np.cos(t)*np.cos(n*a)*special.jn(n,d*kth_zero)

# Meshgrid creation and 3‑D plot omitted for brevity
</code>

11. NLTK – Tokenizes a sentence, tags parts of speech, extracts named entities and displays a parse tree.

<code>import nltk
sentence = "At eight o'clock on Thursday morning Arthur didn't feel very good."
tokens = nltk.word_tokenize(sentence)
tagged = nltk.pos_tag(tokens)
entities = nltk.chunk.ne_chunk(tagged)
</code>

12. spaCy – Loads an English model and extracts named entities from a list of example sentences.

<code>import spacy
nlp = spacy.load('en_core_web_sm')
texts = ["Net income was $9.4 million...", "Revenue exceeded twelve billion dollars..."]
for doc in nlp.pipe(texts, disable=["tok2vec","tagger","parser","attribute_ruler","lemmatizer"]):
    print([(ent.text, ent.label_) for ent in doc.ents])
</code>

13. LibROSA – Demonstrates beat tracking on an example audio file.

<code>import librosa
filename = librosa.example('nutcracker')
y, sr = librosa.load(filename)
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)
print('Estimated tempo: {:.2f} BPM'.format(tempo))
</code>

14. Pandas – Generates a random time series, computes a cumulative sum, creates a DataFrame and plots it.

<code>import pandas as pd, numpy as np, matplotlib.pyplot as plt
ts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000)).cumsum()
df = pd.DataFrame(np.random.randn(1000,4), index=ts.index, columns=list('ABCD')).cumsum()
df.plot()
plt.show()
</code>

15‑16. Matplotlib & Seaborn – Shows how to plot multiple curves with Matplotlib and how to create a pair‑plot of the penguins dataset with Seaborn.

<code># Matplotlib example
import numpy as np, matplotlib.pyplot as plt
x = np.linspace(0.1, 2*np.pi, 100)
plt.plot(x, x)
plt.plot(x, np.square(x))
plt.plot(x, np.log(x))
plt.plot(x, np.sin(x))
plt.show()

# Seaborn example
import seaborn as sns
sns.set_theme(style='ticks')
df = sns.load_dataset('penguins')
sns.pairplot(df, hue='species')
plt.show()
</code>

17. Orange – Briefly mentions installation via pip install orange3 and launching the GUI with orange-canvas .

18. PyBrain – Shows how to build a simple feed‑forward network, add layers and connections, and sort modules.

<code>from pybrain.structure import FeedForwardNetwork, LinearLayer, SigmoidLayer, FullConnection
net = FeedForwardNetwork()
inLayer = LinearLayer(2)
hiddenLayer = SigmoidLayer(3)
outLayer = LinearLayer(1)
net.addInputModule(inLayer)
net.addModule(hiddenLayer)
net.addOutputModule(outLayer)
net.addConnection(FullConnection(inLayer, hiddenLayer))
net.addConnection(FullConnection(hiddenLayer, outLayer))
net.sortModules()
</code>

19. Milk – Demonstrates training a binary classifier on synthetic data.

<code>import numpy as np, milk
features = np.random.rand(100,10)
labels = np.zeros(100)
features[50:] += .5
labels[50:] = 1
learner = milk.defaultclassifier()
model = learner.train(features, labels)
print(model.apply(np.random.rand(10)))
</code>

20. TensorFlow – Builds and trains a simple CNN on CIFAR‑10 using the high‑level Keras‑style API.

<code>import tensorflow as tf
from tensorflow.keras import datasets, layers, models
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
train_images, test_images = train_images/255.0, test_images/255.0
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(32,32,3)),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10)
])
model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=10, validation_data=(test_images, test_labels))
</code>

21. PyTorch – Defines a simple fully‑connected network, loss function and optimizer, and outlines a training loop.

<code>import torch, torch.nn as nn
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.stack = nn.Sequential(
            nn.Linear(28*28, 512), nn.ReLU(),
            nn.Linear(512, 512), nn.ReLU(),
            nn.Linear(512, 10), nn.ReLU()
        )
    def forward(self, x):
        x = self.flatten(x)
        return self.stack(x)
model = NeuralNetwork().to('cuda' if torch.cuda.is_available() else 'cpu')
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
# Training loop omitted for brevity
</code>

22. Theano – Computes the Jacobian of a vector‑valued function using theano.scan .

<code>import theano, theano.tensor as T
x = T.dvector('x')
y = x**2
J, updates = theano.scan(lambda i, y, x: T.grad(y[i], x), sequences=T.arange(y.shape[0]), non_sequences=[y, x])
f = theano.function([x], J, updates=updates)
print(f([4,4]))
</code>

23. Keras – Builds a small dense network, compiles it and starts training.

<code>from keras.models import Sequential
from keras.layers import Dense
model = Sequential([Dense(64, activation='relu', input_dim=100), Dense(10, activation='softmax')])
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5, batch_size=32)
</code>

24‑26. Caffe, MXNet, PaddlePaddle, CNTK – Briefly mention each framework and provide minimal example code (e.g., MXNet digit recognizer, PaddlePaddle LeNet definition, CNTK NDL network description) to illustrate their usage.

Overall, the article serves as a quick‑reference guide for Python developers who want to explore the ecosystem of AI‑related libraries, offering short explanations and ready‑to‑run code snippets for each tool.

Artificial IntelligencePythonTensorFlowdata sciencePyTorchNumPy
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.