Introduction to Common Machine Learning Algorithms with Python Implementations
This article introduces the three main categories of machine learning—supervised, unsupervised, and reinforcement learning—detailing common algorithms such as Linear Regression, Logistic Regression, Naive Bayes, K‑Nearest Neighbors, Decision Trees, Random Forests, SVM, K‑Means, and PCA, and provides concise Python code examples using scikit‑learn for each.
Machine learning is typically divided into three major categories: supervised learning, unsupervised learning, and reinforcement learning.
Supervised Learning
Supervised algorithms use labeled training data to learn a mapping from inputs to outputs. Common supervised methods include regression (e.g., Linear Regression) and classification (e.g., Logistic Regression, Naive Bayes, K‑Nearest Neighbors, Decision Tree, Random Forest, Support Vector Machine).
Linear Regression (one‑dimensional example)
<span>import matplotlib.pyplot as plt</span>
<span>import numpy as np</span>
<span>from sklearn import datasets</span>
<span>from sklearn.model_selection import train_test_split</span>
<span>from sklearn import linear_model</span>
<span>from sklearn.metrics import mean_squared_error</span>
<span># 1. Generate synthetic data</span>
<span>lr_X_data, lr_y_data = datasets.make_regression(n_samples=500, n_features=1, noise=2)</span>
<span># 2. Split into train/test</span>
<span>lr_X_train, lr_X_test, lr_y_train, lr_y_test = train_test_split(lr_X_data, lr_y_data, test_size=0.3)</span>
<span># 3. Train model</span>
<span>lr_model = linear_model.LinearRegression()</span>
<span>lr_model.fit(lr_X_train, lr_y_train)</span>
<span># 4. Predict</span>
<span>lr_y_pred = lr_model.predict(lr_X_test)</span>
<span># 5. Evaluate</span>
<span>lr_mse = mean_squared_error(lr_y_test, lr_y_pred)</span>
<span>print("mse:", lr_mse)</span>
<span># 6. Visualize</span>
<span>plt.figure('Linear Regression')</span>
<span>plt.title('Linear Regression')</span>
<span>plt.scatter(lr_X_test, lr_y_test, color='lavender')</span>
<span>plt.plot(lr_X_test, lr_y_pred, color='pink', linewidth=3)</span>
<span>plt.show()</span>Logistic Regression (binary classification)
<span>import matplotlib.pyplot as plt</span>
<span>import numpy as np</span>
<span>from sklearn.model_selection import train_test_split</span>
<span>from sklearn import linear_model</span>
<span># 1. Prepare synthetic data</span>
<span>np.random.seed(123)</span>
<span>logit_X = np.random.normal(size=1000)</span>
<span>logit_y = (logit_X > 0).astype(float)</span>
<span>logit_X = logit_X[:, np.newaxis]</span>
<span># 2. Split</span>
<span>logit_X_train, logit_X_test, logit_y_train, logit_y_test = train_test_split(logit_X, logit_y, test_size=0.3)</span>
<span># 3. Train</span>
<span>logit_model = linear_model.LogisticRegression(C=1e4)</span>
<span>logit_model.fit(logit_X_train, logit_y_train)</span>
<span># 4. Predict</span>
<span>logit_y_pred = logit_model.predict(logit_X_test)</span>
<span># 5. Accuracy</span>
<span>accuracy = logit_model.score(logit_X_test, logit_y_test)</span>
<span>print("accuracy:", accuracy)</span>
<span># 6. Visualize decision boundary (optional)</span>
<span>plt.figure('Logistic Regression')</span>
<span>plt.title('Logistic Regression')</span>
<span>plt.scatter(logit_X_test.ravel(), logit_y_test, color='lavender')</span>
<span>plt.show()</span>Naive Bayes
<span>import matplotlib.pyplot as plt</span>
<span>import numpy as np</span>
<span>from sklearn.datasets import make_classification</span>
<span>from sklearn.model_selection import train_test_split</span>
<span>import sklearn.naive_bayes as nb</span>
<span># 1. Generate data</span>
<span>X, y = make_classification(n_features=2, n_redundant=0, n_informative=2, random_state=1, n_classes=4)</span>
<span># 2. Split</span>
<span>X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)</span>
<span># 3. Train</span>
<span>model = nb.GaussianNB()</span>
<span>model.fit(X_train, y_train)</span>
<span># 4. Predict</span>
<span>y_pred = model.predict(X_test)</span>
<span># 5. Visualize decision regions</span>
<span>plt.figure('Naive Bayes')</span>
<span>plt.title('Naive Bayes')</span>
<span># (visualization code omitted for brevity)</span>
<span>plt.show()</span>K‑Nearest Neighbors
<span>import matplotlib.pyplot as plt</span>
<span>import numpy as np</span>
<span>from sklearn.datasets import make_classification</span>
<span>from sklearn.model_selection import train_test_split</span>
<span>from sklearn.neighbors import KNeighborsClassifier</span>
<span># 1. Data</span>
<span>X, y = make_classification(n_features=2, n_redundant=0, n_informative=2, random_state=1, n_classes=4)</span>
<span># 2. Split</span>
<span>X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)</span>
<span># 3. Train</span>
<span>knn = KNeighborsClassifier(n_neighbors=5)</span>
<span>knn.fit(X_train, y_train)</span>
<span># 4. Predict</span>
<span>y_pred = knn.predict(X_test)</span>
<span># 5. Visualize (omitted)</span>
<span>plt.show()</span>Decision Tree
<span>from sklearn.tree import DecisionTreeClassifier</span>
<span>from sklearn.datasets import make_classification</span>
<span>from sklearn.model_selection import train_test_split</span>
<span># Data</span>
<span>X, y = make_classification(n_features=2, n_redundant=0, n_informative=2, random_state=1, n_classes=4)</span>
<span># Split</span>
<span>X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)</span>
<span># Train</span>
<span>tree = DecisionTreeClassifier(max_depth=4)</span>
<span>tree.fit(X_train, y_train)</span>
<span># Predict</span>
<span>y_pred = tree.predict(X_test)</span>Random Forest
<span>from sklearn.ensemble import RandomForestClassifier</span>
<span>from sklearn.datasets import make_classification</span>
<span>from sklearn.model_selection import train_test_split</span>
<span># Data</span>
<span>X, y = make_classification(n_features=2, n_redundant=0, n_informative=2, random_state=1, n_classes=4)</span>
<span># Split</span>
<span>X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)</span>
<span># Train</span>
<span>rf = RandomForestClassifier(n_estimators=100, max_depth=4)</span>
<span>rf.fit(X_train, y_train)</span>
<span># Predict</span>
<span>y_pred = rf.predict(X_test)</span>Support Vector Machine (SVM)
<span>from sklearn import svm</span>
<span>from sklearn.datasets import make_classification</span>
<span>from sklearn.model_selection import train_test_split</span>
<span># Data</span>
<span>X, y = make_classification(n_features=2, n_redundant=0, n_informative=2, random_state=1, n_classes=4)</span>
<span># Split</span>
<span>X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)</span>
<span># Train</span>
<span>svm_model = svm.SVC(kernel='rbf', gamma=1, C=0.0001)</span>
<span>svm_model.fit(X_train, y_train)</span>
<span># Predict</span>
<span>y_pred = svm_model.predict(X_test)</span>Unsupervised Learning
Unsupervised algorithms discover hidden structure in unlabeled data. This article covers association (Apriori), clustering (K‑Means), and dimensionality reduction (PCA).
K‑Means Clustering
<span>from sklearn.datasets import make_blobs</span>
<span>from sklearn.cluster import KMeans</span>
<span>import matplotlib.pyplot as plt</span>
<span># Generate synthetic blobs</span>
<span>X, _ = make_blobs(n_samples=500, centers=5, cluster_std=0.6, random_state=0)</span>
<span># Fit K‑Means</span>
<span>kmeans = KMeans(n_clusters=5)</span>
<span>kmeans.fit(X)</span>
<span># Predict cluster labels</span>
<span>labels = kmeans.predict(X)</span>
<span># Visualize</span>
<span>plt.scatter(X[:,0], X[:,1], c=labels, cmap='viridis')
<span>plt.scatter(kmeans.cluster_centers_[:,0], kmeans.cluster_centers_[:,1], c='red', marker='x')
<span>plt.show()</span>Principal Component Analysis (PCA)
<span>from sklearn.decomposition import PCA</span>
<span>from sklearn.datasets import load_iris</span>
<span>import matplotlib.pyplot as plt</span>
<span># Load Iris data</span>
<span>data = load_iris()</span>
<span>X = data.data</span>
<span>y = data.target</span>
<span># Reduce to 2 dimensions</span>
<span>pca = PCA(n_components=2)</span>
<span>X_reduced = pca.fit_transform(X)</span>
<span># Plot</span>
<span>plt.scatter(X_reduced[y==0,0], X_reduced[y==0,1], c='r')
<span>plt.scatter(X_reduced[y==1,0], X_reduced[y==1,1], c='g')
<span>plt.scatter(X_reduced[y==2,0], X_reduced[y==2,1], c='b')
<span>plt.show()</span>Reinforcement Learning
Reinforcement learning trains an agent to take actions in an environment to maximize cumulative reward. The article mentions the concept but does not provide code examples.
Conclusion
The tutorial covers nine widely used machine learning algorithms, explains their core ideas, and supplies ready‑to‑run Python snippets based on scikit‑learn, enabling readers to experiment and deepen their understanding through hands‑on practice.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
