Artificial Intelligence 8 min read

Template Notebook for Building Machine Learning Models with Scikit-learn

This notebook provides ready‑to‑use Python code templates for ten common machine‑learning algorithms—including linear regression, logistic regression, decision trees, Naïve Bayes, SVM, K‑Nearest Neighbors, K‑Means, Random Forest, PCA, and Gradient Boosting—showing how to import, train, evaluate, and predict with scikit‑learn.

Python Programming Learning Circle

Jun 19, 2021

Template Notebook for Building Machine Learning Models with Scikit-learn

This notebook contains code templates for creating the main machine‑learning algorithms using scikit‑learn. By adjusting parameters, supplying data, training the model, and making predictions, users can quickly build and evaluate models.

1. Linear Regression

Import the linear_model module, create training and test subsets, instantiate a LinearRegression object, fit the model, evaluate its score, print coefficients and intercept, and make predictions.

# Import modules
from sklearn import linear_model

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted_variable

x_test  = test_dataset_precictor_variables

# Create linear regression object
linear = linear_model.LinearRegression()

# Train the model with training data and check the score
linear.fit(x_train, y_train)
linear.score(x_train, y_train)

# Collect coefficients
print('Coefficient: 
', linear.coef_)
print('Intercept: 
', linear.intercept_)

# Make predictions
predicted_values = linear.predict(x_test)

2. Logistic Regression

Replace LinearRegression with LogisticRegression, then fit, score, and predict similarly.

# Import modules
from sklearn.linear_model import LogisticRegression

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted_variable

x_test  = test_dataset_precictor_variables

# Create logistic regression object
model = LogisticRegression()

# Train the model with training data and checking the score
model.fit(x_train, y_train)
model.score(x_train, y_train)

# Collect coefficients
print('Coefficient: 
', model.coef_)
print('Intercept: 
', model.intercept_)

# Make predictions
predicted_vaues = model.predict(x_teste)

3. Decision Tree

Switch to DecisionTreeRegressor or DecisionTreeClassifier, fit, score, and predict.

# Import modules
from sklearn import tree

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted_variable

x_test  = test_dataset_precictor_variables

# Create Decision Tree Regressor Object
model = tree.DecisionTreeRegressor()

# Create Decision Tree Classifier Object
model = tree.DecisionTreeClassifier()

# Train the model with training data and checking the score
model.fit(x_train, y_train)
model.score(x_train, y_train)

# Make predictions
predicted_values = model.predict(x_test)

4. Naïve Bayes

Use GaussianNB for classification.

# Import modules
from sklearn.naive_bayes import GaussianNB

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted variable

x_test  = test_dataset_precictor_variables

# Create GaussianNB object
model = GaussianNB()

# Train the model with training data
model.fit(x_train, y_train)

# Make predictions
predicted_values = model.predict(x_test)

5. Support Vector Machine

Instantiate an SVC (or SVR) object, fit, score, and predict.

# Import modules
from sklearn import svm

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted variable

x_test  = test_dataset_precictor_variables

# Create SVM Classifier object
model = svm.svc()

# Train the model with training data and checking the score
model.fit(x_train, y_train)
model.score(x_train, y_train)

# Make predictions
predicted_values = model.predict(x_test)

6. K‑Nearest Neighbors

Adjust the n_neighbors hyper‑parameter, fit, and predict.

# Import modules
from sklearn.neighbors import KNeighborsClassifier

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted variable

x_test  = test_dataset_precictor_variables

# Create KNeighbors Classifier Objects
KNeighborsClassifier(n_neighbors = 6) # default value = 5

# Train the model with training data
model.fit(x_train, y_train)

# Make predictions
predicted_values = model.predict(x_test)

7. K‑Means Clustering

Define number of clusters, fit on training data, and predict cluster assignments.

# Import modules
from sklearn.cluster import KMeans

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted variable

x_test  = test_dataset_precictor_variables

# Create KMeans objects
k_means = KMeans(n_clusters = 3, random_state = 0)

# Train the model with training data
model.fit(x_train)

# Make predictions
predicted_values = model.predict(x_test)

8. Random Forest

Instantiate RandomForestClassifier, fit on training data, and predict.

# Import modules
from sklearn.ensemble import RandomForestClassifier

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted variable

x_test  = test_dataset_precictor_variables

# Create Random Forest Classifier objects
model = RandomForestClassifier()

# Train the model with training data
model.fit(x_train, x_test)

# Make predictions
predicted_values = model.predict(x_test)

9. Dimensionality Reduction

Use PCA or FactorAnalysis to reduce feature space before training.

# Import modules
from sklearn import decomposition

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted variable

x_test  = test_dataset_precictor_variables

# Creating PCA decomposition object
pca = decomposition.PCA(n_components = k)

# Creating Factor analysis decomposition object
fa = decomposition.FactorAnalysis()

# Reduce the size of the training set using PCA
reduced_train = pca.fit_transform(train)

# Reduce the size of the test set using PCA
reduced_test = pca.transform(test)

10. Gradient Boosting and AdaBoost

Instantiate GradientBoostingClassifier (or AdaBoost), fit, and predict.

# Import modules
from sklearn.ensemble import GradientBoostingClassifier

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted variable

x_test  = test_dataset_precictor_variables

# Creating Gradient Boosting Classifier object
model = GradientBoostingClassifier(n_estimators = 100, learning_rate = 1.0, max_depth = 1, random_state = 0)

# Training the model with training data
model.fit(x_train, x_test)

# Make predictions
predicted_values = model.predict(x_test)

The workflow for each algorithm involves defining a business problem, preprocessing data, training the model, tuning hyper‑parameters, validating results, and iterating until satisfactory accuracy is achieved.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning AI Regression classification model templates

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.