Artificial Intelligence 12 min read

When to Use Logistic Regression, SVM, Decision Trees, and More? A Practical Frequency Guide

This article analyzes how often common machine‑learning algorithms such as k‑NN, Naïve Bayes, decision trees, SVM, logistic regression, and neural networks are used in industry, explains their typical scenarios, highlights strengths and weaknesses, and shows how non‑linearity and feature engineering affect their suitability.

Baobao Algorithm Notes

Jan 5, 2022

When to Use Logistic Regression, SVM, Decision Trees, and More? A Practical Frequency Guide

Introduction

Neural networks have become indispensable across many domains by 2021, but a wide range of classic algorithms still play crucial roles. The discussion focuses on supervised methods, which dominate real‑world applications, and aims to reveal the key differences that determine when each algorithm is appropriate.

Logistic Regression (LR)

Logistic regression models the probability with a sigmoid function sigmoid(ax + by + c), learning the three parameters a, b, and c. Historically LR was heavily used in large‑scale advertising systems (e.g., billions of features). Its main limitation is linearity; to handle non‑linear patterns you can either:

Introduce a kernel term, e.g., sigmoid(ax + by + k·x*y^(-2) + c), which expands the feature space with polynomial interactions.

Perform feature engineering, such as computing BMI = weight / height², which captures the underlying relationship and may even replace the raw height and weight features.

Naïve Bayes

Often introduced via text‑classification (spam detection), Naïve Bayes applies Bayes’ theorem:

p(class|features) = p(features|class) * p(class) / p(features)

. It assumes conditional independence of features, making training a simple counting operation and inference extremely fast. However, the strong independence assumption limits its applicability to very simple problems.

Support Vector Machine (SVM)

SVM is a linear classifier that seeks a hyperplane separating classes. It offers clear geometric interpretability but struggles with non‑linearly separable data. Kernel tricks can lift the data into higher dimensions, similar to the LR kernel approach. In the body‑type example, adding a BMI feature makes the problem linearly separable, while without BMI a kernel is required.

Decision Tree

Decision trees address two major issues of linear models: collinearity and noise. Their advantages include:

Robustness to outliers and missing values; they can learn directly from such data.

Automatic handling of moderate non‑linearity through greedy splitting and sampling‑based regularization.

Excellent interpretability, providing feature‑importance scores that guide further feature engineering.

Limitations are limited non‑linear capacity for high‑dimensional dense data (e.g., images) and diminishing returns as data volume grows.

Neural Network (NN)

Neural networks excel at automatic feature learning and express powerful non‑linear mappings, especially on dense data such as text and images. Their strengths are:

End‑to‑end feature extraction without manual engineering.

Large model capacity that benefits from abundant data.

Drawbacks include sensitivity to outliers, lack of interpretability, and a higher risk of over‑fitting when data are noisy.

Algorithm Comparison

LR and similar linear models : Very fast and highly interpretable, but lack non‑linearity and rely on handcrafted features or kernels.

Decision Tree : Robust to anomalies and missing values, moderate non‑linearity, good interpretability, but limited on large‑scale dense data.

Neural Network : Automatic feature learning and high capacity, suitable for text/image data, but less interpretable and more sensitive to noise.

Horizontal Comparison

The accompanying chart plots y‑axis as the difficulty of feature engineering (how hard it is to craft useful features) and x‑axis as data non‑linearity. As non‑linearity increases, linear models lose effectiveness, while tree‑based models and neural networks become dominant. The area covered by each algorithm in the chart roughly matches its real‑world usage frequency: neural networks dominate a large portion of the space, decision trees cover most of the remaining area, and linear models occupy only a small niche.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

feature engineering Decision Tree logistic regression svm algorithm comparison

Written by

Baobao Algorithm Notes

Author of the BaiMian large model, offering technology and industry insights.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.