Artificial Intelligence 16 min read

KDnuggets 2016 Poll: Top Algorithms Used by Data Scientists – Usage Trends and Industry vs. Academia Analysis

The KDnuggets 2016 poll of 844 data scientists reveals the most popular algorithms, shifts since 2011, differences in usage across employment sectors, regional participation, and an industry‑academic affinity metric, highlighting a rise in boosting, text mining, visualization, and deep learning while noting declines in association rules and uplift modeling.

Architects Research Society

Sep 27, 2016

KDnuggets 2016 Poll: Top Algorithms Used by Data Scientists – Usage Trends and Industry vs. Academia Analysis

The latest KDnuggets poll asked 844 data scientists which methods or algorithms they used in the past 12 months for real data‑science applications. The results show the top 10 algorithms by share of voters and a notable increase in the average number of algorithms used per respondent (8.1).

Fig. 1: Top 10 algorithms used by Data Scientists.

Compared with the 2011 poll, the core methods (Regression, Clustering, Decision Trees/Rules, Visualization) remain dominant, while the biggest relative gains are seen in Boosting (+40%), Text Mining (+30%), Visualization (+27%), Time‑series/Sequence analysis (+25%), Anomaly/Deviation detection (+19%), Ensemble methods (+19%), SVM (+18%), and Regression (+16%).

Boosting , up 40% to 32.8% share in 2016

Text Mining , up 30% to 35.9%

Visualization , up 27% to 48.7%

Time series/Sequence analysis , up 25% to 37.0%

Anomaly/Deviation detection , up 19% to 19.5%

Ensemble methods , up 19% to 33.6%

SVM , up 18% to 33.6%

Regression , up 16% to 67.1%

Newly popular options in 2016 include K‑nearest neighbors (46%), PCA (43%), Random Forests (38%), Optimization (24%), Neural networks – Deep Learning (19%), and Singular Value Decomposition (16%).

Association rules declined 47% to 15.3%

Uplift modeling declined 36% to 3.1%

Factor Analysis declined 24% to 14.2%

Survival Analysis declined 15% to 7.9%

Table 1: Algorithm usage by Employment Type

Employment Type

% Voters

Avg Num Algorithms Used

% Used Super‑vised

% Used Unsuper‑vised

% Used Meta

% Used Other Methods

Industry

59%

8.4

94%

81%

55%

83%

Government/Non‑profit

4.1%

9.5

91%

89%

49%

89%

Student

16%

8.1

94%

76%

47%

77%

Academia

12%

7.2

95%

81%

44%

77%

All

8.3

94%

82%

48%

81%

Almost everyone uses supervised learning algorithms. Industry data scientists employ a broader variety of methods and are more likely to use meta‑algorithms, while government/non‑profit scientists favor visualization, PCA, and time‑series. Academic researchers lean toward PCA and deep learning, and students perform more text mining and deep learning.

Table 2: Top 10 Algorithms + Deep Learning usage by Employment Type

Algorithm

Industry

Government/Non‑profit

Academia

Student

All

Regression

71%

63%

51%

64%

67%

Clustering

58%

63%

51%

58%

57%

Decision Trees/Rules

59%

63%

38%

57%

55%

Visualization

55%

71%

28%

47%

49%

K‑NN

46%

54%

48%

47%

46%

PCA

43%

57%

48%

40%

43%

Statistics

47%

49%

37%

36%

43%

Random Forests

40%

29%

36%

38%

Time series

42%

54%

26%

24%

37%

Text Mining

36%

40%

33%

38%

36%

Deep Learning

18%

24%

19%

Algorithm bias for a specific employment type is computed as Bias(Alg,Type)=Usage(Alg,Type)/Usage(Alg,All)‑1. The bias plot (Fig. 2) shows that industry data scientists are more likely to use Regression, Visualization, Statistics, Random Forests, and Time Series, while academia leans toward PCA and Deep Learning.

Fig. 2: Algorithm usage bias by Employment.

Regional participation mirrors overall KDnuggets traffic: US/Canada (40%), Europe (32%), Asia (18%), Latin America (5%), Africa/Middle East (3.4%), Australia/NZ (2.2%).

Affinity of an algorithm to Industry/Government versus Academia/Students is calculated as (N(Alg,Ind_Gov)/N(Alg,Aca_Stu)) / (N(Ind_Gov)/N(Aca_Stu)) ‑ 1. Values near 0 indicate equal use; positive values denote “industrial” algorithms, negative values denote “academic” ones.

The most “industrial” algorithms are Uplift modeling (2.01), Anomaly Detection (1.61), Survival Analysis (1.39), Factor Analysis (0.83), Time series/Sequences (0.69), and Association Rules (0.5). Despite its high industrial affinity, uplift modeling is used by only 3.1% of respondents.

The most “academic” algorithms are Neural networks – regular (‑0.35), Naive Bayes (‑0.35), SVM (‑0.24), Deep Learning (‑0.19), and EM (‑0.17).

Fig. 3. KDnuggets Poll: Top Algorithms used by Data Scientists – Industry vs Academia

Table 3: KDnuggets 2016 Poll – Algorithms Used by Data Scientists (summary)

Algorithm

Type

2016 % used

2011 % used

% Change

Industry Affinity

Regression

67%

58%

16%

0.21

Clustering

57%

52%

8.7%

0.05

Decision Trees/Rules

55%

60%

-7.3%

0.21

Visualization

49%

38%

27%

0.44

K‑nearest neighbors

46%

0.32

PCA

43%

0.02

Statistics

43%

48%

-11.0%

1.39

Random Forests

38%

0.22

Time series/Sequence analysis

37%

30%

25.0%

0.69

Text Mining

36%

28%

29.8%

0.01

Ensemble methods

34%

28%

18.9%

-0.17

SVM

34%

29%

17.6%

-0.24

Boosting

33%

23%

40%

0.24

Neural networks – regular

24%

27%

-10.5%

-0.35

Optimization

24%

0.07

Naive Bayes

24%

22%

8.9%

-0.02

Bagging

22%

20%

8.8%

0.02

Anomaly/Deviation detection

20%

16%

19%

1.61

Neural networks – Deep Learning

19%

-0.35

Singular Value Decomposition

16%

0.29

Association rules

15%

29%

-47%

0.50

Graph / Link / Social Network Analysis

15%

14%

8.0%

-0.08

Factor Analysis

14%

19%

-23.8%

0.14

Bayesian networks

13%

-0.10

Genetic algorithms

8.8%

9.3%

-6.0%

0.83

Survival Analysis

7.9%

9.3%

-14.9%

-0.15

6.6%

-0.19

Other methods

4.6%

-0.06

Uplift modeling

3.1%

4.8%

-36.1%

2.01

This comprehensive poll provides a snapshot of current data‑science practice, showing a shift toward more advanced techniques such as boosting and deep learning, while traditional association‑rule mining continues to decline.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

data science survey algorithm usage industry vs academia

Written by

Architects Research Society

A daily treasure trove for architects, expanding your view and depth. We share enterprise, business, application, data, technology, and security architecture, discuss frameworks, planning, governance, standards, and implementation, and explore emerging styles such as microservices, event‑driven, micro‑frontend, big data, data warehousing, IoT, and AI architecture.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.