Tagged articles
26 articles
Page 1 of 1
Data Party THU
Data Party THU
Oct 16, 2025 · Fundamentals

Mastering Anomaly vs Novelty Detection with Distribution Fitting in Python

This article explains the fundamental differences between anomaly and novelty detection, outlines how to model univariate outliers using probability distribution fitting with the distfit library, and demonstrates the workflow on synthetic height data and real natural‑gas price data, including model selection, visualization, and prediction.

PythonStatistical Modelinganomaly detection
0 likes · 17 min read
Mastering Anomaly vs Novelty Detection with Distribution Fitting in Python
Data Party THU
Data Party THU
Oct 5, 2025 · Fundamentals

Which Probability Distribution Fits Your Data? A Practical Guide to 8 Core Models

This article presents eight essential probability distributions for everyday data‑science tasks, explains when to use each, provides concise Python code for fitting and sampling, and shares practical tips and a real‑world case study to help you choose the right model quickly.

Statistical Modelingdata analysisprobability distribution
0 likes · 11 min read
Which Probability Distribution Fits Your Data? A Practical Guide to 8 Core Models
DataFunSummit
DataFunSummit
Jul 26, 2024 · Big Data

Understanding Power Law Distributions in Content Ecosystems: Data Science Insights and Applications

This article explores how power‑law and other heavy‑tailed distributions appear in content ecosystems, explains their statistical foundations, discusses why they are common, and presents data‑driven strategies—including integer programming, graph‑based creator analysis, and causal inference—to optimize content production, recommendation, and settlement policies.

Big DataData SciencePower Law
0 likes · 18 min read
Understanding Power Law Distributions in Content Ecosystems: Data Science Insights and Applications
Model Perspective
Model Perspective
Jul 12, 2024 · Fundamentals

Why Lognormal Distribution Is Key to Modeling Rainfall and Financial Data

Lognormal distribution, where a variable’s logarithm follows a normal law, offers non‑negative, right‑skewed modeling ideal for phenomena such as rainfall, river flow, asset prices, and biological sizes, and this article explains its definition, properties, and a practical rainfall‑modeling case study.

EnvironmentStatistical Modelingfinance
0 likes · 5 min read
Why Lognormal Distribution Is Key to Modeling Rainfall and Financial Data
Model Perspective
Model Perspective
Apr 28, 2024 · Fundamentals

Why Simple Linear Regression Falls Short and How Hierarchical Models Solve It

Linear regression often fails to capture nested data structures, but hierarchical (multilevel) linear models address this limitation by modeling both within‑group and between‑group variation, enabling nuanced analysis of factors like school type on student performance and extending to fields such as ecology and health.

Statistical Modelingeducational statisticshierarchical linear model
0 likes · 11 min read
Why Simple Linear Regression Falls Short and How Hierarchical Models Solve It
Model Perspective
Model Perspective
Sep 17, 2023 · Fundamentals

Why Correlation Isn’t Causation: Methods to Reveal True Relationships in Data

This article explains the difference between correlation and causation, illustrates common misconceptions with real‑world examples, and introduces statistical tools such as randomized experiments, instrumental variables, propensity score matching, and difference‑in‑differences that help researchers uncover genuine causal effects in mathematical modeling.

Statistical Modelingcausalitycorrelation
0 likes · 9 min read
Why Correlation Isn’t Causation: Methods to Reveal True Relationships in Data
Python Programming Learning Circle
Python Programming Learning Circle
May 26, 2023 · Fundamentals

Introduction to Statsmodels: Installation, Data Loading, and Basic Statistical Analysis with Python

This article introduces the Python Statsmodels library, explains its key features such as linear regression, GLM, time‑series and robust methods, shows how to install it, load data with pandas, perform descriptive statistics, visualizations, hypothesis testing, and simple and multiple linear regression examples.

PythonStatistical ModelingStatsmodels
0 likes · 6 min read
Introduction to Statsmodels: Installation, Data Loading, and Basic Statistical Analysis with Python
DataFunSummit
DataFunSummit
May 8, 2023 · Fundamentals

Understanding Data Distributions: Normal vs. Power Law in Content Ecosystems

This article explores how data in content ecosystems is distributed, contrasting the classic normal distribution with heavy‑tailed power‑law patterns, explains why power‑law appears frequently, discusses its statistical properties and risks, and presents practical optimization and causal‑inference methods applied to creator incentives and platform strategies.

Power LawStatistical Modelingcontent ecosystem
0 likes · 20 min read
Understanding Data Distributions: Normal vs. Power Law in Content Ecosystems
Model Perspective
Model Perspective
Mar 31, 2023 · Big Data

How to Model Used Sailboat Prices and Rethink the Future of the Olympics

These COMAP MCM problem statements challenge teams to develop statistical models for pricing used sailboats using a large 2023 dataset and to propose innovative strategies for the Olympic Games, evaluating regional effects, data sources, and policy recommendations for sustainable hosting.

OlympicsStatistical Modelingdata modeling
0 likes · 10 min read
How to Model Used Sailboat Prices and Rethink the Future of the Olympics
Model Perspective
Model Perspective
Nov 8, 2022 · Fundamentals

Mastering Multiple Linear Regression: Theory, Estimation, and Prediction

This article explains the fundamentals of multiple linear regression, covering model formulation, least‑squares estimation of coefficients, statistical tests for significance, and how to use the fitted equation for accurate predictions and confidence intervals.

Least SquaresMultiple Linear RegressionPrediction
0 likes · 5 min read
Mastering Multiple Linear Regression: Theory, Estimation, and Prediction
Model Perspective
Model Perspective
Sep 14, 2022 · Fundamentals

Mastering Grouped and Dummy Variable Regression: Weighted Models Explained

This article explains how regression can handle grouped (aggregated) data using weighted least squares, illustrates the impact of heteroskedasticity, and shows how dummy variables encode categorical factors for flexible, non‑parametric modeling of treatment effects.

Dummy VariablesStatistical Modelinggrouped data
0 likes · 12 min read
Mastering Grouped and Dummy Variable Regression: Weighted Models Explained
Model Perspective
Model Perspective
Aug 2, 2022 · Fundamentals

How ARMA Models Enable Accurate Time Series Forecasting

This article explains the recursive forecasting formulas for ARMA and MA(q) time‑series models, showing how forecasts depend only on past observations, how model invertibility ensures stability, and how estimated parameters are used in practical prediction.

ARMAMA(q)Statistical Modeling
0 likes · 2 min read
How ARMA Models Enable Accurate Time Series Forecasting
Model Perspective
Model Perspective
Jul 12, 2022 · Fundamentals

How Simple Linear Regression Uncovers Hidden Relationships in Data

This article explains the theory and practice of simple linear regression, covering deterministic vs. stochastic relationships, the least‑squares estimation of coefficients, goodness‑of‑fit measures such as R², hypothesis testing for linearity, and a real‑world case linking wine consumption to heart‑disease mortality.

Least SquaresR-squaredStatistical Modeling
0 likes · 8 min read
How Simple Linear Regression Uncovers Hidden Relationships in Data
Meituan Technology Team
Meituan Technology Team
Mar 24, 2022 · Artificial Intelligence

Cyclic Generative Adversarial Networks for Probability Density Estimation – Academic Salon by Tsinghua University & Meituan Digital Life

The Tsinghua‑Meituan Digital Life Joint Research Institute’s academic salon will feature Associate Professor Jiang Rui presenting a cyclic generative adversarial network for probability density estimation, demonstrating how merging statistical models with deep‑learning techniques can solve core statistical problems and foster industry‑academia innovation.

Deep LearningGenerative Adversarial NetworksProbability Density Estimation
0 likes · 4 min read
Cyclic Generative Adversarial Networks for Probability Density Estimation – Academic Salon by Tsinghua University & Meituan Digital Life
DataFunTalk
DataFunTalk
Jan 2, 2022 · Fundamentals

Survival Analysis for User Churn: Concepts, Data Preparation, and Quantitative Modeling

This article introduces survival analysis, explains how to model user churn by defining purchase and cancellation times as birth and death events, describes data formatting, presents descriptive Kaplan‑Meier results, and shows how Cox regression quantifies the impact of factors such as membership and activity on user survival.

Statistical Modelingcox regressiondata analysis
0 likes · 7 min read
Survival Analysis for User Churn: Concepts, Data Preparation, and Quantitative Modeling
DeWu Technology
DeWu Technology
Mar 4, 2021 · Fundamentals

Dominance Analysis for Attribution in Data Analytics

The article explains that attribution analysis of metric declines requires a quantitative approach, introducing Dominance Analysis—a econometric technique that decomposes regression R² into variable-specific contributions by fitting all subset models, averaging marginal effects, ranking factors, and providing a Python implementation with the dominance‑analysis package illustrated on the Boston Housing dataset.

Data AnalyticsStatistical Modelingattribution
0 likes · 7 min read
Dominance Analysis for Attribution in Data Analytics
Ctrip Technology
Ctrip Technology
Sep 24, 2020 · Artificial Intelligence

Time Series Analysis and ARIMA Modeling Practice with Python

This article introduces time series fundamentals, classification, and challenges for internet businesses, then provides a step‑by‑step Python tutorial on ARIMA modeling—including data loading, stationarity testing, differencing, ACF/PACF analysis, AIC‑based order selection, model training, prediction, error evaluation, exogenous variable integration, and diagnostic checks.

ARIMAPythonStatistical Modeling
0 likes · 11 min read
Time Series Analysis and ARIMA Modeling Practice with Python
Efficient Ops
Efficient Ops
May 26, 2020 · Information Security

5 Correlation Analysis Models Every Security Engineer Should Know

This article explores five primary correlation analysis models—rule‑based, statistical, threat‑intelligence‑based, context‑based, and big‑data‑driven—detailing their principles, typical use cases such as single‑log alerts, event‑count thresholds, multi‑value detections, temporal sequences, and how accurate log parsing underpins effective security analytics.

Statistical Modelingcorrelation analysisrule-based detection
0 likes · 15 min read
5 Correlation Analysis Models Every Security Engineer Should Know
dbaplus Community
dbaplus Community
Feb 8, 2018 · Artificial Intelligence

Unlocking Data Value: A Practical Guide to Bayesian Theorem and Its Applications

This article explains the fundamentals of Bayes' theorem, shows how to compute prior, likelihood, and posterior probabilities, demonstrates Bayesian A/B testing with Python code, introduces Bayesian networks for causal inference, and discusses the role of Bayesian methods in machine learning and data‑driven decision making.

AB testingBayesianStatistical Modeling
0 likes · 11 min read
Unlocking Data Value: A Practical Guide to Bayesian Theorem and Its Applications