Tagged articles
521 articles
Page 4 of 6
DataFunSummit
DataFunSummit
Oct 25, 2021 · Big Data

Building a Multi-Dimensional Analysis System: Practice at Baixin Bank

This talk by Baixin Bank's BI leader outlines the bank's business model, multi-dimensional data analysis requirements, and the design of a laddered analysis solution, including indicator and analysis system construction, user‑product‑enterprise scenario modeling, and productization of data insights for operational decision‑making.

BankingBig DataBusiness Intelligence
0 likes · 20 min read
Building a Multi-Dimensional Analysis System: Practice at Baixin Bank
MaGe Linux Operations
MaGe Linux Operations
Oct 23, 2021 · Backend Development

How to Scrape and Analyze Taobao Snack Sales Data with Python

This article walks through a real‑world Python project that uses Selenium to crawl the first ten pages of Taobao snack listings, extracts sales, price and location data, visualizes price distribution and geographic concentration, generates a word‑cloud of top user comments, and lists the top‑selling stores, providing full source code for replication.

PythonSeleniumTaobao
0 likes · 12 min read
How to Scrape and Analyze Taobao Snack Sales Data with Python
Python Crawling & Data Mining
Python Crawling & Data Mining
Oct 12, 2021 · Fundamentals

Plotly Express Quick Start: Create Stunning Interactive Visualizations with Minimal Code

This article introduces Plotly Express, a high‑level Python visualization library, covering installation, built‑in datasets, color palettes, themes, and step‑by‑step examples for bar, scatter, bubble, matrix, area, line, pie, sunburst, funnel, 3D, map, and polar charts, demonstrating how to generate dynamic, publication‑ready plots with just a few lines of code.

Interactive ChartsPlotly ExpressPython visualization
0 likes · 13 min read
Plotly Express Quick Start: Create Stunning Interactive Visualizations with Minimal Code
Python Programming Learning Circle
Python Programming Learning Circle
Oct 11, 2021 · Fundamentals

Essential Pandas Techniques for Data Analysis in Python

This article presents a comprehensive guide to essential Pandas operations, including creating Series and DataFrames, common methods for data selection, indexing, grouping, reading and writing files, handling missing values, sorting, statistical analysis, and data transformation, with practical code examples for each feature.

data analysisdata cleaningdataframe
0 likes · 16 min read
Essential Pandas Techniques for Data Analysis in Python
Python Crawling & Data Mining
Python Crawling & Data Mining
Sep 27, 2021 · Fundamentals

Master Pandas Text Manipulation: From Basics to Advanced String Operations

This guide walks you through handling textual data with pandas, covering basic and new string dtypes, essential string methods for formatting, alignment, counting, encoding, and advanced operations such as splitting, replacing, concatenating, matching, and extracting patterns, all illustrated with clear code examples.

PythonString Methodsdata analysis
0 likes · 13 min read
Master Pandas Text Manipulation: From Basics to Advanced String Operations
ByteDance SE Lab
ByteDance SE Lab
Sep 17, 2021 · Product Management

Why A/B Testing Matters: Cases, Architecture & Best Practices

This article explains why A/B testing is essential, illustrates real-world examples from ByteDance, details the multi-layer architecture of the Volcano Engine A/B testing system, outlines experiment design, implementation, statistical analysis, best practices, and future trends, providing a comprehensive guide for product teams.

A/B testingdata analysisexperiment design
0 likes · 18 min read
Why A/B Testing Matters: Cases, Architecture & Best Practices
Python Programming Learning Circle
Python Programming Learning Circle
Sep 16, 2021 · Artificial Intelligence

10 Creative Python Project Ideas to Boost Your Skills

This article presents ten diverse Python project concepts—including voice‑controlled GUIs, AI betting and trading bots, a virtual assistant, concert monitoring, automatic SSL renewal, facial recognition, contact tracing, file organization, and YouTube career‑path aggregation—to help developers practice and expand their programming and artificial‑intelligence expertise.

artificial intelligencedata analysisproject ideas
0 likes · 11 min read
10 Creative Python Project Ideas to Boost Your Skills
Python Crawling & Data Mining
Python Crawling & Data Mining
Sep 14, 2021 · Fundamentals

What TV Fans Say: Analyzing 97,331 Danmu Comments with Python

Using Python and pandas, this article collects and analyzes 97,331 danmu comments from the first episode of Mango TV’s “Brother” show, presenting data previews, word clouds, top‑liked remarks, super‑active users, and favorite performers, while also sharing the data‑scraping script.

Danmudata analysispandas
0 likes · 8 min read
What TV Fans Say: Analyzing 97,331 Danmu Comments with Python
DeWu Technology
DeWu Technology
Sep 12, 2021 · Operations

How Technical Support (TS) Improves Self-Processing Rate

Technical Support teams boost self‑processing rates by building personal knowledge bases, using diagnostic tools, standardizing issue templates, managing feedback channels, creating automation tools, analyzing data trends, and conducting regular post‑mortems, thereby reducing developer support load and increasing overall efficiency.

data analysisissue managementknowledge management
0 likes · 15 min read
How Technical Support (TS) Improves Self-Processing Rate
Xianyu Technology
Xianyu Technology
Sep 7, 2021 · Big Data

Analyzing Business Data Fluctuations and Attribution Methods

The article outlines a systematic framework for detecting abnormal KPI fluctuations in daily dashboards—verifying data accuracy, applying period‑over‑period and 3‑sigma rules, then attributing causes across product, competitor and market scopes using MECE‑based horizontal, vertical funnel, and cross analyses, and quantifying impacts with control‑variable, slot, marginal‑effect, prior‑judgment and difference‑in‑differences methods for rapid analyst response and potential automation.

Business IntelligenceKPI monitoringattribution
0 likes · 7 min read
Analyzing Business Data Fluctuations and Attribution Methods
Taobao Frontend Technology
Taobao Frontend Technology
Aug 20, 2021 · Frontend Development

How to Measure and Improve Front‑End Code Review Quality: Metrics, Insights, and Best Practices

This article examines the evolution of code review, defines key quality metrics such as LOC, inspection time, defect count, and derived rates, analyzes data from CodeCollaborator and Alibaba’s DEF platform, and offers actionable insights to enhance front‑end code review effectiveness.

Code reviewSoftware Engineeringdata analysis
0 likes · 15 min read
How to Measure and Improve Front‑End Code Review Quality: Metrics, Insights, and Best Practices
DataFunTalk
DataFunTalk
Aug 16, 2021 · Fundamentals

Common Pitfalls in Problem Identification and Analysis Thinking for Data Analysts

The article explains how analysts should responsibly identify problems and choose analysis frameworks, illustrating typical traps such as unclear metric definitions, Simpson's paradox, and false causation through three practical scenarios and offering structured thinking methods to avoid chaotic analysis.

Simpson's paradoxanalysis frameworkanalysis pitfalls
0 likes · 7 min read
Common Pitfalls in Problem Identification and Analysis Thinking for Data Analysts
DataFunTalk
DataFunTalk
Aug 12, 2021 · Artificial Intelligence

Causal Inference and Experiment Design in Kuaishou Live Streaming

This article presents Dr. Jin Yaran’s comprehensive overview of causal inference challenges, frameworks, and practical case studies—including DID, double machine learning, causal forests, and meta‑learners—applied to Kuaishou’s live‑streaming product, and discusses complex experimental designs such as bilateral and time‑slice experiments.

A/B testingKuaishoucausal inference
0 likes · 15 min read
Causal Inference and Experiment Design in Kuaishou Live Streaming
Xianyu Technology
Xianyu Technology
Aug 10, 2021 · Product Management

Design of Full-Traffic AB Experiments for Seller Growth on Xianyu

The article describes a full‑traffic A/B testing framework for Xianyu that hashes seller IDs to create exclusive experiment and control groups, ensuring each seller sees only one strategy, and demonstrates that a chat‑incentive for new or churned sellers boosted chat exposure by 22 % and modestly improved overall buyer‑seller metrics without harming transaction efficiency.

AB testingdata analysisexperiment design
0 likes · 9 min read
Design of Full-Traffic AB Experiments for Seller Growth on Xianyu
Python Programming Learning Circle
Python Programming Learning Circle
Aug 6, 2021 · Fundamentals

A Comprehensive List of Commonly Used Pandas Functions Categorized by Purpose

This article presents a curated collection of 100 frequently used pandas functions, organized into six categories—statistical aggregation, data cleaning, data selection, plotting and element‑wise operations, time‑series utilities, and miscellaneous helpers—providing concise Chinese explanations for each function’s purpose.

Pythondata analysisdata cleaning
0 likes · 10 min read
A Comprehensive List of Commonly Used Pandas Functions Categorized by Purpose
Python Programming Learning Circle
Python Programming Learning Circle
Aug 6, 2021 · Fundamentals

Essential Pandas Functions for Data Analysis in Python

This article introduces Python's pandas library as a powerful open‑source alternative to MATLAB for data modeling competitions, covering basic, intermediate, and advanced functions—including data I/O, inspection, logical filtering, visualization, aggregation, and integration with tqdm for progress tracking—complete with code examples.

Pythondata analysispandas
0 likes · 7 min read
Essential Pandas Functions for Data Analysis in Python
DataFunTalk
DataFunTalk
Jul 26, 2021 · Artificial Intelligence

Essential Skills for Algorithm Engineers in a Highly Competitive Landscape

The article outlines the core abilities algorithm engineers need—strong data analysis, solid coding and engineering practices, problem definition, product mindset, and continuous learning—illustrated with real project cases and practical advice for thriving in today’s increasingly competitive tech environment.

AIalgorithm engineeringcareer advice
0 likes · 14 min read
Essential Skills for Algorithm Engineers in a Highly Competitive Landscape
Python Crawling & Data Mining
Python Crawling & Data Mining
Jul 24, 2021 · Fundamentals

Master Pandas: A Step‑by‑Step Guide to Data Analysis with Python

This comprehensive tutorial introduces Pandas—the powerful Python library for data manipulation and analysis—covers installation, data import, inspection, cleaning, indexing, selection, sorting, grouping, transformation, statistical functions, visualization, and exporting, all illustrated with clear code examples and visual outputs.

Data ScienceJupyter NotebookPython
0 likes · 18 min read
Master Pandas: A Step‑by‑Step Guide to Data Analysis with Python
Alimama Tech
Alimama Tech
Jul 14, 2021 · Big Data

A/B Testing Framework for Online Experiments: Design, Implementation, Analysis, and Decision Making

The paper presents a comprehensive A/B testing framework for online experiments that guides practitioners through four stages—designing objectives and metrics, implementing random traffic allocation with robustness checks, evaluating effects using descriptive statistics and hypothesis testing, and making rollout decisions based on multidimensional significance and attribution analyses.

A/B testingdata analysisexperimental design
0 likes · 22 min read
A/B Testing Framework for Online Experiments: Design, Implementation, Analysis, and Decision Making
MaGe Linux Operations
MaGe Linux Operations
Jul 7, 2021 · Fundamentals

Why Spyder Is the Ideal Python IDE for Scientists and Data Analysts

Spyder, a powerful native-Python scientific IDE, offers integrated editing, interactive consoles, variable browsing, documentation viewing, and development tools, plus extensibility via plugins and APIs, and can be installed easily through Anaconda or other package managers, making it a versatile choice for engineers, scientists, and data analysts.

AnacondaPython IDESpyder
0 likes · 4 min read
Why Spyder Is the Ideal Python IDE for Scientists and Data Analysts
JD Retail Technology
JD Retail Technology
Jun 29, 2021 · Big Data

The Value of Data and Data Products: From Concept to Practice

This article explains how data has become a critical production resource, outlines the limitations of traditional data‑analysis workflows, defines data products and their components, describes their advantages and key characteristics, and shares practical case studies of data‑product implementations in a large e‑commerce environment.

Big DataData ProductData Value
0 likes · 16 min read
The Value of Data and Data Products: From Concept to Practice
21CTO
21CTO
Jun 21, 2021 · Fundamentals

Discover Spyder: The Powerful Python IDE for Scientific Computing

Spyder is a native‑Python scientific IDE offering advanced editing, debugging, variable exploration, and interactive consoles, with extensible plugins and API support, and can be installed via Anaconda or other package managers, making it ideal for scientists, engineers, and data analysts.

AnacondaIDE FeaturesPython IDE
0 likes · 4 min read
Discover Spyder: The Powerful Python IDE for Scientific Computing
Python Crawling & Data Mining
Python Crawling & Data Mining
Jun 19, 2021 · Fundamentals

Essential Python Data Analysis Libraries You Must Know

This article provides a concise overview of key Python data‑analysis libraries—including NumPy, pandas, matplotlib, IPython/Jupyter, SciPy, scikit‑learn, and statsmodels—explaining their core features, typical use cases, and how they interoperate to form a powerful scientific computing ecosystem.

MatplotlibNumPyPython
0 likes · 12 min read
Essential Python Data Analysis Libraries You Must Know
MaGe Linux Operations
MaGe Linux Operations
Jun 7, 2021 · Artificial Intelligence

How to Analyze Your WeChat Friends with Python: Gender, Avatar, Signature & Location Insights

This tutorial shows how to use Python libraries such as itchat, jieba, matplotlib, snownlp, and Tencent Youtu to fetch WeChat friend data and produce visual analyses of gender distribution, avatar face detection and tags, signature word clouds and sentiment, and geographic location, all illustrated with charts and code examples.

Face DetectionSentiment AnalysisWeChat
0 likes · 15 min read
How to Analyze Your WeChat Friends with Python: Gender, Avatar, Signature & Location Insights
Python Programming Learning Circle
Python Programming Learning Circle
May 28, 2021 · Fundamentals

Top 25 pandas tricks for efficient data analysis in Python

This tutorial presents 25 practical pandas techniques, covering version checking, DataFrame creation, column renaming, row and column reversal, dtype selection, type conversion, memory optimization, reading and concatenating multiple files, handling missing values, string splitting, series expansion, aggregation, pivot tables, categorizing continuous data, DataFrame styling, and profiling, all illustrated with clear code examples.

Tipsdata analysisdata manipulation
0 likes · 19 min read
Top 25 pandas tricks for efficient data analysis in Python
Didi Tech
Didi Tech
May 21, 2021 · Fundamentals

Introduction to Causal Inference and Its Application in Ride‑Hailing Business

The article introduces causal inference for ride‑hailing businesses, explaining the difference between causality and correlation, common misconceptions, and how randomized experiments and observational techniques like propensity‑score matching can quantify effects of actions such as coupons, driver assignments, and platform growth decisions.

Ride Hailingbusiness decisioncausal inference
0 likes · 7 min read
Introduction to Causal Inference and Its Application in Ride‑Hailing Business
Python Crawling & Data Mining
Python Crawling & Data Mining
May 10, 2021 · Fundamentals

Master NumPy: Turn Math Formulas into Python Code

This article explains how to use Python's NumPy library to translate common mathematical formulas—such as powers, roots, absolute values, vector and matrix operations—into concise, executable code, covering setup, basic operations, and practical examples for data analysis and machine learning.

NumPyPythondata analysis
0 likes · 11 min read
Master NumPy: Turn Math Formulas into Python Code
NiuNiu MaTe
NiuNiu MaTe
May 2, 2021 · Fundamentals

How to Master Python Quickly: A Complete Learning Roadmap for 2024

This guide explains why Python is essential, presents a step‑by‑step learning roadmap covering beginner basics, backend web development, web crawling, data analysis, and machine learning, and provides curated resources and project links to help learners progress efficiently.

Backend DevelopmentWeb Scrapingdata analysis
0 likes · 8 min read
How to Master Python Quickly: A Complete Learning Roadmap for 2024
Python Crawling & Data Mining
Python Crawling & Data Mining
Apr 24, 2021 · Fundamentals

Discover 140+ Must‑Know Python Libraries for Data Science & AI

The article presents a comprehensive guide to Python's built‑in functions, standard libraries, and third‑party packages across file I/O, web scraping, databases, data cleaning, statistical analysis, machine learning, visualization, and more, rating each with stars and offering a free e‑book collection for readers.

PythonWeb Scrapingdata analysis
0 likes · 32 min read
Discover 140+ Must‑Know Python Libraries for Data Science & AI
MaGe Linux Operations
MaGe Linux Operations
Apr 17, 2021 · Backend Development

Boost Your Python Projects: 7 Essential Efficiency Tools You Must Try

Discover seven powerful Python tools—including Pandas, Selenium, Flask, Scrapy, Requests, Faker, and Pillow—that streamline data analysis, web testing, API calls, web development, web scraping, mock data generation, and image processing, complete with installation steps and code examples to boost your development efficiency.

Pythondata analysisproductivity tools
0 likes · 6 min read
Boost Your Python Projects: 7 Essential Efficiency Tools You Must Try
MaGe Linux Operations
MaGe Linux Operations
Apr 16, 2021 · Fundamentals

Master Data Analysis with Python: From Excel/SQL to Pandas in 10 Steps

This tutorial walks data analysts through transitioning from Excel and SQL to Python, covering environment setup, data import with pandas, web scraping, cleaning, renaming, type conversion, filtering, grouping, merging, and visualization using Jupyter Notebook and popular libraries.

Data visualizationJupyter NotebookPython
0 likes · 13 min read
Master Data Analysis with Python: From Excel/SQL to Pandas in 10 Steps
Liangxu Linux
Liangxu Linux
Apr 13, 2021 · Fundamentals

Master awk: From Basics to Advanced Text Processing on Linux

This comprehensive guide introduces awk as a powerful Linux text‑analysis tool, explains its underlying record‑field model, demonstrates practical one‑liners and scripts for reporting, filtering, formatting, file splitting, and shows advanced features like built‑in variables, conditionals, arrays, and string functions.

LinuxShell scriptingUnix
0 likes · 11 min read
Master awk: From Basics to Advanced Text Processing on Linux
JD.com Experience Design Center
JD.com Experience Design Center
Apr 7, 2021 · Artificial Intelligence

How to Use CHAID Decision Trees in SPSS for Market Segmentation

This article explains why simple single‑feature analysis can miss important user groups, introduces decision trees—especially the CHAID algorithm—as a way to uncover multi‑attribute segments, and provides step‑by‑step instructions for building descriptive and predictive trees in SPSS, including how to interpret tree visuals and benefit tables.

CHAIDSPSSdata analysis
0 likes · 11 min read
How to Use CHAID Decision Trees in SPSS for Market Segmentation
DataFunTalk
DataFunTalk
Apr 4, 2021 · Big Data

User Profiling: Concepts, Practices, and Data‑Driven E‑Commerce Case Study

This article introduces the fundamentals of user profiling, explains tag types and their business value, and demonstrates a data‑driven e‑commerce case study that analyzes gender, age, region, marital status, education, profession, product preferences, purchase timing, and price sensitivity to guide targeted promotion strategies.

MarketingPythondata analysis
0 likes · 16 min read
User Profiling: Concepts, Practices, and Data‑Driven E‑Commerce Case Study
Efficient Ops
Efficient Ops
Mar 30, 2021 · Fundamentals

Mastering Awk: Powerful Text Processing for Linux with Real‑World Examples

This tutorial introduces awk as a versatile Linux text‑analysis tool, explains its execution model (BEGIN, body, END), demonstrates practical commands for reporting, filtering, formatting, and advanced scripting, and provides numerous code snippets and visual examples to help readers quickly apply awk in real‑world scenarios.

Scriptingawkcommand-line
0 likes · 12 min read
Mastering Awk: Powerful Text Processing for Linux with Real‑World Examples
Python Programming Learning Circle
Python Programming Learning Circle
Mar 29, 2021 · Databases

User Retention, Funnel, and Session Analysis in ClickHouse Using Bitmap and Retention Functions

The article explains how to perform efficient user retention, funnel, and session analysis on large ClickHouse datasets by replacing costly multi‑table joins with bitmap compression, the built‑in retention function, windowFunnel, and high‑order array functions, providing practical SQL examples and performance insights.

BitmapClickHouseSQL
0 likes · 18 min read
User Retention, Funnel, and Session Analysis in ClickHouse Using Bitmap and Retention Functions
Python Programming Learning Circle
Python Programming Learning Circle
Mar 17, 2021 · Big Data

Eight Python Techniques for Efficient Data Analysis

This article presents eight Python data analysis techniques—including list comprehensions, lambda expressions, map/filter, NumPy arange and linspace, pandas axis handling, and DataFrame concatenation, merging, joining, applying, and pivot tables—to improve code efficiency, readability, and analytical capabilities.

NumPyPythondata analysis
0 likes · 7 min read
Eight Python Techniques for Efficient Data Analysis
Python Crawling & Data Mining
Python Crawling & Data Mining
Mar 3, 2021 · Fundamentals

Uncovering Tianjin’s Bus Network: From Raw GPS Data to Complex Network Insights

This article walks through acquiring Tianjin bus line data via the Gaode Map API, cleaning and converting geographic coordinates, visualizing station distributions with matplotlib and Baidu maps, and then applying complex‑network analysis to reveal degree distributions, clustering coefficients, and small‑world characteristics of the city’s public‑transport system.

Complex NetworksNetwork MetricsPython
0 likes · 19 min read
Uncovering Tianjin’s Bus Network: From Raw GPS Data to Complex Network Insights
Xianyu Technology
Xianyu Technology
Feb 23, 2021 · Artificial Intelligence

Pricing Guidance System for Xianyu Secondhand Marketplace

The Xianyu pricing guidance system blends new‑product market values with depreciation factors derived from usage, condition and category attributes—extracted via real‑time text mining and image analysis—to recommend dynamic price ranges adjusted for supply‑demand and seller urgency, currently covering 60% of listings with over 65% overall accuracy.

data analysise‑commercemachine learning
0 likes · 6 min read
Pricing Guidance System for Xianyu Secondhand Marketplace
DevOps
DevOps
Feb 21, 2021 · Big Data

GitHub 2020 Digital Insight Report: Data‑Driven Analysis of the Global Open‑Source Ecosystem

The GitHub 2020 Digital Insight Report, produced by X‑lab and multiple research institutions, analyzes 860 million event logs, 54.21 million active repositories and 14.54 million developers to reveal growth trends, activity metrics, regional distributions, project influence networks (OpenGalaxy), and monthly‑star highlights, offering actionable insights for open‑source governance and community management.

2020GitHubOpenGalaxy
0 likes · 18 min read
GitHub 2020 Digital Insight Report: Data‑Driven Analysis of the Global Open‑Source Ecosystem
DataFunTalk
DataFunTalk
Feb 5, 2021 · R&D Management

Three Stages of Technical Colleagues and How to Drive Business

The article outlines three developmental stages for engineers—from merely implementing PRD specifications, to understanding business and selecting appropriate technical solutions, and finally proactively contributing business ideas—while describing practical methods for demand exploration, project initiation, management, and data‑driven iteration within a mobile development context.

Mobile DevelopmentProject ManagementR&D management
0 likes · 11 min read
Three Stages of Technical Colleagues and How to Drive Business
Java Captain
Java Captain
Jan 26, 2021 · Big Data

Five Open-Source Stock Trading Tools for Developers

This article introduces five open-source stock trading tools—funds, ZVT, QUANTAXIS, StockAnalysisSystem, and match-trade—detailing their authors, star counts, features, and repository links, offering developers practical resources to build or enhance their own trading applications.

data analysisfinanceopen-source
0 likes · 5 min read
Five Open-Source Stock Trading Tools for Developers
Python Crawling & Data Mining
Python Crawling & Data Mining
Jan 24, 2021 · Fundamentals

Master Python Data Analysis: From Reading Files to Visualization

This guide walks you through the complete Python data‑analysis workflow—reading and writing data, processing with NumPy and pandas, modeling with statsmodels and scikit‑learn, and visualizing results with Matplotlib—while highlighting the key tools and learning path for beginners and busy professionals alike.

NumPyPythondata analysis
0 likes · 6 min read
Master Python Data Analysis: From Reading Files to Visualization
Python Crawling & Data Mining
Python Crawling & Data Mining
Dec 27, 2020 · Fundamentals

What Do China’s Divorce Rates Reveal? A Decade of Data Visualization and Analysis

Using publicly available statistics from the National Bureau of Statistics, this article visualizes and analyzes China’s divorce rates over the past decade, comparing national trends, provincial variations, and city-specific patterns such as Beijing versus Shanghai, revealing a steady increase and regional disparities.

AltairPyechartsPython
0 likes · 8 min read
What Do China’s Divorce Rates Reveal? A Decade of Data Visualization and Analysis
Ctrip Technology
Ctrip Technology
Dec 17, 2020 · Artificial Intelligence

Time Series Forecasting: Tools, Models, and Lessons from Ctrip

This article outlines Ctrip's approach to time series forecasting, covering background, common tools such as factor‑based models, traditional statistical methods like ARIMA, and machine‑learning techniques including tree and neural networks, and shares practical experiences on data splitting, feature engineering, model stability, and evaluation.

ARIMACtripTime Series
0 likes · 13 min read
Time Series Forecasting: Tools, Models, and Lessons from Ctrip
Baobao Algorithm Notes
Baobao Algorithm Notes
Dec 11, 2020 · Industry Insights

How to Cultivate Data Sensitivity: The Core Skill Behind Algorithm Engineers

This article explores the concept of data sensitivity for algorithm engineers, defines its meaning, discusses how to measure it, offers practical steps to develop the skill through data analysis and feature engineering, and reveals the hidden pattern in a label‑prediction example that illustrates its importance.

algorithm engineeringdata analysisdata sensitivity
0 likes · 6 min read
How to Cultivate Data Sensitivity: The Core Skill Behind Algorithm Engineers
MaGe Linux Operations
MaGe Linux Operations
Nov 29, 2020 · Big Data

How Vaex Enables Billion‑Row Data Analysis on a Laptop

This article explains how Vaex, an open‑source DataFrame library, lets data scientists efficiently explore, visualize, and analyze massive datasets—such as the over‑billion‑row NYC taxi records—using memory‑mapping and virtual columns, all on a standard notebook without costly cloud resources.

NYC taxi datasetPythonVaex
0 likes · 11 min read
How Vaex Enables Billion‑Row Data Analysis on a Laptop
Yuewen Technology
Yuewen Technology
Nov 10, 2020 · Artificial Intelligence

Modeling Web Novel Popularity with Predictive Ranking and Statistical Fusion

This article explains how a binary‑classification model combining estimated future behavior and statistical data is used to compute a unified popularity score for web novels, improving both recall and ranking in search and library scenarios while addressing challenges of cold‑start and long‑tail items.

LambdaMARTLearning-to-RankLightGBM
0 likes · 9 min read
Modeling Web Novel Popularity with Predictive Ranking and Statistical Fusion
58UXD
58UXD
Oct 19, 2020 · Product Management

Turning Used‑Car Search into a Smart Recommendation Engine

This article analyzes why the used‑car search conversion is low, reconstructs user search scenarios from query data, categorizes search intents, identifies pain points across the search funnel, and proposes product‑level redesigns and recommendation strategies to educate vague users and deliver more precise results.

User experiencedata analysise‑commerce
0 likes · 9 min read
Turning Used‑Car Search into a Smart Recommendation Engine
JD Retail Technology
JD Retail Technology
Oct 16, 2020 · Industry Insights

How JD’s PLUS Membership Used Data and Algorithms to Drive Growth

This article examines JD.com’s transition from traffic‑driven acquisition to a data‑centric, algorithm‑powered membership model, detailing the construction of a robust data foundation, multi‑level analysis methods, productized dashboards, and growth‑hacking experiments that boosted PLUS member retention and revenue.

Growth Hackingalgorithmaugmented analytics
0 likes · 24 min read
How JD’s PLUS Membership Used Data and Algorithms to Drive Growth
DataFunTalk
DataFunTalk
Oct 10, 2020 · Product Management

Search Product Optimization: From System Architecture to User Demand and Content Strategies

This article outlines a comprehensive approach for search product managers to drive system improvements, covering overall architecture, query understanding, recall and ranking optimization, business and presentation rules, content enrichment, frontend design, and methods for uncovering user needs through data and behavior analysis.

Searchcontent strategydata analysis
0 likes · 24 min read
Search Product Optimization: From System Architecture to User Demand and Content Strategies
Programmer DD
Programmer DD
Sep 10, 2020 · Artificial Intelligence

Can You Predict Speed‑Dating Success? A Data‑Driven Exploration

This article walks through loading the Speed Dating dataset, examining its features and missing values, visualizing match rates by gender and age, performing correlation analysis, and building a logistic regression model with SMOTE oversampling to predict whether a pair will successfully match.

Pythondata analysisimbalanced data
0 likes · 11 min read
Can You Predict Speed‑Dating Success? A Data‑Driven Exploration
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 9, 2020 · Artificial Intelligence

Can You Predict Speed‑Dating Success? A Data‑Driven AI Analysis

This article explores the classic Speed Dating dataset, performing data cleaning, exploratory analysis of match rates, gender and age effects, correlation studies, and finally building a logistic regression model with SVMSMOTE oversampling to predict matchmaking success, achieving around 83% accuracy.

PythonSVMSMOTEdata analysis
0 likes · 11 min read
Can You Predict Speed‑Dating Success? A Data‑Driven AI Analysis
MaGe Linux Operations
MaGe Linux Operations
Aug 28, 2020 · Fundamentals

7 Powerful Jupyter Tricks to Supercharge Your Data Analysis

This guide presents seven practical techniques—from using Pandas Profiling and Cufflinks‑Plotly visualizations to mastering IPython magic commands, Jupyter formatting, keyboard shortcuts, multiple outputs, and live slide conversion with RISE—to accelerate and enrich everyday data analysis workflows.

IPythonPythondata analysis
0 likes · 9 min read
7 Powerful Jupyter Tricks to Supercharge Your Data Analysis
Youzan Coder
Youzan Coder
Aug 21, 2020 · Industry Insights

From PRD Translator to Business Driver: The Three Growth Stages for Engineers

The article outlines three progressive stages for technical staff—from merely following PRD specifications, to understanding business needs and selecting appropriate solutions, and finally to proactively shaping business strategy—while sharing practical methods for demand exploration, project initiation, management, and data‑driven iteration.

Career DevelopmentProject Managementbusiness thinking
0 likes · 13 min read
From PRD Translator to Business Driver: The Three Growth Stages for Engineers
Liangxu Linux
Liangxu Linux
Aug 19, 2020 · Operations

How to Quickly Analyze Beijing Residency Data with Shell Commands

This tutorial shows how to use standard Unix shell tools such as grep, cut, sort, uniq, awk, and join to extract insights—top companies, most common surnames, popular given names, age distribution, and hometown statistics—from a JSON dataset of over 6,000 Beijing residency applicants.

Big DataJSONShell
0 likes · 13 min read
How to Quickly Analyze Beijing Residency Data with Shell Commands
DevOps Coach
DevOps Coach
Aug 13, 2020 · Databases

How to Benchmark Elasticsearch Clusters with Rally: A Step‑by‑Step Guide

This article explains why large‑scale Elasticsearch deployments need rigorous performance testing, compares available testing tools, walks through installing and configuring the official Rally benchmark suite, details hardware recommendations, shows how to run tests against multiple cloud providers, and teaches you how to interpret the resulting metrics to make informed cluster‑selection decisions.

BenchmarkingElasticsearchPerformance Testing
0 likes · 16 min read
How to Benchmark Elasticsearch Clusters with Rally: A Step‑by‑Step Guide
MaGe Linux Operations
MaGe Linux Operations
Aug 12, 2020 · Fundamentals

Why Learn Python? 6 Powerful Application Areas You Should Know

Python’s versatility makes it essential across fields—from AI and cloud computing to web development, web scraping, game creation, and data analysis—offering high demand, lucrative careers for IT professionals and productivity boosts for non‑IT workers, as illustrated by real‑world examples and industry trends.

PythonWeb Developmentartificial intelligence
0 likes · 5 min read
Why Learn Python? 6 Powerful Application Areas You Should Know
MaGe Linux Operations
MaGe Linux Operations
Aug 5, 2020 · Big Data

8 Must‑Know Python Tools for Data Mining and Analysis

This article introduces eight essential Python libraries—Gensim, TensorFlow, SciPy, NumPy, Matplotlib, Pandas, Scikit‑Learn, and Keras—that empower developers to clean, prepare, merge, and accurately analyze data for effective data mining.

Pythondata analysisdata mining
0 likes · 4 min read
8 Must‑Know Python Tools for Data Mining and Analysis
58UXD
58UXD
Aug 5, 2020 · Product Management

Mastering Product Management: How to Decompose Problems and Boost Your Value 10x

This talk guides product managers and designers on breaking down complex goals—like achieving a million‑yuan sales target in three months—through structured problem‑decomposition methods such as MECE and SMART, while outlining a personal development framework that cultivates analytical, communication, and leadership skills across career stages.

Career DevelopmentDesign ThinkingMECE
0 likes · 7 min read
Mastering Product Management: How to Decompose Problems and Boost Your Value 10x
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 29, 2020 · Fundamentals

What Is Data Analysis? Definitions, Skills, History, and Practical Steps

This comprehensive guide explains what data analysis is, the types of data, its historical evolution, the relationship between data analysis, data science and business intelligence, essential skills, why analysis matters, and a step‑by‑step framework with models and metric design for effective decision‑making.

AnalyticsBusiness IntelligenceMetrics
0 likes · 25 min read
What Is Data Analysis? Definitions, Skills, History, and Practical Steps
DataFunTalk
DataFunTalk
Jul 19, 2020 · Product Management

Stranger Social Apps: Business Insights, Data‑Driven Modeling, and Matching Algorithms

This article analyses the unique challenges of stranger‑social platforms such as Tinder and Tantan, exploring business models, user behavior, network effects, gender dynamics, data collection, algorithmic matching, risk control, and system architecture to guide product strategy and optimization.

Recommendation Systemsdata analysismatching algorithms
0 likes · 30 min read
Stranger Social Apps: Business Insights, Data‑Driven Modeling, and Matching Algorithms
Architects Research Society
Architects Research Society
Jul 14, 2020 · Frontend Development

Graph Visualization Ecosystem: Overview of Libraries, Toolkits, and Applications

This article, the final part of the GraphTech ecosystem series, provides a comprehensive overview of the front‑end graph visualization layer, detailing its role, benefits, and a curated list of over 70 open‑source and commercial libraries, toolkits, software, and built‑in visualizers for graph data analysis.

Graph Visualizationdata analysisfrontend libraries
0 likes · 9 min read
Graph Visualization Ecosystem: Overview of Libraries, Toolkits, and Applications
Python Crawling & Data Mining
Python Crawling & Data Mining
Jul 2, 2020 · Big Data

How to Identify Top Bilibili Creators Using the IFL Model: A Data‑Driven Guide

This article presents a complete data‑analysis workflow that scrapes Bilibili video metrics from January 2019 to March 2020, cleans and preprocesses 50,130 records, and extends the classic RFM model into an IFL framework—calculating interaction, frequency and like rates—to score and rank up‑creators across multiple categories, with code and datasets provided for replication.

IFL modelPythonRFM
0 likes · 11 min read
How to Identify Top Bilibili Creators Using the IFL Model: A Data‑Driven Guide
Architect
Architect
Jun 30, 2020 · Artificial Intelligence

Analyzing TikTok's US Retention Surge: Algorithmic, Operational, and Marketing Factors

The article examines TikTok's dramatic increase in US user retention by dissecting supply‑side content growth, operational localization, marketing exposure, algorithmic matching, and external influences, and then proposes data‑driven and algorithmic interventions to sustain and amplify the platform's growth.

TikTokUser Retentioncontent moderation
0 likes · 17 min read
Analyzing TikTok's US Retention Surge: Algorithmic, Operational, and Marketing Factors
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Jun 17, 2020 · Fundamentals

Turning User Tags into Actionable Business Insights: A VIP Fruit‑Plate Service Case Study

This article explains how to avoid idle user tags by integrating them into business processes, using a VIP fruit‑plate service example to illustrate tag basics, two promotion strategies, key performance metrics, and deeper implementation steps that turn profiling data into concrete operational value.

Business IntelligenceVIP servicedata analysis
0 likes · 8 min read
Turning User Tags into Actionable Business Insights: A VIP Fruit‑Plate Service Case Study
DataFunTalk
DataFunTalk
Jun 15, 2020 · Artificial Intelligence

Understanding and Handling Bad Cases in E-commerce Recommendation Systems

The article explores why bad cases occur in e‑commerce recommendation and search pipelines, classifies their types, demonstrates data‑driven analysis methods, and proposes practical online and offline strategies—including rule‑based fixes, model improvements, and iterative feedback loops—to continuously improve user experience and business metrics.

badcasedata analysise‑commerce
0 likes · 23 min read
Understanding and Handling Bad Cases in E-commerce Recommendation Systems
Fulu Network R&D Team
Fulu Network R&D Team
Jun 11, 2020 · Artificial Intelligence

Intelligent Inventory Management: Comparing Prophet and LSTM for Time‑Series Forecasting

This article presents an intelligent inventory management solution that predicts product consumption using two time‑series algorithms—Facebook's Prophet and LSTM deep learning—detailing data sources, preprocessing, model configuration, evaluation metrics, and a comparative analysis of their performance and suitability.

LSTMProphetTime Series
0 likes · 16 min read
Intelligent Inventory Management: Comparing Prophet and LSTM for Time‑Series Forecasting
Python Programming Learning Circle
Python Programming Learning Circle
Jun 4, 2020 · Fundamentals

Superstore Sales Data Analysis: From Data Preprocessing to RFM Modeling

This article presents a comprehensive analysis of a global supermarket's four‑year sales dataset, covering data collection, preprocessing, exploratory visualizations, sales, quantity, profit, market segmentation, product performance, customer segmentation, RFM modeling, and actionable recommendations to improve revenue and customer retention.

RFM modelRetail analyticsSales Forecasting
0 likes · 27 min read
Superstore Sales Data Analysis: From Data Preprocessing to RFM Modeling
Python Crawling & Data Mining
Python Crawling & Data Mining
May 19, 2020 · Fundamentals

Master Pandas: From Import to Data Cleaning in One Comprehensive Guide

This tutorial walks through essential pandas operations—including importing modules, building a sample shopping dataset, reading and writing CSV files, inspecting data structures, and performing thorough data cleaning such as handling missing values, trimming spaces, case conversion, replacements, deletions, duplicate removal, type casting, and column renaming—complete with code snippets and visual results.

PythonTutorialdata analysis
0 likes · 10 min read
Master Pandas: From Import to Data Cleaning in One Comprehensive Guide
Architect's Tech Stack
Architect's Tech Stack
May 15, 2020 · Databases

May 2020 Programmer Salary Report and Oracle High‑Performance System Architecture Book Recommendation

The article presents May 2020 programmer salary statistics across China, analyzes data anomalies and corrective rules, highlights regional wage differences, and recommends the book “Oracle High‑Performance System Architecture Practice” for deep understanding of Oracle database performance and optimization techniques.

Oraclebook recommendationdata analysis
0 likes · 3 min read
May 2020 Programmer Salary Report and Oracle High‑Performance System Architecture Book Recommendation
Architects Research Society
Architects Research Society
Apr 27, 2020 · Fundamentals

What Is an IoT Platform? A Simple Non‑Technical Overview

This article explains what an IoT platform is, how it fits into a complete IoT system, the functions it provides such as connectivity, protocol handling, security, data collection and analysis, and offers guidance on when businesses should adopt one despite cost trade‑offs.

HardwareIoTIoT Platform
0 likes · 6 min read
What Is an IoT Platform? A Simple Non‑Technical Overview