Tagged articles

data mining

128 articles · Page 1 of 2

Oct 29, 2025 · Backend Development

Leveraging Industry Sectors for Smarter Value Investing with Python Scraping & FastAPI

This guide explains how to use industry sector classification to support value investing by scraping sector and stock data, parsing JSON responses, filtering by keywords, and exposing the results via FastAPI services, then displaying them in a Uniapp front‑end table for easy stock selection.

FastAPIPythonWeb Scraping

0 likes · 10 min read

Leveraging Industry Sectors for Smarter Value Investing with Python Scraping & FastAPI

DataFunSummit

Sep 14, 2025 · Artificial Intelligence

How AI is Revolutionizing Chemistry and Drug Discovery: From Data to Breakthroughs

This article explores how AI-driven models and data pipelines are transforming the chemistry and pharmaceutical sectors by accelerating drug design, improving protein‑antibody predictions, automating patent data extraction, and outlining future goals for end‑to‑end AI‑enabled scientific discovery.

AI for scienceChemistry AILarge Language Models

0 likes · 13 min read

How AI is Revolutionizing Chemistry and Drug Discovery: From Data to Breakthroughs

Kuaishou Tech

Jul 29, 2025 · Artificial Intelligence

How Kuaishou’s 8 Groundbreaking Papers Are Shaping AI at KDD 2025

Eight Kuashou research papers covering recommendation systems, multi‑task learning, multimodal large models, large language models, and combinatorial optimization have been accepted to the premier AI data‑mining conference KDD 2025, highlighting the company’s cutting‑edge innovations and their potential impact on the field.

AIMultimodal LearningRecommendation Systems

0 likes · 18 min read

How Kuaishou’s 8 Groundbreaking Papers Are Shaping AI at KDD 2025

Python Crawling & Data Mining

Jun 4, 2025 · Big Data

How to Master Python Web Scraping with Pandas: From HTML to CSV in Minutes

This article walks through using Pandas to directly read HTML pages, extract table data, handle AJAX‑loaded JSON and CSV formats, and save results, providing concise code examples and visual steps for effective Python web scraping and data mining.

PandasPythonWeb Scraping

0 likes · 4 min read

How to Master Python Web Scraping with Pandas: From HTML to CSV in Minutes

AI Code to Success

Mar 12, 2025 · Artificial Intelligence

Mastering K‑Means: Theory, Implementation, and Real‑World Applications

This comprehensive guide explores the K‑Means clustering algorithm, covering its mathematical foundation, step‑by‑step procedure, centroid initialization strategies, practical implementation with Python’s Scikit‑learn on the Iris dataset, evaluation metrics, optimization techniques, and diverse applications ranging from image segmentation to bioinformatics.

ClusteringK-MeansMachine Learning

0 likes · 31 min read

Mastering K‑Means: Theory, Implementation, and Real‑World Applications

Python Programming Learning Circle

Jan 2, 2025 · Artificial Intelligence

A Comprehensive Guide to Dimensionality Reduction Algorithms with Python Implementations

This article introduces eleven classic dimensionality reduction techniques—including PCA, LDA, MDS, LLE, and t‑SNE—explains their principles, advantages, and limitations, and provides complete Python code examples and resources for each method, making it a valuable guide for beginners in machine learning and data mining.

PCAdata miningdimensionality reduction

0 likes · 17 min read

A Comprehensive Guide to Dimensionality Reduction Algorithms with Python Implementations

JavaEdge

Oct 7, 2024 · Big Data

Master Data Analysis: From Collection to Visualization

This guide explains why data analysis is essential, breaks it into three core stages—data collection, data mining, and data visualization—offers practical tool recommendations, and presents principles for efficient learning and skill development.

Big DataData VisualizationPython

0 likes · 10 min read

Master Data Analysis: From Collection to Visualization

Software Development Quality

Oct 7, 2024 · Fundamentals

8 Essential Data Analysis Techniques Every Analyst Should Master

This article introduces eight core data analysis methods—including association, comparative, clustering, cross, Pareto, quadrant, funnel, and full‑path analysis—explaining their principles, typical use cases, key metrics, and visual examples to help professionals make data‑driven decisions.

data miningstatistical methods

0 likes · 11 min read

8 Essential Data Analysis Techniques Every Analyst Should Master

AntTech

Sep 5, 2024 · Artificial Intelligence

Ant InTech Technology Award Announces First Ten Young Scholars and Their Research Areas

On September 5 at the 2024 Inclusion·Bund Conference, Ant InTech announced its first ten award-winning young scholars from top Chinese universities, highlighting their research in artificial intelligence, data processing, cloud computing, security, and related fields, each receiving a 200,000‑RMB grant.

Ant GroupArtificial IntelligenceInTech Award

0 likes · 4 min read

Ant InTech Technology Award Announces First Ten Young Scholars and Their Research Areas

AntTech

Aug 28, 2024 · Artificial Intelligence

Ant Group’s Selected Papers at KDD2024: Abstracts and Highlights

The article presents a curated collection of Ant Group's research papers accepted at KDD2024, summarizing each paper's title, type, link, source, relevant fields, and abstract, covering topics such as graph mining, large language models, fraud detection, recommendation systems, and multimodal medical AI.

AI researchAnt GroupKDD2024

0 likes · 31 min read

Ant Group’s Selected Papers at KDD2024: Abstracts and Highlights

Meituan Technology Team

Aug 8, 2024 · Artificial Intelligence

BlackPearl Team Wins All Three Tracks of KDD 2024 OAG‑Challenge Cup with Large‑Model Solutions

The BlackPearl team from Meituan’s Dazhong Dianping division swept all three KDD 2024 OAG‑Challenge Cup tracks—WhoIsWho, PST, and AQA—by deploying innovative large‑model techniques such as iterative text clustering, graft‑learning‑enhanced BERT RAG pipelines, and a Boosting LLM‑for‑Vector search, and have released the code publicly on GitHub.

Academic DisambiguationKDD CupLarge Language Model

0 likes · 4 min read

BlackPearl Team Wins All Three Tracks of KDD 2024 OAG‑Challenge Cup with Large‑Model Solutions

Architect

Jul 19, 2024 · Artificial Intelligence

Can Machine Learning Beat the Odds? A Deep Dive into Football Match Prediction

This article presents a data‑driven football match prediction system that extracts match features, builds machine‑learning models—including linear, SVM, random forest, and deep neural networks—and evaluates their accuracy on European league data, then analyzes betting strategies, limitations, and extensions to stock forecasting.

Artificial IntelligenceMachine Learningdata mining

0 likes · 24 min read

Can Machine Learning Beat the Odds? A Deep Dive into Football Match Prediction

Tencent Cloud Developer

Jul 4, 2024 · Artificial Intelligence

Football Match Outcome Prediction and Betting Strategy Using Machine Learning

The study combines team statistics and bookmaker odds with machine‑learning models—including Poisson, regression, Bayesian, SVM, Random Forest, DNN, and LSTM—to predict football match outcomes, identify confidence‑based betting intervals that yield profit, and suggests extensions to broader data, features, and financial trading.

Machine LearningRandom Forestdata mining

0 likes · 23 min read

Football Match Outcome Prediction and Betting Strategy Using Machine Learning

Python Programming Learning Circle

Jun 21, 2024 · Artificial Intelligence

Using scikit-learn for Data Mining: Feature Engineering, Parallel Processing, Pipelines, and Model Persistence

This article demonstrates how to perform data mining with scikit-learn by detailing the full workflow—from data acquisition and feature engineering, through parallel and pipeline processing, to automated hyper‑parameter tuning and model persistence—using the Iris dataset as an example.

Scikit-learndata miningfeature engineering

0 likes · 13 min read

Using scikit-learn for Data Mining: Feature Engineering, Parallel Processing, Pipelines, and Model Persistence

DataFunSummit

Jun 2, 2024 · Artificial Intelligence

Construction and Application of a User Profile Tag System: Methods, Platforms, and Use Cases

This article presents a comprehensive overview of building a user profile tag system—including tag taxonomy, platform architecture, construction methods, update cycles, access patterns, common algorithmic tags, and real‑world applications such as marketing, metric attribution, and A/B testing—illustrated with examples and a detailed Q&A session from a data‑mining senior manager at Qunar.

AB testingMachine Learningcausal inference

0 likes · 21 min read

Model Perspective

May 13, 2024 · Fundamentals

How to Identify and Quantify Core Variables for Better Decision‑Making

The article explains why pinpointing core variables is crucial, outlines domain‑knowledge and technical methods such as sensitivity analysis and data mining to discover them, and describes practical ways to turn those variables into quantitative indicators like scoring systems, composite indices, and real‑world examples.

core variablesdata miningdecision making

0 likes · 10 min read

How to Identify and Quantify Core Variables for Better Decision‑Making

Python Crawling & Data Mining

Apr 19, 2024 · Fundamentals

How to Clean and Transform Data in Python: A Step‑by‑Step Guide

This article walks through solving a Pandas data‑analysis problem raised in a Python community, showcasing multiple Python techniques—including string manipulation, zero‑width spaces, and slicing—to clean and transform data, complete with code snippets and result screenshots.

PandasPythonWeb Scraping

0 likes · 3 min read

How to Clean and Transform Data in Python: A Step‑by‑Step Guide

DataFunTalk

Mar 6, 2024 · Artificial Intelligence

Construction and Practical Application of a User Profile Tagging System

This article details the design, integration, and operational practices of a comprehensive user and item profiling tag system, covering tag taxonomy, construction methods, update cycles, access strategies, algorithmic implementations, and real‑world applications such as marketing, attribution analysis, and A/B testing.

AB testingMachine LearningTagging System

0 likes · 20 min read

Test Development Learning Exchange

Jan 26, 2024 · Artificial Intelligence

Data Mining Techniques for Marketing: Customer Segmentation, Purchase Prediction, Recommendation, and More with Python

This article introduces ten data‑mining applications for marketing—including customer segmentation, purchase forecasting, market‑basket analysis, churn prediction, sentiment analysis, response modeling, recommendation systems, brand reputation, competitive analysis, and public‑opinion monitoring—each illustrated with concise Python code examples.

Customer SegmentationMachine LearningMarketing Analytics

0 likes · 11 min read

Data Mining Techniques for Marketing: Customer Segmentation, Purchase Prediction, Recommendation, and More with Python

Test Development Learning Exchange

Jan 7, 2024 · Big Data

Association Rule Mining Applications Across Various Business Scenarios with Python Code

This article demonstrates how to apply the Apriori algorithm and association rule mining using Python's mlxtend library across ten real‑world business scenarios, providing step‑by‑step code examples for retail, e‑commerce, marketing, healthcare, security, CRM, social networks, travel, market basket, and online advertising.

AprioriPythonassociation rule mining

0 likes · 9 min read

Association Rule Mining Applications Across Various Business Scenarios with Python Code

DataFunSummit

Dec 20, 2023 · Artificial Intelligence

Building and Applying an Image Tagging System: Architecture, Tag Design, Algorithms, and Business Use Cases

This presentation by senior data mining manager Zhou Yuanwei of Qunar outlines the architecture of an image tagging platform, the construction of a comprehensive tagging system, common algorithmic tags, and real-world applications such as look‑alike marketing, A/B test efficiency analysis, and business attribution, helping audiences understand tag types, design considerations, and value‑driven use cases.

AB testingBusiness Analyticsdata mining

0 likes · 2 min read

Building and Applying an Image Tagging System: Architecture, Tag Design, Algorithms, and Business Use Cases

Meituan Technology Team

Aug 10, 2023 · Artificial Intelligence

Selected Meituan Technical Papers from KDD 2023: Summaries of Seven Research Works

The article showcases seven Meituan research papers accepted at KDD 2023—spanning feed‑stream, cross‑domain, takeaway, bonus allocation, contour‑based segmentation, living‑needs prediction, and multilingual recommendation—detailing their novel methods, real‑world deployments, and concluding with an invitation for academic collaboration.

Artificial IntelligenceKDD 2023Machine Learning

0 likes · 17 min read

Selected Meituan Technical Papers from KDD 2023: Summaries of Seven Research Works

DataFunSummit

Aug 9, 2023 · Artificial Intelligence

Applying Graph Intelligence to Anti-Money Laundering: Business Background, Process, and Case Study

This article presents Fabarta's application of graph‑based artificial intelligence to anti‑money‑laundering, detailing the business challenges, a five‑step analytical workflow, a synthetic data case study, and a Q&A that explores practical deployment and future directions.

Case Studyanti-money launderingdata mining

0 likes · 18 min read

Applying Graph Intelligence to Anti-Money Laundering: Business Background, Process, and Case Study

DataFunTalk

Jul 24, 2023 · Artificial Intelligence

Session Analytics: User Path Analysis, Data Processing, and Algorithm Mining

This article introduces user path analysis and the SessionAnalytics open‑source framework, covering business scenarios, technical architecture, data integration, session segmentation, data cleaning, sampling, graph structures, NLP‑based mining, clustering, and visualization techniques for extracting insights from large‑scale user behavior data.

NLPdata miningsession analytics

0 likes · 19 min read

Session Analytics: User Path Analysis, Data Processing, and Algorithm Mining

Python Crawling & Data Mining

May 8, 2023 · Artificial Intelligence

How to Choose the Right Features for Python Machine Learning Projects

This article explains Python machine‑learning basics, covering data splitting, feature and label concepts, key factors for feature selection, and practical tips for building predictive models, while also offering code snippets and visual illustrations to help readers apply these techniques effectively.

AIdata miningfeature selection

0 likes · 6 min read

How to Choose the Right Features for Python Machine Learning Projects

Python Crawling & Data Mining

Mar 2, 2023 · Backend Development

Solve Encrypted Python Web Scraping Issues Using Selenium – Step‑by‑Step Guide

This article walks through a Python web‑scraping challenge involving encrypted parameters, demonstrates a Selenium‑based solution with full code, shares community tips that fix the request payload, and shows how the issue was resolved step by step.

AutomationSeleniumWeb Scraping

0 likes · 4 min read

Solve Encrypted Python Web Scraping Issues Using Selenium – Step‑by‑Step Guide

Big Data Technology & Architecture

Nov 28, 2022 · Big Data

Comprehensive Guide to Big Data Interview Topics: Log Collection, Data Synchronization, Offline Development, Real‑time Technology, Data Services, and Data Mining

This article provides an extensive overview of big‑data interview subjects, covering browser and mobile log collection methods, data synchronization techniques (batch, real‑time, sharding), offline data development platforms, streaming architectures, data service evolution, performance optimization, and data‑mining layers and applications.

Big DataData synchronizationStreaming

0 likes · 17 min read

Comprehensive Guide to Big Data Interview Topics: Log Collection, Data Synchronization, Offline Development, Real‑time Technology, Data Services, and Data Mining

Baidu Intelligent Testing

Oct 19, 2022 · Artificial Intelligence

Intelligent Test Evaluation: Risk Dimension Mining, Admission Assessment, Multi‑Dimensional Activity Data Mining, and Model‑Based Risk Evaluation

This article presents an end‑to‑end intelligent testing framework that mines development‑stage risk dimensions, conducts admission risk assessment, extracts multi‑dimensional activity data such as coverage metrics, and applies model‑based risk evaluation to guide quality‑assurance decisions and improve release safety.

Artificial Intelligencedata miningmodeling

0 likes · 11 min read

Intelligent Test Evaluation: Risk Dimension Mining, Admission Assessment, Multi‑Dimensional Activity Data Mining, and Model‑Based Risk Evaluation

Baidu Geek Talk

Oct 18, 2022 · Artificial Intelligence

Intelligent Test Evaluation and Risk Assessment in Software Quality Assurance

The article describes an intelligent test‑evaluation framework that gathers performance data, quantifies project, personnel, and code risk dimensions, feeds them into rule‑based and logistic‑regression models to produce risk scores and risk‑driven testing plans, and demonstrates how this approach identified thousands of high‑risk projects, prevented hundreds of bugs, and saved thousands of person‑days.

data miningrisk assessmentsoftware testing

0 likes · 9 min read

Intelligent Test Evaluation and Risk Assessment in Software Quality Assurance

MaGe Linux Operations

Sep 8, 2022 · Artificial Intelligence

Master 10 Popular Clustering Algorithms in Python with Scikit‑Learn

This tutorial introduces unsupervised clustering, explains its purpose, and walks through installing scikit‑learn and implementing ten popular clustering algorithms—including AffinityPropagation, Agglomerative, BIRCH, DBSCAN, K‑Means, Mini‑Batch K‑Means, MeanShift, OPTICS, Spectral Clustering, and Gaussian Mixture—complete with code examples and visualizations.

ClusteringMachine LearningScikit-learn

0 likes · 27 min read

Master 10 Popular Clustering Algorithms in Python with Scikit‑Learn

Model Perspective

Sep 4, 2022 · Fundamentals

Grey Relational Analysis: A Powerful Tool for Comprehensive Evaluation

The article explains the principles of grey system theory, introduces grey relational analysis as a method for handling sparse information, outlines its mathematical foundations, step‑by‑step modeling process, and demonstrates how the grey comprehensive evaluation method can rank and compare multiple alternatives without requiring large sample sizes or strict statistical assumptions.

Comprehensive Evaluationdata mininggrey-system

0 likes · 14 min read

Grey Relational Analysis: A Powerful Tool for Comprehensive Evaluation

Python Crawling & Data Mining

Sep 1, 2022 · Big Data

How to Scrape and Process Chinese Stock Flow Data with Python

This guide walks you through using Python to locate the API endpoint for Eastmoney sector capital flow, send HTTP requests, clean the returned string into proper JSON, convert it to a Pandas DataFrame, and finally save the data locally for further analysis.

data miningfinancial data

0 likes · 7 min read

How to Scrape and Process Chinese Stock Flow Data with Python

MaGe Linux Operations

Jul 29, 2022 · Artificial Intelligence

Master 10 Popular Clustering Algorithms in Python with Scikit‑Learn

This tutorial introduces clustering, explains why no single algorithm fits all data, and provides step‑by‑step Python examples using scikit‑learn for ten popular unsupervised learning methods, complete with code snippets and visualizations to illustrate results.

ClusteringMachine LearningPython

0 likes · 24 min read

Model Perspective

Jun 17, 2022 · Artificial Intelligence

What Is Classification in Data Mining? Types, Models, and Key Applications

The article explains classification as a data‑analysis task that builds models to assign new observations to predefined categories, outlines its implementation steps, describes various data types (boolean, nominal, ordinal, continuous, discrete), presents common machine‑learning classifiers such as decision trees and neural networks, and highlights practical applications like crime detection, disease risk prediction, and credit assessment.

Machine Learningclassificationdata mining

0 likes · 5 min read

What Is Classification in Data Mining? Types, Models, and Key Applications

Python Programming Learning Circle

Apr 14, 2022 · Artificial Intelligence

Top Clustering Algorithms in Python with scikit-learn: A Comprehensive Tutorial

This tutorial explains clustering as an unsupervised learning task, outlines why no single algorithm fits all data, and provides step‑by‑step Python code using scikit‑learn to install the library, generate synthetic datasets, and apply ten popular clustering algorithms with visualizations.

ClusteringMachine LearningPython

0 likes · 21 min read

Top Clustering Algorithms in Python with scikit-learn: A Comprehensive Tutorial

Python Crawling & Data Mining

Mar 31, 2022 · Fundamentals

How to Scrape Shenzhen Bus Data with Python: Step‑by‑Step Guide

This tutorial walks you through using Python and BeautifulSoup to crawl the 8684.cn website, extract city‑specific bus route information, parse classification pages, and collect detailed schedule and station data, while providing reusable code snippets and data‑cleaning suggestions.

Bus Databeautifulsoupdata mining

0 likes · 8 min read

How to Scrape Shenzhen Bus Data with Python: Step‑by‑Step Guide

DataFunTalk

Mar 18, 2022 · Artificial Intelligence

Alternative Data Mining: From 19th‑Century Cholera Mapping to Modern AI‑Driven Risk Modeling

This talk reviews the concept of alternative data, illustrates its early use in John Snow's cholera map, explores contemporary AI‑powered systems such as IBM's Debater and satellite‑based poverty estimation, and presents the speaker's own research on using unconventional data for financial‑market risk detection and prediction.

Artificial IntelligenceSatellite Imageryalternative data

0 likes · 14 min read

Alternative Data Mining: From 19th‑Century Cholera Mapping to Modern AI‑Driven Risk Modeling

NetEase LeiHuo UX Big Data Technology

Feb 26, 2022 · Fundamentals

Applying DMAIC and the Five‑Layer UX Model to Data Product Design

The article explains how the DMAIC framework from Six Sigma and the five‑layer user‑experience model can be combined to guide the definition, measurement, analysis, improvement, and control of data products, especially in gaming contexts, emphasizing systematic design, visualization, and iterative refinement.

Data Product DesignSix Sigmadata mining

0 likes · 8 min read

Applying DMAIC and the Five‑Layer UX Model to Data Product Design

Python Crawling & Data Mining

Jan 9, 2022 · Big Data

Unlocking Hidden Insights: A Beginner’s Guide to Data Mining Processes

This article explains why data mining matters, defines the discipline, outlines its five‑step workflow, and dives into core techniques such as association‑rule mining, classification, clustering, and regression, illustrated with practical examples and visual diagrams.

Big DataClusteringRegression

0 likes · 10 min read

Unlocking Hidden Insights: A Beginner’s Guide to Data Mining Processes

Meituan Technology Team

Jan 6, 2022 · Artificial Intelligence

Multi-domain Modeling and AutoML Techniques from Kaggle/KDD Cup Championships

Drawing on seven Kaggle and KDD Cup victories, the article outlines a multi‑domain modeling optimization strategy—covering recommendation, time‑series, and AutoML problems—alongside a three‑module AutoML pipeline and a three‑stage workflow that emphasize systematic evaluation, bias‑variance balance, and robust model‑fusion for competition and industry success.

AutoMLKDD CupKaggle

0 likes · 37 min read

Multi-domain Modeling and AutoML Techniques from Kaggle/KDD Cup Championships

YunZhu Net Technology Team

Dec 17, 2021 · Artificial Intelligence

Understanding Recommendation Systems for B2B Construction E‑Commerce

This article explains why recommendation systems are essential for B2B construction e‑commerce, describes the types of data they rely on, outlines multi‑channel recall methods, details collaborative‑filtering algorithms with similarity calculations, and presents the four‑stage recommendation pipeline from recall to re‑ranking.

Artificial IntelligenceB2B e-commercecollaborative filtering

0 likes · 11 min read

Understanding Recommendation Systems for B2B Construction E‑Commerce

Python Crawling & Data Mining

Nov 7, 2021 · Big Data

Mastering Data Mining: A Deep Dive into CRISP‑DM and SEMMA Methodologies

This article explains the two most common data‑mining frameworks—CRISP‑DM and SEMMA—detailing their six and five stages respectively, illustrating each phase with diagrams and highlighting how the iterative nature of data mining drives continuous improvement.

AnalyticsBig DataCRISP-DM

0 likes · 8 min read

Mastering Data Mining: A Deep Dive into CRISP‑DM and SEMMA Methodologies

DataFunSummit

Sep 29, 2021 · Artificial Intelligence

Construction and Application of Retail Product Knowledge Graph at Meituan

This article describes how Meituan builds a multi‑level, multi‑dimensional retail product knowledge graph to support new‑retail scenarios, detailing its architecture, data acquisition challenges, labeling pipelines, attribute extraction methods, efficiency improvements, human‑machine collaboration, and downstream search and recommendation applications.

AIMeituanRetail

0 likes · 25 min read

Construction and Application of Retail Product Knowledge Graph at Meituan

IT Architects Alliance

Sep 25, 2021 · Big Data

Top 10 Classic Data Mining Algorithms and Their Core Characteristics

This article introduces the ten classic data‑mining algorithms selected by IEEE ICDM—C4.5, k‑Means, SVM, Apriori, EM, PageRank, AdaBoost, k‑NN, Naive Bayes, and CART—explaining their main ideas, advantages, and typical applications for readers seeking a solid foundation in data analysis.

ClusteringMachine Learningalgorithms

0 likes · 8 min read

Top 10 Classic Data Mining Algorithms and Their Core Characteristics

Meituan Technology Team

Sep 2, 2021 · Artificial Intelligence

Construction and Application of Retail Product Knowledge Graph at Meituan

The paper describes Meituan’s retail product knowledge graph—a multi‑layered, multi‑modal system that structures billions of SKUs, attributes, and user insights using hierarchical categories, graph‑enhanced NER, semi‑supervised learning, and expert‑in‑the‑loop validation, enabling precise search, ranking, recommendation, and real‑time merchant optimization.

AIRetaildata mining

0 likes · 25 min read

Python Crawling & Data Mining

Aug 1, 2021 · Artificial Intelligence

How to Unlock Restaurant Success with Data Mining: A Step‑by‑Step Guide

This article explains the complete data‑mining workflow for the restaurant industry—from defining business goals and sampling relevant data to exploring, preprocessing, modeling, evaluating results, and selecting suitable tools—enabling intelligent dish recommendation, customer segmentation, sales forecasting, and optimal store placement.

Business Intelligencedata miningrestaurant industry

0 likes · 13 min read

How to Unlock Restaurant Success with Data Mining: A Step‑by‑Step Guide

DataFunTalk

Jul 31, 2021 · Artificial Intelligence

Construction and Application of Retail Product Knowledge Graph at Meituan

This article details Meituan's development of a multi‑level, multi‑dimensional retail product knowledge graph, covering its background in new retail, hierarchical design, attribute modeling, challenges, efficiency improvements, human‑machine collaboration, and its impact on search, recommendation and both C‑ and B‑side services.

Artificial IntelligenceMeituanRetail

0 likes · 25 min read

Python Crawling & Data Mining

Jun 14, 2021 · Big Data

Why Stanford’s Data Mining Tutorial Is the Ultimate Guide to Large‑Scale Data Mining

This article introduces the third edition of Stanford’s Data Mining Tutorial, highlighting its panoramic roadmap of data‑mining techniques for massive datasets, core features, comprehensive topic coverage, target audience, and supplementary resources while noting its popularity among students and professionals.

Distributed ComputingMachine LearningStanford

0 likes · 11 min read

Why Stanford’s Data Mining Tutorial Is the Ultimate Guide to Large‑Scale Data Mining

iQIYI Technical Product Team

May 21, 2021 · Big Data

Design and Implementation of iQIYI's User Feedback Analysis System

iQIYI built an in‑house user‑feedback analysis system that automatically ingests multi‑channel data, classifies and clusters issues, assesses feedback quality, localizes problems, and streamlines repair closure, boosting recall accuracy, alarm precision, closure rates and reducing cycle time across business lines to enhance user experience.

AIBig DataClustering

0 likes · 15 min read

Design and Implementation of iQIYI's User Feedback Analysis System

Architects Research Society

Feb 2, 2021 · Big Data

Understanding the Difference Between Data Mining and Data Analysis

This article explains the distinct concepts, applications, and techniques of data mining and data analysis, highlighting their definitions, typical tools, and how each contributes to extracting insights from large datasets in various industries.

Comparisondata analysisdata mining

0 likes · 8 min read

Understanding the Difference Between Data Mining and Data Analysis

Amap Tech

Jan 8, 2021 · Industry Insights

How AI‑Driven Data Mining Revives POI Freshness: A Deep Dive into Expired POI Detection

This article examines the technical evolution of POI expiration detection, covering attribute‑based, behavior‑based, and human‑place relationship mining methods, their machine‑learning models, and how they collectively improve map freshness and user experience at scale.

AIBig DataMachine Learning

0 likes · 17 min read

How AI‑Driven Data Mining Revives POI Freshness: A Deep Dive into Expired POI Detection

Python Crawling & Data Mining

Dec 31, 2020 · Backend Development

How to Scrape Thousands of New‑House Listings in Python: A Step‑by‑Step Guide

This article demonstrates how to use Python's requests, fake_useragent, and lxml libraries to batch‑scrape nearly a thousand new‑house listings from the 惠民之家 website, extracting 41 fields such as name, price, layout, opening date, plot ratio and green ratio, while handling pagination and anti‑scraping measures.

CSVPythonReal Estate Data

0 likes · 9 min read

How to Scrape Thousands of New‑House Listings in Python: A Step‑by‑Step Guide

Python Crawling & Data Mining

Dec 14, 2020 · Backend Development

Create a Python iQIYI Movie Scraper with GUI – Full Step‑by‑Step Guide

Learn how to build a Python web scraper that extracts iQIYI movie titles, actors, and scores, parses the data with regex or BeautifulSoup, displays results in a Tkinter GUI with a combobox, and saves the information to a file—all explained with code snippets and screenshots.

GUIPythonTkinter

0 likes · 7 min read

Create a Python iQIYI Movie Scraper with GUI – Full Step‑by‑Step Guide

Python Crawling & Data Mining

Dec 5, 2020 · Big Data

How to Build a Python Web Scraper for Zhihu Answers and Generate Word Clouds

This article walks through the complete process of designing a Python web scraper to collect Zhihu answer data, parse author, ID and excerpt fields, store the results in CSV files, and finally visualize the text with a word‑cloud, including all necessary code snippets and explanations.

CSVPythonWeb Scraping

0 likes · 17 min read

How to Build a Python Web Scraper for Zhihu Answers and Generate Word Clouds

Zhengtong Technical Team

Oct 27, 2020 · Mobile Development

Implementing Mobile Data Collection and Analytics with Countly: Architecture, Customization, and Insights

This article outlines how to design and implement a comprehensive mobile data collection and analysis system using the open‑source Countly platform, covering background requirements, solution selection, architecture, customizations for client, server and dashboard, SDK integration for Android and H5, and practical data mining insights.

Android SDKCountlyH5 SDK

0 likes · 11 min read

Implementing Mobile Data Collection and Analytics with Countly: Architecture, Customization, and Insights

Meituan Technology Team

Sep 24, 2020 · Artificial Intelligence

Multimodal Recall Solution for KDD Cup 2020: ImageBERT and LXMERT Based Approach

The second‑place team tackled KDD Cup 2020’s Multimodal Recall challenge by fine‑tuning ImageBERT and LXMERT on query‑image pairs, generating negatives, applying AMSoftmax and multi‑similarity losses, ensembling weighted predictions, and using score‑based post‑processing, boosting NDCG@5 to 0.8352 and powering Meituan’s multimodal search pipeline.

ImageBERTKDD Cup 2020LXMERT

0 likes · 23 min read

Multimodal Recall Solution for KDD Cup 2020: ImageBERT and LXMERT Based Approach

Xianyu Technology

Sep 10, 2020 · Artificial Intelligence

Interest Tagging System for Xianyu: Data‑Driven User Profiling

The Xianyu interest‑tagging system profiles post‑95 users by matching expert and hot‑search keywords to product text, weighting user actions with a TF‑IDF‑based behavior‑statistics pipeline, producing over twenty tags that cover more than half the target cohort and have already doubled click‑through rates for interest‑aligned live streams.

TF-IDFbehavior analyticsdata mining

0 likes · 11 min read

Interest Tagging System for Xianyu: Data‑Driven User Profiling

Python Crawling & Data Mining

Aug 10, 2020 · Artificial Intelligence

Build Smart Product Recommendations with Python’s Apriori Algorithm

This article explains how intelligent recommendation differs from generic marketing, introduces association‑rule concepts such as support, confidence, and lift, and provides a step‑by‑step Python implementation using the Apriori algorithm to generate and interpret market‑basket recommendations.

Apriori algorithmMarket Basket AnalysisPython

0 likes · 13 min read

Build Smart Product Recommendations with Python’s Apriori Algorithm

Python Crawling & Data Mining

Aug 8, 2020 · Big Data

How Python Data Mining Uncovers Why '30 Only' Became a Summer Hit

This article uses Python to scrape and analyze Douban ratings, user comments, and Tencent video danmu for the TV drama “30 Only”, revealing the show’s explosive popularity, the most discussed characters, and audience sentiment through statistical charts and word‑cloud visualizations.

Big DataPythonTV Drama Analysis

0 likes · 11 min read

How Python Data Mining Uncovers Why '30 Only' Became a Summer Hit

MaGe Linux Operations

Aug 5, 2020 · Big Data

8 Must‑Know Python Tools for Data Mining and Analysis

This article introduces eight essential Python libraries—Gensim, TensorFlow, SciPy, NumPy, Matplotlib, Pandas, Scikit‑Learn, and Keras—that empower developers to clean, prepare, merge, and accurately analyze data for effective data mining.

Pythondata analysisdata mining

0 likes · 4 min read

8 Must‑Know Python Tools for Data Mining and Analysis

Python Crawling & Data Mining

Jul 20, 2020 · Fundamentals

How to Build a Python Stock Temperature Analyzer with Tushare

This tutorial shows how to fetch ten years of stock fundamentals using Tushare, compute a probabilistic "temperature" metric with NumPy and SciPy, and visualize the temperature alongside closing prices in Matplotlib, providing a complete end‑to‑end Python data‑analysis workflow.

MatplotlibPythonTushare

0 likes · 13 min read

How to Build a Python Stock Temperature Analyzer with Tushare

Architects Research Society

Jul 10, 2020 · Artificial Intelligence

Core Concepts and Relationships in Data Science: Big Data, Machine Learning, Data Mining, Deep Learning, and AI

This article examines six core data‑science concepts—Big Data, Machine Learning, Data Mining, Deep Learning, Artificial Intelligence, and Data Science itself—explaining their definitions, interrelationships, and how they fit together as pieces of a larger analytical puzzle.

Artificial Intelligencedata miningdata science

0 likes · 17 min read

Core Concepts and Relationships in Data Science: Big Data, Machine Learning, Data Mining, Deep Learning, and AI

Python Crawling & Data Mining

Jul 9, 2020 · Big Data

How to Build a Python Web Scraper for Job Listings and Bypass Anti‑Scraping Measures

This tutorial explains how to crawl 58.com job listings with Python, extract location, company, and salary information, handle anti‑scraping defenses using realistic headers and random User‑Agents, and save the results into a text file.

PythonWeb Scrapinganti-scraping

0 likes · 7 min read

How to Build a Python Web Scraper for Job Listings and Bypass Anti‑Scraping Measures

Cloud Native Technology Community

Jun 5, 2020 · Artificial Intelligence

Automating a Data‑Science Workflow on Kubernetes: From GitHub Issue Mining to an MLP Bug Classifier

This article describes how to collect, clean, and analyze 90,000+ GitHub issues and pull requests from the Kubernetes repository using Kubeflow, TensorFlow, and a fully automated CI/CD pipeline, then build, train, and serve a simple MLP model that classifies release‑note texts as bugs or non‑bugs.

CI/CDKubeflowKubernetes

0 likes · 19 min read

Automating a Data‑Science Workflow on Kubernetes: From GitHub Issue Mining to an MLP Bug Classifier

TAL Education Technology

May 21, 2020 · Artificial Intelligence

Seven TAL Education AI Research Papers Accepted at Top International Conferences in 2020

TAL Education's AI Engineering Institute recently had seven of its machine‑learning papers selected for prestigious conferences such as AIED 2020, EDM 2020, ICASSP 2020 and WWW 2020, showcasing advances in speech recognition, data mining, multimodal learning and educational AI applications.

AIAcademic Conferencesdata mining

0 likes · 10 min read

Seven TAL Education AI Research Papers Accepted at Top International Conferences in 2020

Python Crawling & Data Mining

May 9, 2020 · Artificial Intelligence

What Do Netizens Really Think of Bilibili’s “Post‑Wave” Video? Sentiment & Word‑Cloud Analysis

The article details how a data analyst scraped comments from Zhihu, Weibo, and Bilibili about the 2020 Youth Day video “后浪”, applied Baidu AI sentiment analysis and Python word‑cloud generation, and compared the differing emotional tones across platforms.

AIBilibiliPython

0 likes · 7 min read

What Do Netizens Really Think of Bilibili’s “Post‑Wave” Video? Sentiment & Word‑Cloud Analysis

Python Crawling & Data Mining

Mar 6, 2020 · Big Data

Bypass Anti‑Scraping Limits with Free Proxy IPs in Python

This tutorial explains how to obtain free proxy IPs, extract their addresses using Python's requests and BeautifulSoup, and continuously validate them to overcome anti‑scraping restrictions when crawling sites such as Baidu Baike for data mining tasks.

PythonWeb Scrapingdata mining

0 likes · 5 min read

Bypass Anti‑Scraping Limits with Free Proxy IPs in Python

37 Interactive Technology Team

Feb 20, 2020 · Artificial Intelligence

Risk Control System for Detecting Game Account Fraud Using Feature Engineering and Graph Database

The article describes a risk‑control pipeline for detecting high‑volume fraudulent game accounts, detailing data collection from game logs, extensive feature engineering and statistical tests, enrichment via a Neo4j knowledge graph, and a hybrid RandomForest‑GBDT model combined with methods to filter personal accounts.

Neo4jdata miningfeature engineering

0 likes · 8 min read

Risk Control System for Detecting Game Account Fraud Using Feature Engineering and Graph Database

Python Programming Learning Circle

Feb 19, 2020 · Backend Development

How to Earn Extra Income with Python: Freelance Crawling, Web Development, Data Services, and More

This article outlines practical ways for individuals, especially students, to generate side income using Python by taking on web‑scraping freelance projects, building data‑driven websites, creating simple automation tools, running blogs or media channels, and even modest stock‑analysis scripts.

Pythondata miningfreelance

0 likes · 7 min read

How to Earn Extra Income with Python: Freelance Crawling, Web Development, Data Services, and More

Xianyu Technology

Nov 7, 2019 · Big Data

Sequence Pattern Mining for User Behavior Analysis in Xianyu

By applying sequence pattern mining and unsupervised clustering to Xianyu’s massive event logs, the study abstracts high‑level user behaviors, discovers frequent subsequences, uncovers unknown fraudulent account patterns, expands known fraud cohorts with 99 % precision, and enables richer analyses such as PCA‑based cross‑group comparisons.

Big DataClusteringdata mining

0 likes · 8 min read

Sequence Pattern Mining for User Behavior Analysis in Xianyu

DataFunTalk

Aug 22, 2019 · Artificial Intelligence

End‑to‑End Group Risk Perception Modeling: From Requirement Mining to Deployment

This article presents a comprehensive workflow for group risk perception, covering business requirement mining, data acquisition and understanding, feature engineering, model training and evaluation, deployment, and practical user applications, with detailed objectives, methods, and deliverables for each stage.

Machine LearningModel Deploymentdata mining

0 likes · 11 min read

End‑to‑End Group Risk Perception Modeling: From Requirement Mining to Deployment

Big Data Technology & Architecture

Jun 16, 2019 · Big Data

Understanding Data Warehouse Terminology: DB, DW, ODS, OLTP, OLAP, BI, and Data Mining

This article explains core data‑warehouse concepts—including DB, DW, ODS, OLTP, OLAP, BI, and the differing meanings of DM—as well as their relationships, integration examples, and why OLAP cannot replace data mining, providing a concise reference for beginners in data analytics.

BIBig DataData Warehouse

0 likes · 9 min read

Understanding Data Warehouse Terminology: DB, DW, ODS, OLTP, OLAP, BI, and Data Mining

Mafengwo Technology

Jun 6, 2019 · Big Data

How MaFengWo Quantifies User Content Contribution with a Three‑Factor Model

This article explains how MaFengWo builds a three‑dimensional user content contribution model—combining activity, popularity, and sharing willingness—to objectively score UGC creators, improve recommendation strategies, and drive more effective travel‑related services.

RFM modelUGC analysisUser Modeling

0 likes · 12 min read

How MaFengWo Quantifies User Content Contribution with a Three‑Factor Model

JD Retail Technology

Apr 12, 2019 · R&D Management

Balancing Business Demands and Technical Advancement: Insights from JD’s Data Knowledge Leader Li Wei

In an interview, JD data platform leader Li Wei discusses how dynamic balance between business demands and technical improvement, knowledge computing, and AI-driven product quality control can drive innovation, enhance user experience, and shape future R&D management strategies.

Artificial IntelligenceProduct DevelopmentR&D Management

0 likes · 8 min read

Balancing Business Demands and Technical Advancement: Insights from JD’s Data Knowledge Leader Li Wei

JD Tech Talk

Mar 22, 2019 · Artificial Intelligence

Data Mining Techniques for Telemarketing: Supervised Classification, Clustering, Optimization, Anomaly Detection, and Text Mining

The article examines how telemarketing, a data‑intensive industry, leverages various data‑mining methods—including supervised classification, clustering, operations research optimization, anomaly detection, and text mining—to improve lead selection, agent allocation, churn prediction, and voice analysis, while also outlining the key data‑talent roles needed for successful implementation.

Anomaly DetectionClusteringOptimization

0 likes · 7 min read

Data Mining Techniques for Telemarketing: Supervised Classification, Clustering, Optimization, Anomaly Detection, and Text Mining

Python Crawling & Data Mining

Mar 17, 2019 · Artificial Intelligence

How Association Rules and Machine Learning Reveal Stock Market Industry Linkages

This report analyzes 2018 AMAC industry index data using association‑rule mining and several machine‑learning models (Apriori, KNN, Bayesian, decision tree, neural network) to uncover sector linkages, predict index and stock movements, compare model performance, and suggest future improvements.

R languageassociation rulesdata mining

0 likes · 11 min read

How Association Rules and Machine Learning Reveal Stock Market Industry Linkages

MaGe Linux Operations

Mar 6, 2019 · Artificial Intelligence

How to Install and Fix WordCloud in Python for Chinese Text Visualization

This tutorial walks you through installing the WordCloud library, resolving the Microsoft Visual C++ compiler error, fixing Chinese font issues, and creating both basic and image‑masked word clouds with sample Python code.

Chinese encodingPythondata mining

0 likes · 4 min read

How to Install and Fix WordCloud in Python for Chinese Text Visualization

MaGe Linux Operations

Mar 1, 2019 · Artificial Intelligence

Master Python Data Mining & Machine Learning: From Preprocessing to Classification

This comprehensive guide introduces data mining and machine learning concepts, walks through Python data preprocessing techniques, reviews common classification algorithms, demonstrates an Iris flower classification case, and offers practical tips for selecting the most suitable algorithm for a given problem.

Classification AlgorithmsData preprocessingMachine Learning

0 likes · 21 min read

Master Python Data Mining & Machine Learning: From Preprocessing to Classification

Python Crawling & Data Mining

Jan 25, 2019 · Backend Development

Master Web Crawlers: How Python Scrapes the Web Efficiently

As online information explodes, traditional data collection methods fall short, prompting the rise of Python web crawlers that use URLs and libraries like urllib, urllib2, and re, while frameworks boost efficiency, enabling fast, accurate, and automated extraction of web data for analysis.

data extractiondata miningweb scraper

0 likes · 5 min read

Master Web Crawlers: How Python Scrapes the Web Efficiently

Meituan Technology Team

Dec 13, 2018 · Artificial Intelligence

Advances in Machine Learning for Real‑Time Delivery at Meituan

Meituan’s AI‑driven “Superbrain” platform combines real‑time big‑data processing, fine‑grained location perception, high‑precision ETA forecasting, multi‑rider dispatch and dynamic pricing to cut instant food‑delivery times from about an hour to roughly thirty minutes while boosting efficiency, cost savings and user experience.

AIETA predictionMachine Learning

0 likes · 19 min read

Advances in Machine Learning for Real‑Time Delivery at Meituan

Big Data and Microservices

Sep 17, 2018 · Big Data

5 Essential Data Mining Techniques Every Analyst Should Know

This article outlines five widely used data‑mining methods—association rules, classification/tagging, clustering, decision trees, and sequential pattern mining—explaining their principles, real‑world examples, and how they help organizations extract actionable insights from massive datasets.

Big DataClusteringDecision Trees

0 likes · 6 min read

5 Essential Data Mining Techniques Every Analyst Should Know

Big Data and Microservices

Sep 3, 2018 · Big Data

From Raw Data to Business Impact: A Complete Data Analyst Skill Guide

The article outlines a comprehensive data‑analyst competency framework, covering data collection, storage, extraction, mining, analysis, visualization, and practical application, and provides concrete questions, techniques, and tool recommendations to help analysts turn raw data into actionable business insights.

Business IntelligenceData Visualizationdata analysis

0 likes · 9 min read

From Raw Data to Business Impact: A Complete Data Analyst Skill Guide

Big Data and Microservices

Aug 16, 2018 · Big Data

Mastering Big Data Analysis: 5 Core Aspects and 4 Key Methods

This article outlines the five fundamental aspects of big data analysis—visualization, data‑mining algorithms, predictive analytics, semantic engines, and data quality management—and explains four primary analytical approaches: descriptive, diagnostic, predictive, and prescriptive analysis.

Big Datadata analysisdata mining

0 likes · 6 min read

Mastering Big Data Analysis: 5 Core Aspects and 4 Key Methods

Model Perspective

Jun 17, 2018 · Big Data

How Tablet Usage Data Can Transform Education: Insights and Strategies

By leveraging tablet-based learning platforms, schools can collect rich usage data, which, when mined, reveals student habits, sentiment, and learning patterns, enabling educators to personalize instruction, improve curricula, and guide strategic decisions, while highlighting the need for data protection and dedicated research centers.

big data in educationdata miningeducational technology

0 likes · 5 min read

How Tablet Usage Data Can Transform Education: Insights and Strategies

AntTech

Jun 14, 2018 · Artificial Intelligence

A Local Online Learning Approach for Non-linear Data (SCW-LOL)

This paper introduces the SCW-LOL algorithm, a local online learning method based on Soft Confidence Weighted that extends a global model with multiple local classifiers, uses online K‑Means for sample assignment, provides theoretical error bounds, and demonstrates superior performance on ten benchmark datasets, especially for multi‑class classification.

Machine LearningSCW algorithmdata mining

0 likes · 9 min read

A Local Online Learning Approach for Non-linear Data (SCW-LOL)

Xianyu Technology

May 16, 2018 · Artificial Intelligence

Geographic Alias Mining and Knowledge Base Construction Using Contextual Vectors and Address Similarity

The paper presents two inexpensive techniques for extracting geographic aliases of points of interest—comparing high‑dimensional contextual vectors of nearby shipping addresses and analyzing co‑occurring words in identical addresses—to construct a knowledge base that links official names with their synonyms, improving location‑based service accuracy.

Cosine SimilarityGeographic AliasKnowledge Base

0 likes · 9 min read

Geographic Alias Mining and Knowledge Base Construction Using Contextual Vectors and Address Similarity

Python Crawling & Data Mining

May 9, 2018 · Backend Development

How to Scrape WeChat Moments with Python and Scrapy: Step‑by‑Step Guide

Learn how to export WeChat Moments using a third‑party service, set up a Scrapy project in Python, analyze the dynamic JSON responses, and write a crawler to extract timeline data, complete with screenshots and command‑line instructions for a fully functional scraper.

PythonWeChatWeb Scraping

0 likes · 6 min read

How to Scrape WeChat Moments with Python and Scrapy: Step‑by‑Step Guide

AntTech

Apr 9, 2018 · Artificial Intelligence

Practical Guide to Modeling Stability: Feature PSI, Model PSI, and Monitoring Techniques

This article explains the importance of modeling stability, describes how to assess feature and model stability using the Population Stability Index (PSI), provides step‑by‑step calculation methods, and shares practical monitoring practices such as rank mapping and daily SQL‑based checks.

Machine LearningModel MonitoringPSI

0 likes · 9 min read

Practical Guide to Modeling Stability: Feature PSI, Model PSI, and Monitoring Techniques

MaGe Linux Operations

Apr 8, 2018 · Artificial Intelligence

Master Python Data Mining & Machine Learning: From Preprocessing to Classification

This comprehensive tutorial walks you through Python data mining and machine learning fundamentals, covering data preprocessing techniques, common classification algorithms, an Iris flower classification case study, and practical tips for selecting the right algorithm, all illustrated with clear code examples and visualizations.

Classification AlgorithmsData preprocessingMachine Learning

0 likes · 22 min read

JD Tech

Jan 26, 2018 · Artificial Intelligence

JD Big Data R&D Department Presents Three Accepted Papers at AAAI-2018

The JD Big Data R&D team announced that three of its research papers—covering cross‑domain human parsing, multi‑view outlier detection, and orthogonal weight normalization for deep neural networks—were accepted at the prestigious AAAI‑2018 conference, highlighting the department's contributions to computer vision, data mining, and deep learning.

Artificial IntelligenceCross‑domain Adaptationcomputer vision

0 likes · 8 min read

JD Big Data R&D Department Presents Three Accepted Papers at AAAI-2018

MaGe Linux Operations

Jan 5, 2018 · Backend Development

How to Build a High‑Speed Sina Weibo Scrapy Spider that Crawls 13 Million Posts Daily

This article explains how to create a Python‑based Scrapy spider that logs into Sina Weibo using cookies, crawls user profiles, posts, followers and followees from the WAP site at speeds exceeding 13 million records per day, and stores the data in MongoDB.

MongoDBPythonScrapy

0 likes · 6 min read

How to Build a High‑Speed Sina Weibo Scrapy Spider that Crawls 13 Million Posts Daily

Hulu Beijing

Dec 1, 2017 · Artificial Intelligence

How to Evaluate Unsupervised Clustering Algorithms: Metrics, Scenarios, and Insights

This article explains how to assess unsupervised clustering algorithms by describing realistic user‑watching scenarios, outlining common cluster and algorithm types, presenting five key evaluation criteria, and introducing practical metrics such as RMSSTD, R‑Square, and the improved Hubert‑Gamma statistic.

clustering evaluationdata miningmetrics

0 likes · 10 min read

How to Evaluate Unsupervised Clustering Algorithms: Metrics, Scenarios, and Insights

Baixing.com Technical Team

Nov 30, 2017 · Artificial Intelligence

How User Profiling Powers Modern Recommendation Systems

This article explains what user profiling is, why it’s crucial for recommendation systems, outlines key dimensions such as personal attributes, status, and interests, describes algorithms like classification and autoregressive models, and details offline and real‑time computation methods, evaluation techniques, and practical examples.

Machine LearningRecommendation Systemsalgorithm

0 likes · 11 min read

How User Profiling Powers Modern Recommendation Systems

StarRing Big Data Open Lab

Sep 30, 2017 · Artificial Intelligence

How to Mine Association Rules with discoverR: Apriori & FP‑Growth in R

This guide explains the fundamentals of association‑rule mining, introduces support, confidence and lift metrics, and demonstrates step‑by‑step how to use the discoverR R package with Apriori and FP‑Growth algorithms to extract and visualize recommendation rules from the classic groceries dataset.

AprioriFP-GrowthR

0 likes · 10 min read

How to Mine Association Rules with discoverR: Apriori & FP‑Growth in R

21CTO

Sep 27, 2017 · Artificial Intelligence

How Tagging and User Profiling Power Modern Recommendation Systems

This article explores how simple tagging and user profiling underpin modern recommendation systems, contrasting tag‑based, flexible representations with traditional hierarchical classifications, and examines practical applications such as personalized advertising, industry research, and product optimization.

Recommendation SystemsTaggingdata mining

0 likes · 13 min read

How Tagging and User Profiling Power Modern Recommendation Systems

Tongcheng Travel Technology Center

Aug 29, 2017 · Big Data

How to Become a Data Mining Engineer: A Year‑Long Journey and Practical Guide

This article recounts a year-long journey to become a data mining engineer, explaining the role’s value, required skills, tools such as Excel, Tableau, SQL, Python, Scala, Spark, and machine‑learning techniques, and offers practical steps for aspiring professionals.

Big DataMachine LearningPython

0 likes · 11 min read

How to Become a Data Mining Engineer: A Year‑Long Journey and Practical Guide

MaGe Linux Operations

Jul 30, 2017 · Artificial Intelligence

Why Python Dominates Data Mining: Clear Syntax, Rich Libraries, and Speed Trade‑offs

Python is favored for data‑mining algorithms because its clear syntax, built‑in advanced data structures, easy text handling, extensive libraries, and widespread community support outweigh its slower execution speed compared to Java or C, allowing rapid development and seamless integration with high‑performance code when needed.

Algorithm DevelopmentPythondata mining

0 likes · 5 min read

Why Python Dominates Data Mining: Clear Syntax, Rich Libraries, and Speed Trade‑offs

21CTO

Jul 18, 2017 · Artificial Intelligence

What’s the Real Salary Value of Your Coding Skills? Insights from 1M Job Posts

By mining over a million computer‑related job postings with weak‑supervised learning and a BiLSTM NER model, this article quantifies how programming languages, development tools, and hardware skills translate into salary value, offering data‑driven guidance for developers and new graduates.

AIdata miningdeep learning tools

0 likes · 8 min read

What’s the Real Salary Value of Your Coding Skills? Insights from 1M Job Posts

Alibaba Cloud Developer

Jul 10, 2017 · Artificial Intelligence

How Alibaba’s Experts Tackle Large‑Scale Matching and Deep Learning Challenges

The first Alibaba data‑mining forum in Hangzhou gathered top academics and industry leaders who discussed large‑scale online precise matching, the role of distributed versus single‑machine learning, and the benefits and limitations of deep learning in modern AI applications.

AIOnline Matchingdata mining

0 likes · 11 min read

How Alibaba’s Experts Tackle Large‑Scale Matching and Deep Learning Challenges

Tencent Advertising Technology

Jun 15, 2017 · Artificial Intelligence

Tencent Social Ads Data Mining Expert Q&A: Feature Engineering, Modeling, and Competition Insights

In a Q&A session, a Tencent social ads data mining expert addressed competition participants' questions on data delays, full‑set versus sliding‑window statistics, dataset authenticity, Bayesian smoothing, feature selection, handling missing values, large‑scale training, feature interactions, model stacking, online mini‑batch training, and provided reference resources.

Vowpal Wabbitcompetitiondata mining

0 likes · 11 min read