Tagged articles
1881 articles
Page 19 of 19
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Nov 4, 2016 · Artificial Intelligence

How Item Features Power Music Recommendations: A Hands‑On Guide

This article explains how recommendation systems can use item‑level features instead of user ratings, illustrating the approach with Pandora's music‑gene project, detailing feature selection, scoring, distance calculations, standardization, and classification techniques across music, athlete, Iris, and automobile datasets.

Recommendation Systemsclassificationdistance metrics
0 likes · 20 min read
How Item Features Power Music Recommendations: A Hands‑On Guide
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Oct 20, 2016 · Artificial Intelligence

How Collaborative Filtering Powers Recommendations: From Manhattan to Cosine Similarity

This article walks through the fundamentals of recommendation systems, explaining collaborative filtering and various similarity measures—including Manhattan, Euclidean, Minkowski, Pearson correlation, and cosine similarity—while discussing their suitability for dense, sparse, or biased rating data and introducing K‑Nearest Neighbors for practical implementation.

Recommendation Systemscollaborative filteringdata mining
0 likes · 15 min read
How Collaborative Filtering Powers Recommendations: From Manhattan to Cosine Similarity
Qunar Tech Salon
Qunar Tech Salon
Oct 17, 2016 · Information Security

Design and Implementation of a Cloud‑Based Web Application Firewall at Ctrip

This article describes Ctrip's challenges with web security, evaluates hardware and commercial cloud WAF shortcomings, and presents a low‑cost, low‑risk cloud‑based WAF solution that leverages DNS redirection, closed‑loop rule management, Lua/Tengine deployment, supervised machine‑learning log analysis, and big‑data streaming for real‑time threat detection and mitigation.

Big DataWAFWeb Security
0 likes · 9 min read
Design and Implementation of a Cloud‑Based Web Application Firewall at Ctrip
Alibaba Cloud Developer
Alibaba Cloud Developer
Oct 8, 2016 · Artificial Intelligence

Unlocking Machine Learning Basics: From Perceptrons to Ensemble Models

An introductory guide for machine‑learning beginners that covers essential algorithms—including perceptrons, logistic regression, decision trees, LDA, and ensemble techniques like bagging and boosting—explains feature design, model training, evaluation, and practical tips for avoiding under‑ and over‑fitting.

Decision TreesUnsupervised Learningensemble methods
0 likes · 8 min read
Unlocking Machine Learning Basics: From Perceptrons to Ensemble Models
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 26, 2016 · Artificial Intelligence

Can Machine Learning Predict China’s Car License Lottery? Secrets in 13‑Digit IDs

This article investigates whether the 13‑digit user IDs used in Chinese car‑license lotteries are truly random, revealing how the ID generation, seed‑based selection, and hidden patterns—especially the influential seventh digit—affect outcomes, and demonstrates that simple linear models can achieve an AUC of around 0.8 in predicting winners, while also discussing the system’s opacity across major cities.

ID generationcar license lotterydata analysis
0 likes · 17 min read
Can Machine Learning Predict China’s Car License Lottery? Secrets in 13‑Digit IDs
ITPUB
ITPUB
Sep 21, 2016 · Artificial Intelligence

Deep Learning Platforms Unveiled: From DistBelief to TensorFlow and Real‑World Uses

The article reviews the evolution and challenges of deep learning, outlines major commercial platforms such as DistBelief, COTS, and Adam, compares open‑source frameworks like MXNet, TensorFlow and Petuum, and highlights their architectures, performance metrics, and diverse applications ranging from image recognition to recommendation systems.

AIDeep LearningMXNet
0 likes · 11 min read
Deep Learning Platforms Unveiled: From DistBelief to TensorFlow and Real‑World Uses
Ctrip Technology
Ctrip Technology
Sep 19, 2016 · Artificial Intelligence

Personalized Demand Prediction and Ranking for Qunar's "Guess You Like" Feature

This article describes Qunar's personalized demand prediction system for the "Guess You Like" card, detailing how user‑demand associations are mined via rule engines, collaborative filtering, LBS and manual rules, and how ranking models evolve from subjective Bayes to RankBoost and LambdaMart, with experimental evaluation and future LSTM plans.

AITravelmachine learning
0 likes · 10 min read
Personalized Demand Prediction and Ranking for Qunar's "Guess You Like" Feature
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Sep 18, 2016 · Artificial Intelligence

How Linear Regression Can Tame Your Nighttime Alert Fatigue

This article explores how historical monitoring alerts can be analyzed and predicted using linear regression, guiding operations engineers to preprocess data, build regression models, and forecast future alert trends to reduce manual alarm handling and improve system stability.

Operationsalert predictionlinear regression
0 likes · 8 min read
How Linear Regression Can Tame Your Nighttime Alert Fatigue
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 13, 2016 · Artificial Intelligence

How Game Theory and AI Stop Fake Reviews on E‑Commerce Platforms

This article explains how Alibaba combines big‑data analytics, machine learning, and mechanism‑design game theory to create a recommendation system that removes incentives for merchants to generate fake orders, improving fairness and user experience on e‑commerce platforms.

Game TheoryRecommendation Systemsanti-fraud
0 likes · 3 min read
How Game Theory and AI Stop Fake Reviews on E‑Commerce Platforms
Ctrip Technology
Ctrip Technology
Sep 10, 2016 · Artificial Intelligence

Deep Learning Anti‑Scam Guide: An Informal Introduction to Neural Networks, Training, and Practical Applications

This article provides a light‑hearted yet thorough overview of deep learning, covering neural network fundamentals, layer construction, back‑propagation, ResNet shortcuts, encoder‑decoder structures, PU‑learning for unlabeled data, GPU acceleration, and practical advice on data size, frameworks, and deployment in financial scenarios.

BackpropagationBig DataGPU
0 likes · 27 min read
Deep Learning Anti‑Scam Guide: An Informal Introduction to Neural Networks, Training, and Practical Applications
Qunar Tech Salon
Qunar Tech Salon
Aug 21, 2016 · Artificial Intelligence

Hotel Search Ranking: Problem Definition, Model Construction, Feature Engineering, and Offline Evaluation

This article presents a comprehensive overview of hotel search ranking, covering problem definition, the distinction between ranking and probability estimation, handling position bias, detailed feature engineering, the AnyBoost linear boosting model, offline evaluation methods, and observed online performance improvements.

Learning-to-Rankfeature engineeringhotel ranking
0 likes · 7 min read
Hotel Search Ranking: Problem Definition, Model Construction, Feature Engineering, and Offline Evaluation
Qunar Tech Salon
Qunar Tech Salon
Aug 20, 2016 · Artificial Intelligence

Personalized Demand Prediction and Ranking for Qunar App’s “You May Like” Card

This article describes how Qunar replaced a low‑click hot‑words card with a personalized “You May Like” recommendation card, detailing data collection, rule‑based and collaborative‑filtering association methods, learning‑to‑rank models (subjective Bayes, RankBoost, LambdaMart), training‑sample strategies, online experiments, evaluation metrics, and future plans including LSTM‑based sequence modeling.

Qunarcollaborative filteringmachine learning
0 likes · 14 min read
Personalized Demand Prediction and Ranking for Qunar App’s “You May Like” Card
Qunar Tech Salon
Qunar Tech Salon
Aug 19, 2016 · Artificial Intelligence

Deep Learning Anti‑Scam Guide: A Non‑Technical Overview of Neural Networks, Training, and Practical Tips

This article provides a humorous yet informative, non‑mathematical guide to deep learning, covering neural network basics, layer addition, training methods, back‑propagation, unsupervised pre‑training, regularization, ResNet shortcuts, GPU computation, framework choices, and practical advice for applying deep learning to industrial data.

AIDeep LearningGPU
0 likes · 26 min read
Deep Learning Anti‑Scam Guide: A Non‑Technical Overview of Neural Networks, Training, and Practical Tips
Qunar Tech Salon
Qunar Tech Salon
Aug 18, 2016 · Artificial Intelligence

Automatic Ticket Classification Using SVM and word2vec at Qunar

At Qunar, the data center algorithm team developed an automatic ticket classification system that combines Support Vector Machine with word2vec embeddings to handle high‑dimensional, low‑sample text data, achieving 89% accuracy and 80% recall while outlining the full machine‑learning pipeline from feature extraction to deployment.

QunarWord2Vecmachine learning
0 likes · 7 min read
Automatic Ticket Classification Using SVM and word2vec at Qunar
Architecture Digest
Architecture Digest
Aug 15, 2016 · Big Data

Understanding Data: Types, Systems, and Big Data Technologies

This article explains what data is, classifies it into structured, semi‑structured and unstructured forms, describes data mining, databases, data warehouses, the full data lifecycle, and surveys the big‑data ecosystem including storage, batch and real‑time processing, resource scheduling, and visualization technologies.

Lambda architecturedata engineeringdata mining
0 likes · 22 min read
Understanding Data: Types, Systems, and Big Data Technologies
Aotu Lab
Aotu Lab
Aug 10, 2016 · Artificial Intelligence

Can AI Teach Computers to Design Fonts? A Journey into Automated Typography

The article explores the author's experiments combining artificial intelligence with typography, detailing the development of algorithms that measure font attributes, compute similarity scores, and generate rule‑based design systems, while reflecting on the challenges, inspirations, and future possibilities of AI‑driven font selection and design.

AIalgorithmdesign systems
0 likes · 21 min read
Can AI Teach Computers to Design Fonts? A Journey into Automated Typography
Baidu Intelligent Testing
Baidu Intelligent Testing
Jul 13, 2016 · Artificial Intelligence

Detecting Offline Merchant Service Issues Using Machine Learning and Big Data at Nuomi

The article describes how Nuomi analyzes refund and complaint data with machine‑learning and big‑data techniques, extracts features for single‑ and multi‑store scenarios, builds decision‑tree models with regional adjustments, and creates an online workflow to promptly intervene on merchants that fail to serve customers.

Big Datacustomer experiencedecision tree
0 likes · 5 min read
Detecting Offline Merchant Service Issues Using Machine Learning and Big Data at Nuomi
Qunar Tech Salon
Qunar Tech Salon
Jul 4, 2016 · Information Security

Xiaomi Risk Control Practices: Architecture, Rule Engine, and Machine Learning

Xiaomi senior R&D engineer Deng Wenjun shares the evolution of Xiaomi's internet‑finance risk‑control system, describing early rule‑based limits, the adoption of Drools for fast rule deployment, data‑driven modeling with random‑forest classifiers, and ongoing challenges in scalability, latency, and privacy.

DroolsRandom Forestfinancial technology
0 likes · 16 min read
Xiaomi Risk Control Practices: Architecture, Rule Engine, and Machine Learning
High Availability Architecture
High Availability Architecture
Jun 24, 2016 · Information Security

Xiaomi's Internet Finance Risk Control Practices: Architecture, Rules Engine, and Machine Learning

The article details Xiaomi's evolution of internet‑finance risk control—from early limit and frequency rules that cut bad‑debt by a third, through adopting the Drools rules engine for rapid deployment and gray‑release, to leveraging random‑forest machine‑learning models and extensive user profiling that reduced fraud by roughly 40%, while addressing privacy and operational challenges.

DroolsRandom ForestSecurity
0 likes · 15 min read
Xiaomi's Internet Finance Risk Control Practices: Architecture, Rules Engine, and Machine Learning
21CTO
21CTO
Jun 16, 2016 · Big Data

Building a Simple Open‑Source Self‑Service BI Platform with Flask & React

This article introduces dataplay2, an open‑source self‑service BI platform built with Flask, pandas, scikit‑learn on the backend and React, ECharts, D3, and other JavaScript libraries on the frontend, detailing its architecture, installation steps, core features such as data upload, visualization, classification, clustering, and future improvement ideas.

BIdata analysismachine learning
0 likes · 11 min read
Building a Simple Open‑Source Self‑Service BI Platform with Flask & React
21CTO
21CTO
Jun 11, 2016 · Artificial Intelligence

Designing System & Personalized Recommendations Using Mahout

This article outlines the design of both system-wide and personalized recommendation modules for e‑commerce platforms, explains three recommendation approaches (demographic, content‑based, collaborative filtering), details the implementation of Mahout’s collaborative‑filtering algorithm with Java code, discusses data sources, technology stack, algorithm choices, and solutions to the cold‑start problem.

Mahoutcollaborative filteringe‑commerce
0 likes · 14 min read
Designing System & Personalized Recommendations Using Mahout
Architecture Digest
Architecture Digest
May 11, 2016 · Artificial Intelligence

Interest Feeds: From Facebook NewsFeed and EdgeRank to Pinterest Smart Feed and General Techniques

This article explains why interest‑driven feeds are essential, reviews Facebook's NewsFeed evolution and EdgeRank algorithm, details Pinterest's Smart Feed architecture and Pinnability model, and provides a comprehensive guide to building, ranking, and monitoring generic interest‑feed systems for social platforms.

FacebookPinterestSocial network
0 likes · 34 min read
Interest Feeds: From Facebook NewsFeed and EdgeRank to Pinterest Smart Feed and General Techniques
Meituan Technology Team
Meituan Technology Team
Apr 29, 2016 · Big Data

Introduction to Spark in Big Data

Apache Spark, a versatile big‑data platform supporting batch processing, SQL queries, real‑time streaming, and machine‑learning workloads, dramatically accelerates data‑intensive jobs, as demonstrated by Meituan‑Dianping, where its high‑performance engine reduces execution times and enhances scalability across diverse analytical and operational pipelines.

Batch ProcessingBig DataSpark
0 likes · 1 min read
Introduction to Spark in Big Data
Architecture Digest
Architecture Digest
Apr 22, 2016 · Artificial Intelligence

An Introductory Overview of Recommendation Systems and Their Core Algorithms

This article introduces the basic concepts, purposes, and a range of algorithms—including popularity‑based, collaborative filtering, content‑based, model‑based, and hybrid methods—used in recommendation systems, and discusses evaluation metrics and improvement strategies for practical deployment.

AIRecommendation Systemscollaborative filtering
0 likes · 15 min read
An Introductory Overview of Recommendation Systems and Their Core Algorithms
Big Data and Microservices
Big Data and Microservices
Apr 19, 2016 · Industry Insights

Designing a Scalable Real‑Time Stock Prediction Architecture with Open‑Source Tools

This article outlines a reference architecture for a low‑latency, horizontally scalable real‑time stock prediction system built with open‑source components such as Spring Cloud Data Flow, Apache Geode, Spark MLlib, and Hadoop, and discusses data flow steps, simplified deployment, and algorithm choices for market forecasting.

Big DataReal-TimeStock Prediction
0 likes · 7 min read
Designing a Scalable Real‑Time Stock Prediction Architecture with Open‑Source Tools
Java High-Performance Architecture
Java High-Performance Architecture
Apr 18, 2016 · Big Data

Why Spark Is Outpacing Hadoop: Speed, Real‑Time Processing, and ML Advantages

The article explains how Spark has become the leading open‑source big‑data platform, highlighting its superior speed, in‑memory processing, real‑time streaming, and built‑in machine‑learning library compared with Hadoop’s slower, disk‑based MapReduce approach and reliance on external storage and ML tools.

Big DataHadoopReal-time Processing
0 likes · 5 min read
Why Spark Is Outpacing Hadoop: Speed, Real‑Time Processing, and ML Advantages
21CTO
21CTO
Apr 14, 2016 · Big Data

How Meituan’s Data Architecture Powers Precise Mobile Marketing

This article details Meituan Dianping's data‑driven approach to precise marketing, describing the O2O marketing framework, a layered pyramid data system, profiling techniques, budget monitoring, and two real‑world case studies that together illustrate how big‑data technologies boost marketing efficiency on mobile platforms.

Big DataData Architecturemachine learning
0 likes · 12 min read
How Meituan’s Data Architecture Powers Precise Mobile Marketing
Architecture Digest
Architecture Digest
Apr 14, 2016 · Big Data

Data‑Driven Precise Marketing: Architecture and Case Studies from Meituan Dianping

This article presents Meituan Dianping's data‑driven precise marketing architecture, detailing a layered pyramid system, user profiling, budget monitoring, and two real‑world cases—potential user mining and a smart coupon engine—demonstrating how big‑data techniques improve marketing efficiency and ROI.

Data ArchitectureMeituanmachine learning
0 likes · 12 min read
Data‑Driven Precise Marketing: Architecture and Case Studies from Meituan Dianping
Qunar Tech Salon
Qunar Tech Salon
Mar 29, 2016 · Fundamentals

Overview of Ten Classic Algorithms: Sorting, Searching, Graph Traversal, and Machine Learning

This article presents concise explanations and step‑by‑step procedures for ten classic algorithms—including quick sort, heap sort, merge sort, binary search, BFPRT selection, depth‑first and breadth‑first graph traversals, Dijkstra’s shortest‑path method, dynamic programming principles, and the Naive Bayes classifier—highlighting their complexities and core ideas.

SearchingSortingalgorithm fundamentals
0 likes · 11 min read
Overview of Ten Classic Algorithms: Sorting, Searching, Graph Traversal, and Machine Learning
Architecture Digest
Architecture Digest
Mar 29, 2016 · Artificial Intelligence

Practical Guide to Machine Learning: Problem Modeling, Data Preparation, Feature Engineering, Model Training and Optimization

This article presents a comprehensive, practical guide to applying machine learning in industry, covering problem modeling, data preparation, feature extraction, model training, and optimization, illustrated with a DEAL transaction amount forecasting case study.

Model Trainingdata preparationfeature engineering
0 likes · 17 min read
Practical Guide to Machine Learning: Problem Modeling, Data Preparation, Feature Engineering, Model Training and Optimization
21CTO
21CTO
Mar 18, 2016 · Artificial Intelligence

10 Essential Tips for Building High‑Performance Intelligent Recommendation Systems

This article outlines ten practical key points—including leveraging explicit and implicit feedback, hybridizing algorithms, handling temporal and geographic factors, exploiting social ties, solving cold‑start issues, optimizing presentation, defining clear metrics, ensuring real‑time updates, and scaling big‑data processing—to help engineers design effective intelligent recommendation systems.

cold startdata miningevaluation
0 likes · 18 min read
10 Essential Tips for Building High‑Performance Intelligent Recommendation Systems
Meitu Technology
Meitu Technology
Mar 11, 2016 · Artificial Intelligence

Meipai Personalized Recommendation Technology Journey

As Meipai’s user base exploded, the platform shifted from manual curation to sophisticated personalized recommendation algorithms—leveraging machine‑learning and data‑mining techniques, iterating through multiple generations, overcoming scalability and relevance challenges, and delivering rapid solutions that inspire future recommendation system designs.

MeipaiRecommendation Algorithmalgorithm evolution
0 likes · 1 min read
Meipai Personalized Recommendation Technology Journey
Architect
Architect
Mar 10, 2016 · Big Data

Analysis and Practice of a Real-Time Hadoop Data Security Solution

The article presents a detailed technical overview of Apache Eagle's real-time Hadoop data security architecture, covering distributed data collection, stream processing, metadata‑driven policy enforcement, machine‑learning‑based anomaly detection, and integration with Hadoop ecosystem components such as HBase, Kafka, and Storm.

Apache EagleBig DataHadoop
0 likes · 25 min read
Analysis and Practice of a Real-Time Hadoop Data Security Solution
ITPUB
ITPUB
Mar 1, 2016 · Artificial Intelligence

10 Essential Machine Learning Algorithms with Python and R Cheat Sheets

This article warns against abandoning machine learning near the finish line and offers a concise cheat‑sheet of the ten most commonly used algorithms, complete with ready‑to‑run Python and R code examples to help practitioners accelerate model development.

AIRmachine learning
0 likes · 3 min read
10 Essential Machine Learning Algorithms with Python and R Cheat Sheets
21CTO
21CTO
Feb 29, 2016 · Fundamentals

Master 10 Essential Algorithms: From QuickSort to Naive Bayes

This article presents concise explanations, step‑by‑step procedures, and visual illustrations for ten core algorithms—including QuickSort, HeapSort, MergeSort, Binary Search, BFPRT, DFS, BFS, Dijkstra, Dynamic Programming, and Naive Bayes—highlighting their principles, complexities, and typical use cases.

Search AlgorithmsSorting Algorithmsdynamic programming
0 likes · 15 min read
Master 10 Essential Algorithms: From QuickSort to Naive Bayes
21CTO
21CTO
Feb 27, 2016 · Artificial Intelligence

How User‑Based Collaborative Filtering Powers Modern Recommendation Systems

This article explains the fundamentals of recommendation algorithms, focusing on user‑based collaborative filtering, similarity metrics, neighbor selection, scoring methods, practical implementation with the MovieLens dataset, and common challenges such as popularity bias and dirty data.

collaborative filteringmachine learningmovie recommendation
0 likes · 12 min read
How User‑Based Collaborative Filtering Powers Modern Recommendation Systems
21CTO
21CTO
Feb 17, 2016 · Big Data

How Big Data Powers Personalized Recommendations in Mother‑Baby E‑Commerce

This article explains the unique characteristics of mother‑baby e‑commerce, describes a comprehensive big‑data platform architecture—including data collection, offline and real‑time computing, and recommendation algorithms—and shows how user profiling and personalized ranking dramatically improve conversion and user experience.

e‑commercemachine learningpersonalization
0 likes · 11 min read
How Big Data Powers Personalized Recommendations in Mother‑Baby E‑Commerce
21CTO
21CTO
Feb 12, 2016 · Artificial Intelligence

Can Machine Learning Reveal the True Author of Red Mansions' Final 40 Chapters?

This article uses machine learning to compare lexical patterns between the first 80 and last 40 chapters of 'Dream of the Red Chamber', demonstrating distinct stylistic differences that support the scholarly view that the final chapters were not authored by Cao Xueqin.

Red MansionsSupport Vector Machinefeature engineering
0 likes · 6 min read
Can Machine Learning Reveal the True Author of Red Mansions' Final 40 Chapters?
Qunar Tech Salon
Qunar Tech Salon
Feb 6, 2016 · Big Data

An Introduction to Data Mining Algorithms and Their Real-World Applications

This article introduces the main types of data‑mining algorithms—classification, prediction, clustering, and association—explains supervised and unsupervised learning, and illustrates each with practical examples such as spam detection, tumor cell identification, wine quality assessment, fraud detection, recommendation systems, and more.

association analysisclassificationclustering
0 likes · 15 min read
An Introduction to Data Mining Algorithms and Their Real-World Applications
Qunar Tech Salon
Qunar Tech Salon
Jan 13, 2016 · Artificial Intelligence

Ranking Learning in Mobile Taobao: Challenges, Solutions, and Improvements

This article presents a comprehensive overview of ranking learning techniques used in Mobile Taobao's recommendation system, covering problem definition, pointwise/pairwise/listwise approaches, feature engineering, online learning, industry applications, and future optimization strategies.

CTR predictionLambdaMARTlistwise
0 likes · 8 min read
Ranking Learning in Mobile Taobao: Challenges, Solutions, and Improvements
21CTO
21CTO
Jan 11, 2016 · Artificial Intelligence

How WeChat Serves Tailored Ads: Inside the Recommendation Algorithm

This article explains the content‑based recommendation technique behind WeChat Moments ads, illustrates how user behavior is matched to ad attributes, and offers practical tips for influencing the system to display high‑value ads such as BMW.

WeChat advertisingcontent-based filteringmachine learning
0 likes · 5 min read
How WeChat Serves Tailored Ads: Inside the Recommendation Algorithm
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Jan 8, 2016 · Artificial Intelligence

Can Open‑Source AI Predict the Stock Market? Inside a Real‑Time Forecasting Architecture

The article examines the suspension of China's stock‑market circuit‑breaker, then explores whether open‑source frameworks and machine‑learning algorithms can realistically forecast stock prices by leveraging massive historical data, real‑time streams, and sentiment analysis from social media and news sources.

Stock Predictionfinancial time seriesmachine learning
0 likes · 9 min read
Can Open‑Source AI Predict the Stock Market? Inside a Real‑Time Forecasting Architecture
21CTO
21CTO
Jan 6, 2016 · Artificial Intelligence

From Naïve Algorithms to Scalable Recommendations: Jiayuan’s Journey

This article chronicles the evolution of Jiayuan’s dating recommendation system from early item‑based kNN experiments through a feature‑engineering focused engineering year and a product‑oriented optimization phase, while also reviewing several advanced machine‑learning techniques the author explored.

Recommendation Systemsfeature engineeringlogistic regression
0 likes · 15 min read
From Naïve Algorithms to Scalable Recommendations: Jiayuan’s Journey
Efficient Ops
Efficient Ops
Jan 5, 2016 · Information Security

How Apache Eagle Secures Hadoop: Real‑Time Big Data Threat Detection

Apache Eagle is an open‑source, distributed, real‑time security monitoring platform for Hadoop that combines stream‑processing, scalable policy enforcement, and machine‑learning user profiling to protect massive data assets across eBay’s production clusters.

Apache EagleBig DataHadoop
0 likes · 19 min read
How Apache Eagle Secures Hadoop: Real‑Time Big Data Threat Detection
21CTO
21CTO
Jan 3, 2016 · Artificial Intelligence

How to Build a Real-Time Stock Prediction System with Open-Source AI and Big Data Tools

An open-source reference architecture for real-time stock prediction is presented, detailing a scalable, low-latency pipeline that captures live market data, stores it in memory, trains and applies machine learning models using Spring Cloud Data Flow, Apache Geode, Spark MLlib, and related big‑data components.

Big DataSpark MLlibSpring Cloud Data Flow
0 likes · 8 min read
How to Build a Real-Time Stock Prediction System with Open-Source AI and Big Data Tools
Architect
Architect
Dec 31, 2015 · Big Data

Using Spark for Machine Learning, New Word Discovery, and Intelligent Q&A

The article explains how to leverage Apache Spark for machine‑learning tasks, large‑scale new‑word discovery, and simple intelligent question‑answering by using Spark‑Shell, Scala code, and word2vec‑based similarity, while sharing practical tips and performance considerations.

Big DataIntelligent QANew Word Discovery
0 likes · 15 min read
Using Spark for Machine Learning, New Word Discovery, and Intelligent Q&A
Qunar Tech Salon
Qunar Tech Salon
Dec 29, 2015 · Artificial Intelligence

Technical Debt in Machine Learning Systems

The paper examines how machine‑learning systems inherit unique forms of technical debt—such as boundary erosion, entanglement, hidden feedback loops, and data‑dependency issues—and discusses mitigation strategies, measurement techniques, and cultural changes needed to maintain sustainable, reliable ML deployments.

Software EngineeringTechnical Debtdata dependencies
0 likes · 26 min read
Technical Debt in Machine Learning Systems
Efficient Ops
Efficient Ops
Dec 5, 2015 · Information Security

Cultivating Secure Development Talent, Effective Security Visualization, and the Role of Machine Learning

This article shares insights from a security‑focused discussion on nurturing security‑oriented developers, balancing leadership and analyst needs in security visualization, and evaluating whether machine‑learning techniques truly add value to internal security data processing.

DevSecOpsinformation securitymachine learning
0 likes · 7 min read
Cultivating Secure Development Talent, Effective Security Visualization, and the Role of Machine Learning
Architects Research Society
Architects Research Society
Dec 3, 2015 · Artificial Intelligence

IBM Donates SystemML to Apache Incubator, Joining the Open‑Source Machine Learning Wave

IBM announced that its SystemML machine‑learning platform will become an Apache Incubator project, highlighting a broader industry trend where tech giants like Google and Facebook open‑source their AI tools to accelerate data‑driven innovation and expand enterprise‑focused machine‑learning ecosystems.

Apache SystemMLBig DataIBM
0 likes · 5 min read
IBM Donates SystemML to Apache Incubator, Joining the Open‑Source Machine Learning Wave
21CTO
21CTO
Nov 20, 2015 · Artificial Intelligence

How Meituan Builds and Optimizes Its Recommendation System

This article explains Meituan's end‑to‑end recommendation system architecture, data processing pipeline, candidate generation strategies, model training and online ranking techniques, illustrating how data, algorithms, and real‑time signals are combined to improve relevance and conversion.

AIMeituandata engineering
0 likes · 19 min read
How Meituan Builds and Optimizes Its Recommendation System
ITPUB
ITPUB
Nov 13, 2015 · Fundamentals

What Defines Data Science? Core Steps and Essential Book Recommendations

The article outlines data science as an interdisciplinary field centered on three key steps—pre‑processing, interpretation, and modeling—while providing concise recommendations of foundational books for R, Python, exploratory analysis, machine learning, and essential tools to guide practitioners.

Book RecommendationsData ScienceR programming
0 likes · 16 min read
What Defines Data Science? Core Steps and Essential Book Recommendations

TalkingData’s Journey to Building a Mobile Big Data Platform with Spark and YARN

This article recounts how TalkingData progressively introduced Spark into its Hadoop‑YARN based mobile big‑data platform, detailing early architectures, migration challenges, performance gains, the fully Spark‑centric redesign with Kafka and Spark Streaming, encountered pitfalls, and future plans for further optimization.

Data PlatformHadoopSpark
0 likes · 16 min read
TalkingData’s Journey to Building a Mobile Big Data Platform with Spark and YARN
21CTO
21CTO
Oct 26, 2015 · Artificial Intelligence

How Weibo’s Recommendation Engine Evolved: From 1.0 to Platform‑Scale 3.0

This article traces the evolution of Weibo's recommendation architecture across three major phases—independent 1.0, layered 2.0, and platform‑centric 3.0—detailing the driving business and technical factors, architectural components, advantages, shortcomings, and key outcomes of each stage.

AI EngineeringWeiboarchitecture evolution
0 likes · 19 min read
How Weibo’s Recommendation Engine Evolved: From 1.0 to Platform‑Scale 3.0
21CTO
21CTO
Oct 24, 2015 · Artificial Intelligence

Building an Offline Recommendation System with Mahout: Practical Steps and Tips

This article walks through the end‑to‑end process of building an offline recommendation system using Mahout, covering data collection, filtering, storage, various collaborative‑filtering algorithms, similarity measures, evaluation metrics, parameter tuning, AB testing, and spam‑fighting strategies.

Mahoutcollaborative filteringmachine learning
0 likes · 16 min read
Building an Offline Recommendation System with Mahout: Practical Steps and Tips
Qunar Tech Salon
Qunar Tech Salon
Oct 22, 2015 · Artificial Intelligence

Airbnb’s Dynamic Pricing System and Machine‑Learning Platform (Aerosolve)

The article describes how Airbnb built and continuously improved a machine‑learning‑driven dynamic pricing tool—Aerosolve—that extracts property features, compares similar listings, incorporates seasonal and event‑driven demand, and automatically updates nightly price suggestions to help hosts set optimal rates.

AirbnbData SciencePrice Optimization
0 likes · 18 min read
Airbnb’s Dynamic Pricing System and Machine‑Learning Platform (Aerosolve)
21CTO
21CTO
Oct 16, 2015 · Artificial Intelligence

Mastering Industrial Machine Learning: From Problem Modeling to Model Optimization

This article outlines a complete industrial machine‑learning workflow—starting with problem modeling, through data preparation, feature extraction, model training, and ending with model optimization—illustrated with a real‑world DEAL revenue‑prediction case and practical tips for handling data, features, and model selection.

Industrial ApplicationModel Trainingdata preparation
0 likes · 20 min read
Mastering Industrial Machine Learning: From Problem Modeling to Model Optimization
Architects Research Society
Architects Research Society
Oct 16, 2015 · Artificial Intelligence

From RankNet to Boosted Decision Trees: Evolution of Bing’s Search Ranking Algorithms

Chris Burges recounts Microsoft’s transition from the early “Flying Dutchman” system to RankNet and finally to Boosted Decision Trees, explaining how fast experimentation, LambdaRank/MART innovations, and large‑scale data handling have dramatically improved Bing’s search ranking accuracy and efficiency.

Boosted Decision TreesLambdaMARTRankNet
0 likes · 11 min read
From RankNet to Boosted Decision Trees: Evolution of Bing’s Search Ranking Algorithms
Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
Oct 8, 2015 · Artificial Intelligence

Facebook AI Research (FAIR): History, Teams, Projects, and Vision

The article chronicles Facebook's evolution from a social platform into a leading AI research hub, detailing the founding of FAIR, its key personnel, ambitious goals, major projects such as memory networks, embedding world, DeepFace, language technology, and the M assistant, and highlights the open, collaborative nature of its AI work.

AI researchDeep LearningFAIR
0 likes · 17 min read
Facebook AI Research (FAIR): History, Teams, Projects, and Vision
21CTO
21CTO
Sep 28, 2015 · Artificial Intelligence

How Meituan Built a Scalable AI‑Powered Recommendation Engine

This article details Meituan's end‑to‑end recommendation system, covering its four‑layer architecture, data sources, candidate‑generation strategies, fusion methods, and both linear and non‑linear re‑ranking models, while highlighting practical optimizations like AB testing and online learning.

MeituanOnline Learningdata pipelines
0 likes · 15 min read
How Meituan Built a Scalable AI‑Powered Recommendation Engine
21CTO
21CTO
Sep 16, 2015 · Artificial Intelligence

Why Deep Learning Marks a Turning Point in Artificial Intelligence

The article traces humanity’s long‑standing quest for intelligent machines—from early mechanical curiosities and Turing’s seminal test to modern breakthroughs in deep learning, highlighting how hierarchical feature learning, massive data, and collaborative open‑source efforts are reshaping AI and its future impact.

AI historyDeep Learningartificial intelligence
0 likes · 10 min read
Why Deep Learning Marks a Turning Point in Artificial Intelligence
Qunar Tech Salon
Qunar Tech Salon
Aug 14, 2015 · Big Data

The Nine Laws of Data Mining: Principles, Processes, and Insights

This article presents nine fundamental laws of data mining—covering goals, knowledge, preparation, experimentation, patterns, insight, prediction, value, and change—explaining how business objectives and domain expertise drive each stage of the CRISP‑DM process and why technical metrics alone cannot guarantee success.

CRISP-DMPredictive Modelingbusiness knowledge
0 likes · 19 min read
The Nine Laws of Data Mining: Principles, Processes, and Insights
Model Perspective
Model Perspective
Jul 22, 2015 · Artificial Intelligence

How Data Mining Can Transform School Learning: Insights from AI

This essay examines how data‑mining techniques underpinning artificial intelligence can be applied to school learning, proposing a framework for data collection, analysis, and interpretation to uncover deeper insights into student behavior and improve educational outcomes.

AILearning Analyticseducation
0 likes · 6 min read
How Data Mining Can Transform School Learning: Insights from AI
Qunar Tech Salon
Qunar Tech Salon
Jul 12, 2015 · Big Data

Airbnb OpenAir Conference: Open‑Source Tools Airpal, Aerosolve, and Airflow

At Airbnb’s inaugural OpenAir conference, the company unveiled three open‑source big‑data tools—Airpal, a Presto‑based visual SQL query engine; Aerosolve, an interpretable machine‑learning engine for pricing recommendations; and Airflow, an internal platform for orchestrating and monitoring data pipelines.

AirbnbBig DataOpenAir
0 likes · 4 min read
Airbnb OpenAir Conference: Open‑Source Tools Airpal, Aerosolve, and Airflow
Suning Technology
Suning Technology
Jun 18, 2015 · Artificial Intelligence

How Suning Uses Naive Bayes for High‑Accuracy Product Classification

This article explains Suning's implementation of a Naive Bayes‑based product classification system, detailing its basic theory, formal definition, step‑by‑step training process, three implementation phases, evaluation results, and error analysis to improve classification accuracy.

Naive BayesSuningalgorithm
0 likes · 6 min read
How Suning Uses Naive Bayes for High‑Accuracy Product Classification
Ctrip Technology
Ctrip Technology
May 14, 2015 · Artificial Intelligence

Data‑Driven User Experience: Machine Learning Applications in Hotel Booking and Marketing at Ctrip

In his 2015 China Hotel Marketing Summit keynote, Ctrip CTO Ye Yamin explained how machine‑learning models built on purchase behavior and order data improve hotel room availability predictions, shorten confirmation times, personalize recommendations, and evaluate advertising effectiveness, illustrating a data‑driven approach to user experience and operations.

Big DataData AnalyticsMarketing
0 likes · 14 min read
Data‑Driven User Experience: Machine Learning Applications in Hotel Booking and Marketing at Ctrip
MaGe Linux Operations
MaGe Linux Operations
Apr 22, 2015 · Artificial Intelligence

Your Complete Python Roadmap to Become a Data Scientist

This guide outlines a comprehensive, step‑by‑step Python learning path for aspiring data scientists, covering environment setup, core language fundamentals, regular expressions, scientific libraries such as NumPy, SciPy, Matplotlib, Pandas, data visualization, machine‑learning with scikit‑learn, and an introduction to deep learning, with curated resources and practice projects.

Data ScienceData visualizationDeep Learning
0 likes · 11 min read
Your Complete Python Roadmap to Become a Data Scientist
Qunar Tech Salon
Qunar Tech Salon
Mar 15, 2015 · Artificial Intelligence

Overview of Common Classification Algorithms in Data Mining

This article introduces the concepts of classification and prediction in data mining, outlines their workflow, and provides concise explanations of six widely used classification techniques—decision trees, K‑Nearest Neighbour, Support Vector Machine, Vector Space Model, Bayesian methods, and neural networks—highlighting their principles, advantages, and limitations.

Bayesiandata miningdecision tree
0 likes · 9 min read
Overview of Common Classification Algorithms in Data Mining
Qunar Tech Salon
Qunar Tech Salon
Mar 14, 2015 · Artificial Intelligence

Common Distance and Similarity Measures in Machine Learning and Data Mining

This article reviews the most frequently used distance and similarity formulas in machine learning and data mining, explaining their definitions, mathematical properties, practical examples, and when each metric is appropriate for measuring differences between data points or probability distributions.

Cosine SimilarityKL divergenceMahalanobis distance
0 likes · 13 min read
Common Distance and Similarity Measures in Machine Learning and Data Mining
Meituan Technology Team
Meituan Technology Team
Jan 31, 2015 · Artificial Intelligence

Meituan Recommendation System Architecture and Optimization Practices

Meituan’s recommendation platform comprises a data layer, a multi‑strategy candidate generation layer, a fusion‑and‑filtering layer, and a ranking layer that uses additive‑grove tree ensembles and online‑updated logistic regression, leveraging extensive user behavior logs, location, query, graph and real‑time signals to deliver personalized deals.

Meituanmachine learningpersonalization
0 likes · 14 min read
Meituan Recommendation System Architecture and Optimization Practices
Meituan Technology Team
Meituan Technology Team
Dec 18, 2014 · Artificial Intelligence

Auto-Label Missing POI Categories Using Naive Bayes and Feature Selection

This article details a step‑by‑step machine‑learning pipeline that transforms over one million calibrated POI records into feature vectors, selects discriminative terms via information‑gain and domain rules, trains a Naive Bayes classifier, and achieves 91% accuracy with 84% coverage on unseen POI data.

Chinese NLPNaive BayesPOI classification
0 likes · 12 min read
Auto-Label Missing POI Categories Using Naive Bayes and Feature Selection
Baidu Tech Salon
Baidu Tech Salon
Oct 21, 2014 · Big Data

Baidu's Big Data Intelligence: From Data to Intelligence - QCon2014 Presentation

At QCon2014, Baidu Research’s Shen Zhiyong showcased the company’s massive big‑data engine—20,000 PB storage and daily processing of up to 100 PB—highlighting open platforms like Baidu Brain and real‑world prediction projects for tourism, the World Cup, disease outbreaks, and UN collaborations, while urging industry‑wide data‑driven transformation.

BaiduData Intelligencedata mining
0 likes · 8 min read
Baidu's Big Data Intelligence: From Data to Intelligence - QCon2014 Presentation
Baidu Tech Salon
Baidu Tech Salon
Aug 18, 2014 · Big Data

Big Data and Prediction: Insights from Baidu Research Lab

At Baidu’s 53rd Technology Salon, researcher Shen Zhiyong outlined the lab’s vision of an online intelligent system that unifies monitoring, anomaly detection, diagnosis and big‑data‑driven prediction—using time‑series, causal and simulation analyses—to forecast tourism crowds, predict Gaokao essay topics, and illustrate both the opportunities and challenges of processing massive, heterogeneous data for real‑time decision support.

BaiduPredictionTime Series Analysis
0 likes · 8 min read
Big Data and Prediction: Insights from Baidu Research Lab
Suning Design
Suning Design
Jul 17, 2014 · Mobile Development

What’s Next for Mobile Search? Exploring Future Input, Data, and Output Innovations

Mobile search is evolving beyond traditional keyword queries, with emerging trends in precise user profiling, crowdsourced data, voice and natural language understanding, deep linking, machine learning, and structured, intelligent result aggregation, promising a more personalized, context‑aware, and seamless search experience on smartphones.

deep linkingmachine learningmobile search
0 likes · 9 min read
What’s Next for Mobile Search? Exploring Future Input, Data, and Output Innovations