Tagged articles

Clustering

154 articles · Page 2 of 2

Oct 24, 2020 · Artificial Intelligence

FrameX: An AI System for Intelligent Floorplan Analysis and Applications

FrameX is an AI-powered platform developed by Beike’s Data Intelligence Center that leverages vector floorplan data to automatically tag, score, interpret, cluster, and retrieve housing layouts, supporting numerous business scenarios through a layered architecture of data, feature, and application layers.

AIClusteringFloorplan Analysis

0 likes · 9 min read

FrameX: An AI System for Intelligent Floorplan Analysis and Applications

Selected Java Interview Questions

Oct 17, 2020 · Databases

Redis Interview Questions and Answers: Persistence, Caching Issues, Data Types, Clustering, and More

This article provides a comprehensive overview of Redis interview topics, covering persistence mechanisms, cache avalanche and penetration problems, hot and cold data concepts, differences from Memcached, single‑thread performance, data structures, expiration policies, clustering solutions, distributed locks, transactions, and practical troubleshooting tips.

CachingClusteringData Types

0 likes · 23 min read

Redis Interview Questions and Answers: Persistence, Caching Issues, Data Types, Clustering, and More

Architect

Sep 26, 2020 · Backend Development

Understanding RabbitMQ: AMQP Fundamentals, Reliability, Consumer Flow Control, and High‑Availability Deployment

This article explains RabbitMQ’s AMQP fundamentals, exchange types, reliability mechanisms such as confirms and returns, consumer flow control, idempotency, dead‑letter handling, and for various high‑availability deployment models including mirrored clusters and federation.

AMQPClusteringRabbitMQ

0 likes · 16 min read

Understanding RabbitMQ: AMQP Fundamentals, Reliability, Consumer Flow Control, and High‑Availability Deployment

Full-Stack Internet Architecture

Sep 4, 2020 · Databases

Comprehensive Guide to Redis Data Structures, Persistence, Transactions, Clustering, and Applications

This article provides an in‑depth technical overview of Redis, covering its core data structures, memory allocation strategies, eviction policies, persistence mechanisms (RDB and AOF), transaction model, sentinel and cluster architectures, Pub/Sub messaging, and multiple approaches to implementing distributed locks.

ClusteringData StructuresDistributed Lock

0 likes · 89 min read

Comprehensive Guide to Redis Data Structures, Persistence, Transactions, Clustering, and Applications

DataFunTalk

Jul 20, 2020 · Artificial Intelligence

Embedding Techniques in Tencent Mobile News Recommendation System

This article reviews the practical use of embedding technologies in Tencent's mobile news recommendation pipeline, covering the fundamentals of embeddings, their historical development, item and image embeddings, user embeddings, various vector‑based recall methods, clustering strategies, and recent advances and challenges.

ClusteringEmbeddingTencent

0 likes · 15 min read

Embedding Techniques in Tencent Mobile News Recommendation System

Laravel Tech Community

Jul 15, 2020 · Databases

Comprehensive Redis Interview Questions and Answers

This article provides a comprehensive overview of Redis, covering its definition, advantages over memcached, supported data types, memory consumption, eviction policies, clustering options, persistence mechanisms, distributed lock implementations, cache penetration and avalanche solutions, and best-use scenarios compared to other caching systems.

CacheClusteringDistributed Lock

0 likes · 26 min read

Comprehensive Redis Interview Questions and Answers

Big Data Technology & Architecture

Jun 6, 2020 · Artificial Intelligence

Embedding Techniques and Practices in Tencent Mobile News Recommendation System

This article reviews the concept, history, and practical implementations of embedding—including item, image, and user embeddings—and describes various vector‑based recall strategies such as i2i, u2i, clustering, and deep‑learning models used in Tencent's mobile news recommendation platform.

ClusteringDSSMEmbedding

0 likes · 18 min read

Embedding Techniques and Practices in Tencent Mobile News Recommendation System

Architects' Tech Alliance

May 21, 2020 · Fundamentals

Fundamentals of Storage: RAID, COW/ROW Snapshots, Backup, CDP, Clustering, and VTL

This article provides a comprehensive overview of storage fundamentals, covering RAID configurations, COW/ROW snapshot mechanisms, backup strategies, continuous data protection (CDP), clustering concepts, and virtual tape library (VTL) technologies, offering essential knowledge for IT professionals and system architects.

CDPClusteringRAID

0 likes · 1 min read

Fundamentals of Storage: RAID, COW/ROW Snapshots, Backup, CDP, Clustering, and VTL

ITFLY8 Architecture Home

Apr 26, 2020 · Backend Development

How to Build a Million‑Message‑Per‑Second RabbitMQ Cluster: Lessons from Google

This article explains how to design and scale a RabbitMQ cluster capable of handling millions of messages per second, covering core concepts, Google’s large‑scale test setup, sharding and federation plugins, mirror queues, reliability mechanisms, and practical tips for high‑availability and performance optimization.

ClusteringMessage QueueRabbitMQ

0 likes · 25 min read

How to Build a Million‑Message‑Per‑Second RabbitMQ Cluster: Lessons from Google

Yanxuan Tech Team

Apr 20, 2020 · Artificial Intelligence

How AI-Driven Clustering Boosts Smart Customer Service Knowledge Bases

This article outlines an AI-powered workflow for constructing and enriching a business knowledge base in intelligent customer service, covering preprocessing, intent detection, deep and shallow semantic feature engineering, hierarchical bucket clustering, and automated summary extraction to improve FAQ coverage and reduce manual workload.

AIClusteringKnowledge Base

0 likes · 15 min read

How AI-Driven Clustering Boosts Smart Customer Service Knowledge Bases

21CTO

Mar 5, 2020 · Fundamentals

How Alibaba Overcame Three Major Challenges in Code Defect Detection with PRECFIX

This article explains how Alibaba's Cloud R&D team tackled the complex business environment, limited auxiliary resources, and strict product requirements of defect detection by developing the PRECFIX method, which extracts, clusters, and templates defect‑repair pairs to improve code review and patch recommendation.

Clusteringcode reviewdefect detection

0 likes · 17 min read

How Alibaba Overcame Three Major Challenges in Code Defect Detection with PRECFIX

Java Backend Technology

Jan 22, 2020 · Backend Development

Why Does Redis Have 16 Databases? Uncover the Design Reason

This article explains why a Redis instance creates 16 default databases, how they function as simple namespaces rather than separate applications, how to configure their number, and why Redis clusters support only a single database.

CachingClusteringRedis

0 likes · 6 min read

Why Does Redis Have 16 Databases? Uncover the Design Reason

Open Source Tech Hub

Jan 9, 2020 · Databases

Mastering Redis Configuration: Units, Security, Persistence, and Cluster Settings

This guide provides a comprehensive walkthrough of Redis configuration options, covering memory unit notation, include templates, module loading, security settings, networking, persistence mechanisms, replication, clustering, Docker deployment, monitoring, and advanced memory and performance tuning.

ClusteringPerformance TuningPersistence

0 likes · 25 min read

Mastering Redis Configuration: Units, Security, Persistence, and Cluster Settings

360 Zhihui Cloud Developer

Dec 3, 2019 · Databases

How to Build a High‑Performance InfluxDB Cluster for Massive Time‑Series Data

This article explores InfluxDB’s time‑series strengths, compares TSDB with traditional databases, explains its TSM storage engine and shard concepts, and details the design, architecture, performance benchmarks, integration steps, and future enhancements of a high‑availability InfluxDB‑HA solution used at 360.

ClusteringHighAvailabilityInfluxDB

0 likes · 9 min read

How to Build a High‑Performance InfluxDB Cluster for Massive Time‑Series Data

DataFunTalk

Nov 25, 2019 · Artificial Intelligence

Real-time Attention-based Look-alike Model for Recommender Systems

This talk presents a real-time attention-based look‑alike model (RALM) designed to address the long‑tail problem in recommendation systems by efficiently expanding seed users, leveraging user representation learning, attention mechanisms, and clustering to deliver timely, diverse content without retraining the model.

ClusteringLong Tailattention

0 likes · 24 min read

Real-time Attention-based Look-alike Model for Recommender Systems

Xianyu Technology

Nov 7, 2019 · Big Data

Sequence Pattern Mining for User Behavior Analysis in Xianyu

By applying sequence pattern mining and unsupervised clustering to Xianyu’s massive event logs, the study abstracts high‑level user behaviors, discovers frequent subsequences, uncovers unknown fraudulent account patterns, expands known fraud cohorts with 99 % precision, and enables richer analyses such as PCA‑based cross‑group comparisons.

Big DataClusteringdata mining

0 likes · 8 min read

Sequence Pattern Mining for User Behavior Analysis in Xianyu

Java Captain

Apr 24, 2019 · Databases

Understanding Redis Data Structures, Clustering, and Core Operations

This article explains how Redis stores all values as byte arrays, clarifies the five primary data structures, describes cluster slot mapping and node‑key relationships, and covers single‑threaded execution, transactions, pipelines, and the Redis protocol in detail.

ClusteringData Structuresprotocol

0 likes · 14 min read

Understanding Redis Data Structures, Clustering, and Core Operations

Java Captain

Apr 8, 2019 · Backend Development

RabbitMQ: Use Cases, Roles, Components, and Operational Practices

This article explains RabbitMQ's typical scenarios, key roles and components, virtual host purpose, message delivery process, durability and loss‑prevention mechanisms, broadcast types, delayed queues, clustering benefits, node types, setup considerations, and cluster shutdown order.

ClusteringRabbitMQarchitecture

0 likes · 9 min read

RabbitMQ: Use Cases, Roles, Components, and Operational Practices

JD Tech Talk

Mar 22, 2019 · Artificial Intelligence

Data Mining Techniques for Telemarketing: Supervised Classification, Clustering, Optimization, Anomaly Detection, and Text Mining

The article examines how telemarketing, a data‑intensive industry, leverages various data‑mining methods—including supervised classification, clustering, operations research optimization, anomaly detection, and text mining—to improve lead selection, agent allocation, churn prediction, and voice analysis, while also outlining the key data‑talent roles needed for successful implementation.

Anomaly DetectionClusteringOptimization

0 likes · 7 min read

Data Mining Techniques for Telemarketing: Supervised Classification, Clustering, Optimization, Anomaly Detection, and Text Mining

MaGe Linux Operations

Mar 8, 2019 · Operations

Mastering High‑Availability Clusters: Resources, Constraints, and Failure Handling

This article explains the principles and components of high‑availability (HA) clusters, covering active/standby nodes, resource stickiness and constraints, heartbeat and quorum mechanisms, split‑brain avoidance, failure detection methods, and the minimal setup required for a reliable web‑service HA deployment.

ClusteringHigh AvailabilityOperations

0 likes · 14 min read

Mastering High‑Availability Clusters: Resources, Constraints, and Failure Handling

Efficient Ops

Feb 11, 2019 · Databases

Best Redis Cluster Options: Client Sharding, Proxy, Codis, Official

Redis, a high‑performance NoSQL database, offers multiple clustering approaches—including client‑side sharding, proxy‑based solutions like Twemproxy and Codis, and the native Redis Cluster—each with distinct trade‑offs in scalability, availability, operational complexity, and performance, guiding engineers to select the optimal architecture for their workloads.

ClusteringCodisRedis

0 likes · 15 min read

Best Redis Cluster Options: Client Sharding, Proxy, Codis, Official

Alibaba Cloud Developer

Jan 22, 2019 · Artificial Intelligence

How Tmall’s “Most Concerned” Feature Uses AI to Match Reviews with Consumer Questions

The article explains how Tmall’s new “Most Concerned” module leverages NLP techniques, fastText embeddings, Bi‑LSTM classifiers, and a custom clustering algorithm to filter, group, and link consumer questions with relevant product reviews, improving the shopping experience across many product categories.

AIClusteringNLP

0 likes · 9 min read

How Tmall’s “Most Concerned” Feature Uses AI to Match Reviews with Consumer Questions

Alibaba Cloud Developer

Jan 2, 2019 · Artificial Intelligence

How AI Detects Screenshot Bugs: From CNN Models to Image Clustering

Leveraging TensorFlow's CNN and OCR‑LSTM models, this article details how AI can automatically spot blank pages, UI anomalies, and garbled text in app screenshots, and describes a Jenkins‑driven retraining pipeline and hierarchical clustering to de‑duplicate images and boost manual review efficiency.

AICNNClustering

0 likes · 7 min read

How AI Detects Screenshot Bugs: From CNN Models to Image Clustering

Xianyu Technology

Dec 12, 2018 · Big Data

Community Data Normalization Using Prefix Matching and Text Similarity

The study presents a four‑step pipeline that normalizes community data for rental platforms by clustering records using longest‑common‑prefix patterns, geographic filtering, Levenshtein similarity, and pattern‑based parent‑child assignment, achieving under 8 % false positives and 5 % false negatives.

ClusteringGeospatialdata normalization

0 likes · 10 min read

Community Data Normalization Using Prefix Matching and Text Similarity

Mike Chen's Internet Architecture

Dec 10, 2018 · Databases

Comprehensive Redis Interview Guide: Data Types, Core Features, Persistence, Clustering, and Performance Tips

This article provides an extensive overview of Redis for interview preparation, covering its supported data structures, key functionalities such as Sentinel, replication, transactions, Lua scripting, persistence mechanisms, clustering options, performance characteristics, memory‑optimization strategies, and common use‑case scenarios.

Clusteringdata-types

0 likes · 12 min read

Comprehensive Redis Interview Guide: Data Types, Core Features, Persistence, Clustering, and Performance Tips

360 Quality & Efficiency

Dec 7, 2018 · Artificial Intelligence

Image Feature Extraction and Clustering for Key Frame Selection in Mobile App Installation Screenshots

This article presents a technical solution for extracting representative key frames from time‑series screenshots of a mobile app installation process, covering pixel sampling, dimensionality reduction, classic feature extractors (SIFT, HOG, ORB), auto‑encoder based deep learning, and clustering methods such as KMeans and DBSCAN, along with practical results and performance analysis.

AutoencoderClusteringHOG

0 likes · 5 min read

Image Feature Extraction and Clustering for Key Frame Selection in Mobile App Installation Screenshots

ITFLY8 Architecture Home

Dec 6, 2018 · Databases

Redis vs Memcached: Which In‑Memory Cache Wins for Complex Data?

Redis and Memcached differ significantly in data structure support, memory efficiency, performance, memory management, persistence options, and clustering capabilities, with Redis offering richer data types, server‑side operations, configurable persistence, and native clustering, while Memcached provides simpler key‑value storage, slab allocation, and client‑side distribution.

ClusteringData StructuresIn-Memory Cache

0 likes · 17 min read

Redis vs Memcached: Which In‑Memory Cache Wins for Complex Data?

360 Quality & Efficiency

Nov 2, 2018 · Artificial Intelligence

Extracting Regression from Production Requests Using Clustering Algorithms

This article explains how to apply TF‑IDF weighting and the K‑means clustering algorithm in Python to identify a small set of representative regression cases from hundreds of thousands of production request records, including guidance on selecting the optimal number of clusters.

ClusteringK-MeansTF-IDF

0 likes · 5 min read

Extracting Regression from Production Requests Using Clustering Algorithms

Architect's Tech Stack

Oct 23, 2018 · Databases

Redis Overview: Features, Data Types, Persistence, Clustering, and Common Interview Questions

This article provides a comprehensive introduction to Redis, covering its core concepts, advantages, data structures, persistence mechanisms, eviction policies, clustering, common usage scenarios, and typical interview questions for developers working with this high‑performance in‑memory key‑value store.

CacheClusteringIn-Memory Database

0 likes · 24 min read

Redis Overview: Features, Data Types, Persistence, Clustering, and Common Interview Questions

Big Data and Microservices

Sep 17, 2018 · Big Data

5 Essential Data Mining Techniques Every Analyst Should Know

This article outlines five widely used data‑mining methods—association rules, classification/tagging, clustering, decision trees, and sequential pattern mining—explaining their principles, real‑world examples, and how they help organizations extract actionable insights from massive datasets.

Big DataClusteringDecision Trees

0 likes · 6 min read

5 Essential Data Mining Techniques Every Analyst Should Know

Meituan Technology Team

Sep 13, 2018 · Mobile Development

ARKit LBS AR Application for Meituan Dining Experience

Meituan’s dining AR app uses ARKit’s orientation‑tracking configuration and gravity‑and‑heading world alignment to place virtual restaurant cards in the camera view, rendering them with SceneKit billboards, handling overlap via tap‑to‑disperse and K‑Means clustering, and eliminating flicker by disabling depth buffering.

ARARKitClustering

0 likes · 15 min read

ARKit LBS AR Application for Meituan Dining Experience

Java Backend Technology

Jul 14, 2018 · Databases

Redis Deep Dive: Core Concepts, Data Types, and Best Practices

This comprehensive guide explains what Redis is, its advantages over memcached, supported data structures, eviction policies, clustering options, persistence mechanisms, memory optimization techniques, and practical use‑cases such as caching, queues, leaderboards, and pub/sub, providing essential knowledge for developers and architects.

ClusteringData StructuresIn-Memory Database

0 likes · 26 min read

Redis Deep Dive: Core Concepts, Data Types, and Best Practices

Programmer DD

Jun 7, 2018 · Operations

How to Build a High‑Availability RabbitMQ Cluster with Load Balancing

This guide explains the principles behind RabbitMQ clustering, shows how metadata synchronization works, compares design choices, and provides step‑by‑step instructions—including component installation, node configuration, HAProxy load‑balancing setup, and a sample architecture diagram—to create a reliable, scalable RabbitMQ cluster for production use.

ClusteringHAProxyOperations

0 likes · 16 min read

How to Build a High‑Availability RabbitMQ Cluster with Load Balancing

Efficient Ops

Apr 18, 2018 · Operations

Huawei’s Triple‑Play Model: Advancing AIOps for Massive K8s and Serverless

At the 9th Global Operations Conference, Huawei Cloud’s chief architect Cai Xiaogang presented a three‑pronged AIOps strategy that combines large‑scale Kubernetes management, causal tracing in Serverless environments, multi‑source RCA analysis, and clustering‑based black‑box network packet inspection, showcasing how academia‑industry collaboration accelerates cloud‑native operations.

AIOpsClusteringRoot Cause Analysis

0 likes · 8 min read

Huawei’s Triple‑Play Model: Advancing AIOps for Massive K8s and Serverless

Architects' Tech Alliance

Mar 9, 2018 · Artificial Intelligence

Master Machine Learning Basics: From PCA to KNN Explained with Visual Demos

An in‑depth, visual guide walks readers through the fundamentals of machine learning—distinguishing supervised from unsupervised approaches, explaining dimensionality reduction with PCA, detailing clustering techniques such as hierarchical clustering, K‑Means and DBSCAN, and summarizing core regression and classification algorithms including linear regression, SVM, decision trees, logistic regression, Naïve Bayes, and KNN.

ClusteringRegressionclassification

0 likes · 11 min read

Master Machine Learning Basics: From PCA to KNN Explained with Visual Demos

Architecture Digest

Feb 13, 2018 · Artificial Intelligence

Overview of Common Machine Learning Models: Characteristics, Advantages, and Disadvantages

This article provides a concise overview of fifteen widely used machine learning models—including decision trees, random forests, k‑means, KNN, EM, linear and logistic regression, Naive Bayes, Apriori, Boosting, GBDT, SVM, neural networks, HMM, and CRF—detailing their features, strengths, weaknesses, and typical application scenarios.

ClusteringRegressionclassification

0 likes · 12 min read

Overview of Common Machine Learning Models: Characteristics, Advantages, and Disadvantages

Hulu Beijing

Feb 8, 2018 · Artificial Intelligence

How Self‑Organizing Maps Work: Key Features, Design Tips & K‑Means Comparison

This article explains the principles, biological inspiration, network structure, training process, design parameters, and practical differences of Self‑Organizing Maps (SOM), an unsupervised neural network used for clustering, visualization, and feature extraction, and compares it with methods like K‑means.

ClusteringSelf-Organizing Mapdimensionality reduction

0 likes · 10 min read

How Self‑Organizing Maps Work: Key Features, Design Tips & K‑Means Comparison

MaGe Linux Operations

Jan 25, 2018 · Databases

Master Redis: Data Structures, Commands, and Performance Tuning Explained

This comprehensive guide introduces Redis fundamentals, covering its core data structures and essential commands, then delves into performance optimization, high‑availability setups with replication and Sentinel, and scaling strategies using Redis Cluster, providing practical examples and best‑practice recommendations for robust in‑memory data management.

ClusteringData StructuresPerformance Tuning

0 likes · 29 min read

Master Redis: Data Structures, Commands, and Performance Tuning Explained

Qunar Tech Salon

Jan 23, 2018 · Artificial Intelligence

Intelligent Business Zone Planning for Super Bus Service Using DBSCAN Clustering and Convex Hull

The article describes how the Super Bus platform leverages unsupervised DBSCAN clustering and a Graham‑scan convex‑hull algorithm, combined with a data‑center and distributed processing framework, to automatically generate compliant service zones that match user demand while improving efficiency and scalability.

ClusteringDBSCANconvex hull

0 likes · 8 min read

Intelligent Business Zone Planning for Super Bus Service Using DBSCAN Clustering and Convex Hull

dbaplus Community

Jan 14, 2018 · Backend Development

Mastering Tomcat: Kernel Design, Clustering, and Performance Tuning

This article provides a comprehensive technical guide to Tomcat, covering its kernel implementation principles, server models, distributed clustering strategies, production deployment parameters, JVM tuning, request processing flow, servlet mechanisms, filter chains, Comet and WebSocket modes, as well as performance monitoring and optimization techniques.

ClusteringJVMJava

0 likes · 18 min read

Mastering Tomcat: Kernel Design, Clustering, and Performance Tuning

Architecture Digest

Oct 26, 2017 · Databases

Redis Overview: Architecture, Master‑Slave Replication, Cluster Design, Persistence and Failure Handling

This article provides a comprehensive English overview of Redis, covering its in‑memory key‑value data model, master‑slave replication, cluster topology, installation steps, persistence mechanisms (RDB and AOF), consistency hashing, node failure detection, slave election, and the advantages and drawbacks of using Redis Cluster.

ClusteringIn-Memory DatabasePersistence

0 likes · 16 min read

Redis Overview: Architecture, Master‑Slave Replication, Cluster Design, Persistence and Failure Handling

21CTO

Jul 8, 2017 · Artificial Intelligence

Mastering Recommendation Systems: From Collaborative Filtering to Deep Learning

This article surveys major recommendation system techniques—from collaborative filtering and matrix factorization to clustering and deep‑learning approaches like YouTube’s two‑stage neural network—explaining their principles, strengths, and practical considerations for building effective personalized recommenders.

ClusteringRecommendation SystemsYouTube

0 likes · 10 min read

Mastering Recommendation Systems: From Collaborative Filtering to Deep Learning

Tencent IMWeb Frontend Team

Jun 27, 2017 · Backend Development

Master Real-Time Messaging: From Polling to WebSocket with Socket.io and Multi-Node Clustering

This article explains the evolution of real-time message delivery—from short polling to long polling, streaming, and WebSocket—introduces Socket.io, and details a multi‑node cluster architecture using Redis, Nginx, and Node.js for scalable chat applications.

ClusteringNode.jsRedis

0 likes · 9 min read

Master Real-Time Messaging: From Polling to WebSocket with Socket.io and Multi-Node Clustering

Architects' Tech Alliance

May 24, 2017 · Big Data

Customer Segmentation: Processes, Best Practices, Common Mistakes, and an RFM Model Case Study

This article provides a comprehensive overview of customer segmentation, detailing its definition, multi‑dimensional challenges, seven implementation guidelines, five systematic steps, ten frequent pitfalls, and a practical RFM model case study using big‑data mining techniques.

ClusteringCustomer SegmentationMarketing Analytics

0 likes · 16 min read

Customer Segmentation: Processes, Best Practices, Common Mistakes, and an RFM Model Case Study

MaGe Linux Operations

May 7, 2017 · Artificial Intelligence

Big Data & Machine Learning: Core Definitions and Essential Algorithms

This article explains what big data and machine learning are, their interrelationship, various big‑data analysis approaches, core machine‑learning concepts, and details ten fundamental algorithms—including regression, neural networks, SVM, clustering, dimensionality reduction, and recommendation—while highlighting their roles in modern data‑driven applications.

Big DataClusteringRegression

0 likes · 24 min read

Big Data & Machine Learning: Core Definitions and Essential Algorithms

Tencent Cloud Developer

Dec 26, 2016 · Databases

Analysis of Redis Design: Network Model, Data Structures, Memory Management, Persistence, and Clustering

The article dissects Redis’s architecture by examining its single‑threaded reactor network model, core data structures and memory‑management tactics, AOF/RDB persistence mechanisms, and master‑slave, Sentinel, and Cluster multi‑node strategies, highlighting how each design choice balances speed, memory usage, and system complexity.

ClusteringMemory ManagementPersistence

0 likes · 16 min read

Analysis of Redis Design: Network Model, Data Structures, Memory Management, Persistence, and Clustering

Architects' Tech Alliance

Nov 24, 2016 · Big Data

Data Mining Overview: Process, Techniques, and Model Evaluation

This article provides a comprehensive introduction to data mining, covering its definition, goal setting, data sampling, exploration, preprocessing, pattern discovery, model building, evaluation methods, and the main analytical techniques such as classification, regression, clustering, association rules, feature and deviation analysis, and web mining.

Clusteringassociation rulesclassification

0 likes · 10 min read

Data Mining Overview: Process, Techniques, and Model Evaluation

Architecture Digest

Nov 17, 2016 · Big Data

Spam Detection on Zhihu Using Text and Behavior Clustering with Jaccard and SimHash on Spark

This article describes how Zhihu combats large‑scale spam by applying text and behavior clustering techniques—using Jaccard similarity, SimHash fingerprinting, and Spark‑based graph partitioning—to efficiently identify and group similar spammy content and actions.

Big DataClusteringSimHash

0 likes · 11 min read

Spam Detection on Zhihu Using Text and Behavior Clustering with Jaccard and SimHash on Spark

MaGe Linux Operations

Sep 25, 2016 · Databases

Understanding Redis Clustering: Client Sharding, Proxy Sharding, and Codis Solutions

This article reviews Redis's native limitations, explains three clustering approaches—client‑side sharding, proxy sharding (e.g., Twemproxy), and the official Redis Cluster—then compares Twemproxy's drawbacks and introduces Codis as a modern, open‑source alternative with practical deployment tips.

ClusteringCodisRedis

0 likes · 9 min read

Understanding Redis Clustering: Client Sharding, Proxy Sharding, and Codis Solutions

Qunar Tech Salon

Jul 12, 2016 · Databases

Redis vs Memcached: Data Types, Memory Management, Persistence, and Clustering Comparison

This article compares Redis and Memcached across data‑type support, memory‑usage efficiency, performance, memory‑management mechanisms, persistence options, and clustering capabilities, highlighting the strengths and trade‑offs of each in‑memory data store.

ClusteringData StructuresIn-Memory Database

0 likes · 18 min read

Redis vs Memcached: Data Types, Memory Management, Persistence, and Clustering Comparison

Architect

Mar 6, 2016 · Big Data

Clustering Geolocated User Events with DBSCAN and Spark

This article explains how to apply the DBSCAN clustering algorithm to geolocated user event data and leverage Apache Spark’s distributed processing with PairRDDs to efficiently identify frequent user regions, detect outliers, and build location‑based services such as personalized recommendations and security alerts.

Big DataClusteringDBSCAN

0 likes · 8 min read

Clustering Geolocated User Events with DBSCAN and Spark

Qunar Tech Salon

Feb 6, 2016 · Big Data

An Introduction to Data Mining Algorithms and Their Real-World Applications

This article introduces the main types of data‑mining algorithms—classification, prediction, clustering, and association—explains supervised and unsupervised learning, and illustrates each with practical examples such as spam detection, tumor cell identification, wine quality assessment, fraud detection, recommendation systems, and more.

Clusteringassociation analysisclassification

0 likes · 15 min read

An Introduction to Data Mining Algorithms and Their Real-World Applications

dbaplus Community

Dec 25, 2015 · Artificial Intelligence

Detecting Fraudulent ModemPOOL Terminals with K‑Means Clustering

This article details how telecom operators can identify fraudulent ModemPOOL (cat‑pool) terminals and predict churn using data‑driven clustering and day‑interval warning models, covering metric selection, data exploration, k‑means clustering, model deployment, and performance evaluation.

ClusteringK-MeansModel Deployment

0 likes · 18 min read

Detecting Fraudulent ModemPOOL Terminals with K‑Means Clustering

Art of Distributed System Architecture Design

Aug 8, 2015 · Operations

Automatic Anomaly Detection for Server Failures Using DBSCAN at Netflix

This article describes how Netflix’s technical operations team built an automatic anomaly‑detection system based on the DBSCAN clustering algorithm to identify subtly failing servers from time‑series error‑rate data, evaluate its effectiveness, and discuss practical deployment considerations.

Anomaly DetectionClusteringDBSCAN

0 likes · 9 min read

Automatic Anomaly Detection for Server Failures Using DBSCAN at Netflix