Tag

Clustering

0 views collected around this technical thread.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Feb 23, 2025 · Frontend Development

AutoHue.js: Automatic Image Color Extraction for Web Banners

This article introduces AutoHue.js, a lightweight JavaScript library that uses Canvas, Lab color space conversion, and clustering algorithms to automatically extract dominant, secondary, and edge colors from images for seamless background integration in web banners, complete with usage examples and installation instructions.

AutoHueCanvasClustering
0 likes · 16 min read
AutoHue.js: Automatic Image Color Extraction for Web Banners
Code Mala Tang
Code Mala Tang
Nov 20, 2024 · Backend Development

Can Node.js Power Millions of Users? Scaling Strategies Revealed

This article explores whether Node.js can handle millions of concurrent users, explains the core non‑blocking architecture, outlines challenges such as the single‑thread model and memory leaks, and provides practical scaling tactics like clustering, load balancing, caching, and database optimization.

CachingClusteringNode.js
0 likes · 10 min read
Can Node.js Power Millions of Users? Scaling Strategies Revealed
Sohu Tech Products
Sohu Tech Products
Sep 11, 2024 · Big Data

Tencent Real-time Lakehouse Intelligent Optimization Practice

Tencent’s real‑time lakehouse combines Spark, Flink, StarRocks and Presto compute layers with Iceberg‑based management and HDFS/COS storage, and its Intelligent Optimize Service—comprising Compaction, Expiration, Cleaning, Clustering, Index and Auto‑Engine modules—automatically reduces merge time, improves query performance, enables secondary indexing, and dynamically routes hot partitions, while future plans target cold/hot separation, materialized view acceleration, and AI‑driven optimizations.

Big DataClusteringCompaction
0 likes · 12 min read
Tencent Real-time Lakehouse Intelligent Optimization Practice
DataFunSummit
DataFunSummit
Aug 31, 2024 · Big Data

Apache Hudi Clustering: Workflow and Layout Optimization Strategies (Part 6)

This article explains Apache Hudi's clustering service, detailing its workflow, three execution modes, and layout optimization strategies—including linear, Z‑order, and Hilbert space‑filling curves—to improve storage locality and query performance in large‑scale data lake environments.

Apache HudiBig DataClustering
0 likes · 8 min read
Apache Hudi Clustering: Workflow and Layout Optimization Strategies (Part 6)
Python Programming Learning Circle
Python Programming Learning Circle
May 5, 2024 · Artificial Intelligence

Python Implementation of DBSCAN and KMeans for Point Cloud Clustering and Tracking with Hungarian Matching

This article presents a Python project that reads point‑cloud data from CSV files, applies DBSCAN and KMeans clustering, extracts cluster features, and uses the Hungarian algorithm to match clusters across frames for tracking, complete with full source code and result visualization.

ClusteringDBSCANData Processing
0 likes · 13 min read
Python Implementation of DBSCAN and KMeans for Point Cloud Clustering and Tracking with Hungarian Matching
Wukong Talks Architecture
Wukong Talks Architecture
Apr 24, 2024 · Backend Development

Core Concepts and Common Patterns of RabbitMQ

This article explains why message queues are needed, outlines RabbitMQ's architecture, describes its basic concepts and working modes such as simple, work, fanout, direct and topic, and discusses reliability features like transactions, confirms, dead‑letter queues, TTL, clustering, ordering, and handling message backlogs.

ClusteringDead LetterTTL
0 likes · 24 min read
Core Concepts and Common Patterns of RabbitMQ
Model Perspective
Model Perspective
Feb 1, 2024 · Fundamentals

Essential Guide to Statistical and Probabilistic Model Articles

This curated list gathers recent articles on statistical and probabilistic models, covering clustering analysis, various linear regression techniques, and causal analysis, providing convenient links for students and researchers to explore each topic in depth.

Clusteringcausal analysislinear regression
0 likes · 3 min read
Essential Guide to Statistical and Probabilistic Model Articles
DataFunTalk
DataFunTalk
Nov 16, 2023 · Product Management

User Operations: Methods for User Analysis, Segmentation, and Aha‑Moment Identification

This article provides a comprehensive guide to user operations, covering the definition of user operation, common user analysis techniques, attribute and behavior analysis, segmentation methods using business logic and clustering algorithms, and the concept of the Aha‑moment or magic number for optimizing retention and value.

Aha momentClusteringproduct management
0 likes · 12 min read
User Operations: Methods for User Analysis, Segmentation, and Aha‑Moment Identification
Test Development Learning Exchange
Test Development Learning Exchange
Sep 12, 2023 · Artificial Intelligence

Various Anomaly Detection Techniques with Python Code Examples

This article introduces ten common anomaly detection approaches—including statistical thresholds, boxplots, clustering, isolation forest, LOF, collaborative filtering, robust covariance, NLP, computer‑vision, and time‑series methods—each accompanied by concise Python code snippets illustrating how to identify outliers in different data domains.

Anomaly DetectionClusteringPython
0 likes · 9 min read
Various Anomaly Detection Techniques with Python Code Examples
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jul 25, 2023 · Artificial Intelligence

Building a Reverse Image Search Engine with Geometric Distance, ResNet Feature Embeddings, Clustering, and Milvus Vector Database

This article walks through implementing a reverse image search system, starting with simple pixel‑based geometric distance, then improving accuracy using ResNet‑derived feature embeddings, accelerating queries with K‑means clustering, and finally deploying a Milvus vector database for fast, scalable similarity retrieval.

ClusteringComputer VisionFeature Extraction
0 likes · 17 min read
Building a Reverse Image Search Engine with Geometric Distance, ResNet Feature Embeddings, Clustering, and Milvus Vector Database
Practical DevOps Architecture
Practical DevOps Architecture
May 16, 2023 · Databases

Redis Course Curriculum Overview: Distributed Locks, High Availability, Clustering, Persistence, and Advanced Projects

This article outlines a comprehensive Redis training program covering fundamentals, distributed lock implementation, high‑availability mechanisms, clustering, persistence strategies, and practical projects such as Bloom filter integration and flash‑sale systems, providing learners with the knowledge to master advanced Redis usage.

Bloom FilterClusteringPersistence
0 likes · 5 min read
Redis Course Curriculum Overview: Distributed Locks, High Availability, Clustering, Persistence, and Advanced Projects
Architects Research Society
Architects Research Society
Apr 28, 2023 · Databases

High‑Availability Cluster Solutions for PostgreSQL

This article explains high‑availability concepts for PostgreSQL, reviews standby database types, describes clustering models, and evaluates several HA solutions such as DRBD, ClusterControl, Rubyrep, Pgpool II, Bucardo, Postgres‑XC, Citus, and PostgresXL, while noting practical considerations and trade‑offs.

ClusteringHA SolutionsPostgreSQL
0 likes · 10 min read
High‑Availability Cluster Solutions for PostgreSQL
Model Perspective
Model Perspective
Mar 22, 2023 · Artificial Intelligence

Master DBSCAN Clustering: Theory, Python Code, and Real-World Examples

DBSCAN is a density‑based clustering algorithm that automatically discovers arbitrarily shaped clusters and isolates noise, with detailed explanations of core, border, and noise points, step‑by‑step examples, Python implementations using scikit‑learn, and guidance on key parameters such as eps and min_samples.

ClusteringDBSCANPython
0 likes · 10 min read
Master DBSCAN Clustering: Theory, Python Code, and Real-World Examples
DataFunTalk
DataFunTalk
Feb 28, 2023 · Artificial Intelligence

Event‑Aware Graph Extraction and Adaptive Clustering‑Gain Network for Insurance Creative Recommendation

This article presents a comprehensive study on insurance creative recommendation, introducing an event‑aware graph extractor, a heterogeneous graph construction, and an adaptive clustering‑gain network that together address data sparsity, counterfactual samples, and cross‑industry cold‑start challenges, achieving significant AUC improvements in experiments.

AIClusteringadvertising
0 likes · 15 min read
Event‑Aware Graph Extraction and Adaptive Clustering‑Gain Network for Insurance Creative Recommendation
DataFunSummit
DataFunSummit
Feb 1, 2023 · Artificial Intelligence

Clustering-Based Global LSTM Models for Large-Scale Time Series Forecasting

The paper proposes clustering thousands of related time series and training separate global LSTM models for each cluster, showing that this reduces heterogeneity, leverages shared information, and improves forecasting accuracy compared to individual models, with extensive experiments on CIF2016 and NN5 datasets.

Big DataClusteringLSTM
0 likes · 33 min read
Clustering-Based Global LSTM Models for Large-Scale Time Series Forecasting
Architect
Architect
Jan 16, 2023 · Databases

Redis Fundamentals: Pipelines, Pub/Sub, Expiration, Transactions, Persistence, Distributed Locks and Clustering

This article provides a comprehensive overview of Redis, covering basic concepts, pipeline optimization, publish/subscribe messaging, key expiration strategies, transaction behavior, persistence mechanisms (RDB, AOF, hybrid), distributed locking techniques, Redisson and Redlock algorithms, as well as high‑availability setups using replication, Sentinel and Cluster modes.

CachingClusteringDistributed Locks
0 likes · 32 min read
Redis Fundamentals: Pipelines, Pub/Sub, Expiration, Transactions, Persistence, Distributed Locks and Clustering
Model Perspective
Model Perspective
Jan 13, 2023 · Artificial Intelligence

Master Classic Modeling with Python: LP, Graphs, Clustering, PCA & More

This article presents Python implementations of classic mathematical modeling techniques—including linear programming with PuLP, shortest‑path analysis using NetworkX, K‑means and hierarchical clustering, principal component analysis, frequent‑pattern mining with FP‑Growth, and linear regression and K‑nearest‑neighbors—providing code snippets, explanations, and visualizations to guide readers through each method.

ClusteringFrequent Pattern MiningPCA
0 likes · 12 min read
Master Classic Modeling with Python: LP, Graphs, Clustering, PCA & More
Model Perspective
Model Perspective
Jan 8, 2023 · Artificial Intelligence

Unlock Hidden Patterns: A Deep Dive into Unsupervised Learning Techniques

This article introduces unsupervised learning, covering its motivation, Jensen's inequality, key clustering methods such as EM, k‑means, hierarchical clustering, evaluation metrics, and dimensionality‑reduction techniques like PCA and ICA, providing clear explanations and illustrative diagrams.

ClusteringEM algorithmICA
0 likes · 8 min read
Unlock Hidden Patterns: A Deep Dive into Unsupervised Learning Techniques
Architect's Guide
Architect's Guide
Dec 1, 2022 · Databases

Comprehensive Guide to Redis: Architecture, Data Structures, Persistence, Replication, Clustering, and Advanced Features

This article provides an in‑depth overview of Redis, covering its single‑threaded architecture, core data structures (String, Hash, List, Set, ZSet), persistence mechanisms (RDB, AOF, hybrid), replication, Sentinel and cluster designs, memory eviction policies, bitmap analytics, skiplist implementation, and strategies for ensuring data consistency.

CachingClusteringDatabase
0 likes · 27 min read
Comprehensive Guide to Redis: Architecture, Data Structures, Persistence, Replication, Clustering, and Advanced Features