Tagged articles
1178 articles
Page 11 of 12
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 28, 2019 · Operations

Comprehensive Guide to Building an ELK Log Management Platform with Kafka and Filebeat

This article provides a detailed tutorial on designing, deploying, and operating an ELK log management platform—including Elasticsearch, Logstash, Kibana, Kafka, and Filebeat—covering architecture options, configuration files, command‑line operations, cluster setup, and best‑practice recommendations for scalable, real‑time log collection and analysis.

ELKElasticsearchFilebeat
0 likes · 22 min read
Comprehensive Guide to Building an ELK Log Management Platform with Kafka and Filebeat
21CTO
21CTO
Jul 11, 2019 · Big Data

Boost Elasticsearch Queries on Billions of Docs: Filesystem Cache & Smart Design

Elasticsearch performance at billions‑scale can be dramatically improved by leveraging the OS filesystem cache, limiting indexed fields, separating hot and cold data, pre‑warming caches, and using scroll or search_after for pagination, while avoiding costly joins and ensuring the dataset fits in memory.

ElasticsearchFilesystem Cachedata modeling
0 likes · 12 min read
Boost Elasticsearch Queries on Billions of Docs: Filesystem Cache & Smart Design
Mafengwo Technology
Mafengwo Technology
Jul 11, 2019 · Backend Development

How We Achieved Near‑Real‑Time MySQL‑to‑Elasticsearch Sync Using Binlog and Kafka

This article explains why traditional MySQL queries no longer meet the growing e‑commerce data needs, describes the limitations of a MySQL‑to‑Elasticsearch intermediate table, and details a binlog‑driven, Kafka‑based pipeline with custom modules, upsert handling, filtering, and monitoring to ensure fast, reliable data synchronization.

BackendBinlogElasticsearch
0 likes · 11 min read
How We Achieved Near‑Real‑Time MySQL‑to‑Elasticsearch Sync Using Binlog and Kafka
Big Data Technology Architecture
Big Data Technology Architecture
Jul 9, 2019 · Operations

Elasticsearch Node Shutdown Process and Risks During Rolling Upgrade

During a rolling upgrade of an Elasticsearch cluster, stopping nodes—especially the master—can block write requests, cause client connection failures, trigger master re‑election, and lead to temporary data duplication, making it essential to understand the shutdown sequence and its impact on read/write operations.

ElasticsearchNode ShutdownRolling Upgrade
0 likes · 5 min read
Elasticsearch Node Shutdown Process and Risks During Rolling Upgrade
ITPUB
ITPUB
Jul 6, 2019 · Backend Development

How Elasticsearch Revolutionized Search and Logging: The ELK Stack Story

This article narrates the origin and evolution of Elasticsearch, from its Lucene roots through Compass to the modern ELK Stack, illustrating how it simplifies full‑text search, log analysis, and real‑time monitoring for developers and operations teams.

BeatsELKElasticsearch
0 likes · 13 min read
How Elasticsearch Revolutionized Search and Logging: The ELK Stack Story
Big Data Technology Architecture
Big Data Technology Architecture
Jul 3, 2019 · Backend Development

Step-by-Step Guide to Installing Elasticsearch 7.x (Single‑Node) and Elasticsearch‑head

This article provides a comprehensive tutorial for installing Elasticsearch 7.x in single‑node mode, configuring its key settings, deploying the Elasticsearch‑head web UI via Tomcat, and includes reference configuration files for Elasticsearch 6.x, complete with command‑line examples and code snippets.

ConfigurationElasticsearchElasticsearch-head
0 likes · 8 min read
Step-by-Step Guide to Installing Elasticsearch 7.x (Single‑Node) and Elasticsearch‑head
Programmer DD
Programmer DD
Jun 28, 2019 · Backend Development

Master Elasticsearch with Jest: A Hands‑On Java Client Guide

This tutorial walks you through using Jest, a fluent Java HTTP client for Elasticsearch, covering Maven setup, creating a reusable client, performing CRUD operations, bulk and asynchronous requests, and showcases code examples that illustrate how to manage indices and documents efficiently.

ElasticsearchJestSearch
0 likes · 12 min read
Master Elasticsearch with Jest: A Hands‑On Java Client Guide
JavaEdge
JavaEdge
Jun 26, 2019 · Backend Development

How Does Elasticsearch Write and Query Data? A Deep Dive into ES Internals

This article explains the complete workflow of Elasticsearch write, read, search, delete, and update operations, covering coordinating nodes, shard routing, buffer refresh, translog, segment files, commit/flush processes, and the underlying inverted index mechanism.

ElasticsearchSearch Architecturenear real-time
0 likes · 10 min read
How Does Elasticsearch Write and Query Data? A Deep Dive into ES Internals
Architecture Digest
Architecture Digest
Jun 25, 2019 · Operations

Design and Implementation of a Unified Monitoring and Alert System for MaFengWo Large Transportation Business

This article describes the motivation, architecture, key components, rule engine, alert actions, and practical lessons learned while building a unified monitoring and alarm system for MaFengWo's large‑scale transportation platform, highlighting data collection, Elasticsearch storage, scheduling, and future enhancements.

AlertingElasticsearcharchitecture
0 likes · 13 min read
Design and Implementation of a Unified Monitoring and Alert System for MaFengWo Large Transportation Business
Architect's Tech Stack
Architect's Tech Stack
Jun 23, 2019 · Big Data

Elasticsearch Interview Questions: Architecture, Indexing, Optimization, and Operations

This article compiles common Elasticsearch interview questions and detailed answers covering cluster architecture, inverted index fundamentals, index design, write/query optimizations, master election, document indexing flow, search process, Linux tuning, and Lucene internals, providing practical guidance for candidates.

ClusterElasticsearchindexing
0 likes · 10 min read
Elasticsearch Interview Questions: Architecture, Indexing, Optimization, and Operations
macrozheng
macrozheng
Jun 14, 2019 · Operations

Step‑by‑Step: Deploy the Mall Application on Linux Using Docker

This guide walks you through installing Docker on CentOS 7.6 and deploying a complete mall stack—including MySQL, Redis, Nginx, RabbitMQ, Elasticsearch, MongoDB, and a SpringBoot service—inside Docker containers, configuring volumes, ports, firewalls, and testing the APIs.

DeploymentDockerElasticsearch
0 likes · 11 min read
Step‑by‑Step: Deploy the Mall Application on Linux Using Docker
21CTO
21CTO
Jun 10, 2019 · Databases

Master Elasticsearch in Python: From Installation to Advanced Queries

This tutorial introduces Elasticsearch, explains its architecture and use cases, walks through installation, index creation, mapping, CRUD operations, and demonstrates how to integrate and query Elasticsearch from Python using both the REST API and the official client library.

ElasticsearchNoSQLPython
0 likes · 13 min read
Master Elasticsearch in Python: From Installation to Advanced Queries
dbaplus Community
dbaplus Community
Jun 2, 2019 · Databases

Why Sharding (Database Partitioning) Beats Partitioning and NoSQL for Massive Data

The article explains why sharding (splitting databases and tables) is the preferred solution for handling massive user, order, and transaction data in high‑traffic internet applications, comparing it with partitioning and NoSQL/NewSQL alternatives, and detailing practical middleware choices, sharding column selection, and integration with Elasticsearch and HBase.

ElasticsearchHBasedatabase partitioning
0 likes · 14 min read
Why Sharding (Database Partitioning) Beats Partitioning and NoSQL for Massive Data
Efficient Ops
Efficient Ops
May 30, 2019 · Operations

How to Supercharge Elasticsearch for Massive Log Analytics: Real-World Optimizations

This article examines the unique characteristics of log data, outlines the challenges of using Elasticsearch at scale, and presents practical optimization techniques—including ingestion, mapping, time‑range search, metadata loading, and a custom C++ engine—to dramatically improve performance, stability, and cost efficiency.

BackendElasticsearchLog Analytics
0 likes · 11 min read
How to Supercharge Elasticsearch for Massive Log Analytics: Real-World Optimizations
Architecture Digest
Architecture Digest
May 28, 2019 · Backend Development

Improving Elasticsearch Query Performance for Billion‑Scale Datasets

To boost Elasticsearch query speed on billions of records, allocate sufficient filesystem cache memory, store only searchable fields, separate hot and cold data, warm up cache, avoid complex joins, and replace deep pagination with Scroll API or search_after for millisecond‑level responses.

ElasticsearchFilesystem Cachedata modeling
0 likes · 10 min read
Improving Elasticsearch Query Performance for Billion‑Scale Datasets
Fangduoduo Tech
Fangduoduo Tech
May 25, 2019 · Backend Development

How Fangdd Scales Real‑Estate Search with Elasticsearch: Architecture & Lessons

This article explains how Fangdd leverages Elasticsearch to boost search performance across consumer, broker, and internal products, detailing a platformized architecture that separates indexing and querying, addresses operational challenges, and outlines design patterns for index management and incremental updates.

ElasticsearchMicroservicesSearch Architecture
0 likes · 12 min read
How Fangdd Scales Real‑Estate Search with Elasticsearch: Architecture & Lessons
Tencent Cloud Developer
Tencent Cloud Developer
May 24, 2019 · Cloud Computing

How Tencent Cloud Elasticsearch Enables Multi‑AZ Disaster Recovery

Tencent Cloud Elasticsearch now supports cross‑availability‑zone deployment, requiring even‑numbered data nodes, dedicated master nodes, and replica settings to ensure continuous service when a zone fails, with detailed steps for quick setup and region limitations explained.

ElasticsearchMulti‑AZTencent Cloud
0 likes · 6 min read
How Tencent Cloud Elasticsearch Enables Multi‑AZ Disaster Recovery
dbaplus Community
dbaplus Community
May 21, 2019 · Big Data

How to Supercharge Elasticsearch Queries on Billions of Records

This article explains why Elasticsearch can be slow on massive datasets, then details practical techniques—leveraging filesystem cache, pre‑heating hot data, separating hot and cold indices, designing lean document models, and avoiding deep pagination—to achieve sub‑second query performance at billions‑scale.

Big DataElasticsearchdata modeling
0 likes · 11 min read
How to Supercharge Elasticsearch Queries on Billions of Records
macrozheng
macrozheng
May 20, 2019 · Backend Development

Integrating Elasticsearch with Spring Boot for Full-Text Product Search

This guide walks through installing Elasticsearch and Kibana, configuring a Chinese analyzer, defining Spring Data Elasticsearch annotations, creating repository and service layers, building a REST controller, and testing product search functionality within a Spring Boot mall application.

ElasticsearchFull‑Text SearchSpring Boot
0 likes · 14 min read
Integrating Elasticsearch with Spring Boot for Full-Text Product Search
dbaplus Community
dbaplus Community
May 9, 2019 · Databases

Exporting Redis Slowlog to Elasticsearch with a Customized rsbeat

This guide explains how to overcome Redis slowlog retention limits by modifying rsbeat to collect and ship slowlog entries—including sentinel and cluster support—to Elasticsearch, where Kibana can be used for detailed analysis and visualization.

BeatsDatabase MonitoringElasticsearch
0 likes · 7 min read
Exporting Redis Slowlog to Elasticsearch with a Customized rsbeat
Efficient Ops
Efficient Ops
May 9, 2019 · Operations

Master ELK Log Processing: Encoding, Multiline, Grok, and Performance Tuning

This article compiles practical ELK knowledge, covering character‑set conversion, removing unwanted log lines, Grok pattern handling for multi‑line logs, multiline plugin usage in Filebeat and Logstash, date filtering, log type classification, performance optimization, Redis buffering, and Elasticsearch node tuning.

ELKElasticsearchFilebeat
0 likes · 16 min read
Master ELK Log Processing: Encoding, Multiline, Grok, and Performance Tuning
Tencent Cloud Developer
Tencent Cloud Developer
May 7, 2019 · Databases

A New Era of Cluster Coordination in Elasticsearch 7.0

Elasticsearch 7.0 replaces Zen Discovery with an automatic, quorum‑based cluster‑coordination subsystem that elects master‑eligible nodes, simplifies bootstrapping via cluster.initial_master_nodes, supports safe rolling upgrades, and provides robust fault tolerance through a consensus protocol similar to Paxos or Raft.

7.0Cluster CoordinationElasticsearch
0 likes · 18 min read
A New Era of Cluster Coordination in Elasticsearch 7.0
macrozheng
macrozheng
May 5, 2019 · Backend Development

Essential Resources to Master the Technologies Behind a Mall Project

This guide compiles must‑read books and tutorials on Spring, Spring Boot, MyBatis, MySQL, Linux, Elasticsearch, MongoDB, Docker and related tools, helping developers quickly acquire the knowledge needed to build and deploy a complex e‑commerce mall application.

DockerElasticsearchLearning Resources
0 likes · 5 min read
Essential Resources to Master the Technologies Behind a Mall Project
macrozheng
macrozheng
May 4, 2019 · Backend Development

Explore the Full‑Featured Mall E‑Commerce System Built with Spring Boot & MyBatis

The Mall project is a comprehensive e‑commerce solution featuring a front‑end storefront and back‑office management, implemented with Spring Boot, MyBatis, and a suite of modern technologies such as Redis, Elasticsearch, RabbitMQ, Docker, and more, offering modules for products, orders, marketing, and analytics.

DockerElasticsearchMyBatis
0 likes · 4 min read
Explore the Full‑Featured Mall E‑Commerce System Built with Spring Boot & MyBatis
Tencent Cloud Developer
Tencent Cloud Developer
Apr 29, 2019 · Operations

Introduction to Elastic Stack and Building an Automated Log Monitoring System

This guide explains how to combine Tencent Cloud Elasticsearch with the Elastic Stack—Filebeat, Logstash, and Kibana—to automatically collect JSON‑formatted logs from development workflows, route them to dynamically created indices, and visualize status dashboards, while highlighting best‑practice tips for schema design, deduplication, and future scaling.

BeatsElastic StackElasticsearch
0 likes · 6 min read
Introduction to Elastic Stack and Building an Automated Log Monitoring System
Youzan Coder
Youzan Coder
Apr 27, 2019 · Big Data

Recap of Elastic Community Technical Salon: Cluster Management, Multi‑Tenant Practices, and Search Platform Engineering

On April 27, Youzan Technology and the Elastic Chinese community hosted a “starry sky” technical salon where experts from Getui, Ant Financial, Haipai Ke and Youzan presented four talks on large‑cluster proxy management, multi‑tenant ES optimization, search‑platform engineering, and the evolution of Youzan’s log platform, followed by lively Q&A and resource sharing.

Cluster ManagementElasticsearchLog Analytics
0 likes · 6 min read
Recap of Elastic Community Technical Salon: Cluster Management, Multi‑Tenant Practices, and Search Platform Engineering
Efficient Ops
Efficient Ops
Apr 21, 2019 · Backend Development

Mastering Elasticsearch: From Inverted Index to Distributed Search

This article walks through the fundamentals of search engines, explaining inverted indexes, the explosion of index size, core Elasticsearch concepts, its distributed architecture, and how it powers the ELK stack for log analysis, all illustrated with clear diagrams and examples.

BackendDistributed SystemsELK
0 likes · 6 min read
Mastering Elasticsearch: From Inverted Index to Distributed Search
JD Retail Technology
JD Retail Technology
Apr 18, 2019 · Big Data

Data Heterogeneity with BinLake, Binlog, and Flink: Approaches for Order, Subscription, and Product Data

The article explains how data heterogeneity is achieved using JD's BinLake to capture MySQL binlogs, with Flink handling sequential and parallel consumption for order, subscription, and product data, discussing challenges such as ordering guarantees, idempotency, IO overhead, and the shift toward stream‑processing architectures.

BinlogElasticsearchFlink
0 likes · 5 min read
Data Heterogeneity with BinLake, Binlog, and Flink: Approaches for Order, Subscription, and Product Data
Youzan Coder
Youzan Coder
Apr 17, 2019 · Big Data

Order Data Synchronization Architecture at YouZan: From MySQL to ES and HBase

YouZan’s order data synchronization moves changes from MySQL through Canal‑parsed binlogs into a message queue, then uses sequential SeqNo‑based optimistic locking and HBase’s column‑version timestamps to guarantee ordering for both single‑ and multi‑table updates, while a Logstash‑style configurable pipeline feeds ES for search and HBase for detail queries, eliminating ordered‑queue bottlenecks and ensuring high‑throughput consistency.

BinlogCanalDistributed Systems
0 likes · 12 min read
Order Data Synchronization Architecture at YouZan: From MySQL to ES and HBase
dbaplus Community
dbaplus Community
Apr 16, 2019 · Big Data

Scaling Elasticsearch for Billions of Daily Events: Cluster Planning, Routing & Hot‑Warm Tips

This article explains how to handle a real‑time OLAP monitoring platform processing 10‑12 billion daily events and 400 billion yearly records by optimizing Elasticsearch 5.3.3 through cluster planning, storage strategies, index sharding, compression, hot‑warm architecture, routing, index templates, rollover, and cross‑cluster search, providing concrete configurations and code examples.

Big DataCluster PlanningElasticsearch
0 likes · 23 min read
Scaling Elasticsearch for Billions of Daily Events: Cluster Planning, Routing & Hot‑Warm Tips
Programmer DD
Programmer DD
Apr 16, 2019 · Big Data

Solr vs Elasticsearch: Which Full‑Text Search Engine Wins in 2024?

This article explains the fundamentals of full‑text search, compares Solr, Elasticsearch and their underlying Lucene library, discusses when to choose each engine, and provides practical guidance for developers facing unstable search services or needing scalable, distributed indexing solutions.

ElasticsearchFull‑Text SearchSolr
0 likes · 18 min read
Solr vs Elasticsearch: Which Full‑Text Search Engine Wins in 2024?
Youzan Coder
Youzan Coder
Apr 12, 2019 · Industry Insights

How Youzan Scaled Its Log Platform to Handle Billions of Daily Logs

This article details Youzan's evolution from a simple Flume‑based log collector to a multi‑tenant, Kafka‑buffered, Spark‑processed, HBase‑backed logging architecture that now handles hundreds of billions of log entries per day, highlighting challenges, design decisions, and future improvements.

Distributed SystemsElasticsearchHBase
0 likes · 10 min read
How Youzan Scaled Its Log Platform to Handle Billions of Daily Logs
Youzan Coder
Youzan Coder
Apr 7, 2019 · Industry Insights

How Youzan Scaled Order Search: Hot‑State Indexing and AKF Expansion

This article reviews the evolution of Youzan's order search architecture over two years, detailing challenges from data growth, the creation of a hot‑state index covering half of search traffic, time‑sharded indexes, and the AKF expansion cube that guides multi‑axis scalability.

Big DataElasticsearchScalability
0 likes · 10 min read
How Youzan Scaled Order Search: Hot‑State Indexing and AKF Expansion
Architecture Talk
Architecture Talk
Jan 8, 2019 · Big Data

Boost Elasticsearch Performance: Bulk API, Gateway & Caching Secrets

This article explains how to dramatically improve Elasticsearch throughput by using the bulk API, tuning bulk request sizes, configuring gateway settings, optimizing cluster state updates, managing caches, leveraging fielddata and doc values, and employing tools like Curator and the Profiler for efficient cluster operations.

Cluster ManagementElasticsearchbulk API
0 likes · 27 min read
Boost Elasticsearch Performance: Bulk API, Gateway & Caching Secrets
Didi Tech
Didi Tech
Jan 7, 2019 · Big Data

Didi's Multi-Cluster Elasticsearch Architecture: Challenges and Practices

Didi transformed its massive single‑cluster Elasticsearch deployment into a transparent multi‑cluster architecture using TribeNode and cross‑cluster search, isolating workloads, reducing fault impact, and achieving five‑fold scale while preserving a single‑cluster appearance for services, despite added configuration complexity and stability challenges.

DidiDistributed SearchElasticsearch
0 likes · 17 min read
Didi's Multi-Cluster Elasticsearch Architecture: Challenges and Practices
dbaplus Community
dbaplus Community
Jan 3, 2019 · Backend Development

Supercharging Elasticsearch for Billion-Row Queries: Practical Tips

This guide details how to optimize Elasticsearch for handling billions of daily records, covering core Lucene concepts, index and shard configuration, performance‑tuning parameters, and practical testing methods to achieve sub‑second query responses and long‑term data retention.

Big DataElasticsearchSearch
0 likes · 13 min read
Supercharging Elasticsearch for Billion-Row Queries: Practical Tips
MaGe Linux Operations
MaGe Linux Operations
Dec 30, 2018 · Operations

Step‑by‑Step Guide to Building an ELK Stack on CentOS 6.7

This tutorial walks you through setting up Java, ElasticSearch 2.1.0, Logstash 2.1.1, Kibana 4.3.1, and NGINX on a CentOS 6.7 server, configuring each component, linking them together, and troubleshooting common time‑zone issues so you can visualize logs with Kibana.

CentOSELKElasticsearch
0 likes · 8 min read
Step‑by‑Step Guide to Building an ELK Stack on CentOS 6.7
vivo Internet Technology
vivo Internet Technology
Dec 28, 2018 · Big Data

Running a 400+ Node Elasticsearch Cluster: Architecture, Scaling, and Performance Tuning

Meltwater’s media‑monitoring platform runs a custom Elasticsearch 1.7.6 cluster of over 400 nodes on AWS, handling 200 TB of primary data and 3 million daily documents while serving thousands of complex queries per minute, achieved through careful shard design, master‑node configuration, extensive performance tuning, and automated provisioning.

AWSCluster ManagementElasticsearch
0 likes · 13 min read
Running a 400+ Node Elasticsearch Cluster: Architecture, Scaling, and Performance Tuning
vivo Internet Technology
vivo Internet Technology
Dec 28, 2018 · Cloud Native

Curated Technical Resources for Vivo Mobile Internet (Elasticsearch, Jenkins, Kubernetes, Service Mesh, Big Data, Java, Spring Cloud)

This page curates a collection of high‑quality technical articles and tutorials for developers working within the Vivo Mobile Internet ecosystem, covering Elasticsearch performance and search tuning, Jenkins CI/CD pipelines, Kubernetes scheduling and TensorFlow, service‑mesh resources, SparkSQL big‑data optimizations, Java concurrency, Quick‑App development, and Spring Cloud microservice frameworks.

ElasticsearchJenkinsKubernetes
0 likes · 6 min read
Curated Technical Resources for Vivo Mobile Internet (Elasticsearch, Jenkins, Kubernetes, Service Mesh, Big Data, Java, Spring Cloud)
dbaplus Community
dbaplus Community
Dec 27, 2018 · Operations

How JD Daojia Scaled Its Order Search with a Real‑Time Dual Elasticsearch Cluster

This article details how JD Daojia’s order center migrated from MySQL‑only reads to a multi‑stage Elasticsearch architecture, describing each evolution step, data‑sync strategies, performance pitfalls, and the final real‑time active‑passive cluster that ensures high availability for billions of daily queries.

Cluster ArchitectureElasticsearchdata-sync
0 likes · 14 min read
How JD Daojia Scaled Its Order Search with a Real‑Time Dual Elasticsearch Cluster
Didi Tech
Didi Tech
Dec 26, 2018 · Cloud Native

Top Developer Tools of 2018: A Comprehensive Overview

The 2018 developer‑tool roundup highlights Elasticsearch for log processing, gRPC for high‑performance RPC, the CNCF ecosystem (Kubernetes, Prometheus, etc.), Python’s AI dominance, cross‑platform Mini‑Programs, VSCode’s plugin‑rich IDE, Vue.js front‑end simplicity, GraphQL’s flexible APIs, and notes a shift toward mobile, cloud‑native infrastructure and commercial open‑source licensing.

CNCFElasticsearchPython
0 likes · 9 min read
Top Developer Tools of 2018: A Comprehensive Overview
Ops Development Stories
Ops Development Stories
Dec 21, 2018 · Operations

How to Install Zabbix Server, MySQL, Nginx, PHP, and Elasticsearch on CentOS

This comprehensive tutorial walks you through adding the Zabbix repository, installing Zabbix server and web interface, setting up MySQL 5.7, configuring Nginx and PHP from source, deploying the Zabbix agent, installing Elasticsearch with the head plugin, and finally storing Zabbix history data in Elasticsearch on a CentOS system.

CentOSElasticsearchPHP
0 likes · 20 min read
How to Install Zabbix Server, MySQL, Nginx, PHP, and Elasticsearch on CentOS
Qunar Tech Salon
Qunar Tech Salon
Dec 18, 2018 · Big Data

Practical Insights on Deploying and Operating Elasticsearch at Scale

This article shares extensive practical experience from Qunar's large‑scale Elasticsearch deployment, covering suitable use cases, index‑type design, document ID strategies, scaling considerations for index and data volume, hardware sizing, and storage architecture recommendations to help newcomers avoid common pitfalls.

Big DataElasticsearchSearch
0 likes · 10 min read
Practical Insights on Deploying and Operating Elasticsearch at Scale
Youzan Coder
Youzan Coder
Dec 10, 2018 · Backend Development

How Youzan Scaled Order Export to Millions with ES, HBase, and Config‑Driven Design

This article examines the challenges of Youzan's order export system, describes the migration from PHP‑based scripts to an Elasticsearch and HBase stack, and details the step‑by‑step configuration‑driven refactor—including enum field definitions, Groovy scripts, strategy patterns, plugin architecture, and quality‑assurance practices—that enabled million‑order exports with high performance and stability.

Backend ArchitectureConfigurationElasticsearch
0 likes · 13 min read
How Youzan Scaled Order Export to Millions with ES, HBase, and Config‑Driven Design
Sohu Tech Products
Sohu Tech Products
Dec 5, 2018 · Backend Development

Overview of Web Crawler Types and the Architecture of the Mole Crawler System

This article explains the evolution and classification of web crawlers, describes the design and components of the Mole distributed crawler—including scheduler, fetcher, processor, rate‑limiting, URL deduplication, and Elasticsearch storage optimization—and outlines common anti‑anti‑crawling strategies.

ElasticsearchWeb Crawleranti‑crawling
0 likes · 12 min read
Overview of Web Crawler Types and the Architecture of the Mole Crawler System
21CTO
21CTO
Dec 3, 2018 · Operations

How JD Daojia Scaled Its Elasticsearch Cluster to Billions of Docs: Lessons and Pitfalls

This article details JD Daojia's order center Elasticsearch architecture evolution—from a chaotic initial deployment to a real‑time dual‑cluster backup—covering scaling strategies, data synchronization methods, and the practical pitfalls encountered along the way.

Cluster ArchitectureElasticsearchdata synchronization
0 likes · 14 min read
How JD Daojia Scaled Its Elasticsearch Cluster to Billions of Docs: Lessons and Pitfalls
JD Tech
JD Tech
Dec 3, 2018 · Backend Development

Evolution of JD.com Order Center Elasticsearch Cluster Architecture

This article details how JD.com’s order center migrated its Elasticsearch cluster through multiple stages—from an initial unoptimized deployment to a real‑time dual‑cluster backup solution—addressing scalability, reliability, shard tuning, version upgrades, and data synchronization strategies to support billions of documents and hundreds of millions of daily queries.

Cluster ArchitectureElasticsearchJD.com
0 likes · 13 min read
Evolution of JD.com Order Center Elasticsearch Cluster Architecture
Dada Group Technology
Dada Group Technology
Nov 30, 2018 · Big Data

Evolution of JD Daojia Order Center Elasticsearch Cluster: Architecture, Scaling, and Lessons Learned

This article details how JD Daojia's order center migrated from MySQL to a multi‑stage Elasticsearch cluster—covering initial deployment, isolation, replica tuning, primary‑secondary setup, real‑time dual‑cluster upgrades, data synchronization methods, and key pitfalls—to achieve massive scalability, high availability, and performance for billions of orders.

Cluster ArchitectureElasticsearchScalability
0 likes · 13 min read
Evolution of JD Daojia Order Center Elasticsearch Cluster: Architecture, Scaling, and Lessons Learned
Beike Product & Technology
Beike Product & Technology
Nov 23, 2018 · Backend Development

Elasticsearch Internals: Distributed Document Storage, Real‑time Search, and Translog Mechanics

This article explains the core Elasticsearch architecture—including shard routing, primary‑replica interaction, document CRUD workflows, multi‑document APIs, segment merging, translog durability, and storage file formats—providing a comprehensive view of how near‑real‑time search is achieved on large‑scale data.

ElasticsearchSegment Mergingdistributed storage
0 likes · 20 min read
Elasticsearch Internals: Distributed Document Storage, Real‑time Search, and Translog Mechanics
vivo Internet Technology
vivo Internet Technology
Nov 16, 2018 · Artificial Intelligence

Efficient Vector Search with Deep Learning Embeddings in Elasticsearch

The article explains how to replace keyword matching with deep‑learning document embeddings in Elasticsearch by applying PCA dimensionality reduction, indexing vectors using Lucene’s KD‑tree structures via a custom plugin, and leveraging FAISS‑style nearest‑neighbour techniques to achieve fast, semantically aware similarity search.

Deep LearningElasticsearchFAISS
0 likes · 7 min read
Efficient Vector Search with Deep Learning Embeddings in Elasticsearch
DataFunTalk
DataFunTalk
Nov 9, 2018 · Backend Development

From Zero to One: Building and Optimizing Search Engines with Elasticsearch – Insights and Case Studies

This article presents a comprehensive overview of constructing a search engine using Elasticsearch, covering architecture components, data read/write mechanisms, shard management, caching strategies, and real‑world case studies that illustrate performance tuning, isolation, and deployment best practices.

Distributed SystemsElasticsearchbackend-development
0 likes · 14 min read
From Zero to One: Building and Optimizing Search Engines with Elasticsearch – Insights and Case Studies
JD Tech
JD Tech
Nov 5, 2018 · Operations

Practical Guide to Elasticsearch Monitoring and Operations

This article provides a comprehensive, operations‑focused overview of Elasticsearch monitoring, covering tool selection, key metrics for black‑box and white‑box monitoring, common issues discovered through alerts, and practical optimization recommendations to ensure high availability of ES clusters.

ElasticsearchSREtools
0 likes · 8 min read
Practical Guide to Elasticsearch Monitoring and Operations
360 Quality & Efficiency
360 Quality & Efficiency
Oct 12, 2018 · Backend Development

Automated Testing and Monitoring Solution for DSP Advertising Business

The article outlines a comprehensive automated testing framework for a DSP advertising platform, covering income, interface, and log layers, and detailing the use of protobuf, Pytest, Logstash, ElasticSearch, Jenkins, and Allure to achieve efficient, real‑time quality assurance and continuous integration.

Automated TestingDSPElasticsearch
0 likes · 8 min read
Automated Testing and Monitoring Solution for DSP Advertising Business
21CTO
21CTO
Oct 12, 2018 · Backend Development

How Elastic’s IPO Mirrors the Rise of Open‑Source Search Engines

The article chronicles Elastic’s journey from a small open‑source search tool to a NYSE‑listed company, explaining Elasticsearch’s technical foundations, its real‑world applications, and what the IPO means for developers and the broader search‑technology ecosystem.

ElasticsearchIPObackend-development
0 likes · 10 min read
How Elastic’s IPO Mirrors the Rise of Open‑Source Search Engines
ITPUB
ITPUB
Oct 8, 2018 · Big Data

From Open‑Source Search to a Billion‑Dollar IPO: The Elastic Story

Elastic's NYSE debut, its rapid stock surge, the origins and technical strengths of Elasticsearch, and what the company's public listing means for developers and tech entrepreneurs are explored in detail, highlighting the journey from a personal recipe‑search tool to a global data‑search platform.

ElasticsearchIPOelastic
0 likes · 9 min read
From Open‑Source Search to a Billion‑Dollar IPO: The Elastic Story
Programmer DD
Programmer DD
Oct 6, 2018 · Big Data

Elastic Search IPO: What It Means for Search and Big Data

Elastic announced its IPO on the NYSE under ticker ESTC, highlighting its origins, rapid growth to over 5000 customers worldwide, a $160 million FY2018 revenue, and its Elastic Stack suite that powers search and analytics across industries, while investors celebrated the stock surge.

Big DataElasticsearchIPO
0 likes · 6 min read
Elastic Search IPO: What It Means for Search and Big Data
Ops Development Stories
Ops Development Stories
Oct 3, 2018 · Operations

Master Logstash: Essential Commands and Top Log Collection Plugins

This guide walks through Logstash fundamentals, from creating basic pipelines with input, filter, and output sections to using common plugins such as grok, mutate, date, geoip, multiline, and integrations with NGINX, rsyslog, Redis, and Docker‑based Logspout, providing practical configuration examples and command‑line tips.

DockerElasticsearchLogstash
0 likes · 27 min read
Master Logstash: Essential Commands and Top Log Collection Plugins
HomeTech
HomeTech
Sep 25, 2018 · Operations

Design and Implementation of an Integrated Log Collection, Analysis, and Monitoring System

This article describes how a rapidly growing technical team built a unified log system that consolidates program, web access, and slow logs, introduces host‑agent and process‑agent collection, leverages Kafka, Elasticsearch, and Storm for high‑throughput processing, and provides monitoring, alerting, and reporting features to improve reliability and operational efficiency.

Big DataElasticsearchLog Management
0 likes · 20 min read
Design and Implementation of an Integrated Log Collection, Analysis, and Monitoring System
Youzan Coder
Youzan Coder
Sep 14, 2018 · Big Data

Elasticsearch Optimization and Index Splitting Strategies in the Youzan Search System

The Youzan search system uses middleware‑driven Elasticsearch optimizations—segment merging, larger buffers, routing, and rollover—to cut index files and document scans, splits large indices into business‑specific or hot‑cold sub‑indices, and adds asynchronous cross‑datacenter replication with soft‑delete versioning for high‑availability and scalable performance.

ElasticsearchHot/Cold IsolationIndex Optimization
0 likes · 10 min read
Elasticsearch Optimization and Index Splitting Strategies in the Youzan Search System
System Architect Go
System Architect Go
Sep 3, 2018 · Fundamentals

Understanding Elasticsearch Analyzer, Tokenizer, and Token Filters

This article explains the core components of Elasticsearch's full‑text search analysis—Analyzers, Tokenizers, and Token Filters—detailing their roles, building blocks, built‑in types, and how they combine to customize text processing for effective indexing and querying.

ElasticsearchFull‑Text SearchToken Filter
0 likes · 5 min read
Understanding Elasticsearch Analyzer, Tokenizer, and Token Filters
Youzan Coder
Youzan Coder
Aug 31, 2018 · Big Data

Evolution of Youzan Search Platform Architecture: From 1.0 to 4.0

The Youzan Search Platform evolved from a simple Elasticsearch cluster in 2015 to a modular, message‑driven architecture with proxy validation, caching, and management tools, and now plans a cloud‑native, Kubernetes‑based 4.0 version that automates data sync, isolates workloads, and scales elastically to support billions of records.

Data IntegrationElasticsearchProxy
0 likes · 14 min read
Evolution of Youzan Search Platform Architecture: From 1.0 to 4.0
360 Tech Engineering
360 Tech Engineering
Aug 29, 2018 · Operations

Monitoring Elasticsearch Performance: Host‑Level System and Network Metrics, Cluster Health, and Resource Saturation

This article continues the Elasticsearch performance monitoring series by detailing host‑level system and network metrics, cluster health and node availability, resource saturation, and related errors, providing practical guidance on disk space, I/O, CPU, network throughput, file descriptors, HTTP connections, thread pools, caches, pending tasks, and failed GET requests.

ElasticsearchOperationsPerformance Monitoring
0 likes · 14 min read
Monitoring Elasticsearch Performance: Host‑Level System and Network Metrics, Cluster Health, and Resource Saturation
System Architect Go
System Architect Go
Aug 4, 2018 · Databases

Synchronizing MySQL Data to Elasticsearch: Methods and Practices

This article reviews various approaches for keeping MySQL data in sync with Elasticsearch, including direct business‑layer hooks, independent synchronization via plugins or custom scripts, and real‑time binlog subscription using tools like zongji, while discussing their advantages, drawbacks, and implementation details.

BinlogElasticsearchPlugins
0 likes · 4 min read
Synchronizing MySQL Data to Elasticsearch: Methods and Practices
Youzan Coder
Youzan Coder
Aug 4, 2018 · Databases

Key Takeaways from the 2018 Youzan PaaS Tech Meetup: TiDB, Zankv & Search

The 2018 Youzan PaaS meetup in Hangzhou featured deep dives into TiDB's distributed SQL architecture and performance gains in version 2.1, introduced the open‑source Zankv KV store built on Raft, and shared practical Elasticsearch search‑engine optimizations used at Youzan.

ElasticsearchKV StoreTech Meetup
0 likes · 4 min read
Key Takeaways from the 2018 Youzan PaaS Tech Meetup: TiDB, Zankv & Search
iQIYI Technical Product Team
iQIYI Technical Product Team
Aug 3, 2018 · Backend Development

Integrating Spring Data with Elasticsearch: Features, Use Cases, and Repository Loading Mechanism

The article explains how Spring Data provides a unified, Spring‑compatible data‑access layer for relational and NoSQL stores, illustrates its features and typical use cases, and walks through a practical Spring Data Elasticsearch integration—including configuration, entity and repository definitions—and details the dynamic proxy loading mechanism behind Spring Data repositories.

BackendElasticsearchNoSQL
0 likes · 14 min read
Integrating Spring Data with Elasticsearch: Features, Use Cases, and Repository Loading Mechanism
System Architect Go
System Architect Go
Jul 29, 2018 · Databases

What Is Elasticsearch? Core Concepts and Fundamentals

Elasticsearch is an open‑source, scalable, high‑availability distributed full‑text search engine that operates in near real‑time, using clusters of nodes, indexes, documents, shards and replicas to efficiently store and retrieve large volumes of data.

ClusterDistributed SystemsElasticsearch
0 likes · 4 min read
What Is Elasticsearch? Core Concepts and Fundamentals
Efficient Ops
Efficient Ops
Jul 22, 2018 · Operations

Essential ELK Stack Tools to Boost Your DevOps Efficiency

This guide presents a comprehensive overview of essential ELK Stack utilities—including head plugins, Kibana, ElasticHD, Cerebro, security extensions, visualization platforms, automation frameworks, and alerting solutions—complete with brief feature descriptions and direct links, helping developers and operations teams select the right tools to enhance development, monitoring, and maintenance efficiency.

ELKElasticsearchKibana
0 likes · 8 min read
Essential ELK Stack Tools to Boost Your DevOps Efficiency
MaGe Linux Operations
MaGe Linux Operations
Jul 19, 2018 · Databases

Master Elasticsearch with Python: From Installation to Advanced Queries

This tutorial walks you through installing Elasticsearch, creating indices, inserting and updating documents, performing searches via REST API, and integrating Elasticsearch into Python applications using both raw HTTP requests and the official Python client, illustrated with practical examples and screenshots.

ElasticsearchNoSQLREST API
0 likes · 11 min read
Master Elasticsearch with Python: From Installation to Advanced Queries
Ctrip Technology
Ctrip Technology
Jul 10, 2018 · Databases

Designing Real‑Time Sharding Index and Replication with Elasticsearch for High‑Performance Order Queries

This article describes how Ctrip's hotel R&D team tackled growing order‑volume challenges by sharding the database, building a real‑time Elasticsearch index, implementing a custom replication pipeline, and applying various write‑ and read‑optimizations to achieve low latency and stable performance.

ElasticsearchReplicationSQL Server
0 likes · 10 min read
Designing Real‑Time Sharding Index and Replication with Elasticsearch for High‑Performance Order Queries
Architecture Digest
Architecture Digest
Jul 9, 2018 · Databases

Elasticsearch Index Design and Sharding Principles

This article outlines practical guidelines for designing Elasticsearch indices, comparing single versus time‑based indexes, detailing mapping settings, shard allocation strategies, and deduplication methods, while providing concrete examples and code snippets for effective search infrastructure management.

Elasticsearchdata modelingindex design
0 likes · 5 min read
Elasticsearch Index Design and Sharding Principles
Beike Product & Technology
Beike Product & Technology
Jun 22, 2018 · Big Data

Beike Zhaofang's 秒X Real‑Time Analytics Platform: Architecture, Implementation, and Use Cases

The article details the design and deployment of the 秒X real‑time analytics platform at Beike Zhaofang, covering its background, Spark Streaming‑based architecture, fast configuration, data processing pipeline, monitoring, visualization, practical applications, and future development plans.

DruidElasticsearchReal-time analytics
0 likes · 7 min read
Beike Zhaofang's 秒X Real‑Time Analytics Platform: Architecture, Implementation, and Use Cases
Architecture Digest
Architecture Digest
Jun 15, 2018 · R&D Management

Elasticsearch Team Development Constitution: Principles and Guidelines for Sustainable Software Development

The Elasticsearch Team Development Constitution outlines a comprehensive set of principles and guidelines—covering design philosophy, code quality, interaction etiquette, and organizational responsibilities—to help the rapidly growing project evolve into a reliable, secure, scalable, and user‑friendly distributed search engine.

ElasticsearchScalabilitycode quality
0 likes · 19 min read
Elasticsearch Team Development Constitution: Principles and Guidelines for Sustainable Software Development
21CTO
21CTO
May 27, 2018 · Big Data

How to Install Elasticsearch 5.6.9 and Perform Advanced Data Aggregations with Java

This guide walks you through downloading and configuring Elasticsearch 5.6.9, setting system limits, creating indices, inserting and deleting documents, executing complex aggregation queries via HTTP, and integrating Elasticsearch with Java using the Transport client for powerful data analysis.

ElasticsearchInstallationdata aggregation
0 likes · 13 min read
How to Install Elasticsearch 5.6.9 and Perform Advanced Data Aggregations with Java
Architecture Digest
Architecture Digest
May 27, 2018 · Big Data

Installing Elasticsearch and Performing Data Aggregation Queries

This article walks through installing Elasticsearch 5.6.9, configuring system limits, creating indices, inserting and deleting documents, executing complex aggregation queries, and integrating Elasticsearch with Java using the TransportClient, providing a practical guide for building analytics on large‑scale data.

AnalyticsBig DataElasticsearch
0 likes · 12 min read
Installing Elasticsearch and Performing Data Aggregation Queries
UCloud Tech
UCloud Tech
Apr 18, 2018 · Big Data

How Elasticsearch Powers Billion‑Record Log Analysis and Full‑Text Search

This article explains how Elasticsearch and the ELK stack address challenges of storing, securing, retrieving, and analyzing massive data volumes by providing distributed real‑time search, log collection, visualization, and even serving as a NoSQL alternative for large‑scale applications.

Big DataELKElasticsearch
0 likes · 7 min read
How Elasticsearch Powers Billion‑Record Log Analysis and Full‑Text Search
MaGe Linux Operations
MaGe Linux Operations
Apr 11, 2018 · Big Data

Master ELK Stack: Install & Configure Elasticsearch, Logstash, Kibana

This step‑by‑step guide walks you through setting up the ELK stack on CentOS 6, covering Elasticsearch repository configuration, Java installation, Elasticsearch tuning, Logstash pipelines, Kibana deployment, Redis integration, and practical log collection for system, Apache, Nginx, and MySQL logs.

ELKElasticsearchInstallation
0 likes · 22 min read
Master ELK Stack: Install & Configure Elasticsearch, Logstash, Kibana
Efficient Ops
Efficient Ops
Apr 8, 2018 · Operations

Why ELK Is the Ultimate Solution for Log Management and Monitoring

This article introduces the ELK stack—Elasticsearch, Logstash, and Kibana—explaining its core components, architecture, comparison with databases and grep, typical use cases across security, networking, and application monitoring, deployment considerations, challenges, SaaS prospects, and recommended learning resources.

ELKElasticsearchLog Management
0 likes · 10 min read
Why ELK Is the Ultimate Solution for Log Management and Monitoring
Ctrip Technology
Ctrip Technology
Mar 8, 2018 · Big Data

Ctrip Wireless APM Platform: Architecture, Metrics, and Technical Details

The article describes the evolution of Ctrip's wireless APM platform from the early UBT-based monitoring to a globally‑oriented, metric‑rich system that processes over 100 billion data points daily using Storm and Elasticsearch, detailing its design, key performance dimensions, data‑volume trade‑offs, and implementation choices.

APMBig DataCtrip
0 likes · 12 min read
Ctrip Wireless APM Platform: Architecture, Metrics, and Technical Details