Tagged articles
297 articles
Page 3 of 3
Practical DevOps Architecture
Practical DevOps Architecture
Nov 16, 2020 · Big Data

Using curl to Perform CRUD Operations in Elasticsearch

This article introduces Elasticsearch as a Lucene‑based distributed search engine and demonstrates how to use curl commands to create, read, update, and delete documents and indices, providing step‑by‑step examples with command‑line output and screenshots.

CRUDElasticsearchREST API
0 likes · 4 min read
Using curl to Perform CRUD Operations in Elasticsearch
Java Architect Essentials
Java Architect Essentials
Oct 30, 2020 · Databases

Elasticsearch Essentials: Quick Start, Index Management, Mapping, and Advanced Operations

The article offers a thorough, step‑by‑step guide to Elasticsearch, explaining how to check cluster health, create and manage indices, define mappings and field types, use dynamic mapping, and perform maintenance tasks such as shrink, split, rollover, and cache management, all illustrated with concrete API examples.

ElasticsearchIndex ManagementMapping
0 likes · 17 min read
Elasticsearch Essentials: Quick Start, Index Management, Mapping, and Advanced Operations
php Courses
php Courses
Oct 30, 2020 · Big Data

Introduction to Elasticsearch and Its Integration with Laravel

This article explains Elasticsearch's foundation on Lucene, compares its concepts to MySQL, describes inverted indexing, and provides a step‑by‑step guide for installing, configuring, and using the basemkhirat/elasticsearch Laravel plugin with code examples and tips for Chinese analysis.

BackendElasticsearchLaravel
0 likes · 4 min read
Introduction to Elasticsearch and Its Integration with Laravel
Hulu Beijing
Hulu Beijing
Oct 26, 2020 · Artificial Intelligence

Hulu’s AI Innovations: Graph Neural Networks, Ad Targeting & Content Embeddings

The Hulu AI Class event showcased a series of technical talks covering large‑scale graph neural network optimizations, multi‑factor video ad placement algorithms, recommendation and search engine techniques, machine‑learning‑driven video codec improvements, and advanced content‑embedding methods, highlighting practical engineering experiences from Hulu’s Beijing office.

Ad Targetingcontent embeddingmachine learning
0 likes · 9 min read
Hulu’s AI Innovations: Graph Neural Networks, Ad Targeting & Content Embeddings
ITPUB
ITPUB
Oct 23, 2020 · Fundamentals

How General Search Engines Work: From Crawlers to Ranking

This article provides a comprehensive overview of general search engines, covering their classification, core workflow, key modules such as web crawlers, content processing, storage, user query handling, ranking strategies like TF‑IDF and PageRank, as well as anti‑cheat measures and user intent understanding.

PageRankTF-IDFWeb Crawling
0 likes · 16 min read
How General Search Engines Work: From Crawlers to Ranking
Architecture Digest
Architecture Digest
Oct 1, 2020 · Big Data

Elasticsearch Overview: Architecture, Core Concepts, and Performance Optimization

This article provides a comprehensive introduction to Elasticsearch, covering data types, the role of Lucene, cluster architecture, node roles, discovery mechanisms, shard and replica management, mapping, installation, health monitoring, indexing workflow, storage internals, refresh and translog processes, segment merging, and practical performance and JVM tuning tips.

ElasticsearchShardinverted index
0 likes · 35 min read
Elasticsearch Overview: Architecture, Core Concepts, and Performance Optimization
Architecture Digest
Architecture Digest
Sep 14, 2020 · Databases

Understanding the Underlying Mechanics of Elasticsearch and Lucene

This article provides a comprehensive, top‑down and bottom‑up explanation of Elasticsearch’s internal architecture, covering clusters, nodes, shards, Lucene segments, inverted indexes, stored fields, document values, caching, merging, routing, scaling, and query processing, while addressing common performance questions.

Elasticsearchcachinglucene
0 likes · 11 min read
Understanding the Underlying Mechanics of Elasticsearch and Lucene
Top Architect
Top Architect
Sep 10, 2020 · Operations

Elasticsearch Performance Tuning Guide: Configuration, System, and Usage Optimizations

This article provides a comprehensive guide to improving Elasticsearch performance and stability by covering configuration file tweaks, system‑level settings, and usage‑level optimizations such as hot‑thread analysis, pending tasks, field storage, translog handling, refresh intervals, shard management, and best practices for routing and alias usage.

Cluster ConfigurationSystem optimizationperformance tuning
0 likes · 20 min read
Elasticsearch Performance Tuning Guide: Configuration, System, and Usage Optimizations
Architecture Digest
Architecture Digest
Sep 3, 2020 · Databases

Practical Elasticsearch Performance and Stability Tuning Guide

This article consolidates practical Elasticsearch tuning techniques—including configuration file adjustments, system‑level optimizations, and usage‑level settings—to improve cluster performance, stability, and resource efficiency for production environments.

Big DataCluster ConfigurationElasticsearch
0 likes · 15 min read
Practical Elasticsearch Performance and Stability Tuning Guide
Programmer DD
Programmer DD
Aug 23, 2020 · Databases

What’s New in Elasticsearch 7.9.0? Key Security Fixes and Feature Updates

Elasticsearch 7.9.0 introduces critical security patches for field‑level leakage, updates script cache limits, refines field capabilities, improves snapshot restore throttling, expands thread‑pool write queue, deprecates dangling indices, and addresses known issues like mapping errors in machine‑learning indices.

Elasticsearchsearch engine
0 likes · 4 min read
What’s New in Elasticsearch 7.9.0? Key Security Fixes and Feature Updates
Sohu Tech Products
Sohu Tech Products
Aug 12, 2020 · Big Data

Elasticsearch Basics: Concepts, Installation, and Search Operations

This article introduces Elasticsearch as a distributed open‑source search and analytics engine, explains its core concepts and architecture, compares it with relational databases, details installation steps, configuration, indexing, analyzers, query DSL, pagination, sorting, and provides practical examples for building search functionality.

AnalyzersElasticsearchInstallation
0 likes · 22 min read
Elasticsearch Basics: Concepts, Installation, and Search Operations
Programmer DD
Programmer DD
Aug 10, 2020 · Big Data

Master ElasticSearch: How Its Distributed Architecture Powers Scalable Search

ElasticSearch achieves distributed search by organizing data into indices, types, mappings, documents, and fields, splitting indices into primary and replica shards across multiple nodes, with automatic master election and shard allocation, enabling horizontal scaling, high availability, and improved performance for large‑scale data workloads.

distributed architecturesearch enginesharding
0 likes · 7 min read
Master ElasticSearch: How Its Distributed Architecture Powers Scalable Search
Programmer DD
Programmer DD
Aug 8, 2020 · Artificial Intelligence

How Elasticsearch Handles Write, Read, and Search: Inside the Engine

This article explains Elasticsearch's internal mechanisms for indexing, querying, and retrieving data, covering the roles of coordinating nodes, primary and replica shards, the refresh and commit cycles, near‑real‑time search, and the underlying Lucene inverted index.

Elasticsearchdata ingestionindexing
0 likes · 12 min read
How Elasticsearch Handles Write, Read, and Search: Inside the Engine
Laravel Tech Community
Laravel Tech Community
Jul 29, 2020 · Backend Development

Elasticsearch 7.8.1 Release Highlights and New Features

Elasticsearch 7.8.1 introduces a range of new capabilities such as literal SUM/MIN/MAX/AVG in SQL, enhanced authorization for apm_user, updated index creation logging, composable template renaming, additional machine‑learning aggregations, snapshot/restore optimizations, and an improved update API, all aimed at boosting search and analytics performance.

7.8.1Elasticsearchfeatures
0 likes · 2 min read
Elasticsearch 7.8.1 Release Highlights and New Features
Meituan Technology Team
Meituan Technology Team
Jul 16, 2020 · Artificial Intelligence

Augur: An Online Model Inference Framework and Poker Platform for Meituan Search

Meituan’s AI‑driven search combines the Augur online inference framework—offering stateless, distributed feature operators, transformers, and a DSL for rapid, high‑throughput model scoring—with the Poker platform for model training, versioning, and experimentation, together accelerating iteration, improving performance, and enabling advanced model‑as‑feature ensembles.

AI PlatformModel Servingfeature engineering
0 likes · 26 min read
Augur: An Online Model Inference Framework and Poker Platform for Meituan Search
DataFunTalk
DataFunTalk
Jul 16, 2020 · Big Data

Elasticsearch Practices and Platform Construction at 58.com

This article details 58.com’s extensive use of Elasticsearch for search, analytics, and log processing, covering cluster optimization challenges, typical issues like disk exhaustion and write slowdown, practical solutions, development standards, ELKB architecture, real‑time log and MySQL slow‑log applications, platform‑as‑a‑service construction, and future roadmap plans.

Cluster ManagementElasticsearchLog Analytics
0 likes · 17 min read
Elasticsearch Practices and Platform Construction at 58.com
Programmer DD
Programmer DD
Jul 10, 2020 · Fundamentals

How Search Engines Work: Inside Document and Query Processing

This article explains the core components of a search engine—document processing, query processing, and matching—detailing each step from indexing to ranking, and discusses the document features that influence relevance, providing a comprehensive overview of information retrieval fundamentals.

Document ProcessingQuery Processinginformation retrieval
0 likes · 20 min read
How Search Engines Work: Inside Document and Query Processing
Top Architect
Top Architect
Jun 4, 2020 · Big Data

Elasticsearch Deployment and Use Cases in Major Chinese Companies

This article reviews how leading Chinese internet companies such as JD.com, Ctrip, Qunar, 58.com, and Didi have adopted Elasticsearch for large‑scale order search, log analysis, real‑time monitoring, and security, describing the evolution of cluster architectures, shard strategies, multi‑cluster pipelines, and performance optimizations.

Big DataElasticsearchScalability
0 likes · 12 min read
Elasticsearch Deployment and Use Cases in Major Chinese Companies
MaGe Linux Operations
MaGe Linux Operations
Jun 1, 2020 · Backend Development

Mastering Elasticsearch Analyzers: A Deep Dive into Tokenizers and Filters

This article explains how Elasticsearch uses Analyzer components—character filters, tokenizers, and token filters—to perform text analysis, reviews the built‑in analyzers such as standard, simple, stop, whitespace, keyword, pattern, language, ICU and IK, and provides practical _analyze API examples with code snippets and result screenshots.

ElasticsearchICU PluginIK Analyzer
0 likes · 11 min read
Mastering Elasticsearch Analyzers: A Deep Dive into Tokenizers and Filters
Architect
Architect
May 22, 2020 · Databases

Performance Analysis of Elasticsearch Queries: Lucene Internals and Benchmark Results

This article examines Elasticsearch query performance by explaining Lucene's underlying data structures, describing how composite queries are merged, and presenting benchmark numbers for various query types such as term, range, and combined queries, highlighting optimization techniques and practical conclusions.

BKD-TreeElasticsearchQuery Performance
0 likes · 13 min read
Performance Analysis of Elasticsearch Queries: Lucene Internals and Benchmark Results
Architect
Architect
May 16, 2020 · Big Data

Master/Slave Architecture vs P2P Ring Structure and an Overview of Elasticsearch

This article explains the differences between Master‑Slave and P2P ring architectures, introduces Elasticsearch’s core concepts, internal components, master election, shard routing, indexing and search processes, and discusses how the system avoids split‑brain scenarios and ensures high availability.

ElasticsearchMaster‑SlaveP2P
0 likes · 17 min read
Master/Slave Architecture vs P2P Ring Structure and an Overview of Elasticsearch
Java Captain
Java Captain
May 8, 2020 · Big Data

Elasticsearch Adoption and Architecture Cases in Major Chinese Companies

The article surveys how leading Chinese tech firms such as JD Daojia, Ctrip, Qunar, 58.com, and Didi have adopted Elasticsearch for large‑scale search, real‑time analytics, and security, detailing their evolving cluster architectures, shard strategies, data volumes, and supporting services.

Big DataDistributed SystemsElasticsearch
0 likes · 11 min read
Elasticsearch Adoption and Architecture Cases in Major Chinese Companies
DataFunTalk
DataFunTalk
May 7, 2020 · Artificial Intelligence

Comprehensive Overview of Query Understanding in Search Engines

Query understanding (QU) involves lexical, syntactic, and semantic analysis of user queries to enable effective search recall and ranking, covering modules such as preprocessing, correction, expansion, segmentation, intent detection, term importance, and guidance, with detailed discussion of algorithms, models, and system architecture.

NLPQuery Understandinginformation retrieval
0 likes · 51 min read
Comprehensive Overview of Query Understanding in Search Engines
dbaplus Community
dbaplus Community
May 5, 2020 · Databases

Why Elasticsearch Beats Its Competitors: A Deep Technical Comparison

This article offers a detailed, experience‑driven comparison of Elasticsearch against its main rivals—Lucene, Solr, relational databases, OpenTSDB, HBase, MongoDB, ClickHouse, and Druid—highlighting where Elasticsearch excels, where it falls short, and practical guidance for choosing the right data solution.

ComparisonElasticsearchSolr
0 likes · 15 min read
Why Elasticsearch Beats Its Competitors: A Deep Technical Comparison
Programmer DD
Programmer DD
Apr 12, 2020 · Big Data

Master Elasticsearch: From Basics to SpringBoot Integration and Advanced Queries

This comprehensive guide introduces Elasticsearch fundamentals, its features and use cases, then walks through integrating it with SpringBoot, configuring Maven dependencies, performing index and document operations, and demonstrates a variety of query types and aggregations using both RESTful APIs and Java code examples.

Big DataElasticsearchFull‑Text Search
0 likes · 46 min read
Master Elasticsearch: From Basics to SpringBoot Integration and Advanced Queries
360 Quality & Efficiency
360 Quality & Efficiency
Mar 24, 2020 · Databases

Introduction to Elasticsearch: Core Concepts, Use Cases, and Practical Operations

This article introduces Elasticsearch by explaining its core concepts such as indices, types, documents, mappings, and Query DSL, demonstrates common use cases, and provides step‑by‑step instructions for creating, updating, viewing, and deleting indices and documents using RESTful APIs, curl commands, and Docker‑compose deployment.

CRUDDockerElasticsearch
0 likes · 5 min read
Introduction to Elasticsearch: Core Concepts, Use Cases, and Practical Operations
Big Data Technology Architecture
Big Data Technology Architecture
Mar 7, 2020 · Operations

How to Perform a Graceful Shutdown of an Elasticsearch Node

This article outlines a step‑by‑step procedure for safely taking an Elasticsearch node offline—checking master‑eligible settings, adjusting minimum_master_nodes, excluding the node from routing, waiting for shard relocation, stopping the service, and restoring the cluster routing—ensuring no data loss or service interruption.

Cluster ManagementDevOpsElasticsearch
0 likes · 6 min read
How to Perform a Graceful Shutdown of an Elasticsearch Node
Architects Research Society
Architects Research Society
Jan 16, 2020 · Big Data

Elasticsearch vs Solr: Choosing the Right Open‑Source Search Engine

This article compares Elasticsearch and Solr, examining their history, community, licensing, core technologies, APIs, scalability, vendor support, ecosystem, performance, management tools, and visualization options to help organizations decide which open‑source search engine best fits their big‑data and search requirements.

Big DataElasticsearchSolr
0 likes · 12 min read
Elasticsearch vs Solr: Choosing the Right Open‑Source Search Engine
DevOps Cloud Academy
DevOps Cloud Academy
Jan 2, 2020 · Big Data

Introduction, Use Cases, Installation, and Basic Operations of Elasticsearch

This article introduces Elasticsearch as a distributed search and analytics engine, outlines its common application scenarios, provides step‑by‑step installation commands, explains core concepts such as documents and indices, and demonstrates basic indexing, retrieval, bulk processing, and aggregation operations.

DistributedElasticsearchLog Analytics
0 likes · 4 min read
Introduction, Use Cases, Installation, and Basic Operations of Elasticsearch
macrozheng
macrozheng
Dec 20, 2019 · Big Data

How to Supercharge Elasticsearch for Billion‑Row Queries: Practical Optimization Guide

This article explains the architecture of Elasticsearch and Lucene, outlines common performance bottlenecks, and provides concrete indexing and search optimization techniques—including bulk writes, shard routing, doc values tuning, and pagination strategies—to achieve sub‑second query responses on billions of records.

Big DataElasticsearchlucene
0 likes · 14 min read
How to Supercharge Elasticsearch for Billion‑Row Queries: Practical Optimization Guide
DataFunTalk
DataFunTalk
Nov 28, 2019 · Artificial Intelligence

Web Data Mining and Page Analysis Techniques for Search Engines

This article explains how search engines collect, analyze, and rank web pages by describing the spider system, HTML and layout tree construction, feature extraction, and machine‑learning based classification methods used to understand page content and improve result relevance.

HTML treefeature extractionlayout tree
0 likes · 8 min read
Web Data Mining and Page Analysis Techniques for Search Engines
Tencent Cloud Developer
Tencent Cloud Developer
Nov 26, 2019 · Backend Development

TurboSearch: Tencent AI Lab's Next-Generation Large-Scale Search System

TurboSearch is Tencent AI Lab's next-generation large-scale search system, delivering distributed massive indexing, high-performance parallel retrieval, multi-granularity and multi-modal vector indexing, private Docker deployment, integrated NLP query analysis, extensible plugins, and robust operations for massive data and diverse search scenarios.

NLPTencent AI LabTurboSearch
0 likes · 14 min read
TurboSearch: Tencent AI Lab's Next-Generation Large-Scale Search System
HomeTech
HomeTech
Nov 20, 2019 · Artificial Intelligence

Query Understanding and Intent Recognition in Search: Methods, Taxonomy, and Applications

This article explains how query understanding (QP) transforms user search queries into structured semantic blocks and intent categories using rule‑based NLP, entity recognition, and post‑processing, and describes its taxonomy, implementation details, and practical impact on search engine results.

NLPQuery Understandingintent recognition
0 likes · 16 min read
Query Understanding and Intent Recognition in Search: Methods, Taxonomy, and Applications
21CTO
21CTO
Nov 16, 2019 · Fundamentals

From Early Crawlers to ByteDance: A History of Web Scraping

This article traces the evolution of web crawlers—from early Perl scripts to modern ByteDance agents—explaining their role in search engines, business models, anti‑crawling measures, and the impact on content creation and competition.

Web Crawlingcontent aggregationdata-scraping
0 likes · 6 min read
From Early Crawlers to ByteDance: A History of Web Scraping
Architecture Digest
Architecture Digest
Nov 15, 2019 · Big Data

Design and Key Technologies of the 360 Search Engine for Billion‑Scale Web Retrieval

This article explains how 360 Search handles billions of daily crawls and hundred‑billion‑scale indexing by describing its overall architecture, core modules such as offline indexing and online retrieval, query analysis, relevance scoring, and the engineering techniques that enable efficient large‑scale web search.

information retrievallarge-scale indexingranking
0 likes · 22 min read
Design and Key Technologies of the 360 Search Engine for Billion‑Scale Web Retrieval
JD Retail Technology
JD Retail Technology
Nov 6, 2019 · Artificial Intelligence

Technical Overview of JD.com Search and Recommendation Systems for the 11.11 Shopping Festival

The article details JD.com's internally developed distributed search engine and recommendation platform, their new architectures, deep‑learning‑driven ranking and recall models, component‑based deployment, extensive performance testing, and coordinated operations that powered the massive 11.11 shopping event.

Deep LearningOperationsPerformance Testing
0 likes · 5 min read
Technical Overview of JD.com Search and Recommendation Systems for the 11.11 Shopping Festival
Xianyu Technology
Xianyu Technology
Aug 28, 2019 · Big Data

Unified Search System Architecture and Automation for Multiple Business Scenarios

To avoid building separate search services for each Xianyu business, the team created a unified, generic search architecture based on Alibaba’s HA3 engine and a control layer that automates data dumping, indexing, query translation, and result ranking across five subsystems, enabling new services to be onboarded in minutes instead of weeks.

Big Dataautomationdata pipeline
0 likes · 18 min read
Unified Search System Architecture and Automation for Multiple Business Scenarios
21CTO
21CTO
Aug 18, 2019 · Artificial Intelligence

Can ByteDance’s New Search Engine Challenge Baidu’s Dominance?

ByteDance has entered the Chinese search market with its new mobile‑first “Toutiao Search”, assembling a team from top tech firms and leveraging AI, NLP and computer‑vision technologies, sparking a fresh rivalry with Baidu’s long‑standing dominance.

AIBaiduByteDance
0 likes · 5 min read
Can ByteDance’s New Search Engine Challenge Baidu’s Dominance?
Big Data Technology Architecture
Big Data Technology Architecture
Aug 9, 2019 · Databases

Understanding Elasticsearch: Architecture, Core Concepts, and Performance Optimization

This article provides a comprehensive overview of Elasticsearch, covering its role in handling structured and unstructured data, core concepts such as Lucene, inverted indexes, clusters, shards, replicas, mapping, indexing processes, storage mechanisms, and practical performance tuning tips for deployment.

ElasticsearchReplicationinverted index
0 likes · 35 min read
Understanding Elasticsearch: Architecture, Core Concepts, and Performance Optimization
ITPUB
ITPUB
Jul 6, 2019 · Backend Development

How Elasticsearch Revolutionized Search and Logging: The ELK Stack Story

This article narrates the origin and evolution of Elasticsearch, from its Lucene roots through Compass to the modern ELK Stack, illustrating how it simplifies full‑text search, log analysis, and real‑time monitoring for developers and operations teams.

BeatsELKElasticsearch
0 likes · 13 min read
How Elasticsearch Revolutionized Search and Logging: The ELK Stack Story
Big Data Technology Architecture
Big Data Technology Architecture
Jul 3, 2019 · Backend Development

Step-by-Step Guide to Installing Elasticsearch 7.x (Single‑Node) and Elasticsearch‑head

This article provides a comprehensive tutorial for installing Elasticsearch 7.x in single‑node mode, configuring its key settings, deploying the Elasticsearch‑head web UI via Tomcat, and includes reference configuration files for Elasticsearch 6.x, complete with command‑line examples and code snippets.

ConfigurationElasticsearchElasticsearch-head
0 likes · 8 min read
Step-by-Step Guide to Installing Elasticsearch 7.x (Single‑Node) and Elasticsearch‑head
58 Tech
58 Tech
Jun 27, 2019 · Artificial Intelligence

Spelling Correction System for 58.com Search Engine: Rule‑Based and Statistical Methods

This article describes the design and implementation of a spelling‑correction module for 58.com’s search engine, covering common query errors, rule‑based and statistical language‑model approaches, offline dictionary generation, n‑gram and Viterbi decoding, online workflow, and practical examples.

Language ModelQuery ProcessingViterbi algorithm
0 likes · 15 min read
Spelling Correction System for 58.com Search Engine: Rule‑Based and Statistical Methods
Architect's Tech Stack
Architect's Tech Stack
Jun 23, 2019 · Big Data

Elasticsearch Interview Questions: Architecture, Indexing, Optimization, and Operations

This article compiles common Elasticsearch interview questions and detailed answers covering cluster architecture, inverted index fundamentals, index design, write/query optimizations, master election, document indexing flow, search process, Linux tuning, and Lucene internals, providing practical guidance for candidates.

ClusterElasticsearchindexing
0 likes · 10 min read
Elasticsearch Interview Questions: Architecture, Indexing, Optimization, and Operations
Efficient Ops
Efficient Ops
May 30, 2019 · Operations

How to Supercharge Elasticsearch for Massive Log Analytics: Real-World Optimizations

This article examines the unique characteristics of log data, outlines the challenges of using Elasticsearch at scale, and presents practical optimization techniques—including ingestion, mapping, time‑range search, metadata loading, and a custom C++ engine—to dramatically improve performance, stability, and cost efficiency.

BackendElasticsearchLog Analytics
0 likes · 11 min read
How to Supercharge Elasticsearch for Massive Log Analytics: Real-World Optimizations
Efficient Ops
Efficient Ops
Apr 21, 2019 · Backend Development

Mastering Elasticsearch: From Inverted Index to Distributed Search

This article walks through the fundamentals of search engines, explaining inverted indexes, the explosion of index size, core Elasticsearch concepts, its distributed architecture, and how it powers the ELK stack for log analysis, all illustrated with clear diagrams and examples.

BackendDistributed SystemsELK
0 likes · 6 min read
Mastering Elasticsearch: From Inverted Index to Distributed Search
Xianyu Technology
Xianyu Technology
Mar 14, 2019 · Operations

Ensuring High Availability of Search Engine Services: A Case Study of Xianyu's Search System

The article explains how Xianyu guarantees high‑availability of its core Ha3‑based search engine through independent gateway deployment, multi‑datacenter disaster recovery, traffic isolation, comprehensive monitoring, pressure testing, gray releases, and automated/manual failover, enabling rapid issue detection, recovery, and continuous service stability.

System Architecturedisaster recoveryemergency response
0 likes · 19 min read
Ensuring High Availability of Search Engine Services: A Case Study of Xianyu's Search System
58 Tech
58 Tech
Mar 7, 2019 · Big Data

In-Memory Inverted Index Compression Algorithms: Overview and MILC Optimization for High‑Performance Search

This article reviews major in‑memory inverted index compression techniques such as PForDelta, PEF, and MILC, explains their principles and trade‑offs, and details practical optimizations applied at 58.com to achieve query performance comparable to uncompressed indexes while reducing memory usage by about 35 percent.

Big DataMILCalgorithm
0 likes · 17 min read
In-Memory Inverted Index Compression Algorithms: Overview and MILC Optimization for High‑Performance Search
JD Tech
JD Tech
Dec 3, 2018 · Backend Development

Evolution of JD.com Order Center Elasticsearch Cluster Architecture

This article details how JD.com’s order center migrated its Elasticsearch cluster through multiple stages—from an initial unoptimized deployment to a real‑time dual‑cluster backup solution—addressing scalability, reliability, shard tuning, version upgrades, and data synchronization strategies to support billions of documents and hundreds of millions of daily queries.

Cluster ArchitectureElasticsearchJD.com
0 likes · 13 min read
Evolution of JD.com Order Center Elasticsearch Cluster Architecture
DataFunTalk
DataFunTalk
Nov 9, 2018 · Backend Development

From Zero to One: Building and Optimizing Search Engines with Elasticsearch – Insights and Case Studies

This article presents a comprehensive overview of constructing a search engine using Elasticsearch, covering architecture components, data read/write mechanisms, shard management, caching strategies, and real‑world case studies that illustrate performance tuning, isolation, and deployment best practices.

Distributed SystemsElasticsearchbackend-development
0 likes · 14 min read
From Zero to One: Building and Optimizing Search Engines with Elasticsearch – Insights and Case Studies
21CTO
21CTO
Oct 12, 2018 · Backend Development

How Elastic’s IPO Mirrors the Rise of Open‑Source Search Engines

The article chronicles Elastic’s journey from a small open‑source search tool to a NYSE‑listed company, explaining Elasticsearch’s technical foundations, its real‑world applications, and what the IPO means for developers and the broader search‑technology ecosystem.

ElasticsearchIPObackend-development
0 likes · 10 min read
How Elastic’s IPO Mirrors the Rise of Open‑Source Search Engines
ITPUB
ITPUB
Oct 8, 2018 · Big Data

From Open‑Source Search to a Billion‑Dollar IPO: The Elastic Story

Elastic's NYSE debut, its rapid stock surge, the origins and technical strengths of Elasticsearch, and what the company's public listing means for developers and tech entrepreneurs are explored in detail, highlighting the journey from a personal recipe‑search tool to a global data‑search platform.

ElasticsearchIPOelastic
0 likes · 9 min read
From Open‑Source Search to a Billion‑Dollar IPO: The Elastic Story
21CTO
21CTO
Sep 14, 2018 · Backend Development

How Message Queues Enable Near Real‑Time Incremental Indexing in Search Engines

This article examines the high‑real‑time requirements of incremental data ingestion for search engines, compares three update schemes, and details how adopting a Kafka subscription‑based message‑queue approach dramatically improves latency and flexibility for the Nuomi search framework.

KafkaMessage Queueincremental indexing
0 likes · 8 min read
How Message Queues Enable Near Real‑Time Incremental Indexing in Search Engines
21CTO
21CTO
Aug 10, 2018 · Fundamentals

What GO.COM’s Rise and Fall Reveals About Search Engine Ranking Laws

This reflective essay recounts the author’s two‑and‑a‑half‑year stint at Infoseek, the strategic missteps around free‑mail branding, the transition to GO.COM, and the evolution of search‑engine ranking principles—from early link‑analysis patents to the so‑called confidence‑based bidding model.

Internet Historyconfidence biddinghyperlink analysis
0 likes · 15 min read
What GO.COM’s Rise and Fall Reveals About Search Engine Ranking Laws
MaGe Linux Operations
MaGe Linux Operations
Jul 19, 2018 · Databases

Master Elasticsearch with Python: From Installation to Advanced Queries

This tutorial walks you through installing Elasticsearch, creating indices, inserting and updating documents, performing searches via REST API, and integrating Elasticsearch into Python applications using both raw HTTP requests and the official Python client, illustrated with practical examples and screenshots.

ElasticsearchNoSQLREST API
0 likes · 11 min read
Master Elasticsearch with Python: From Installation to Advanced Queries
58 Tech
58 Tech
Jun 1, 2018 · Backend Development

Design and Implementation of Real-Time Indexing in 58.com’s ESearch Search Engine

This article explains how 58.com’s in‑house C++ search kernel ESearch was architected to provide second‑level real‑time indexing, high‑concurrency low‑latency querying, flexible ranking models, and efficient storage structures for billions of daily queries across massive classified data.

BackendC++large scale
0 likes · 13 min read
Design and Implementation of Real-Time Indexing in 58.com’s ESearch Search Engine
Architecture Digest
Architecture Digest
Feb 1, 2018 · Fundamentals

How Search Engines Work: Building Inverted Indexes

This article explains the core of search engine technology by describing what an inverted index is, how it is built using single‑pass memory and multi‑way merge methods, how indexes can be partitioned and incrementally updated, and how Hadoop can be used for large‑scale indexing.

Big DataHadoopindexing
0 likes · 10 min read
How Search Engines Work: Building Inverted Indexes
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Dec 26, 2017 · Big Data

Demystifying Elasticsearch: How Clusters Start, Join, and Process Reads/Writes

This article explains Elasticsearch’s core mechanisms, covering the cluster startup sequence, node discovery and election, handling of failed nodes, cluster management APIs, and the detailed read/write processes including coordinating nodes, shard allocation, memory buffering, transaction logs, and query‑then‑fetch execution.

ElasticsearchNode DiscoveryRead/Write
0 likes · 8 min read
Demystifying Elasticsearch: How Clusters Start, Join, and Process Reads/Writes
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Dec 7, 2017 · Operations

How 360’s Private Cloud Powers Elasticsearch: Architecture, Security, and Scaling

This article explains how 360’s Hulk private cloud platform deploys Elasticsearch with a dedicated master architecture, load‑balancing, per‑business isolated clusters, SearchGuard security, dynamic tokenization, self‑service user features, and advanced monitoring to achieve high‑performance, scalable search services.

Elasticsearchmonitoringprivate cloud
0 likes · 6 min read
How 360’s Private Cloud Powers Elasticsearch: Architecture, Security, and Scaling
21CTO
21CTO
Nov 19, 2017 · Fundamentals

Demystifying PageRank: How Google Ranks Web Pages and Fights Spam

This article explains the core challenges of search engines, the origins and mechanics of the PageRank algorithm, its handling of dead ends and spider traps, extensions like Topic‑Sensitive PageRank, and the various link‑spam attacks and countermeasures such as Spam Farms and TrustRank.

PageRankalgorithmlink spam
0 likes · 23 min read
Demystifying PageRank: How Google Ranks Web Pages and Fights Spam
Meitu Technology
Meitu Technology
Apr 6, 2017 · Artificial Intelligence

Meipai Text Matters: Mining and Practice of Community Text

The talk demonstrates how Meipai, a leading Chinese short‑video community, leverages large‑scale text mining and machine‑learning techniques—ranging from anti‑spam filtering to AI‑enhanced search—to enrich captions, comments, and metadata, improve user experience, and inspire further research on text data in video platforms.

Community analysisanti-spammachine learning
0 likes · 2 min read
Meipai Text Matters: Mining and Practice of Community Text
Meitu Technology
Meitu Technology
Apr 6, 2017 · Artificial Intelligence

Meitu Internet Technology Salon: AI and Machine Learning Applications in Practice

The fourth Meitu Internet Technology Salon showcased practical AI and machine learning uses, highlighting Meipai’s text‑anti‑spam, hot‑topic detection, sentiment analysis and personalized video search, while Baidu demonstrated ML‑driven business intelligence tools for multi‑source data mining, user profiling, and intelligent enterprise and HR management.

Artificial IntelligenceBusiness IntelligenceSentiment Analysis
0 likes · 7 min read
Meitu Internet Technology Salon: AI and Machine Learning Applications in Practice
21CTO
21CTO
Mar 18, 2017 · Backend Development

Inside Baidu’s First‑Generation Spider: How a C‑Only Backend Powered Fast Search

The article recounts Xu Haiyang’s hands‑on experience designing Baidu’s early Spider system, describing its pure C procedural architecture, bug‑fixing journey, PageRank processing, team‑management analogies, and his later moves into AI and education entrepreneurship.

Backend ArchitectureC ProgrammingPageRank
0 likes · 7 min read
Inside Baidu’s First‑Generation Spider: How a C‑Only Backend Powered Fast Search
21CTO
21CTO
Feb 15, 2017 · Fundamentals

How Twitter Evolved Its Search Engine: From MySQL to Earlybird and Beyond

This article explains the fundamentals of search engine architecture, covering text collection, indexing, ranking and evaluation, and then traces Twitter's internal search evolution from MySQL full‑text search to the Earlybird index server, Blender aggregation, and smart memory‑SSD strategies.

Big DataTwitterindexing
0 likes · 8 min read
How Twitter Evolved Its Search Engine: From MySQL to Earlybird and Beyond
Architecture Digest
Architecture Digest
Aug 8, 2016 · Databases

Understanding Elasticsearch Architecture: Clusters, Shards, Discovery, and Scaling

This article provides a comprehensive overview of Elasticsearch 2.x, covering its distributed architecture, core concepts such as clusters, nodes, indices, shards and replicas, the ZenDiscovery master‑election process, scaling mechanisms, recovery, query features, and the underlying system components like Guice, Netty, and thread‑pool designs.

Cluster ManagementElasticsearchNoSQL
0 likes · 20 min read
Understanding Elasticsearch Architecture: Clusters, Shards, Discovery, and Scaling
Architect
Architect
Mar 22, 2016 · Backend Development

Youzan Search Engine Practice – Engineering Part: Architecture, Indexing, and Performance Optimization

This article describes the practical architecture of Youzan's commercial e‑commerce search engine, covering data source integration, distributed real‑time indexing with Elasticsearch, Hadoop and Kafka, advanced search modules, and several performance‑tuning techniques for large‑scale deployments.

Backend ArchitectureElasticsearchKafka
0 likes · 13 min read
Youzan Search Engine Practice – Engineering Part: Architecture, Indexing, and Performance Optimization
21CTO
21CTO
Mar 4, 2016 · Backend Development

Inside Taobao’s Billion-Request Engine: Load Balancing, CDN & Big Data

This article explains how Taobao scales to billions of daily page views using DNS‑based load balancing, LVS, domain sharding, CDN nodes, a distributed file system, sophisticated search processing, and massive data storage and real‑time log pipelines.

CDNDistributed Systemsload balancing
0 likes · 9 min read
Inside Taobao’s Billion-Request Engine: Load Balancing, CDN & Big Data
ITPUB
ITPUB
Dec 21, 2015 · Information Security

How to Shield Your Personal Data: Cold War Secrets and Modern Privacy Hacks

The article explores historical privacy tactics of the USSR and the United States, offers practical habits for protecting personal information online, explains how to detect leaked data using search engines and social‑media checks, and suggests strategies for mitigating exposure and crafting false identities.

Information Securityidentity protectionpersonal data
0 likes · 6 min read
How to Shield Your Personal Data: Cold War Secrets and Modern Privacy Hacks
Architects Research Society
Architects Research Society
Dec 17, 2015 · Artificial Intelligence

How Search Engine Experience Informs Personalized Recommendation at Toutiao

The article explains how search engine techniques such as large‑scale candidate recall, fine‑grained ranking, user profiling, and multi‑objective optimization are applied to news personalization at Toutiao, highlighting data sampling, machine‑learning pipelines, challenges of news freshness, and architectural evolution.

multi-objective optimizationnews recommendationrecommendation
0 likes · 5 min read
How Search Engine Experience Informs Personalized Recommendation at Toutiao
21CTO
21CTO
Sep 14, 2015 · Backend Development

Why Simple‑Looking Sites Like Taobao Need Hundreds of Top Engineers

Although sites like Taobao appear simple to users, they rely on massive distributed search, caching, storage, load‑balancing, CDN, logging, and data‑analysis systems that demand sophisticated backend engineering, massive infrastructure, and specialized algorithms, explaining why countless top engineers are required to keep them running.

Big DataDistributed Systemscaching
0 likes · 12 min read
Why Simple‑Looking Sites Like Taobao Need Hundreds of Top Engineers