Tagged articles

Lucene

93 articles · Page 1 of 1

Sep 24, 2025 · Backend Development

Boost Spring Boot 3 Search with Hibernate Search & Lucene: Full Guide

This article explains why traditional database search struggles with large data sets, introduces Lucene as a high‑performance full‑text engine, and shows step‑by‑step how to integrate it into Spring Boot 3 using Hibernate Search, custom analyzers, entity mapping, repository extensions, and test cases.

Hibernate SearchJavaLucene

0 likes · 11 min read

Boost Spring Boot 3 Search with Hibernate Search & Lucene: Full Guide

dbaplus Community

Apr 22, 2025 · Backend Development

Explore Elasticsearch 9.0: Performance Boosts, AI Features & Security Upgrades

Elasticsearch 9.0, released on April 15, 2025, builds on Lucene 10.1.0 to deliver major performance gains, introduces Better Binary Quantization, Elastic Distributions of OpenTelemetry, LLM observability, AI‑driven attack discovery, enhanced ES|QL, and is available via Elastic Cloud with deployment tips and examples.

AICloudElasticsearch

0 likes · 7 min read

Explore Elasticsearch 9.0: Performance Boosts, AI Features & Security Upgrades

Mingyi World Elasticsearch

Mar 1, 2025 · Backend Development

Why _count and _stats Return Different Document Numbers in Elasticsearch—and How to Fix It

The article explains why Elasticsearch's _count and _stats APIs can return vastly different document totals, especially when nested fields are involved, and provides step‑by‑step analysis, code examples, and practical solutions such as index refresh and data‑model adjustments.

COUNTElasticsearchLucene

0 likes · 7 min read

Why _count and _stats Return Different Document Numbers in Elasticsearch—and How to Fix It

JavaEdge

Dec 22, 2024 · Backend Development

How LinkedIn Powers Lightning‑Fast Message Search with RocksDB, Lucene, and In‑Memory Indexing

LinkedIn’s message search system stores messages in RocksDB, builds Lucene inverted indexes on demand, partitions them by user, keeps indexes in memory, and uses a coordinator with D2/Zookeeper for node routing, enabling rapid, cost‑effective searches while minimizing write overhead.

LinkedInLuceneRocksDB

0 likes · 9 min read

How LinkedIn Powers Lightning‑Fast Message Search with RocksDB, Lucene, and In‑Memory Indexing

IT Services Circle

Jun 24, 2024 · Databases

Understanding Elasticsearch Architecture: Inverted Index, Term Dictionary, Segments, and Distributed Search

This article explains how Elasticsearch transforms simple keyword matching into a high‑performance, scalable search engine by using inverted indexes, term dictionaries, posting lists, term indexes, stored fields, doc values, segments, and distributed node architectures to achieve fast, reliable full‑text search on massive data sets.

ElasticsearchLuceneSearch Engine

0 likes · 16 min read

Understanding Elasticsearch Architecture: Inverted Index, Term Dictionary, Segments, and Distributed Search

Top Architect

Apr 18, 2024 · Big Data

Understanding ElasticSearch Architecture and Its Underlying Lucene Mechanics

This article provides a comprehensive, top‑down and bottom‑up explanation of ElasticSearch’s core architecture, detailing nodes, shards, Lucene segments, inverted indexes, stored fields, document values, caching, query processing, routing, and scaling considerations for efficient search operations.

LuceneSearch EngineSharding

0 likes · 10 min read

Architect

Apr 15, 2024 · Big Data

Understanding the Underlying Working Principles of ElasticSearch

This article explains ElasticSearch’s architecture and core mechanisms—including its reliance on Lucene segments, inverted indexes, stored fields, document values, caching, shard routing, and scaling strategies—while answering common questions about wildcard matching, index compression, and memory usage.

Big DataLuceneSearch Engine

0 likes · 11 min read

Understanding the Underlying Working Principles of ElasticSearch

JD Tech

Mar 14, 2024 · Databases

JD ElasticSearch Supports ZSTD Compression: Implementation, Performance Evaluation, and Usage Guide

This article explains how JD ElasticSearch has integrated the high‑performance ZSTD compression algorithm, details the motivations behind its adoption, presents benchmark results comparing it with LZ4 and best_compression, and provides step‑by‑step instructions and code snippets for configuring and using the new jd_zstd codec in Elasticsearch.

ElasticsearchJavaLucene

0 likes · 14 min read

JD ElasticSearch Supports ZSTD Compression: Implementation, Performance Evaluation, and Usage Guide

Ximalaya Technology Team

Sep 6, 2023 · Backend Development

Design Analysis of Lucene and In-Memory Inverted Index Service for Advertising Retrieval

The team analyzed Lucene’s disk‑based inverted index and built a custom in‑memory inverted‑index service for Himalaya’s ad engine, encoding terms as 64‑bit keys, supporting real‑time updates and BooleanQuery‑style and custom expression filtering, which cut query latency from ~50 ms to under 5 ms and enabled massive scaling.

Data StructuresJavaLucene

0 likes · 27 min read

Design Analysis of Lucene and In-Memory Inverted Index Service for Advertising Retrieval

Didi Tech

Aug 10, 2023 · Big Data

Implementing ZSTD Compression in Didi's Elasticsearch for High‑Performance Log Ingestion

By integrating ZSTD compression into Didi’s Elasticsearch 7.6, the team cut CPU usage by about 15 %, reduced index storage roughly 30 %, boosted write throughput up to 25 %, and eliminated over 20 servers, demonstrating a faster, more storage‑efficient solution for petabyte‑scale log ingestion.

Big DataElasticsearchLucene

0 likes · 10 min read

Implementing ZSTD Compression in Didi's Elasticsearch for High‑Performance Log Ingestion

Architects Research Society

Jul 24, 2023 · Artificial Intelligence

Neural Search in Apache Solr: Dense Vector Fields, HNSW Graphs, and K‑Nearest Neighbor Implementation

This article explains how Apache Solr implements neural search using dense vector fields, K‑Nearest Neighbor algorithms, and Hierarchical Navigable Small World graphs, detailing the underlying Lucene support, configuration options, query syntax, and integration with AI‑driven vector representations.

AIApache SolrDense Vectors

0 likes · 15 min read

Neural Search in Apache Solr: Dense Vector Fields, HNSW Graphs, and K‑Nearest Neighbor Implementation

Top Architect

Jul 18, 2023 · Fundamentals

Comprehensive Introduction to Elasticsearch: Core Concepts, Architecture, and Practical Usage

This article provides a detailed overview of Elasticsearch, covering its underlying Lucene technology, data types, indexing mechanisms, cluster architecture, shard and replica management, mapping definitions, installation steps, health monitoring, write and storage processes, and performance optimization techniques for production deployments.

ElasticsearchIndexingLucene

0 likes · 36 min read

Comprehensive Introduction to Elasticsearch: Core Concepts, Architecture, and Practical Usage

dbaplus Community

May 30, 2023 · Backend Development

How Index Sorting Cut Elasticsearch Search Latency from 2000ms to 50ms

This article explains how the community reduced Elasticsearch search response time from seconds to tens of milliseconds by applying Index Sorting, detailing the problem background, initial quick fixes, deep Lucene analysis, implementation steps, performance testing, and practical trade‑offs.

BackendOptimizationIndexSortingLucene

0 likes · 14 min read

How Index Sorting Cut Elasticsearch Search Latency from 2000ms to 50ms

DeWu Technology

May 8, 2023 · Databases

Optimizing Elasticsearch Search Performance with Index Sorting

By defining index sorting on the publish_time field when creating the Elasticsearch index, the team transformed a multi‑second full‑scan query into a sub‑50 ms operation, demonstrating that pre‑ordered storage dramatically speeds up large‑result‑set sorts while modestly affecting write throughput.

DocValuesElasticsearchIndex Sorting

0 likes · 12 min read

Optimizing Elasticsearch Search Performance with Index Sorting

政采云技术

Mar 2, 2023 · Fundamentals

Two‑Phase Commit in Lucene: Mechanism, Implementation, and Rollback

This article explains the two‑phase commit protocol, describes how Lucene implements it through a dedicated interface, details the preparation, commit, segment handling, deletion policies, and rollback procedures, and provides code snippets illustrating the core logic.

LuceneRollbackdistributed transactions

0 likes · 13 min read

Two‑Phase Commit in Lucene: Mechanism, Implementation, and Rollback

政采云技术

Mar 2, 2023 · Databases

Understanding Two-Phase Commit and Its Implementation in Lucene

This article explains the two-phase commit protocol for distributed transactions, details its generic workflow, and describes how Apache Lucene implements the protocol through its TwoPhaseCommit interface, including preparation, flushing, commit, segment handling, deletion policies, and rollback mechanisms with illustrative code examples.

JavaLucenedistributed transactions

0 likes · 12 min read

Understanding Two-Phase Commit and Its Implementation in Lucene

Architect's Guide

Feb 25, 2023 · Big Data

Elasticsearch Optimization and Performance Tuning for Billion‑Scale Data

This article documents the evolution of a data platform, explains Elasticsearch and Lucene fundamentals, and presents practical index and search performance optimizations—including bulk writes, refresh control, memory allocation, doc‑values tuning, and pagination strategies—that enable cross‑month queries and sub‑second responses on billions of records.

ElasticsearchLucenePerformance Tuning

0 likes · 11 min read

Elasticsearch Optimization and Performance Tuning for Billion‑Scale Data

Shepherd Advanced Notes

Feb 5, 2023 · Databases

Elasticsearch Basics: Overview, Operations, and Integration Guide

This article introduces Elasticsearch as a scalable distributed search engine, compares it with Solr, walks through local, Docker, and Kubernetes installation, explains its document‑oriented data model, demonstrates CRUD operations via REST API, and shows how to integrate it with Spring Boot using the official Java client.

CRUDDockerElasticsearch

0 likes · 23 min read

Elasticsearch Basics: Overview, Operations, and Integration Guide

Efficient Ops

Dec 21, 2022 · Big Data

How Elasticsearch Leverages Lucene’s Inverted Index for Real‑Time Distributed Search

This article explains the fundamentals of structured and unstructured data, introduces Lucene’s inverted index, and details how Elasticsearch builds on Lucene to provide distributed, near‑real‑time search with concepts such as clusters, shards, replicas, routing, and performance optimizations.

Distributed SearchElasticsearchLucene

0 likes · 36 min read

How Elasticsearch Leverages Lucene’s Inverted Index for Real‑Time Distributed Search

21CTO

Nov 18, 2022 · Big Data

How to Supercharge Elasticsearch for Billion‑Row Queries: Proven Optimization Techniques

This article details a real‑world case study of optimizing Elasticsearch for massive daily data volumes, covering the underlying Lucene architecture, shard routing, index and search performance tweaks, practical configuration settings, and benchmark results that achieve sub‑second query responses on billions of records.

IndexingLuceneOptimization

0 likes · 13 min read

How to Supercharge Elasticsearch for Billion‑Row Queries: Proven Optimization Techniques

Architect's Guide

Oct 27, 2022 · Big Data

Elasticsearch Overview: Data Types, Lucene Foundations, Core Concepts, Cluster Architecture, Indexing, Storage, and Performance Optimization

This article provides a comprehensive introduction to Elasticsearch, covering the distinction between structured and unstructured data, Lucene’s inverted index, ES core concepts such as clusters, nodes, shards and replicas, mapping, basic usage, storage mechanisms, and practical performance‑tuning tips for large‑scale search deployments.

ElasticsearchIndexingLucene

0 likes · 39 min read

Architect

Sep 23, 2022 · Databases

Elasticsearch Index and Search Performance Optimization for Billion‑Scale Data

This article presents a comprehensive case study of optimizing Elasticsearch and its underlying Lucene structures to achieve sub‑second query responses on billions of records, covering architecture basics, index design, doc‑values tuning, bulk‑write strategies, and extensive performance testing.

IndexingLuceneOptimization

0 likes · 12 min read

Elasticsearch Index and Search Performance Optimization for Billion‑Scale Data

IT Architects Alliance

Sep 12, 2022 · Backend Development

Elasticsearch Optimization: Lucene Architecture, Index Design, and Performance Tuning

This article presents a comprehensive guide to optimizing Elasticsearch for massive datasets, covering Lucene fundamentals, index and shard architecture, practical performance‑tuning techniques, and real‑world case studies that achieve sub‑second query responses on billions of records.

ElasticsearchIndex OptimizationLucene

0 likes · 11 min read

Elasticsearch Optimization: Lucene Architecture, Index Design, and Performance Tuning

政采云技术

Aug 30, 2022 · Fundamentals

Understanding Lucene Document Writing Process: Core Classes, Workflow, and Flush Strategies

This article explains the key Lucene classes involved in document indexing, outlines the end‑to‑end write workflow—including preUpdate, obtainAndLock, updateDocument, exception handling, and post‑update flush logic—and discusses the strategies and thresholds that control when in‑memory buffers are flushed to disk.

Document WritingIndexingJava

0 likes · 16 min read

Understanding Lucene Document Writing Process: Core Classes, Workflow, and Flush Strategies

Tencent Cloud Developer

Aug 29, 2022 · Big Data

Tencent CLS: High‑Performance Time‑Series Search Engine for Cloud Log Service

Tencent’s Cloud Log Service augments Lucene with a dedicated time‑series index—using timestamp ordering, a secondary index, reverse binary search, and histogram optimization—to cut log query complexity, delivering up to 40‑50× faster responses, higher concurrency, and markedly better performance than traditional ELK‑style and competing cloud log solutions.

LuceneVLDBcloud log service

0 likes · 14 min read

Tencent CLS: High‑Performance Time‑Series Search Engine for Cloud Log Service

Selected Java Interview Questions

Jul 5, 2022 · Big Data

Understanding Elasticsearch: Core Concepts, Architecture, Indexing Mechanics and Performance Optimization

This article explains the fundamentals of structured and unstructured data, introduces Lucene's inverted index, describes Elasticsearch's distributed cluster architecture, node roles, sharding and replication mechanisms, indexing workflow with refresh and translog, storage segment model, and provides practical performance‑tuning recommendations.

ElasticsearchLucenePerformance Optimization

0 likes · 36 min read

Understanding Elasticsearch: Core Concepts, Architecture, Indexing Mechanics and Performance Optimization

政采云技术

May 12, 2022 · Fundamentals

Understanding Lucene Query Process and Core Principles

This article explains Lucene's query types, the step‑by‑step query execution flow—including entry, rewrite, weight creation, scoring, and result collection—while providing code examples and performance considerations to help developers troubleshoot and optimize search performance.

BM25ElasticsearchJava

0 likes · 15 min read

Understanding Lucene Query Process and Core Principles

Senior Brother's Insights

May 10, 2022 · Backend Development

Mastering Elasticsearch: Core Concepts, Architecture, and Performance Tuning

This comprehensive guide explains what Elasticsearch does, its underlying Lucene engine, core concepts like clusters, shards, replicas, mappings, and provides practical steps for installation, configuration, indexing, storage mechanics, and performance optimization.

LuceneShardingcluster management

0 likes · 36 min read

Mastering Elasticsearch: Core Concepts, Architecture, and Performance Tuning

Su San Talks Tech

Apr 17, 2022 · Backend Development

How Elasticsearch Powers Real-Time Search: Core Concepts and Best Practices

This article provides a comprehensive overview of Elasticsearch, explaining its underlying Lucene technology, data modeling, cluster architecture, shard and replica mechanisms, indexing workflow, storage strategies, refresh and translog processes, as well as practical performance and JVM tuning tips for building scalable, near‑real‑time search solutions.

ElasticsearchLuceneSearch Engine

0 likes · 37 min read

How Elasticsearch Powers Real-Time Search: Core Concepts and Best Practices

IT Architects Alliance

Apr 10, 2022 · Backend Development

Understanding Elasticsearch: Core Concepts, Architecture, and Performance Tips

This article provides a comprehensive overview of Elasticsearch, covering data types, Lucene fundamentals, cluster discovery, node roles, shard and replica management, mapping, installation, health monitoring, indexing mechanics, storage strategies, refresh and translog processes, segment merging, and practical performance optimizations for production deployments.

ElasticsearchIndexingLucene

0 likes · 39 min read

Understanding Elasticsearch: Core Concepts, Architecture, and Performance Tips

Top Architect

Apr 9, 2022 · Big Data

Elasticsearch Overview: Architecture, Core Concepts, Indexing Mechanics, and Performance Optimization

This comprehensive article explains what Elasticsearch is, how it builds on Lucene to provide distributed real‑time search and analytics, covering data types, cluster components, shard routing, indexing pipelines, storage formats, segment merging, and practical performance‑tuning tips for production deployments.

ElasticsearchIndexingLucene

0 likes · 36 min read

Elasticsearch Overview: Architecture, Core Concepts, Indexing Mechanics, and Performance Optimization

Selected Java Interview Questions

Mar 9, 2022 · Big Data

Elasticsearch Overview: Core Concepts, Architecture, and Performance Optimization

This article provides a comprehensive overview of Elasticsearch, covering its data types, Lucene-based inverted index, cluster architecture, sharding and replication mechanisms, mapping definitions, basic usage, health monitoring, storage internals, and practical performance tuning tips for large‑scale search deployments.

ElasticsearchLucenePerformance Optimization

0 likes · 36 min read

Elasticsearch Overview: Core Concepts, Architecture, and Performance Optimization

Open Source Linux

Dec 29, 2021 · Backend Development

How Elasticsearch Achieves Lightning‑Fast Search with Inverted Indexes

This article explains how Elasticsearch uses inverted indexes, term dictionaries, and compression techniques like FOR and Roaring Bitmaps to enable rapid full‑text search, contrasting its approach with traditional relational databases and offering practical indexing tips for large‑scale applications.

ElasticsearchLucenePostings List

0 likes · 15 min read

How Elasticsearch Achieves Lightning‑Fast Search with Inverted Indexes

Open Source Linux

Dec 8, 2021 · Backend Development

How Elasticsearch Uses Lucene’s Inverted Index for Lightning‑Fast Search

This article explains how Elasticsearch leverages Lucene’s inverted index, detailing the structure of term dictionaries, postings lists, compression techniques like Frame‑of‑Reference and Roaring Bitmaps, and query optimizations such as filter caches and skip‑list intersections to achieve fast, memory‑efficient search.

ElasticsearchLuceneSearch Engine

0 likes · 19 min read

How Elasticsearch Uses Lucene’s Inverted Index for Lightning‑Fast Search

Efficient Ops

Dec 2, 2021 · Backend Development

How Elasticsearch Achieves Lightning‑Fast Search with Inverted Indexes

This article explains how Elasticsearch uses inverted indexes, term dictionaries, and compression techniques such as Frame‑of‑Reference and Roaring Bitmaps to deliver rapid full‑text search, efficient storage, and fast union queries, while also offering practical indexing tips for production use.

LucenePostings ListRoaring Bitmap

0 likes · 15 min read

Java Interview Crash Guide

Nov 11, 2021 · Big Data

How Elasticsearch Achieves Lightning‑Fast Search with Inverted Indexes and Compression

This article explains how Elasticsearch uses inverted indexes, term dictionaries, and advanced compression techniques like Frame of Reference and Roaring Bitmaps to enable rapid, scalable search over massive datasets, contrasting its approach with traditional relational database queries and detailing practical optimization tips.

ElasticsearchLucenePostings List

0 likes · 16 min read

How Elasticsearch Achieves Lightning‑Fast Search with Inverted Indexes and Compression

MaGe Linux Operations

Oct 13, 2021 · Backend Development

How Elasticsearch Achieves Near Real-Time Search: Core Techniques Explained

This article explains how Elasticsearch implements near real-time search by using immutable inverted indexes, segment merging, sharding, and a translog for durability, detailing the challenges and solutions behind its distributed full‑text search architecture.

ElasticsearchLuceneNear Real-Time Search

0 likes · 9 min read

How Elasticsearch Achieves Near Real-Time Search: Core Techniques Explained

IT Architects Alliance

Oct 6, 2021 · Big Data

Understanding Elasticsearch Inverted Index and Efficient Search Retrieval

This article explains how Elasticsearch uses inverted indexes, term dictionaries, and postings lists—along with compression techniques like Frame of Reference and Roaring Bitmaps—to achieve fast, memory‑efficient search queries, and provides practical tips for optimizing indexing and query performance.

ElasticsearchLucenePostings List

0 likes · 14 min read

Understanding Elasticsearch Inverted Index and Efficient Search Retrieval

IT Architects Alliance

Sep 29, 2021 · Databases

Understanding Elasticsearch Inverted Index: Fast Retrieval, Compression, and Query Techniques

This article explains how Elasticsearch uses inverted index structures—including term dictionaries, term indexes, and postings lists—combined with compression methods like Frame‑of‑Reference and Roaring Bitmaps to achieve fast search, efficient storage, and effective union queries compared to traditional relational databases.

ElasticsearchLucenePostings List

0 likes · 14 min read

Understanding Elasticsearch Inverted Index: Fast Retrieval, Compression, and Query Techniques

Java Interview Crash Guide

Sep 23, 2021 · Fundamentals

How Elasticsearch Writes, Reads, and Searches Data: Deep Dive into ES Internals

This article explains Elasticsearch's core mechanisms for indexing, querying, and searching data, covering the roles of coordinating nodes, primary and replica shards, refresh cycles, translog, commit/flush processes, and the underlying Lucene inverted index.

ElasticsearchLuceneSearch Engine

0 likes · 13 min read

How Elasticsearch Writes, Reads, and Searches Data: Deep Dive into ES Internals

21CTO

Sep 12, 2021 · Big Data

How Elasticsearch Achieves Lightning‑Fast Search with Inverted Indexes and Compression

This article explains how Elasticsearch uses inverted indexes, term dictionaries, FST structures, and compression techniques like FOR and Roaring Bitmaps to dramatically speed up search queries over massive datasets while minimizing memory and disk usage.

LuceneSearch Enginecompression

0 likes · 14 min read

IT Architects Alliance

Sep 5, 2021 · Databases

Understanding Elasticsearch Fast Retrieval: Inverted Index, Postings List, and Compression Techniques

This article explains how Elasticsearch achieves rapid search by using inverted indexes, detailing the structure of posting lists, term dictionaries, compression methods like Frame‑of‑Reference and Roaring Bitmaps, and how these techniques enable efficient union queries and filter caching.

ElasticsearchLucenePostings List

0 likes · 14 min read

Understanding Elasticsearch Fast Retrieval: Inverted Index, Postings List, and Compression Techniques

Architecture Digest

Sep 5, 2021 · Databases

How Elasticsearch Achieves Fast Retrieval: Inverted Index, Term Dictionary, and Compression Techniques

This article explains how Elasticsearch leverages Lucene's inverted index, term dictionary, term index, and compression methods such as Frame‑of‑Reference and Roaring Bitmaps to enable rapid search, efficient storage, and fast set operations for large‑scale data retrieval.

ElasticsearchLuceneRoaring Bitmap

0 likes · 16 min read

How Elasticsearch Achieves Fast Retrieval: Inverted Index, Term Dictionary, and Compression Techniques

Architect

Sep 4, 2021 · Databases

Understanding Elasticsearch Fast Retrieval: Inverted Index, Term Dictionary, and Compression Techniques

This article explains how Elasticsearch achieves fast data retrieval by comparing it with traditional relational databases, detailing search engine fundamentals, the structure of Lucene's inverted index—including term dictionaries, postings lists, and term indexes—and the compression techniques such as Frame of Reference and Roaring Bitmaps that optimize storage and query performance.

ElasticsearchLucenePostings List

0 likes · 14 min read

Understanding Elasticsearch Fast Retrieval: Inverted Index, Term Dictionary, and Compression Techniques

Java Interview Crash Guide

Aug 26, 2021 · Backend Development

How to Supercharge Elasticsearch: Practical Index & Search Optimizations

This article shares practical lessons from three iterations of a data platform, focusing on Elasticsearch and Lucene optimizations that enable cross‑month queries, year‑long data export, and sub‑second query responses for tables handling billions of rows per day.

ElasticsearchIndexingLucene

0 likes · 13 min read

How to Supercharge Elasticsearch: Practical Index & Search Optimizations

Top Architect

Aug 18, 2021 · Big Data

Elasticsearch Indexing and Retrieval Optimization for Billion‑Scale Data

This article describes how a top architect optimized Elasticsearch for handling billions of records, covering Lucene fundamentals, index and shard design, DocValues, query performance tuning, bulk indexing strategies, hardware considerations, and testing methods to achieve sub‑second query responses across multi‑year data ranges.

Big DataElasticsearchIndex Optimization

0 likes · 12 min read

Elasticsearch Indexing and Retrieval Optimization for Billion‑Scale Data

Big Data Technology & Architecture

Aug 7, 2021 · Big Data

Elasticsearch Optimization Practices and Performance Tuning Guide

This article presents a comprehensive guide on optimizing Elasticsearch for large‑scale data platforms, covering Lucene fundamentals, index and shard architecture, doc‑values usage, routing strategies, practical performance‑tuning techniques, and real‑world testing results to achieve sub‑second query responses on billions of records.

ElasticsearchIndex OptimizationLucene

0 likes · 12 min read

Elasticsearch Optimization Practices and Performance Tuning Guide

vivo Internet Technology

Jul 14, 2021 · Databases

An Overview of Lucene: Architecture, Indexing Workflow, and Code Implementation

The article introduces Apache Lucene 7.3.1, explains its core architecture and index hierarchy, details the two‑phase indexing and search workflow with code examples for document addition, deletion, merging, and query execution, and highlights its suitability for small‑to‑medium projects versus distributed alternatives.

Full-Text SearchIndexingJava

0 likes · 20 min read

An Overview of Lucene: Architecture, Indexing Workflow, and Code Implementation

Java Interview Crash Guide

Jul 2, 2021 · Databases

How Elasticsearch Achieves Lightning‑Fast Search with Inverted Indexes

This article explains how Elasticsearch leverages inverted indexes, term dictionaries, and advanced compression techniques like Frame of Reference and Roaring Bitmaps to enable rapid full‑text search, covering the underlying concepts, data structures, and query optimizations essential for high‑performance search applications.

ElasticsearchLucenePostings List

0 likes · 17 min read

Java High-Performance Architecture

Jun 8, 2021 · Big Data

How Elasticsearch Writes, Reads, and Searches Data: Deep Dive into ES Internals

This article explains Elasticsearch's write, read, and search mechanisms, the role of coordinating nodes, primary and replica shards, refresh and commit cycles, Lucene's inverted index, and how data becomes searchable in near‑real‑time.

ElasticsearchLuceneRead Process

0 likes · 12 min read

Top Architect

Mar 5, 2021 · Big Data

Elasticsearch Indexing and Search Optimization: Principles, Lucene Internals, and Performance Tuning

This article explains the architecture and core concepts of Elasticsearch and Lucene, outlines the requirements for cross‑month and high‑speed queries on massive datasets, and provides detailed index and search performance tuning techniques—including bulk writes, shard routing, doc‑values management, and pagination strategies—to achieve sub‑second response times on billions of records.

Big DataElasticsearchIndex Optimization

0 likes · 13 min read

Elasticsearch Indexing and Search Optimization: Principles, Lucene Internals, and Performance Tuning

Architecture Digest

Feb 18, 2021 · Big Data

Elasticsearch Write, Read, and Search Processes: Underlying Mechanisms and Lucene Inverted Index

This article explains how Elasticsearch handles data ingestion, retrieval, and full‑text search by describing the roles of coordinating, primary, and replica nodes, the refresh‑commit‑flush cycle, segment files, translog, and the Lucene‑based inverted index that powers its near‑real‑time capabilities.

ElasticsearchLuceneRead Process

0 likes · 11 min read

Elasticsearch Write, Read, and Search Processes: Underlying Mechanisms and Lucene Inverted Index

Architect

Feb 15, 2021 · Big Data

Elasticsearch Optimization Practices for Large-Scale Data Queries

This article explains how to optimize Elasticsearch for cross‑month and multi‑year queries on billions of records, covering Lucene fundamentals, index and search performance tweaks, configuration settings, and practical testing results to achieve sub‑second response times.

Big DataElasticsearchLucene

0 likes · 14 min read

Elasticsearch Optimization Practices for Large-Scale Data Queries

Qunar Tech Salon

Feb 4, 2021 · Fundamentals

Understanding Lucene Inverted Index: Principles and Implementation

This article explains the concept of inverted indexes, their role in full‑text search, and provides a detailed overview of how Apache Lucene implements inverted indexing, including term dictionaries, posting lists, query processing, and numeric handling with BKDTree.

BKDTreeLucenePosting List

0 likes · 15 min read

Understanding Lucene Inverted Index: Principles and Implementation

Programmer DD

Jan 28, 2021 · Databases

How Elasticsearch Writes, Reads, and Searches Data: Inside the Engine

This article explains Elasticsearch's internal mechanisms for writing, reading, and searching data, covering the roles of coordinating nodes, primary and replica shards, buffers, translog, segment files, refresh cycles, commit and flush operations, as well as Lucene's inverted index and how deletions and updates are handled.

ElasticsearchLuceneSegment

0 likes · 10 min read

How Elasticsearch Writes, Reads, and Searches Data: Inside the Engine

Big Data Technology & Architecture

Jan 1, 2021 · Big Data

Elasticsearch Indexing and Search Optimization for Billion‑Scale Data

This article explains how to design, tune, and optimize an Elasticsearch‑based data platform handling hundreds of billions of records, covering Lucene fundamentals, shard routing, index and query performance tricks, and practical benchmark results for large‑scale deployments.

ElasticsearchIndexingLucene

0 likes · 13 min read

Elasticsearch Indexing and Search Optimization for Billion‑Scale Data

Big Data Technology & Architecture

Dec 17, 2020 · Databases

Deep Dive into Index Implementations of MySQL, InnoDB, MyISAM, and Lucene

This article explains the different index mechanisms used by MySQL (MyISAM and InnoDB) and Lucene, compares them with Elasticsearch's inverted index, and discusses how these structures affect storage, memory usage, and query performance.

DatabaseElasticsearchIndexing

0 likes · 8 min read

Deep Dive into Index Implementations of MySQL, InnoDB, MyISAM, and Lucene

Programmer DD

Nov 26, 2020 · Databases

Unveiling Elasticsearch: Inside Nodes, Shards, and Lucene’s Inverted Index

This article explains Elasticsearch’s internal architecture, from cloud clusters and nodes to shards and Lucene’s inverted index, covering indexing, storage structures, query processing, caching, scaling, routing, and real‑world request handling, with detailed diagrams and examples.

DistributedIndexingLucene

0 likes · 13 min read

Unveiling Elasticsearch: Inside Nodes, Shards, and Lucene’s Inverted Index

MaGe Linux Operations

Nov 19, 2020 · Backend Development

Supercharging Elasticsearch: Practical Index & Search Optimizations for Billion-Row Queries

This article shares practical Elasticsearch and Lucene optimization techniques—including index structure tuning, shard routing, DocValues management, and query pagination—to achieve sub‑second search performance on datasets exceeding a billion records while supporting multi‑year historical queries.

ElasticsearchIndexingLucene

0 likes · 13 min read

Supercharging Elasticsearch: Practical Index & Search Optimizations for Billion-Row Queries

vivo Internet Technology

Oct 14, 2020 · Artificial Intelligence

Understanding Cosine Similarity: From Mathematical Foundations to Practical Applications

The article explains cosine similarity from basic geometry and vector math, derives its formula, and shows how it powers precision marketing, image classification, and text retrieval, while also detailing its industrial implementation in Lucene’s vector space model.

Cosine SimilarityLuceneSearch Engine

0 likes · 18 min read

Understanding Cosine Similarity: From Mathematical Foundations to Practical Applications

Architecture Digest

Sep 14, 2020 · Databases

Understanding the Underlying Mechanics of Elasticsearch and Lucene

This article provides a comprehensive, top‑down and bottom‑up explanation of Elasticsearch’s internal architecture, covering clusters, nodes, shards, Lucene segments, inverted indexes, stored fields, document values, caching, merging, routing, scaling, and query processing, while addressing common performance questions.

CachingElasticsearchLucene

0 likes · 11 min read

Understanding the Underlying Mechanics of Elasticsearch and Lucene

Selected Java Interview Questions

Sep 12, 2020 · Databases

Understanding Elasticsearch Internals: Architecture, Lucene Indexing, Sharding, and Scaling

This article explains the internal workings of Elasticsearch, covering its cloud‑based cluster architecture, Lucene‑based indexing structures such as segments, shards, inverted indexes, stored fields and doc values, as well as search processing, caching, merging, routing, and scaling strategies.

Big DataElasticsearchIndexing

0 likes · 13 min read

Understanding Elasticsearch Internals: Architecture, Lucene Indexing, Sharding, and Scaling

Tencent Cloud Developer

Aug 27, 2020 · Big Data

Elasticsearch Overview: Architecture, Lucene Foundations, Application Scenarios, and Optimizations

Elasticsearch, built on Apache Lucene, provides a distributed, near‑real‑time search platform that scales to billions of documents across thousands of nodes, supporting use cases such as log analytics, time‑series monitoring, and product search, while Tencent’s CES adds advanced availability, performance, and cost‑optimizing features.

Big DataElasticsearchLucene

0 likes · 17 min read

Elasticsearch Overview: Architecture, Lucene Foundations, Application Scenarios, and Optimizations

Selected Java Interview Questions

Aug 20, 2020 · Big Data

Elasticsearch Write, Read, and Search Processes: Underlying Mechanisms and Lucene Inverted Index

This article explains Elasticsearch’s write, read, and search workflows, detailing the roles of coordinating nodes, primary and replica shards, refresh and commit cycles, translog handling, and the underlying Lucene inverted index mechanism.

ElasticsearchLuceneSearch Engine

0 likes · 11 min read

Ctrip Technology

Aug 20, 2020 · Backend Development

Optimizing Ctrip Hotel Search System: Storage, Intelligent Query, Error Correction, and DSL Design

This article details how Ctrip's hotel search system was optimized through storage compression, spatial indexing, KV storage, semantic query generation, context‑aware error correction, and a custom domain‑specific language, balancing performance, flexibility, and user experience for large‑scale online travel services.

Error CorrectionLucenebackend optimization

0 likes · 19 min read

Optimizing Ctrip Hotel Search System: Storage, Intelligent Query, Error Correction, and DSL Design

Programmer DD

Aug 8, 2020 · Artificial Intelligence

How Elasticsearch Handles Write, Read, and Search: Inside the Engine

This article explains Elasticsearch's internal mechanisms for indexing, querying, and retrieving data, covering the roles of coordinating nodes, primary and replica shards, the refresh and commit cycles, near‑real‑time search, and the underlying Lucene inverted index.

ElasticsearchIndexingLucene

0 likes · 12 min read

How Elasticsearch Handles Write, Read, and Search: Inside the Engine

Swan Home Tech Team

Jul 13, 2020 · Backend Development

Design and Evolution of the DaJia App Search System

This article explains the motivations, requirements, and technical design of the DaJia app's search system, compares relational databases with Lucene‑based solutions, describes the inverted index mechanism, outlines common search workflows, and details the system's three iterative development phases and future improvement plans.

ElasticsearchInformation RetrievalLucene

0 likes · 12 min read

Design and Evolution of the DaJia App Search System

Qunar Tech Salon

Jun 30, 2020 · Backend Development

Optimizing Lucene Stored Fields Access with a Custom Codec and In‑Memory Caching

This article describes how the Qunar hotel search team reduced Lucene stored‑fields deserialization overhead and GC pressure by implementing a custom Codec that caches stored fields in memory, redesigning the storage format, and evaluating the performance and space benefits of the approach.

CachingLucenePerformance

0 likes · 12 min read

Optimizing Lucene Stored Fields Access with a Custom Codec and In‑Memory Caching

iQIYI Technical Product Team

Jun 19, 2020 · Artificial Intelligence

Emoji Search at iQIYI Douya: From ElasticSearch to Lucene and Semantic Retrieval

iQIYI Douya’s emoji search evolved from ElasticSearch to a pure Lucene implementation and added semantic vector retrieval, enabling fast, scalable, and more accurate text‑based search of AI‑generated images for small‑to‑medium businesses by combining custom tokenization, dense embeddings, and hybrid ranking.

ElasticsearchLuceneSearch Architecture

0 likes · 14 min read

Emoji Search at iQIYI Douya: From ElasticSearch to Lucene and Semantic Retrieval

Architect

May 22, 2020 · Databases

Performance Analysis of Elasticsearch Queries: Lucene Internals and Benchmark Results

This article examines Elasticsearch query performance by explaining Lucene's underlying data structures, describing how composite queries are merged, and presenting benchmark numbers for various query types such as term, range, and combined queries, highlighting optimization techniques and practical conclusions.

BKD-TreeBenchmarkElasticsearch

0 likes · 13 min read

Performance Analysis of Elasticsearch Queries: Lucene Internals and Benchmark Results

dbaplus Community

May 5, 2020 · Databases

Why Elasticsearch Beats Its Competitors: A Deep Technical Comparison

This article offers a detailed, experience‑driven comparison of Elasticsearch against its main rivals—Lucene, Solr, relational databases, OpenTSDB, HBase, MongoDB, ClickHouse, and Druid—highlighting where Elasticsearch excels, where it falls short, and practical guidance for choosing the right data solution.

ComparisonDatabasesElasticsearch

0 likes · 15 min read

Why Elasticsearch Beats Its Competitors: A Deep Technical Comparison

ITPUB

May 5, 2020 · Backend Development

How to Optimize Elasticsearch for Billion‑Row Queries and Sub‑Second Responses

This guide explains the background, requirements, Elasticsearch architecture, Lucene fundamentals, and practical tuning steps—including indexing, shard routing, doc values, and hardware choices—to achieve cross‑month, sub‑second query performance on datasets exceeding a billion records.

Lucene

0 likes · 12 min read

How to Optimize Elasticsearch for Billion‑Row Queries and Sub‑Second Responses

Big Data Technology Architecture

Feb 21, 2020 · Databases

Analysis of Elasticsearch Write Operations and Underlying Mechanisms

This article examines how Elasticsearch implements write operations on top of Lucene, detailing the challenges of Lucene's write path and describing Elasticsearch's distributed design, near‑real‑time refresh, translog reliability, shard replication, partial updates, and the complete write workflow from coordinating node to primary and replica shards.

ElasticsearchLuceneShard

0 likes · 14 min read

Analysis of Elasticsearch Write Operations and Underlying Mechanisms

Architect's Tech Stack

Dec 25, 2019 · Backend Development

Elasticsearch Optimization Practices for Large-Scale Data Platforms

This article explains the architecture of Elasticsearch and Lucene, outlines common performance bottlenecks, and provides concrete indexing and query optimization techniques—including shard routing, refresh intervals, doc values, and hardware considerations—to achieve sub‑second query responses on billions of records.

ElasticsearchIndexingLucene

0 likes · 12 min read

Elasticsearch Optimization Practices for Large-Scale Data Platforms

macrozheng

Dec 20, 2019 · Big Data

How to Supercharge Elasticsearch for Billion‑Row Queries: Practical Optimization Guide

This article explains the architecture of Elasticsearch and Lucene, outlines common performance bottlenecks, and provides concrete indexing and search optimization techniques—including bulk writes, shard routing, doc values tuning, and pagination strategies—to achieve sub‑second query responses on billions of records.

Big DataElasticsearchLucene

0 likes · 14 min read

How to Supercharge Elasticsearch for Billion‑Row Queries: Practical Optimization Guide

dbaplus Community

Dec 10, 2019 · Backend Development

How to Optimize Elasticsearch for Billions of Records: Practical Tuning Guide

An in‑depth guide walks through Elasticsearch’s underlying Lucene architecture, explains shard routing and DocValues, then presents concrete index‑ and search‑performance tweaks—bulk writes, refresh intervals, memory allocation, SSD usage, field mapping, pagination strategies—and shows benchmark results that reduce query latency to seconds for billions of records.

Big DataElasticsearchIndex Optimization

0 likes · 13 min read

How to Optimize Elasticsearch for Billions of Records: Practical Tuning Guide

ITPUB

Dec 5, 2019 · Big Data

How to Achieve Sub‑Second Queries on Billions of Records with Elasticsearch

This article explains how a data platform handling billions of daily records can be optimized for cross‑month queries and sub‑second response times by tuning Elasticsearch indexing, shard routing, Lucene structures, and hardware configurations.

Big DataIndexingLucene

0 likes · 13 min read

How to Achieve Sub‑Second Queries on Billions of Records with Elasticsearch

DevOps Coach

Nov 26, 2019 · Backend Development

Why Elasticsearch Creates Too Many Segments and How Lucene Flush Works

The article explains how Elasticsearch’s use of Lucene’s flush mechanism, concurrent shard writes, and IndexWriter buffering lead to an excess of small segments, outlines the flush conditions, and offers guidance on managing write concurrency for better performance.

ElasticsearchFlushIndexWriter

0 likes · 10 min read

Why Elasticsearch Creates Too Many Segments and How Lucene Flush Works

Architecture Digest

Nov 22, 2019 · Big Data

Elasticsearch Optimization Practices for Large‑Scale Data Platforms

This article presents a comprehensive guide to optimizing Elasticsearch for massive data volumes, covering Lucene fundamentals, index and shard design, practical performance‑tuning techniques, and real‑world testing results that enable cross‑month queries and sub‑second response times.

Big DataElasticsearchIndex Optimization

0 likes · 14 min read

ITPUB

Sep 1, 2019 · Databases

How Elasticsearch Stores and Retrieves Data: Inside Lucene’s Write‑Refresh‑Flush‑Merge Cycle

This article explains the fundamental architecture of Elasticsearch and its underlying Lucene engine, detailing the data model, index hierarchy, and the step‑by‑step write, refresh, flush, and merge processes that enable near‑real‑time search and data durability.

LuceneSearch Enginedata indexing

0 likes · 8 min read

How Elasticsearch Stores and Retrieves Data: Inside Lucene’s Write‑Refresh‑Flush‑Merge Cycle

Big Data Technology Architecture

Aug 12, 2019 · Fundamentals

Understanding Full‑Text Search and Comparing Solr, Lucene, and Elasticsearch

This article explains the principles of full‑text search, contrasts structured and unstructured data retrieval methods, introduces Lucene, Solr, and Elasticsearch, and provides a detailed comparison of their features, community support, maturity, and documentation to help developers choose the right search engine for their projects.

ElasticsearchFull-Text SearchLucene

0 likes · 15 min read

Understanding Full‑Text Search and Comparing Solr, Lucene, and Elasticsearch

Full-Stack Internet Architecture

Jun 8, 2019 · Big Data

The Story of Doug Cutting: From Stanford to Hadoop and Beyond

This article chronicles Doug Cutting's journey from his humble beginnings at Stanford through his pioneering work on Lucene, Nutch, and Hadoop, highlighting how his innovations in search and distributed computing reshaped the big data landscape and led to the rise of Cloudera.

Big DataClouderaDoug Cutting

0 likes · 8 min read

The Story of Doug Cutting: From Stanford to Hadoop and Beyond

Architecture Digest

May 31, 2019 · Operations

Running a 400+ Node Elasticsearch Cluster: Architecture, Scaling, and Performance Tuning

Meltwater details how it processes millions of daily media posts using a custom‑tuned Elasticsearch 1.7.6 cluster of over 400 nodes on AWS, covering data volume, query complexity, node configuration, indexing strategy, performance optimizations, and lessons learned for large‑scale search deployments.

AWSBig DataElasticsearch

0 likes · 12 min read

Running a 400+ Node Elasticsearch Cluster: Architecture, Scaling, and Performance Tuning

Java Backend Technology

Apr 30, 2019 · Fundamentals

Solr vs Elasticsearch: Choosing the Right Full‑Text Search Engine

This article explains the fundamentals of full‑text search, compares Lucene‑based engines Solr and Elasticsearch, and provides practical guidance on selecting the appropriate solution based on use cases, performance, scalability, and community support.

ElasticsearchFull-Text SearchLucene

0 likes · 17 min read

Solr vs Elasticsearch: Choosing the Right Full‑Text Search Engine

dbaplus Community

Jan 3, 2019 · Backend Development

Supercharging Elasticsearch for Billion-Row Queries: Practical Tips

This guide details how to optimize Elasticsearch for handling billions of daily records, covering core Lucene concepts, index and shard configuration, performance‑tuning parameters, and practical testing methods to achieve sub‑second query responses and long‑term data retention.

Big DataElasticsearchIndexing

0 likes · 13 min read

Supercharging Elasticsearch for Billion-Row Queries: Practical Tips

Beike Product & Technology

Nov 23, 2018 · Backend Development

Elasticsearch Internals: Distributed Document Storage, Real‑time Search, and Translog Mechanics

This article explains the core Elasticsearch architecture—including shard routing, primary‑replica interaction, document CRUD workflows, multi‑document APIs, segment merging, translog durability, and storage file formats—providing a comprehensive view of how near‑real‑time search is achieved on large‑scale data.

Distributed storageElasticsearchLucene

0 likes · 20 min read

Elasticsearch Internals: Distributed Document Storage, Real‑time Search, and Translog Mechanics

Java Captain

Mar 29, 2018 · Fundamentals

Understanding Full‑Text Search and Indexing with Lucene: Core Concepts and Processes

This article explains the fundamentals of full‑text search, describing how Lucene builds and uses inverted indexes, the steps of tokenization, linguistic processing, term weighting, and relevance scoring, and illustrates these concepts with examples, tables, and diagrams.

Full-Text SearchIndexingInformation Retrieval

0 likes · 21 min read

Understanding Full‑Text Search and Indexing with Lucene: Core Concepts and Processes

Architect's Tech Stack

Jan 18, 2018 · Databases

SolrCloud Introduction and Spring Boot Example with Code

This article introduces SolrCloud, explains its relationship with Lucene and Solr, provides environment setup instructions for a CentOS 7.3 cluster, details Maven dependencies, configuration files, and a comprehensive Java implementation using Spring Boot, including repository interfaces, utility classes, and extensive unit tests for adding, querying, and deleting documents.

DistributedSearchJavaLucene

0 likes · 8 min read

SolrCloud Introduction and Spring Boot Example with Code

vivo Internet Technology

Nov 17, 2017 · Big Data

Elasticsearch Search Tuning Guide: Part 2 - Index Optimization, Mapping, Scripts, and Segment Merging

The second part of the Elasticsearch search‑tuning series explains how to pre‑index data, choose appropriate keyword or text mappings, minimize script use by preferring Painless or Lucene expressions, and efficiently force‑merge read‑only indices into single segments for better performance.

Data MappingElasticsearchForce Merge

0 likes · 8 min read

Elasticsearch Search Tuning Guide: Part 2 - Index Optimization, Mapping, Scripts, and Segment Merging

Ctrip Technology

Jun 29, 2017 · Backend Development

Understanding Elasticsearch Scoring: Lucene Scoring Functions, Query Boosting, and Function Score Queries

This article explains how Elasticsearch computes relevance scores using Lucene's practical scoring formula, term frequency, inverse document frequency, field-length norms, and query normalization, and demonstrates query-time boosting, constant_score, function_score, decay functions, and script_score with practical DSL examples.

ElasticsearchLuceneQuery Boosting

0 likes · 14 min read

Understanding Elasticsearch Scoring: Lucene Scoring Functions, Query Boosting, and Function Score Queries

ITFLY8 Architecture Home

Jun 22, 2017 · Backend Development

How to Build a High‑Performance Distributed Log Query System with Lucene, Ignite, and Log4j2

This article presents a design for a transparent, flexible, and low‑resource distributed logging solution that uses Lucene for indexing, Apache Ignite for service and compute grids, and a custom Log4j2 appender, enabling fast, unified log queries across clustered applications.

Apache IgniteLoggingLucene

0 likes · 15 min read

How to Build a High‑Performance Distributed Log Query System with Lucene, Ignite, and Log4j2

Qunar Tech Salon

Feb 26, 2017 · Big Data

Comparative Analysis of Big Data Storage and Query Solutions

This article reviews major big‑data storage and query architectures—including HBase, Dremel/Parquet, pre‑aggregation systems, Lucene, and the custom Tindex solution—evaluating their strengths, weaknesses, and suitability for real‑time, high‑volume analytical workloads.

Big DataHBaseLucene

0 likes · 20 min read

Comparative Analysis of Big Data Storage and Query Solutions

Efficient Ops

Dec 7, 2015 · Backend Development

Mastering the ELK Stack: From Lucene Indexing to ElasticSearch Queries

This article walks through the fundamentals of search engine architecture, explains Lucene's role as an indexing library, details ElasticSearch's distributed design, clustering, sharding, and plugins, and demonstrates practical RESTful API usage and query DSL techniques for effective log analysis.

IndexingLuceneSearch

0 likes · 23 min read

Mastering the ELK Stack: From Lucene Indexing to ElasticSearch Queries