Tag

Lucene

1 views collected around this technical thread.

IT Services Circle
IT Services Circle
Jun 24, 2024 · Databases

Understanding Elasticsearch Architecture: Inverted Index, Term Dictionary, Segments, and Distributed Search

This article explains how Elasticsearch transforms simple keyword matching into a high‑performance, scalable search engine by using inverted indexes, term dictionaries, posting lists, term indexes, stored fields, doc values, segments, and distributed node architectures to achieve fast, reliable full‑text search on massive data sets.

ElasticsearchInverted IndexLucene
0 likes · 16 min read
Understanding Elasticsearch Architecture: Inverted Index, Term Dictionary, Segments, and Distributed Search
Top Architect
Top Architect
Apr 18, 2024 · Big Data

Understanding ElasticSearch Architecture and Its Underlying Lucene Mechanics

This article provides a comprehensive, top‑down and bottom‑up explanation of ElasticSearch’s core architecture, detailing nodes, shards, Lucene segments, inverted indexes, stored fields, document values, caching, query processing, routing, and scaling considerations for efficient search operations.

Big DataElasticsearchInverted Index
0 likes · 10 min read
Understanding ElasticSearch Architecture and Its Underlying Lucene Mechanics
Architect
Architect
Apr 15, 2024 · Big Data

Understanding the Underlying Working Principles of ElasticSearch

This article explains ElasticSearch’s architecture and core mechanisms—including its reliance on Lucene segments, inverted indexes, stored fields, document values, caching, shard routing, and scaling strategies—while answering common questions about wildcard matching, index compression, and memory usage.

Big DataElasticsearchLucene
0 likes · 11 min read
Understanding the Underlying Working Principles of ElasticSearch
JD Tech
JD Tech
Mar 14, 2024 · Databases

JD ElasticSearch Supports ZSTD Compression: Implementation, Performance Evaluation, and Usage Guide

This article explains how JD ElasticSearch has integrated the high‑performance ZSTD compression algorithm, details the motivations behind its adoption, presents benchmark results comparing it with LZ4 and best_compression, and provides step‑by‑step instructions and code snippets for configuring and using the new jd_zstd codec in Elasticsearch.

BigDataElasticsearchJava
0 likes · 14 min read
JD ElasticSearch Supports ZSTD Compression: Implementation, Performance Evaluation, and Usage Guide
Ximalaya Technology Team
Ximalaya Technology Team
Sep 6, 2023 · Backend Development

Design Analysis of Lucene and In-Memory Inverted Index Service for Advertising Retrieval

The team analyzed Lucene’s disk‑based inverted index and built a custom in‑memory inverted‑index service for Himalaya’s ad engine, encoding terms as 64‑bit keys, supporting real‑time updates and BooleanQuery‑style and custom expression filtering, which cut query latency from ~50 ms to under 5 ms and enabled massive scaling.

Data StructuresInverted IndexJava
0 likes · 27 min read
Design Analysis of Lucene and In-Memory Inverted Index Service for Advertising Retrieval
Didi Tech
Didi Tech
Aug 10, 2023 · Big Data

Implementing ZSTD Compression in Didi's Elasticsearch for High‑Performance Log Ingestion

By integrating ZSTD compression into Didi’s Elasticsearch 7.6, the team cut CPU usage by about 15 %, reduced index storage roughly 30 %, boosted write throughput up to 25 %, and eliminated over 20 servers, demonstrating a faster, more storage‑efficient solution for petabyte‑scale log ingestion.

Big DataElasticsearchLucene
0 likes · 10 min read
Implementing ZSTD Compression in Didi's Elasticsearch for High‑Performance Log Ingestion
Architects Research Society
Architects Research Society
Jul 24, 2023 · Artificial Intelligence

Neural Search in Apache Solr: Dense Vector Fields, HNSW Graphs, and K‑Nearest Neighbor Implementation

This article explains how Apache Solr implements neural search using dense vector fields, K‑Nearest Neighbor algorithms, and Hierarchical Navigable Small World graphs, detailing the underlying Lucene support, configuration options, query syntax, and integration with AI‑driven vector representations.

AIApache SolrDense Vectors
0 likes · 15 min read
Neural Search in Apache Solr: Dense Vector Fields, HNSW Graphs, and K‑Nearest Neighbor Implementation
Top Architect
Top Architect
Jul 18, 2023 · Fundamentals

Comprehensive Introduction to Elasticsearch: Core Concepts, Architecture, and Practical Usage

This article provides a detailed overview of Elasticsearch, covering its underlying Lucene technology, data types, indexing mechanisms, cluster architecture, shard and replica management, mapping definitions, installation steps, health monitoring, write and storage processes, and performance optimization techniques for production deployments.

ElasticsearchIndexingLucene
0 likes · 36 min read
Comprehensive Introduction to Elasticsearch: Core Concepts, Architecture, and Practical Usage
DeWu Technology
DeWu Technology
May 8, 2023 · Databases

Optimizing Elasticsearch Search Performance with Index Sorting

By defining index sorting on the publish_time field when creating the Elasticsearch index, the team transformed a multi‑second full‑scan query into a sub‑50 ms operation, demonstrating that pre‑ordered storage dramatically speeds up large‑result‑set sorts while modestly affecting write throughput.

DocValuesElasticsearchIndex Sorting
0 likes · 12 min read
Optimizing Elasticsearch Search Performance with Index Sorting
政采云技术
政采云技术
Mar 2, 2023 · Fundamentals

Two‑Phase Commit in Lucene: Mechanism, Implementation, and Rollback

This article explains the two‑phase commit protocol, describes how Lucene implements it through a dedicated interface, details the preparation, commit, segment handling, deletion policies, and rollback procedures, and provides code snippets illustrating the core logic.

IndexingLuceneRollback
0 likes · 13 min read
Two‑Phase Commit in Lucene: Mechanism, Implementation, and Rollback
政采云技术
政采云技术
Mar 2, 2023 · Databases

Understanding Two-Phase Commit and Its Implementation in Lucene

This article explains the two-phase commit protocol for distributed transactions, details its generic workflow, and describes how Apache Lucene implements the protocol through its TwoPhaseCommit interface, including preparation, flushing, commit, segment handling, deletion policies, and rollback mechanisms with illustrative code examples.

IndexingJavaLucene
0 likes · 12 min read
Understanding Two-Phase Commit and Its Implementation in Lucene
Architect's Guide
Architect's Guide
Feb 25, 2023 · Big Data

Elasticsearch Optimization and Performance Tuning for Billion‑Scale Data

This article documents the evolution of a data platform, explains Elasticsearch and Lucene fundamentals, and presents practical index and search performance optimizations—including bulk writes, refresh control, memory allocation, doc‑values tuning, and pagination strategies—that enable cross‑month queries and sub‑second responses on billions of records.

Big DataElasticsearchIndex Optimization
0 likes · 11 min read
Elasticsearch Optimization and Performance Tuning for Billion‑Scale Data
Efficient Ops
Efficient Ops
Dec 21, 2022 · Big Data

How Elasticsearch Leverages Lucene’s Inverted Index for Real‑Time Distributed Search

This article explains the fundamentals of structured and unstructured data, introduces Lucene’s inverted index, and details how Elasticsearch builds on Lucene to provide distributed, near‑real‑time search with concepts such as clusters, shards, replicas, routing, and performance optimizations.

ElasticsearchInverted IndexLucene
0 likes · 36 min read
How Elasticsearch Leverages Lucene’s Inverted Index for Real‑Time Distributed Search
Architect's Guide
Architect's Guide
Oct 27, 2022 · Big Data

Elasticsearch Overview: Data Types, Lucene Foundations, Core Concepts, Cluster Architecture, Indexing, Storage, and Performance Optimization

This article provides a comprehensive introduction to Elasticsearch, covering the distinction between structured and unstructured data, Lucene’s inverted index, ES core concepts such as clusters, nodes, shards and replicas, mapping, basic usage, storage mechanisms, and practical performance‑tuning tips for large‑scale search deployments.

ElasticsearchIndexingLucene
0 likes · 39 min read
Elasticsearch Overview: Data Types, Lucene Foundations, Core Concepts, Cluster Architecture, Indexing, Storage, and Performance Optimization
Architect
Architect
Sep 23, 2022 · Databases

Elasticsearch Index and Search Performance Optimization for Billion‑Scale Data

This article presents a comprehensive case study of optimizing Elasticsearch and its underlying Lucene structures to achieve sub‑second query responses on billions of records, covering architecture basics, index design, doc‑values tuning, bulk‑write strategies, and extensive performance testing.

Big DataElasticsearchIndexing
0 likes · 12 min read
Elasticsearch Index and Search Performance Optimization for Billion‑Scale Data
IT Architects Alliance
IT Architects Alliance
Sep 12, 2022 · Backend Development

Elasticsearch Optimization: Lucene Architecture, Index Design, and Performance Tuning

This article presents a comprehensive guide to optimizing Elasticsearch for massive datasets, covering Lucene fundamentals, index and shard architecture, practical performance‑tuning techniques, and real‑world case studies that achieve sub‑second query responses on billions of records.

BackendElasticsearchIndex Optimization
0 likes · 11 min read
Elasticsearch Optimization: Lucene Architecture, Index Design, and Performance Tuning
政采云技术
政采云技术
Aug 30, 2022 · Fundamentals

Understanding Lucene Document Writing Process: Core Classes, Workflow, and Flush Strategies

This article explains the key Lucene classes involved in document indexing, outlines the end‑to‑end write workflow—including preUpdate, obtainAndLock, updateDocument, exception handling, and post‑update flush logic—and discusses the strategies and thresholds that control when in‑memory buffers are flushed to disk.

Document WritingIndexingJava
0 likes · 16 min read
Understanding Lucene Document Writing Process: Core Classes, Workflow, and Flush Strategies
Tencent Cloud Developer
Tencent Cloud Developer
Aug 29, 2022 · Big Data

Tencent CLS: High‑Performance Time‑Series Search Engine for Cloud Log Service

Tencent’s Cloud Log Service augments Lucene with a dedicated time‑series index—using timestamp ordering, a secondary index, reverse binary search, and histogram optimization—to cut log query complexity, delivering up to 40‑50× faster responses, higher concurrency, and markedly better performance than traditional ELK‑style and competing cloud log solutions.

LuceneVLDBcloud log service
0 likes · 14 min read
Tencent CLS: High‑Performance Time‑Series Search Engine for Cloud Log Service
Selected Java Interview Questions
Selected Java Interview Questions
Jul 5, 2022 · Big Data

Understanding Elasticsearch: Core Concepts, Architecture, Indexing Mechanics and Performance Optimization

This article explains the fundamentals of structured and unstructured data, introduces Lucene's inverted index, describes Elasticsearch's distributed cluster architecture, node roles, sharding and replication mechanisms, indexing workflow with refresh and translog, storage segment model, and provides practical performance‑tuning recommendations.

ClusterElasticsearchInverted Index
0 likes · 36 min read
Understanding Elasticsearch: Core Concepts, Architecture, Indexing Mechanics and Performance Optimization
政采云技术
政采云技术
May 12, 2022 · Fundamentals

Understanding Lucene Query Process and Core Principles

This article explains Lucene's query types, the step‑by‑step query execution flow—including entry, rewrite, weight creation, scoring, and result collection—while providing code examples and performance considerations to help developers troubleshoot and optimize search performance.

BM25ElasticsearchJava
0 likes · 15 min read
Understanding Lucene Query Process and Core Principles