Step-by-Step Guide to Installing and Using Elasticsearch for Full‑Text Search

This article provides a comprehensive, from‑scratch tutorial on installing Elasticsearch, explaining core concepts such as nodes, clusters, indices, documents, and types, and demonstrates how to create, delete, update, and query data—including Chinese tokenization—using command‑line curl requests.

Architecture Digest
Architecture Digest
Architecture Digest
Step-by-Step Guide to Installing and Using Elasticsearch for Full‑Text Search

Full‑text search is a common requirement, and the open‑source Elasticsearch (referred to as Elastic) is the leading engine for it, capable of storing, searching, and analyzing massive data quickly; Wikipedia, Stack Overflow, and GitHub all use it.

Elastic is built on the Lucene library, but you interact with it via its REST API rather than calling Lucene directly.

This guide walks through installing Elastic from zero, configuring Java 8, downloading the zip package, and starting the service.

$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.5.1.zip $ unzip elasticsearch-5.5.1.zip $ cd elasticsearch-5.5.1/ $ ./bin/elasticsearch

If the virtual memory limit is too low, adjust it with:

$ sudo sysctl -w vm.max_map_count=262144

When Elastic runs, it listens on port 9200; you can verify it with:

$ curl localhost:9200

To allow remote access, edit config/elasticsearch.yml and set network.host: 0.0.0.0, then restart.

Basic Concepts

Elastic is a distributed system where each instance is a node and a group of nodes forms a cluster .

An index is the top‑level logical unit (similar to a database) that stores an inverted index for fast lookup.

A document is a single JSON record stored in an index; documents in the same index need not share the exact schema but similar structures improve performance.

A type groups documents logically (e.g., by city or by data category). In Elasticsearch 6.x each index can have only one type; 7.x removes types entirely.

Creating and Deleting an Index

Create an index with a PUT request:

$ curl -X PUT 'localhost:9200/weather'

Elastic returns a JSON with an acknowledged field indicating success.

Delete the index with:

$ curl -X DELETE 'localhost:9200/weather'

Chinese Analyzer Configuration

Install the IK analyzer plugin (or alternatives like smartcn):

$ ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.5.1/elasticsearch-analysis-ik-5.5.1.zip

After restarting Elastic, create an index with fields that use the Chinese analyzer:

$ curl -X PUT 'localhost:9200/accounts' -d '{ "mappings": { "person": { "properties": { "user": {"type": "text", "analyzer": "ik_max_word", "search_analyzer": "ik_max_word"}, "title": {"type": "text", "analyzer": "ik_max_word", "search_analyzer": "ik_max_word"}, "desc": {"type": "text", "analyzer": "ik_max_word", "search_analyzer": "ik_max_word"} } } } }'

Data Operations

Insert a document (PUT with an explicit ID):

$ curl -X PUT 'localhost:9200/accounts/person/1' -d '{ "user": "张三", "title": "工程师", "desc": "数据库管理" }'

Elastic returns the index, type, ID, version, and result.

Insert without specifying an ID (POST) to let Elastic generate a random ID:

$ curl -X POST 'localhost:9200/accounts/person' -d '{ "user": "李四", "title": "工程师", "desc": "系统管理" }'

Retrieve a document with a GET request:

$ curl 'localhost:9200/accounts/person/1?pretty=true'

Delete a document with DELETE:

$ curl -X DELETE 'localhost:9200/accounts/person/1'

Update a document with PUT (the version increments):

$ curl -X PUT 'localhost:9200/accounts/person/1' -d '{ "user": "张三", "title": "工程师", "desc": "数据库管理,软件开发" }'

Data Queries

Return all documents:

$ curl 'localhost:9200/accounts/person/_search'

Full‑text match query (search for the term "软件" in the desc field):

$ curl 'localhost:9200/accounts/person/_search' -d '{ "query": {"match": {"desc": "软件"}} }'

Limit the number of results with the size parameter and paginate with from:

$ curl 'localhost:9200/accounts/person/_search' -d '{ "query": {"match": {"desc": "管理"}}, "size": 1, "from": 1 }'

Multiple terms are treated as OR by default; for AND logic, use a boolean query:

$ curl 'localhost:9200/accounts/person/_search' -d '{ "query": { "bool": { "must": [ {"match": {"desc": "软件"}}, {"match": {"desc": "系统"}} ] } } }'

References

ElasticSearch official documentation, practical introduction articles, and the original source at ruanyifeng.com.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

CLIInstallationFull‑Text Searchquery
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.