Databases 14 min read

Master Elasticsearch Basics: Core Concepts and Hands‑On API Commands

This guide introduces Elasticsearch fundamentals—including its core concepts, main features, and essential terminology—followed by step‑by‑step Docker setup and practical API commands for creating, indexing, searching, and managing documents in a single‑node cluster.

Java High-Performance Architecture

Jan 19, 2020

Master Elasticsearch Basics: Core Concepts and Hands‑On API Commands

Content Overview

Elasticsearch fundamentals, focusing on core concepts.

Basic API practical operations.

1. Fundamentals

Elasticsearch (ES) is a database that provides distributed, near‑real‑time search and analytics.

Built on Apache Lucene, it can handle structured data, unstructured data, numeric data, and geospatial data.

Data is stored as loosely‑structured JSON documents.

Main Features

Lightweight, fast full‑text search.

Secure analytics and infrastructure monitoring.

Supports massive scale – thousands of servers and petabyte‑level data.

Integrates with visualization tools for application performance analysis, log monitoring, and infrastructure metrics.

Can be used for machine learning with real‑time automated model processing.

Core Concepts

Index – analogous to a table in a relational database; stores documents. Before version 6.0 an index could contain multiple document types (e.g., Car and Bike). After 6.0 each type requires a separate index.

Each document has a unique _id .

Document – analogous to a row in a relational database.

Field – analogous to a column in a relational database.

Data Types

1) String – two sub‑types: text and keyword . text is used for full‑text content such as product descriptions or article bodies; Elasticsearch tokenizes the content into a list of terms and builds an inverted index.

Example: a field "Description" with value "This phone has dual sim capability" becomes the token list ["this", "phone", "has", "dual", "sim", "capability"] . The inverted index maps each term to the documents containing it (e.g., "this" → doc_1, doc_3). keyword stores exact values like usernames, email addresses, or zip codes; it is not tokenized and is suitable for exact matching.

2) Numeric – stores identifiers, percentages, phone numbers, etc.; supported types include long, integer, short, byte, double, float .

3) Date – can be a formatted string (e.g., "2015/01/01 12:10:30"), a millisecond‑precision long, or a second‑precision integer; internally stored as UTC long.

4) Boolean

5) IP

6) Object (embedded) – a field can be a JSON array, each object being indexed as a hidden document.

Example of an embedded object:

{
  "name":"ABC United",
  "homeGround":"Old Trafford",
  "players":[
    {
      "firstName":"James",
      "lastName":"Cohen",
      "position":"Goal Keeper"
    },
    {
      "firstName":"Paul",
      "lastName":"Pogba",
      "position":"Midfielder"
    }
  ]
}

7) Multi‑type – a field can have both text and keyword sub‑fields to support partial and exact matching.

Mapping defines an index schema – which fields exist and their types.

Setting allows customization of index behavior, including custom analyzers and normalizers.

Important settings:

number_of_shards : number of primary shards (default 1).

number_of_replicas : number of replica shards per primary (default 1).

refresh_interval : time between indexing and making documents searchable (default 1s).

Shard – a Lucene instance managed automatically by ES; ES distributes shards across nodes and rebalances them on node failures or additions.

Replica – a copy of a primary shard; provides failover and can serve read requests to improve performance.

Alias – an alternative name for an index or a group of indices, useful for querying across multiple indices.

Template – defines common mappings and settings for multiple indices that match a pattern; explicit mappings/settings on index creation override the template.

2. API Operations

Test Environment Setup

Elasticsearch version used: 7.5.1 .

Start a single‑node Docker container:

docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.5.1

Verify the node:

$ curl -X GET "localhost:9200/_cat/nodes?v&pretty"
ip         heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
172.17.0.2            7          97   2    0.96    0.61     0.25 dilm      *      245e340eba97

$ curl localhost:9200
{
  "name" : "245e340eba97",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "mq_bxY5zTjCpmJU0xOLSbA",
  "version" : {
    "number" : "7.5.1",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "3ae9ac9a93c95bd0cdc054951cf95d88e1e18d96",
    "build_date" : "2019-12-16T22:57:37.835892Z",
    "build_snapshot" : false,
    "lucene_version" : "8.3.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Reference: Elasticsearch Docker documentation

Practical Operations

Create an index

curl -X PUT "localhost:9200/traveler?pretty" -H 'Content-Type: application/json' -d'
{
  "settings":{
    "number_of_shards":5,
    "number_of_replicas":2
  },
  "mappings":{
    "properties":{
      "name":{ "type":"keyword" },
      "age":{ "type":"integer" },
      "background":{ "type":"text" },
      "nationality":{ "type":"keyword" }
    }
  }
}
'

Insert a document

curl -X PUT "localhost:9200/traveler/_doc/1?pretty" -H 'Content-Type: application/json' -d'
{
  "name":"John Doe",
  "age":"23",
  "background":"Born and brought up in California. Engineer by profession. Loves to cook",
  "nationality":"British"
}
'

Read a document

curl -X GET "localhost:9200/traveler/_doc/1?pretty"

Delete a document

curl -X DELETE "localhost:9200/traveler/_doc/1?pretty"

Delete an index

curl -X DELETE "localhost:9200/traveler?pretty"

List all indices

curl -X GET "localhost:9200/_cat/indices"

Check cluster health

curl -X GET "localhost:9200/_cat/health?v"

Get index mapping and settings

# mapping + setting
curl -X GET "localhost:9200/traveler?pretty"
# mapping only
curl -X GET "localhost:9200/traveler/_mapping?pretty"
# settings only
curl -X GET "localhost:9200/traveler/_settings?pretty"

Assign an alias to an index

curl -X POST "localhost:9200/_aliases" -H 'Content-Type: application/json' -d'
{
  "actions":[
    { "add":{ "index":"traveler", "alias":"read_alias" } }
  ]
}
'

Search all documents in the index

curl -X GET "localhost:9200/traveler/_search?pretty"

Key items in the response: took – query time in milliseconds. timed_out – whether the query timed out. _shards – shard execution details. hits – the actual search results. hits.total – total number of matching documents. hits.hits – array of result documents (default first 10). hits.max_score – highest relevance score. hits.hits._score – relevance score of each document.

Get total document count

curl -X GET "localhost:9200/traveler/_count?pretty"

Match query (full‑text search)

curl -X GET "localhost:9200/traveler/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query":{ "match":{ "background":"brought up California Loves cook" } }
}
'

The field "background" is tokenized into ["brought", "up", "california", "loves", "cook"]. Any matching token returns the document.

Term query (exact match)

curl -X GET "localhost:9200/traveler/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query":{ "term":{ "name":{ "value":"John Doe" } } }
}
'

Suitable for keyword, numeric, date, or boolean fields.

Terms query (multiple exact values)

curl -X GET "localhost:9200/traveler/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query":{ "terms":{ "name":[ "John Doe", "Jack Ripper", "Buzz Aldrin" ] } }
}
'

Equivalent to an SQL IN query.

Prefix query

curl -X GET "localhost:9200/traveler/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query":{ "prefix":{ "name":"Joh" } }
}
'

Regexp query

curl -X GET "localhost:9200/traveler/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query":{ "regexp":{ "name":{ "value":"J.*e" } } }
}
'

Multi‑search (multiple queries in one request)

curl -X GET "localhost:9200/_msearch?pretty" -H 'Content-Type: application/x-ndjson' -d'
{"index":"traveler"}
{"query":{"terms":{"name":["John Doe","Jack Ripper","Barack Obama"]}}}
{}
{"query":{"prefix":{"name":"Buzz"}}}
{"index":"traveler"}
{"query":{"match_all":{}}}
'

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Docker Indexing Elasticsearch json API Full‑Text Search

Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.