Big Data 46 min read

Master Elasticsearch: From Basics to SpringBoot Integration and Advanced Queries

This comprehensive guide introduces Elasticsearch fundamentals, its features and use cases, then walks through integrating it with SpringBoot, configuring Maven dependencies, performing index and document operations, and demonstrates a variety of query types and aggregations using both RESTful APIs and Java code examples.

Programmer DD

Apr 12, 2020

Master Elasticsearch: From Basics to SpringBoot Integration and Advanced Queries

1. Elasticsearch Overview

Elasticsearch is a distributed, RESTful search engine built on Apache Lucene. It provides full‑text search, analytics, and real‑time capabilities, and is released under the Apache license.

Features

Distributed document storage engine

Distributed search and analytics engine

Scales to petabyte‑level data

Typical Use Cases

Search engines (e.g., web search)

Portal statistics, article likes, comments

Advertising targeting and behavior analysis

Log and event data collection for big‑data analysis

2. Basic Concepts

Indices, Types, Documents, and Fields

An index contains one or more types; each type holds documents composed of fields. This hierarchy mirrors relational databases (database → table → row → column).

Mapping

Mapping defines field data types and how fields are indexed and searched. Elasticsearch can dynamically create mappings, but explicit mappings are recommended for precise control.

3. SpringBoot Integration

Maven Dependencies

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
  <groupId>org.projectlombok</groupId>
  <artifactId>lombok</artifactId>
  <optional>true</optional>
</dependency>
<dependency>
  <groupId>com.alibaba</groupId>
  <artifactId>fastjson</artifactId>
  <version>1.2.61</version>
</dependency>
<dependency>
  <groupId>org.elasticsearch.client</groupId>
  <artifactId>elasticsearch-rest-high-level-client</artifactId>
  <version>6.5.4</version>
</dependency>
<dependency>
  <groupId>org.elasticsearch</groupId>
  <artifactId>elasticsearch</artifactId>
  <version>6.5.4</version>
</dependency>

Configuration (application.yml)

server:
  port: 8080
spring:
  application:
    name: springboot-elasticsearch-example
elasticsearch:
  schema: http
  address: 127.0.0.1:9200
  connectTimeout: 5000
  socketTimeout: 5000
  connectionRequestTimeout: 5000
  maxConnectNum: 100
  maxConnectPerRoute: 100

Java Configuration Class

import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import java.util.ArrayList;
import java.util.List;

@Configuration
public class ElasticSearchConfig {
    @Value("${elasticsearch.schema:http}")
    private String schema;
    @Value("${elasticsearch.address}")
    private String address;
    @Value("${elasticsearch.connectTimeout:5000}")
    private int connectTimeout;
    @Value("${elasticsearch.socketTimeout:10000}")
    private int socketTimeout;
    @Value("${elasticsearch.connectionRequestTimeout:5000}")
    private int connectionRequestTimeout;
    @Value("${elasticsearch.maxConnectNum:100}")
    private int maxConnectNum;
    @Value("${elasticsearch.maxConnectPerRoute:100}")
    private int maxConnectPerRoute;

    @Bean
    public RestHighLevelClient restHighLevelClient() {
        List<HttpHost> hostLists = new ArrayList<>();
        String[] hostArray = address.split(",");
        for (String addr : hostArray) {
            String host = addr.split(":")[0];
            int port = Integer.parseInt(addr.split(":")[1]);
            hostLists.add(new HttpHost(host, port, schema));
        }
        HttpHost[] httpHosts = hostLists.toArray(new HttpHost[0]);
        RestClientBuilder builder = RestClient.builder(httpHosts);
        builder.setRequestConfigCallback(rcb -> {
            rcb.setConnectTimeout(connectTimeout);
            rcb.setSocketTimeout(socketTimeout);
            rcb.setConnectionRequestTimeout(connectionRequestTimeout);
            return rcb;
        });
        builder.setHttpClientConfigCallback(hcb -> {
            hcb.setMaxConnTotal(maxConnectNum);
            hcb.setMaxConnPerRoute(maxConnectPerRoute);
            return hcb;
        });
        return new RestHighLevelClient(builder);
    }
}

4. Index Operations

Create Index (REST)

PUT /mydlq-user
{
  "mappings": {
    "doc": {
      "dynamic": true,
      "properties": {
        "name": {"type": "text", "fields": {"keyword": {"type": "keyword"}}},
        "address": {"type": "text", "fields": {"keyword": {"type": "keyword"}}},
        "remark": {"type": "text", "fields": {"keyword": {"type": "keyword"}}},
        "age": {"type": "integer"},
        "salary": {"type": "float"},
        "birthDate": {"type": "date", "format": "yyyy-MM-dd"},
        "createTime": {"type": "date"}
      }
    }
  }
}

Delete Index (REST)

DELETE /mydlq-user

5. Document Operations

Add Document (REST)

POST /mydlq-user/doc
{
  "name": "张三",
  "address": "北京市",
  "remark": "来自北京市的张先生",
  "age": 29,
  "salary": 100,
  "birthDate": "1990-01-10",
  "createTime": 1579530727699
}

Get Document (REST)

GET /mydlq-user/doc/1

Update Document (REST)

PUT /mydlq-user/doc/1
{
  "address": "北京市海淀区",
  "age": 29,
  "birthDate": "1990-01-10",
  "createTime": 1579530727699,
  "name": "张三",
  "remark": "来自北京市的张先生",
  "salary": 100
}

Delete Document (REST)

DELETE /mydlq-user/doc/1

6. Data Insertion

Bulk insert a sample dataset of 20 employee records using the _bulk API. Each record contains fields such as name, address, remark, age, salary, birthDate, and createTime.

7. Query Operations

Exact Term Query

GET /mydlq-user/_search
{
  "query": {
    "term": {"address.keyword": {"value": "北京市通州区"}}
  }
}

Terms Query (multiple values)

GET /mydlq-user/_search
{
  "query": {
    "terms": {"address.keyword": ["北京市丰台区","北京市昌平区","北京市大兴区"]}
  }
}

Match Query

GET /mydlq-user/_search
{
  "query": {"match": {"address": "通州区"}}
}

Match Phrase Query

GET /mydlq-user/_search
{
  "query": {"match_phrase": {"address": "北京市通州区"}}
}

Multi‑Match Query (address and remark)

GET /mydlq-user/_search
{
  "query": {"multi_match": {"query": "北京", "fields": ["address", "remark"]}}
}

Fuzzy Query

GET /mydlq-user/_search
{
  "query": {"fuzzy": {"name": "三"}}
}

Range Query (age >= 30)

GET /mydlq-user/_search
{
  "query": {"range": {"age": {"gte": 30}}}
}

Range Query (birthDate within last 30 years)

GET /mydlq-user/_search
{
  "query": {"range": {"birthDate": {"gte": "now-30y"}}}
}

Wildcard Query (names ending with "三")

GET /mydlq-user/_search
{
  "query": {"wildcard": {"name.keyword": {"value": "*三"}}}
}

Boolean Query (birthDate 1990‑1995 and specific addresses)

GET /mydlq-user/_search
{
  "query": {
    "bool": {
      "filter": {"range": {"birthDate": {"format": "yyyy", "gte": 1990, "lte": 1995}}},
      "must": [{"terms": {"address.keyword": ["北京市昌平区","北京市大兴区","北京市房山区"]}}]
    }
  }
}

8. Aggregation Operations

Metric Aggregations

Statistics on the salary field (count, min, max, avg, sum, percentiles):

GET /mydlq-user/_search
{
  "size": 0,
  "aggs": {"salary_stats": {"stats": {"field": "salary"}}}
}

Bucket Aggregations

Terms aggregation on age:

GET /mydlq-user/_search
{
  "size": 0,
  "aggs": {"age_bucket": {"terms": {"field": "age", "size": 10}}}
}

Range aggregation on salary (low, medium, high):

GET /mydlq-user/_search
{
  "aggs": {"salary_range_bucket": {"range": {"field": "salary", "ranges": [{"key": "低级员工", "to": 3000}, {"key": "中级员工", "from": 5000, "to": 9000}, {"key": "高级员工", "from": 9000}]}}}

Date histogram on birthDate (yearly buckets):

GET /mydlq-user/_search
{
  "size": 0,
  "aggs": {"birthday_histogram": {"date_histogram": {"field": "birthDate", "calendar_interval": "year", "format": "yyyy"}}}
}

Metric + Bucket Aggregation (Top Salary per Age)

GET /mydlq-user/_search
{
  "size": 0,
  "aggs": {
    "salary_bucket": {
      "terms": {"field": "age", "size": 10},
      "aggs": {
        "salary_max_user": {
          "top_hits": {"size": 1, "sort": [{"salary": {"order": "desc"}}]}
        }
      }
    }
  }
}

These examples cover creating indices, CRUD operations on documents, a variety of query types, and both metric and bucket aggregations, providing a solid foundation for using Elasticsearch in Java SpringBoot projects.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Big Data Search Engine Elasticsearch springboot Full-Text Search

Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.