Backend Development 35 min read

How to Build Powerful Search, Log, and Recommendation Solutions with Elasticsearch

This guide walks through five real‑world Elasticsearch use cases—including full‑text product search with highlighting, centralized log collection and analysis, personalized video recommendation, price‑range aggregation for e‑commerce, and geo‑location restaurant search—detailing index design, query syntax, Docker setup, and front‑end integration.

Java Architecture Stack

Oct 15, 2024

How to Build Powerful Search, Log, and Recommendation Solutions with Elasticsearch

1. Full‑Text Search and Highlighting

Business scenario: An e‑commerce platform needs fast product search across massive data and keyword highlighting to improve user experience.

Solution: Design an index with text fields for product name, description, and brand; use match or multi_match queries; enable the highlight feature to wrap matching terms.

Implementation steps

Prepare Elasticsearch (Docker command):

docker run -d --name elasticsearch -p 9200:9200 -e "discovery.type=single-node" elasticsearch:8.0.0

Create the products index with mappings:

PUT /products
{
  "mappings": {
    "properties": {
      "name": {"type": "text"},
      "description": {"type": "text"},
      "price": {"type": "float"}
    }
  }
}

Index sample product documents:

POST /products/_doc/1
{
  "name": "huawei mate 70",
  "description": "mate 70 phone runs pure Harmony NEXT OS",
  "price": 6500
}

POST /products/_doc/2
{
  "name": "huawei Mate XT Extraordinary Master",
  "description": "Extraordinary Master 16GB+1TB Black ULTIMATE DESIGN",
  "price": 23999
}

POST /products/_doc/3
{
  "name": "huawei Mate X5",
  "description": "huawei mate x5 12GB+512GB ultra‑thin four‑fold screen",
  "price": 12499
}

Run a multi‑match query to search name and description:

GET /products/_search
{
  "query": {
    "multi_match": {
      "query": "X5",
      "fields": ["name", "description"]
    }
  }
}

Add highlighting to the query:

GET /products/_search
{
  "query": { "multi_match": { "query": "X5", "fields": ["name", "description"] } },
  "highlight": { "fields": { "name": {}, "description": {} } }
}

Parse the response; the highlight section contains HTML <em> tags around matching terms, which can be rendered directly in the front‑end.

{
  "hits": {
    "hits": [
      {
        "_source": {"name": "huawei Mate X5", "description": "..."},
        "highlight": {"name": ["huawei <em>x5</em> 14"], "description": ["huawei mate <em>x5</em> ..."]}
      }
    ]
  }
}

Optional: create a custom analyzer with a synonym filter to match alternative terms (e.g., "x5, mate x5").

PUT /products
{
  "settings": {"analysis": {"analyzer": {"synonym_analyzer": {"tokenizer": "whitespace", "filter": ["synonym_filter"]}},"filter": {"synonym_filter": {"type": "synonym", "synonyms": ["x5, mate x5", "mate70, extraordinary master"]}}}},
  "mappings": {"properties": {"name": {"type": "text", "analyzer": "synonym_analyzer"}, "description": {"type": "text", "analyzer": "synonym_analyzer"}}}
}

2. Log Collection and Analysis

Business scenario: A SaaS company needs centralized storage, real‑time monitoring, and analysis of distributed application logs to quickly locate errors and performance bottlenecks.

Solution: Use the ELK/EFK stack – Filebeat (or Logstash) to ship logs to Elasticsearch, then visualize and alert with Kibana.

Implementation steps

Start Elasticsearch and Kibana via Docker:

# Start Elasticsearch
docker run -d --name elasticsearch -p 9200:9200 -e "discovery.type=single-node" elasticsearch:8.0.0

# Start Kibana
docker run -d --name kibana -p 5601:5601 --link elasticsearch:elasticsearch kibana:8.0.0

Install and configure Filebeat to read log files and forward them to Elasticsearch:

# Install Filebeat (Linux)
sudo apt-get install filebeat

# filebeat.yml (excerpt)
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/myapp/*.log

output.elasticsearch:
  hosts: ["localhost:9200"]
  username: "elastic"
  password: "changeme"

setup.kibana:
  host: "localhost:5601"

# Start Filebeat
sudo filebeat modules enable system
sudo filebeat setup
sudo service filebeat start

Create a dedicated logs index with appropriate mappings:

PUT /logs-system
{
  "mappings": {
    "properties": {
      "timestamp": {"type": "date"},
      "log.level": {"type": "keyword"},
      "message": {"type": "text"},
      "service.name": {"type": "keyword"},
      "host.name": {"type": "keyword"},
      "process.pid": {"type": "integer"}
    }
  }
}

Filebeat ships logs; Kibana can create an index pattern (e.g., logs-system-*) and set timestamp as the time filter field.

Example KQL queries in Kibana:

All ERROR logs: log.level: "ERROR" Specific service logs: service.name: "my-service" ERROR logs in the last hour: log.level: "ERROR" AND @timestamp > "now-1h" Aggregations for insight:

# Count logs per level
GET /logs-system/_search
{
  "size": 0,
  "aggs": {"by_log_level": {"terms": {"field": "log.level"}}}
}

# ERROR logs per minute
GET /logs-system/_search
{
  "size": 0,
  "query": {"match": {"log.level": "ERROR"}},
  "aggs": {"logs_over_time": {"date_histogram": {"field": "timestamp", "interval": "minute"}}}
}

Configure alerts in Kibana (Alerts & Actions) to trigger notifications when thresholds are exceeded.

3. Personalized Recommendation System

Business scenario: An online video platform wants to recommend relevant videos based on user watch history, likes, and behavior to increase engagement and revenue.

Solution: Store user actions and video metadata in Elasticsearch; use more_like_this for content‑based similarity, function_score for weighting, and aggregations for popularity statistics.

Implementation steps

Create a videos index for video metadata:

PUT /videos
{
  "mappings": {
    "properties": {
      "title": {"type": "text"},
      "description": {"type": "text"},
      "tags": {"type": "keyword"},
      "category": {"type": "keyword"},
      "release_date": {"type": "date"}
    }
  }
}

Create a user_actions index to record actions (view, like, search, etc.):

PUT /user_actions
{
  "mappings": {
    "properties": {
      "user_id": {"type": "keyword"},
      "video_id": {"type": "keyword"},
      "action_type": {"type": "keyword"},
      "timestamp": {"type": "date"}
    }
  }
}

Index sample user actions, e.g.:

POST /user_actions/_doc
{
  "user_id": "weige",
  "video_id": "video789",
  "action_type": "view",
  "timestamp": "2024-10-14T12:30:00Z"
}

Identify a user's favorite categories via aggregation:

GET /user_actions/_search
{
  "size": 0,
  "query": {"term": {"user_id": "weige"}},
  "aggs": {"favorite_categories": {"terms": {"field": "category.keyword", "size": 5}}}
}

Content‑based recommendation using more_like_this:

GET /videos/_search
{
  "query": {
    "more_like_this": {
      "fields": ["title", "description", "tags"],
      "like": [{"_id": "video789"}],
      "min_term_freq": 1,
      "max_query_terms": 12
    }
  }
}

Collaborative filtering via aggregations to find users who watched the same video and then recommend videos they liked:

# Find similar users
GET /user_actions/_search
{
  "size": 0,
  "query": {"term": {"video_id": "video789"}},
  "aggs": {"similar_users": {"terms": {"field": "user_id.keyword", "size": 10}}}
}

# Recommend videos watched by those users
GET /user_actions/_search
{
  "size": 10,
  "query": {"terms": {"user_id": ["weige123", "weige456"]}},
  "aggs": {"recommended_videos": {"terms": {"field": "video_id.keyword", "size": 5}}}
}

Combine content‑based and collaborative signals in a bool query and sort by recency:

GET /videos/_search
{
  "query": {
    "bool": {
      "should": [
        {"match": {"category": "user_favorite_category"}},
        {"match": {"tags": "user_favorite_tags"}}
      ]
    }
  },
  "sort": [{"release_date": {"order": "desc"}}]
}

Front‑end fetch example (JavaScript):

fetch('http://localhost:9200/videos/_search', {
  method: 'POST',
  headers: {'Content-Type': 'application/json'},
  body: JSON.stringify({
    query: {more_like_this: {fields: ['title','description','tags'], like: [{_id: 'video789'}], min_term_freq: 1, max_query_terms: 12}}
  })
})
.then(r => r.json())
.then(data => console.log('Recommended videos:', data.hits.hits));

Continuously refine the model using user feedback (likes, clicks) and real‑time streams (e.g., Kafka) to adjust scores.

4. Product Price‑Range Statistics and Filtering

Business scenario: An online store wants to let shoppers filter products by price ranges and see how many items fall into each bucket.

Solution: Store the price field as a numeric type, then use range queries for filtering and range aggregations (or histogram) for counting.

Implementation steps

Create a products index with a price field:

PUT /products
{
  "mappings": {
    "properties": {
      "name": {"type": "text"},
      "description": {"type": "text"},
      "category": {"type": "keyword"},
      "price": {"type": "float"},
      "in_stock": {"type": "boolean"}
    }
  }
}

Bulk‑load sample products:

POST /products/_bulk
{ "index": {"_id": "1"} }
{ "name": "Smartphone A", "description": "A high‑end smartphone", "category": "electronics", "price": 499.99, "in_stock": true }
{ "index": {"_id": "2"} }
{ "name": "Laptop B", "description": "A powerful laptop", "category": "electronics", "price": 899.99, "in_stock": true }
{ "index": {"_id": "3"} }
{ "name": "Tablet C", "description": "A mid‑range tablet", "category": "electronics", "price": 299.99, "in_stock": true }
{ "index": {"_id": "4"} }
{ "name": "Headphones D", "description": "Noise‑cancelling headphones", "category": "accessories", "price": 199.99, "in_stock": true }
{ "index": {"_id": "5"} }
{ "name": "Smartwatch E", "description": "A fitness‑oriented smartwatch", "category": "accessories", "price": 149.99, "in_stock": false }

Filter by a specific price range (e.g., 200‑500):

GET /products/_search
{
  "query": {"range": {"price": {"gte": 200, "lte": 500}}}
}

Aggregate product counts per price bucket:

GET /products/_search
{
  "size": 0,
  "aggs": {
    "price_ranges": {
      "range": {
        "field": "price",
        "ranges": [
          {"to": 200},
          {"from": 200, "to": 500},
          {"from": 500, "to": 1000},
          {"from": 1000}
        ]
      }
    }
  }
}

The response contains price_ranges.buckets with doc_count for each interval.

Update a product’s price or stock status in real time:

POST /products/_update/1
{
  "doc": {"price": 479.99, "in_stock": false}
}

5. Geo‑Location Search for Nearby Restaurants

Business scenario: A food‑delivery app wants to show users restaurants near their current location, sorted by distance, with optional filters such as rating or cuisine.

Solution: Store restaurant coordinates as geo_point, use geo_distance queries to limit the radius, and sort with the _geo_distance sort option.

Implementation steps

Create a restaurants index with a location field of type geo_point:

PUT /restaurants
{
  "mappings": {
    "properties": {
      "name": {"type": "text"},
      "description": {"type": "text"},
      "location": {"type": "geo_point"},
      "rating": {"type": "float"},
      "category": {"type": "keyword"}
    }
  }
}

Bulk‑load sample restaurant data (each document includes lat and lon values):

POST /restaurants/_bulk
{ "index": {"_id": "1"} }
{ "name": "Jiu Cai Ji Dan", "description": "Man's fuel station", "location": {"lat": 40.730610, "lon": -73.935242}, "rating": 4.5, "category": "Italian" }
{ "index": {"_id": "2"} }
{ "name": "Sushi World", "description": "Authentic China sushi", "location": {"lat": 40.742610, "lon": -73.945242}, "rating": 4.7, "category": "Chinese" }
{ "index": {"_id": "3"} }
{ "name": "Burger Town", "description": "Best burgers in town", "location": {"lat": 40.729510, "lon": -73.914342}, "rating": 4.3, "category": "American" }
{ "index": {"_id": "4"} }
{ "name": "Vegan Delight", "description": "Healthy and delicious vegan food", "location": {"lat": 40.715610, "lon": -73.935142}, "rating": 4.6, "category": "Vegan" }

Search for restaurants within 5 km of the user’s coordinates (example: 40.730610, ‑73.935242):

GET /restaurants/_search
{
  "query": {
    "geo_distance": {
      "distance": "5km",
      "location": {"lat": 40.730610, "lon": -73.935242}
    }
  }
}

Sort the results by ascending distance:

GET /restaurants/_search
{
  "query": {"geo_distance": {"distance": "5km", "location": {"lat": 40.730610, "lon": -73.935242}}},
  "sort": [{"_geo_distance": {"location": {"lat": 40.730610, "lon": -73.935242}, "order": "asc", "unit": "km"}}]
}

Adjust the radius (e.g., 3 km) by changing the distance parameter.

GET /restaurants/_search
{
  "query": {"geo_distance": {"distance": "3km", "location": {"lat": 40.730610, "lon": -73.935242}}},
  "sort": [{"_geo_distance": {"location": {"lat": 40.730610, "lon": -73.935242}, "order": "asc", "unit": "km"}}]
}

Update a restaurant’s location when it moves:

POST /restaurants/_update/1
{
  "doc": {"location": {"lat": 40.735610, "lon": -73.930242}}
}

Combine distance with a rating filter to show only highly rated places:

GET /restaurants/_search
{
  "query": {
    "bool": {
      "must": [
        {"geo_distance": {"distance": "5km", "location": {"lat": 40.730610, "lon": -73.935242}}},
        {"range": {"rating": {"gte": 4.5}}}
      ]
    }
  },
  "sort": [{"_geo_distance": {"location": {"lat": 40.730610, "lon": -73.935242}, "order": "asc", "unit": "km"}}]
}

Overall Summary

The five case studies demonstrate how Elasticsearch can serve as a versatile engine for full‑text search with highlighting, centralized log ingestion and real‑time analytics, content‑driven recommendation, numeric range aggregation for e‑commerce filtering, and geo‑spatial queries for location‑based services. By carefully designing indices, leveraging the rich query DSL, and integrating with Docker, Filebeat, Kibana, or front‑end JavaScript, developers can build scalable, high‑performance solutions across a wide range of backend scenarios.

recommendation system backend development Elasticsearch log analysis Full-text search Geolocation Price Aggregation

Written by

Java Architecture Stack

Dedicated to original, practical tech insights—from skill advancement to architecture, front‑end to back‑end, the full‑stack path, with Wei Ge guiding you.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.