Backend Development 10 min read

Solving Marketing Activity Product Search with Elasticsearch: When to Use Join

The article examines why front‑end product search fails during large marketing events, evaluates Elasticsearch's join feature and its drawbacks, compares nested, reverse‑modeling and flattened approaches, recommends reverse modeling for massive activity‑product data, and provides concrete DSL code, pagination and caching tips.

Mingyi World Elasticsearch

Mar 26, 2025

Solving Marketing Activity Product Search with Elasticsearch: When to Use Join

Problem Statement

In a typical e‑commerce system, all product data resides in Elasticsearch. When a marketing activity (e.g., Double‑11 or 618) involves thousands of products, pulling the entire dataset to the front‑end for search leads to slow loading, search lag, and a poor user experience.

Can join Be Used?

Elasticsearch does support a join field that can model parent‑child (father‑son) relationships similar to relational databases. However, the author points out three major issues:

Slow : join queries are more expensive than ordinary queries and performance degrades sharply with large data volumes.

Complex : The parent‑child relationship must be defined in advance, adding extra work when indexing data.

Inflexible : In a distributed environment, many shards increase the overhead, making the approach unsuitable for scenarios with tens of thousands of products per activity.

Therefore, the author concludes that join is not a good fit for the "activity‑product many" scenario.

Alternative Modeling Strategies

1. Nested Fields

Store activity information inside a nested field of the product document. Example document:

{
  "product_id": "123",
  "name": "手机",
  "price": 2000,
  "activities": [
    {"activity_id": "act001", "activity_name": "双11促销"},
    {"activity_id": "act002", "activity_name": "新年特惠"}
  ]
}

Benefit : Query with nested is straightforward and filters products belonging to a specific activity.

Drawback : Updating activity information requires rewriting the whole product document, which consumes storage when many products share the same activity.

Suitable For : Scenarios with few activities and relatively stable relationships.

2. Reverse Modeling (Activity‑Product Index)

Create a separate index where each document represents an activity‑product pair:

{
  "activity_id": "act001",
  "activity_name": "双11促销",
  "product_id": "123",
  "product_name": "手机",
  "price": 2000
}

Benefit : Searching by activity_id is extremely fast because the filter matches a single field.

Drawback : Data duplication (space‑for‑time trade‑off) and the need to keep the index synchronized on writes.

Suitable For : Activities with a huge number of products where query speed is critical.

3. Flattened (Wide Table) Model

Embed activity IDs and names as arrays directly in the product document:

{
  "product_id": "123",
  "name": "手机",
  "price": 2000,
  "activity_ids": ["act001", "act002"],
  "activity_names": ["双11促销", "新年特惠"]
}

Benefit : Simple terms query retrieves products quickly.

Drawback : Complex or frequently changing activity information becomes hard to maintain.

Suitable For : Scenarios where activity data is simple and updates are infrequent.

Implementation Steps (DSL Code)

1. Create Index Mapping

PUT /activity_products
{
  "mappings": {
    "properties": {
      "activity_id": {"type": "keyword"},
      "activity_name": {"type": "text"},
      "product_id": {"type": "keyword"},
      "product_name": {"type": "text"},
      "price": {"type": "float"}
    }
  }
}

2. Bulk Insert Sample Data

POST /activity_products/_bulk
{ "index": {} }
{ "activity_id": "act001", "activity_name": "双11促销", "product_id": "123", "product_name": "小米手机14", "price": 3999 }
{ "index": {} }
{ "activity_id": "act002", "activity_name": "新年特惠", "product_id": "123", "product_name": "小米手机14", "price": 3999 }
... (additional activity‑product pairs) ...

3. Search Products in a Specific Activity

GET /activity_products/_search
{
  "query": {"term": {"activity_id": "act001"}},
  "size": 10,
  "sort": [{"price": "asc"}]
}

The result is sorted by price and limited to the top 10 matches.

4. Pagination Optimization with search_after

GET /activity_products/_search
{
  "query": {"term": {"activity_id": "act001"}},
  "size": 10,
  "sort": [{"price": "asc"}, {"product_id": "asc"}],
  "search_after": [2000, "123"]
}

Using the last hit's sort values ( price, product_id) avoids the performance penalty of traditional from/size pagination.

Additional Tips

Cache : Frequently accessed hot activities can be cached in Redis to reduce load on Elasticsearch.

Trim Fields : Store only essential fields in the index; other details can be fetched later via the product ID from a relational database.

Conclusion

Moving the search to the back‑end Elasticsearch eliminates front‑end bottlenecks. While join works for small parent‑child sets, it is unsuitable for massive activity‑product relationships. Adjusting the data model—preferably using reverse modeling—delivers fast, scalable queries; a flattened model is an alternative when activity data is simple and rarely changes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Elasticsearch data modeling pagination JOIN search_after reverse modeling

Written by

Mingyi World Elasticsearch

The leading WeChat public account for Elasticsearch fundamentals, advanced topics, and hands‑on practice. Join us to dive deep into the ELK Stack (Elasticsearch, Logstash, Kibana, Beats).

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Problem Statement

Can join Be Used?

Alternative Modeling Strategies

1. Nested Fields

2. Reverse Modeling (Activity‑Product Index)

3. Flattened (Wide Table) Model

Recommended Solution

Implementation Steps (DSL Code)

1. Create Index Mapping

2. Bulk Insert Sample Data

3. Search Products in a Specific Activity

4. Pagination Optimization with search_after

Additional Tips

Conclusion

Mingyi World Elasticsearch

How this landed with the community

Was this worth your time?

0 Comments