How to Build a Powerful Site Search with Elasticsearch on Ubuntu

This article walks through installing Elasticsearch on Ubuntu, adding the IK Chinese analyzer and synonym filter, configuring custom analyzers, and using the Node.js client to index and query documents, providing a complete, reproducible setup for site‑wide full‑text search.

21CTO
21CTO
21CTO
How to Build a Powerful Site Search with Elasticsearch on Ubuntu

Install Elasticsearch

The simplest deployment uses the Elasticsearch Dockerfile, but this guide shows a manual installation on Ubuntu 14.04.3 LTS with version 2.1.1. First ensure Java is installed: sudo apt-get install openjdk-7-jre-headless Download and unzip the package:

wget https://download.elasticsearch.org/elasticsearch/release/org/elasticsearch/distribution/zip/elasticsearch/2.1.1/elasticsearch-2.1.1.zip
unzip elasticsearch-2.1.1.zip

Rename the extracted directory to ~/es_root (any location is fine) and start the service:

cd ~/es_root/bin/
chmod a+x elasticsearch
./elasticsearch

Verify it is running with: curl -XGET http://127.0.0.1:9200/?pretty If the JSON response shows the cluster name and version, Elasticsearch is up. By default it only binds to localhost; for debugging you can edit ~/es_root/config/elasticsearch.yml and add:

network.bind_host: "0.0.0.0"
network.publish_host: _non_loopback:ipv4_

Do not use this configuration in production.

Install IK Analyzer

Elasticsearch’s default tokenizer splits Chinese characters individually, so the IK plugin is needed for proper word segmentation. Download the matching version:

wget -c https://github.com/medcl/elasticsearch-analysis-ik/archive/master.zip
unzip master.zip

Install Maven, compile the plugin, and copy the built zip to the plugins directory:

sudo apt-get install maven
cd elasticsearch-analysis-ik-master/
mvn package
mkdir -p ~/es_root/plugins/ik/
unzip target/releases/elasticsearch-analysis-ik-1.6.2.zip -d ~/es_root/plugins/ik/

Copy the IK configuration files:

mkdir -p ~/es_root/config/ik
cp -r config/ik/* ~/es_root/config/ik/

After restarting Elasticsearch you should see a log line like [plugins] [Libra] loaded [elasticsearch-analysis-ik].

Configure Synonym Filter

Elasticsearch includes a synonym filter. To use it together with IK, define a custom analyzer in ~/es_root/config/elasticsearch.yml:

index:
  analysis:
    analyzer:
      ik_syno:
        type: custom
        tokenizer: ik_max_word
        filter: [my_synonym_filter]
      ik_syno_smart:
        type: custom
        tokenizer: ik_smart
        filter: [my_synonym_filter]
    filter:
      my_synonym_filter:
        type: synonym
        synonyms_path: analysis/synonym.txt

Create ~/es_root/config/analysis/synonym.txt with entries such as:

ua,user-agent,userAgent
js,javascript
internet explore=>ie

Use the JavaScript API

Install the official Node.js client: npm install elasticsearch --save Instantiate the client:

var elasticsearch = require('elasticsearch');
var client = new elasticsearch.Client({
  host: '10.211.55.23:9200',
  log: 'trace'
});

All client methods support callbacks or promises. Example using a promise to get cluster info:

client.info({}).then(function(data) {
  console.log('result:', data);
}, function(err) {
  console.log('error:', err);
});

Full‑text Search

Create an index and mapping that uses the custom analyzer:

client.indices.create({index : 'test'});
client.indices.putMapping({
  index : 'test',
  type : 'article',
  body : {
    article: {
      properties: {
        title: {type: 'string', term_vector: 'with_positions_offsets', analyzer: 'ik_syno', search_analyzer: 'ik_syno'},
        content: {type: 'string', term_vector: 'with_positions_offsets', analyzer: 'ik_syno', search_analyzer: 'ik_syno'},
        slug: {type: 'string'},
        tags: {type: 'string', index: 'not_analyzed'},
        update_date: {type: 'date', index: 'not_analyzed'}
      }
    }
  }
});

Index a sample document:

client.index({
  index : 'test',
  type : 'article',
  id : '100',
  body : {
    title : '什么是 JS?',
    slug :'what-is-js',
    tags : ['JS', 'JavaScript', 'TEST'],
    content : 'JS 是 JavaScript 的缩写!',
    update_date : '2015-12-15T13:05:55Z'
  }
});

Search for the term "JS":

client.search({
  index : 'test',
  type : 'article',
  q : 'JS'
});

The response contains the matching document. For more advanced queries, use the Query DSL with dis_max, boosting, and highlighting as shown in the original article.

Diagram

Elasticsearch vs MySQL terminology diagram
Elasticsearch vs MySQL terminology diagram

The tutorial concludes that the setup works, but further tuning, more data, and additional analyzers are needed for production use.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ElasticsearchNode.jsFull‑Text SearchUbuntuIK AnalyzerSite SearchSynonym Filter
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.