Backend Development 6 min read

Quickly Deploy a Search Engine with Docker and Searx

This guide shows how to set up the open‑source Searx search engine using a Docker image, walk through the essential Docker commands, explore its Python source code, and explains how to customize the response handling for building your own lightweight search engine.

MaGe Linux Operations

Aug 7, 2021

Quickly Deploy a Search Engine with Docker and Searx

A group member asked how to quickly build a search engine, and the author points to the open‑source project Searx (GitHub: https://github.com/asciimoo/searx), which provides a ready‑to‑use Docker image.

Deploy with Docker

The official Docker image can be pulled and run with a few commands. First, stop and remove any existing container, then start a new one with the desired environment variables.

cid=$(sudo docker ps -a | grep searx | awk '{print $1}')
 echo searx  cid is $cid
 if [ "$cid" != "" ];then
     sudo docker stop $cid
     sudo docker rm $cid
 fi
 sudo docker run -d --name searx -e IMAGE_PROXY=True -e BASE_URL=http://yourdomain.com -p 7777:8888 wonderfall/searx

After the container is running, you can access the search engine via the exposed port.

Understanding the Source Code

The core of Searx aggregates results after making a request. Data sources can be databases, files, or APIs. Below is a simplified excerpt of the Python code that parses queries, iterates over data structures, builds request URLs, and formats responses.

from urllib import urlencode
from json import loads
from collections import Iterable

search_url = None
url_query = None
content_query = None
title_query = None
suggestion_query = ''
results_query = ''

page_size = 1
first_page_num = 1

def iterate(iterable):
    if type(iterable) == dict:
        it = iterable.iteritems()
    else:
        it = enumerate(iterable)
    for index, value in it:
        yield str(index), value

def is_iterable(obj):
    if type(obj) == str:
        return False
    if type(obj) == unicode:
        return False
    return isinstance(obj, Iterable)

def parse(query):
    q = []
    for part in query.split('/'):
        if part == '':
            continue
        else:
            q.append(part)
    return q

def do_query(data, q):
    ret = []
    if not q:
        return ret
    qkey = q[0]
    for key, value in iterate(data):
        if len(q) == 1:
            if key == qkey:
                ret.append(value)
            elif is_iterable(value):
                ret.extend(do_query(value, q))
        else:
            if not is_iterable(value):
                continue
            if key == qkey:
                ret.extend(do_query(value, q[1:]))
            else:
                ret.extend(do_query(value, q))
    return ret

def query(data, query_string):
    q = parse(query_string)
    return do_query(data, q)

def request(query, params):
    query = urlencode({'q': query})[2:]
    fp = {'query': query}
    if paging and search_url.find('{pageno}') >= 0:
        fp['pageno'] = (params['pageno'] - 1) * page_size + first_page_num
    params['url'] = search_url.format(**fp)
    params['query'] = query
    return params

def response(resp):
    results = []
    json = loads(resp.text)
    if results_query:
        for result in query(json, results_query)[0]:
            url = query(result, url_query)[0]
            title = query(result, title_query)[0]
            content = query(result, content_query)[0]
            results.append({'url': url, 'title': title, 'content': content})
    else:
        for url, title, content in zip(
                query(json, url_query),
                query(json, title_query),
                query(json, content_query)
        ):
            results.append({'url': url, 'title': title, 'content': content})
    if not suggestion_query:
        return results
    for suggestion in query(json, suggestion_query):
        results.append({'suggestion': suggestion})
    return results

Customizing Results

By modifying the response function, you can tailor the returned data—whether from the web, a database, or a file—to create a personalized mini‑search engine. Combining this with tools like jieba for Chinese tokenization makes the project even more versatile.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend Docker Search Engine Searx

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.