Quickly Deploy a Search Engine with Docker and Searx
This guide shows how to set up the open‑source Searx search engine using a Docker image, walk through the essential Docker commands, explore its Python source code, and explains how to customize the response handling for building your own lightweight search engine.
A group member asked how to quickly build a search engine, and the author points to the open‑source project Searx (GitHub: https://github.com/asciimoo/searx), which provides a ready‑to‑use Docker image.
Deploy with Docker
The official Docker image can be pulled and run with a few commands. First, stop and remove any existing container, then start a new one with the desired environment variables.
cid=$(sudo docker ps -a | grep searx | awk '{print $1}')
echo searx cid is $cid
if [ "$cid" != "" ];then
sudo docker stop $cid
sudo docker rm $cid
fi
sudo docker run -d --name searx -e IMAGE_PROXY=True -e BASE_URL=http://yourdomain.com -p 7777:8888 wonderfall/searxAfter the container is running, you can access the search engine via the exposed port.
Understanding the Source Code
The core of Searx aggregates results after making a request. Data sources can be databases, files, or APIs. Below is a simplified excerpt of the Python code that parses queries, iterates over data structures, builds request URLs, and formats responses.
from urllib import urlencode
from json import loads
from collections import Iterable
search_url = None
url_query = None
content_query = None
title_query = None
suggestion_query = ''
results_query = ''
page_size = 1
first_page_num = 1
def iterate(iterable):
if type(iterable) == dict:
it = iterable.iteritems()
else:
it = enumerate(iterable)
for index, value in it:
yield str(index), value
def is_iterable(obj):
if type(obj) == str:
return False
if type(obj) == unicode:
return False
return isinstance(obj, Iterable)
def parse(query):
q = []
for part in query.split('/'):
if part == '':
continue
else:
q.append(part)
return q
def do_query(data, q):
ret = []
if not q:
return ret
qkey = q[0]
for key, value in iterate(data):
if len(q) == 1:
if key == qkey:
ret.append(value)
elif is_iterable(value):
ret.extend(do_query(value, q))
else:
if not is_iterable(value):
continue
if key == qkey:
ret.extend(do_query(value, q[1:]))
else:
ret.extend(do_query(value, q))
return ret
def query(data, query_string):
q = parse(query_string)
return do_query(data, q)
def request(query, params):
query = urlencode({'q': query})[2:]
fp = {'query': query}
if paging and search_url.find('{pageno}') >= 0:
fp['pageno'] = (params['pageno'] - 1) * page_size + first_page_num
params['url'] = search_url.format(**fp)
params['query'] = query
return params
def response(resp):
results = []
json = loads(resp.text)
if results_query:
for result in query(json, results_query)[0]:
url = query(result, url_query)[0]
title = query(result, title_query)[0]
content = query(result, content_query)[0]
results.append({'url': url, 'title': title, 'content': content})
else:
for url, title, content in zip(
query(json, url_query),
query(json, title_query),
query(json, content_query)
):
results.append({'url': url, 'title': title, 'content': content})
if not suggestion_query:
return results
for suggestion in query(json, suggestion_query):
results.append({'suggestion': suggestion})
return resultsCustomizing Results
By modifying the response function, you can tailor the returned data—whether from the web, a database, or a file—to create a personalized mini‑search engine. Combining this with tools like jieba for Chinese tokenization makes the project even more versatile.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
