Build a Flask‑Elasticsearch Search Engine: From Config to Deployment

This tutorial walks through building a Flask‑based search engine powered by Elasticsearch, covering configuration files, logging setup, blueprint routing with pagination, application initialization, and deployment options using Flask‑Script and Gunicorn, with complete code examples and a GitHub repository link.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
Build a Flask‑Elasticsearch Search Engine: From Config to Deployment

Configuration

A Config.py file defines database credentials, secret key, SQLAlchemy options, debug mode, and mail server settings. The configuration class is used by the Flask application.

#coding:utf-8
import os
DB_USERNAME = 'root'
DB_PASSWORD = None  # if no password
DB_HOST = '127.0.0.1'
DB_PORT = '3306'
DB_NAME = 'flask_es'

class Config:
    SECRET_KEY = "随机字符"  # random secret key
    SQLALCHEMY_COMMIT_ON_TEARDOWN = True  # auto commit
    SQLALCHEMY_TRACK_MODIFICATIONS = True  # track modifications
    DEBUG = True  # debug mode
    SQLALCHEMY_DATABASE_URI = 'mysql+pymysql://%s:%s@%s:%s/%s' % (DB_USERNAME, DB_PASSWORD, DB_HOST, DB_PORT, DB_NAME)  # database URL

    MAIL_SERVER = 'smtp.qq.com'
    MAIL_POST = 465
    MAIL_USERNAME = '[email protected]'
    MAIL_PASSWORD = '邮箱授权码'
    FLASK_MAIL_SUBJECT_PREFIX = 'M_KEPLER'
    FLASK_MAIL_SENDER = MAIL_USERNAME  # default sender
    # MAIL_USE_SSL = True
    MAIL_USE_TLS = False
    MAIL_DEBUG = False
    ENABLE_THREADS = True

Logging

A Logger.py sets up colored console output and a rotating file handler, then creates a logger named log for use throughout the project.

#coding=utf-8
import os
import logging
import logging.config as log_conf
import datetime
import coloredlogs

coloredlogs.DEFAULT_FIELD_STYLES = {
    'asctime': {'color': 'green'},
    'hostname': {'color': 'magenta'},
    'levelname': {'color': 'magenta', 'bold': False},
    'name': {'color': 'green'}
}

log_dir = os.path.dirname(os.path.dirname(__file__)) + '/logs'
if not os.path.exists(log_dir):
    os.mkdir(log_dir)

today = datetime.datetime.now().strftime("%Y-%m-%d")
log_path = os.path.join(log_dir, today + ".log")

log_config = {
    'version': 1.0,
    'formatters': {
        'colored_console': {
            'format': "%(asctime)s - %(name)s - %(levelname)s - %(message)s",
            'datefmt': '%H:%M:%S'
        },
        'detail': {
            'format': "%(asctime)s - %(name)s - %(levelname)s - %(message)s",
            'datefmt': "%Y-%m-%d %H:%M:%S"
        }
    },
    'handlers': {
        'console': {
            'class': 'logging.StreamHandler',
            'level': 'DEBUG',
            'formatter': 'colored_console'
        },
        'file': {
            'class': 'logging.handlers.RotatingFileHandler',
            'maxBytes': 1024 * 1024 * 1024,
            'backupCount': 1,
            'filename': log_path,
            'level': 'INFO',
            'formatter': 'detail',
            'encoding': 'utf-8'
        }
    },
    'loggers': {
        'logger': {
            'handlers': ['console'],
            'level': 'DEBUG'
        }
    }
}

log_conf.dictConfig(log_config)
log_v = logging.getLogger('log')
coloredlogs.install(level='DEBUG', logger=log_v)

Routes and Blueprints

Two blueprints, baike and math, provide separate entry points. The baike blueprint includes routes for the index page, search handling with pagination, and detail pages identified by a UID.

#-*- coding:utf-8 -*-
import os
from flask_paginate import Pagination, get_page_parameter
from app.Logger.logger import log_v
from app.elasticsearchClass import elasticSearch
from app.home.forms import SearchForm
from app.home.baike import baike
from flask import request, jsonify, render_template, redirect

baike_es = elasticSearch(index_type="baike_data", index_name="baike")

@baike.route('/')
def index():
    searchForm = SearchForm()
    return render_template('baike/index.html', searchForm=searchForm)

@baike.route('/search', methods=['GET', 'POST'])
def baikeSearch():
    search_key = request.args.get('b', default=None)
    if search_key:
        searchForm = SearchForm()
        log_v.error("[+] Search Keyword: " + search_key)
        match_data = baike_es.search(search_key, count=30)
        PER_PAGE = 10
        page = request.args.get(get_page_parameter(), type=int, default=1)
        start = (page - 1) * PER_PAGE
        end = start + PER_PAGE
        total = 30
        pagination = Pagination(page=page, start=start, end=end, total=total)
        context = {
            'match_data': match_data["hits"]["hits"][start:end],
            'pagination': pagination,
            'uid_link': '/baike/'
        }
        return render_template('data.html', q=search_key, searchForm=searchForm, **context)
    return redirect('home.index')

@baike.route('/<uid>')
def baikeSd(uid):
    base_path = os.path.abspath('app/templates/s_d/')
    old_file = os.listdir(base_path)[0]
    old_path = os.path.join(base_path, old_file)
    file_path = os.path.abspath('app/templates/s_d/{}.html'.format(uid))
    if not os.path.exists(file_path):
        log_v.debug("[-] File does not exist, renaming !!!")
        os.rename(old_path, file_path)
    match_data = baike_es.id_get_doc(uid=uid)
    return render_template('s_d/{}.html'.format(uid), match_data=match_data)

Application Initialization

The Flask app is created, configured, and extensions such as SQLAlchemy, CSRF protection, and Mail are initialized. Blueprints are registered with appropriate URL prefixes.

#-*- coding:utf8 -*-
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
from app.config.config import Config
from flask_mail import Mail
from flask_wtf.csrf import CSRFProtect

app = Flask(__name__, template_folder='templates', static_folder='static')
app.config.from_object(Config)

db = SQLAlchemy(app)
db.init_app(app)

csrf = CSRFProtect(app)
mail = Mail(app)
# Import blueprints after db is created
from app.home.baike import baike as baike_blueprint
from app.home.math import math as math_blueprint
from app.home.home import home as home_blueprint

app.register_blueprint(home_blueprint)
app.register_blueprint(math_blueprint, url_prefix="/math")
app.register_blueprint(baike_blueprint, url_prefix="/baike")

Running the Project

Use flask_script to start the development server, or launch the application with gunicorn using a custom configuration file for production.

#-*- coding:utf8 -*-
from app import app
from flask_script import Manager, Server

manage = Manager(app)
manage.add_command("runserver", Server(use_debugger=True))

if __name__ == "__main__":
    manage.run()
# gunicorn configuration (gconfig.py)
import multiprocessing
from gevent import monkey
monkey.patch_all()

workers = multiprocessing.cpu_count() * 2 + 1
debug = True
reload = True
loglevel = 'debug'
threads = 2
bind = '0.0.0.0:5001'
daemon = 'false'
worker_class = 'gevent'
worker_connections = 2000
pidfile = 'log/gunicorn.pid'
logfile = 'log/debug.log'
accesslog = 'log/gunicorn_acess.log'
errorlog = 'log/gunicorn_error.log'

The application can be accessed at http://127.0.0.1:5000 , and the full source code is hosted on GitHub.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonElasticsearchWeb DevelopmentFlaskGunicorn
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.