Big Data 11 min read

Flink-Based Real‑Time Recommendation System: Architecture, Logic, and Docker Deployment Guide

This article presents a comprehensive walkthrough of a Flink‑powered recommendation system, detailing its v2.0 architecture, module functions, recommendation algorithms (hotness, product similarity, collaborative filtering), front‑end and back‑end UI, and step‑by‑step Docker deployment of MySQL, Redis, HBase, and Kafka services.

IT Architects Alliance

May 22, 2021

Flink-Based Real‑Time Recommendation System: Architecture, Logic, and Docker Deployment Guide

Introduction – The author shares a practical Flink recommendation project found on GitHub (https://github.com/CheckChe0803/flink-recommandSystem-demo) and provides the source code for readers interested in a real‑world implementation.

1. System Architecture v2.0

1.1 Architecture diagram – (image omitted)

1.2 Module description

Log data module (flink-2-hbase) contains six Flink jobs:

User‑product browsing history – records user clicks for item‑based collaborative filtering and stores scores in HBase table p_history.

User interest – calculates contextual interest using action intervals; state cleared when action=3 (collect) or after 100 s, stored in u_interest.

User profile – builds tag‑based profile (color, origin, style) stored in user.

Product profile – records age‑group and gender preferences in prod.

Hotness ranking – uses Flink windows to compute real‑time hotness, caches results in Redis list keyed by timestamp.

Log import – consumes Kafka streams, writes raw logs to HBase con table for downstream aggregation.

Web module:

Front‑end UI – displays recommended product list to users.

Back‑end monitoring page – shows key metrics to administrators.

2. Recommendation Engine Logic

2.1 Hotness‑based recommendation – Re‑ranks the hotness list according to user features, then combines it with two other algorithms to produce final scores.

2.2 Product‑profile similarity – Uses three product attributes (color, country, style) and cosine similarity to compute item‑item relevance, filtering the hotness list.

2.3 Collaborative‑filtering similarity – Calculates similarity scores from the user‑product HBase table using a formula (image omitted).

3. Front‑end Recommendation Page – Shows three columns: hotness ranking, collaborative‑filtering results, and product‑profile recommendations (image omitted).

4. Back‑end Data Dashboard – Real‑time display of hotness ranking and one‑hour log ingestion metrics; data originates from other Flink modules and is stored in resource/database.sql (image omitted).

5. Deployment Instructions

All services are containerised with Docker. The following commands illustrate how to pull images and run containers.

MySQL

docker pull mysql:5.7
docker run --name local-mysql -p 3308:3306 -e MYSQL_ROOT_PASSWORD=123456 -d mysql:5.7

Key flags: --name local-mysql – container name. -p 3308:3306 – host‑to‑container port mapping. -e MYSQL_ROOT_PASSWORD=123456 – root password. -d – run in background.

Redis docker run --name local-redis -p 6379:6379 -d redis HBase

docker pull harisekhon/hbase
docker run -d -h base-server \
  -p 2181:2181 -p 8080:8080 -p 8085:8085 -p 9090:9090 \
  -p 9000:9000 -p 9095:9095 -p 16000:16000 \
  -p 16010:16010 -p 16201:16201 -p 16301:16301 \
  -p 16020:16020 \
  --name hbase harisekhon/hbase

After startup, access the web UI at http://localhost:16010/master-status.

Kafka

# Pull images
docker pull wurstmeister/zookeeper
docker pull wurstmeister/kafka
docker pull sheepkiller/kafka-manager

# Run Zookeeper
docker run -d --name zookeeper --publish 2181:2181 \
  --volume /etc/localtime:/etc/localtime \
  --restart=always wurstmeister/zookeeper

# Run Kafka
docker run --name kafka -p 9092:9092 \
  --link zookeeper:zookeeper \
  -e KAFKA_ADVERTISED_HOST_NAME=192.168.1.8 \
  -e KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181 \
  -d wurstmeister/kafka

# Run Kafka Manager
docker run -d --link zookeeper:zookeeper -p 9000:9000 \
  -e ZK_HOSTS="zookeeper:2181" \
  hlebalbau/kafka-manager:stable -Dpidfile.path=/dev/null

Optional Kafka‑Manager authentication can be set via environment variables KAFKA_MANAGER_AUTH_ENABLED, KAFKA_MANAGER_USERNAME, and KAFKA_MANAGER_PASSWORD.

Service Startup

Configure IPs and ports of the deployed services in the flink-2-hbase and web modules.

Run mvn clean install in the flink-2-hbase root to build the JAR.

Start the Flink tasks (right‑click in IDEA).

Launch the SchedulerJob to periodically compute collaborative‑filtering and user‑profile scores.

Open the web project in IDEA; after the generated JAR is imported, start the web service.

Note: initially the recommendation page will show random products until user interaction generates click logs.

6. Future Work

Add monitoring for Flink tasks.

Enhance the data dashboard with more detailed metrics.

Calculate business indicators such as recall and precision.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Docker Big Data Real-time Processing Flink recommendation system Kafka HBase

Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.